IEEE/CAA Journal of Automatica Sinica
Citation: | Xiong Yang and Bo Zhao, "Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 575-583, Mar. 2020. doi: 10.1109/JAS.2020.1003063 |
[1] |
D. Vrabie, K. G. Vamvoudakis, and F. L. Lewis, Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. London: IET, 2013.
|
[2] |
X. Yang and H. B. He, “Self-learning robust optimal control for continuoustime nonlinear systems with mismatched disturbances,” Neural Networks, vol. 99, pp. 19–30, 2018. doi: 10.1016/j.neunet.2017.11.022
|
[3] |
W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd ed. Hoboken, NJ: John Wiley & Sons, 2007.
|
[4] |
D. Liu, Q. Wei, D. Wang, X. Yang, and H. Li, Adaptive Dynamic Programming with Applications in Optimal Control. Cham, Switzerland: Springer, 2017.
|
[5] |
X. N. Zhong, Z. Ni, and H. B. He, “Gr-GDHP: a new architecture for globalized dual heuristic dynamic programming,” IEEE Trans. Cybernetics, vol. 47, no. 10, pp. 3318–3330, Oct. 2017. doi: 10.1109/TCYB.2016.2598282
|
[6] |
D. Wang and X. N. Zhong, “Advanced policy learning near-optimal regulation,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 743–749, May 2019. doi: 10.1109/JAS.2019.1911489
|
[7] |
Q. L. Wei, D. R. Liu, Y. Liu, and R. Z. Song, “Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 2, pp. 168–176, Apr. 2017. doi: 10.1109/JAS.2016.7510262
|
[8] |
L. Dong, X. N. Zhong, C. Y. Sun, and H. B. He, “Event-triggered adaptive dynamic programming for continuous-time systems with control constraints,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 8, pp. 1941–1952, Aug. 2017. doi: 10.1109/TNNLS.2016.2586303
|
[9] |
B. Zhao and D. R. Liu, “Event-triggered decentralized tracking control of modular reconfigurable robots through adaptive dynamic programming,” IEEE Trans. Industrial Electronics, vol. 67, no. 4, pp. 3054–3064, Apr. 2020. doi: 10.1109/TIE.2019.2914571
|
[10] |
Y. Jiang and Z.-P. Jiang, Robust Adaptive Dynamic Programming. Hoboken, New Jersey: John Wiley & Sons, 2017.
|
[11] |
R. Z. Song, F. L. Lewis, and Q. L. Wei, “Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzerosum games,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 3, pp. 704–713, Mar. 2017. doi: 10.1109/TNNLS.2016.2582849
|
[12] |
H. G. Zhang, K. Zhang, Y. L. Cai, and J. Han, “Adaptive fuzzy fault-tolerant tracking control for partially unknown systems with actuator faults via integral reinforcement learning method,” IEEE Trans. Fuzzy Systems, vol. 27, no. 10, pp. 1986–1998, Oct. 2019. doi: 10.1109/TFUZZ.2019.2893211
|
[13] |
L. Liu, Z. S. Wang, and H. G. Zhang, “Adaptive fault-tolerant tracking control for MIMO discrete-time systems via reinforcement learning algorithm with less learning parameters,” IEEE Trans. Automation Science and Engineering, vol. 14, no. 1, pp. 299–313, Jan. 2017. doi: 10.1109/TASE.2016.2517155
|
[14] |
Y.-J. Liu, S. Li, S. C. Tong, and C. L. P. Chen, “Adaptive reinforcement learning control based on neural approximation for nonlinear discretetime systems with unknown nonaffine dead-zone input,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 1, pp. 295–305, Jan. 2019. doi: 10.1109/TNNLS.2018.2844165
|
[15] |
J. N. Li, H. Modares, T. Y. Chai, F. L. Lewis, and L. H. Xie, “Off-policy reinforcement learning for synchronization in multiagent graphical games,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2434–2445, Oct. 2017. doi: 10.1109/TNNLS.2016.2609500
|
[16] |
J. H. Qin, M. Li, Y. Shi, Q. C. Ma, and W. X. Zheng, “Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 1, pp. 85–96, Jan. 2019. doi: 10.1109/TNNLS.2018.2832025
|
[17] |
X. Yang and H. B. He, “Event-triggered robust stabilization of nonlinear input-constrained systems using single network adaptive critic designs,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2853089, Jul. 2018.
|
[18] |
B. Widrow, N. K. Gupta, and S. Maitra, “Punish/reward: learning with a critic in adaptive threshold systems,” IEEE Trans. Systems,Man,and Cybernetics, vol. 3, no. 5, pp. 455–465, Sept. 1973.
|
[19] |
D. V. Prokhorov and D. C. Wunsch, “Adaptive critic designs,” IEEE Trans. Neural Networks, vol. 8, no. 5, pp. 997–1007, Sept. 1997. doi: 10.1109/72.623201
|
[20] |
R. Padhi, N. Unnikrishnan, X. H. Wang, and S. N. Balakrishnan, “A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems,” Neural Networks, vol. 19, no. 10, pp. 1648–1660, 2006. doi: 10.1016/j.neunet.2006.08.010
|
[21] |
D. Wang, D. R. Liu, Q. C. Zhang, and D. B. Zhao, “Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics,” IEEE Trans. Systems,Man,and Cybernetics:Systems, vol. 46, no. 11, pp. 1544–1555, Nov. 2016. doi: 10.1109/TSMC.2015.2492941
|
[22] |
B. Luo, D. R. Liu, T. W. Huang, and D. Wang, “Model-free optimal tracking control via critic-only Q-learning,” IEEE Trans. Neural Networks and Learning Systems, vol. 27, no. 10, pp. 2134–2144, Oct. 2016. doi: 10.1109/TNNLS.2016.2585520
|
[23] |
H. G. Zhang, Y. H. Luo, and D. R. Liu, “Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints,” IEEE Trans. Neural Networks, vol. 20, no. 9, pp. 1490–1503, Sept. 2009. doi: 10.1109/TNN.2009.2027233
|
[24] |
M. M. Ha, D. Wang, and D. R. Liu, “Event-triggered adaptive critic control design for discrete-time constrained nonlinear systems,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2868510. Sept. 2018.
|
[25] |
M. Abu-Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, pp. 779–791, May 2005. doi: 10.1016/j.automatica.2004.11.034
|
[26] |
H. Modares, F. L. Lewis, and M. Naghibi-Sistani, “Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks,” IEEE Trans. Neural Networks and Learning Systems, vol. 24, no. 10, pp. 1513–1525, Oct. 2013. doi: 10.1109/TNNLS.2013.2276571
|
[27] |
Y. H. Zhu, D. B. Zhao, H. B. He, and J. H. Ji, “Event-triggered optimal control for partially-unknown constrained-input systems via adaptive dynamic programming,” IEEE Trans. Industrial Electronics, vol. 64, no. 5, pp. 4101–4109, May 2017. doi: 10.1109/TIE.2016.2597763
|
[28] |
D. Wang, H. B. He, and D. R. Liu, “Adaptive critic nonlinear robust control: a survey,” IEEE Trans. Cybernetics, vol. 47, no. 10, pp. 3429–3451, Oct. 2017. doi: 10.1109/TCYB.2017.2712188
|
[29] |
H. G. Zhang, K. Zhang, G. Y. Xiao, and H. Jiang, “Robust optimal control scheme for unknown constrained-input nonlinear systems via a plug-n-play event-sampled critic-only algorithm,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2889377, Feb. 2019.
|
[30] |
L. L. Cui, X. P. Xie, X. W. Wang, Y. H. Luo, and J. B. Liu, “Event-triggered singlenetwork ADP method for constrained optimal tracking control of continuous-time nonlinear systems,” Applied Mathematics and Computation, vol. 352, pp. 220–234, Jul. 2019. doi: 10.1016/j.amc.2019.01.066
|
[31] |
Y. Jiang, J. L. Fan, T. Y. Chai, and F. L. Lewis, “Dual-rate operational optimal control for flotation industrial process with unknown operational model,” IEEE Trans. Industrial Electronics, vol. 66, no. 6, pp. 4587–4599, Jun. 2019. doi: 10.1109/TIE.2018.2856198
|
[32] |
L. H. Kong, W. He, Y. T. Dong, L. Cheng, C. G. Yang, and Z. J. Li, “Asymmetric bounded neural control for an uncertain robot by state feedback and output feedback,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2019.2901277, Apr. 2019.
|
[33] |
W. Zhou, H. C. Liu, H. B. He, J. Yi, and T. F. Li, “Neuro-optimal tracking control for continuous stirred tank reactor with input constraints,” IEEE Trans. Industrial Informatics, vol. 15, no. 8, pp. 4516–4524, Aug. 2019. doi: 10.1109/TII.2018.2884214
|
[34] |
X. Yang, D. R. Liu, D. Wang, and Q. L. Wei, “Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning,” Neural Networks, vol. 55, pp. 30–41, 2014. doi: 10.1016/j.neunet.2014.03.008
|
[35] |
Y. H. Zhu, D. B. Zhao, X. Yang, and Q. C. Zhang, “Policy iteration for H∞ optimal control of polynomial nonlinear systems via sum of squares programming,” IEEE Trans. Cybernetics, vol. 48, no. 2, pp. 500–509, Feb. 2018. doi: 10.1109/TCYB.2016.2643687
|
[36] |
W. Rudin, Principles of Mathematical Analysis, 3rd ed. New York: McGraw-Hill Publishing Co., 1976.
|
[37] |
K. Hornik, M. Stinchcombe, and H. White, “Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks,” Neural Networks, vol. 3, no. 5, pp. 551–560, 1990. doi: 10.1016/0893-6080(90)90005-6
|
[38] |
K. G. Vamvoudakis and F. L. Lewis, “Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem,” Automatica, vol. 46, no. 5, pp. 878–888, May 2010. doi: 10.1016/j.automatica.2010.02.018
|
[39] |
Z. J. Fu, W. F. Xie, S. Rakheja, and J. Na, “Observer-based adaptive optimal control for unknown singularly perturbed nonlinear systems with input constraints,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 1, pp. 48–57, Jan. 2017. doi: 10.1109/JAS.2017.7510322
|
[40] |
D. S. Mitrinovic and P. M. Vasic, Analytic Inequalities. Berlin: Springer, 1970.
|
[41] |
X. Yang, D. R. Liu, H. W. Ma, and Y. C. Xu, “Online approximate solution of HJI equation for unknown constrained-input nonlinear continuous-time systems,” Information Sciences, vol. 328, pp. 435–454, Jan. 2016. doi: 10.1016/j.ins.2015.09.001
|
[42] |
D. R. Liu, X. Yang, D. Wang, and Q. L. Wei, “Reinforecement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints,” IEEE Trans. Cybernetics, vol. 45, no. 7, pp. 1372–1385, Jul. 2015. doi: 10.1109/TCYB.2015.2417170
|
[43] |
X. Yang and H. B. He, “Adaptive critic learning and experience replay for decentralized event-triggered control of nonlinear interconnected systems,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2019.2898370, Mar. 2019.
|
[44] |
Z. Ni, N. Malla, and X. N. Zhong, “Prioritizing useful experience replay for heuristic dynamic programming-based learning systems,” IEEE Trans. Cybernetics, vol. 49, no. 11, pp. 3911–3922, Nov. 2019. doi: 10.1109/TCYB.2018.2853582
|
[45] |
L. Liu, Z. S. Wang, and H. G. Zhang, “Neural-network-based robust optimal tracking control for MIMO discrete-time systems with unknown uncertainty using adaptive critic design,” IEEE Trans. Neural Networks and Learning Systems, vol. 29, no. 4, pp. 1239–1251, Apr. 2018. doi: 10.1109/TNNLS.2017.2660070
|
[46] |
Z. S. Wang, L. Liu, Y. M. Wu, and H. G. Zhang, “Optimal fault-tolerant control for discrete-time nonlinear strict-feedback systems based on adaptive critic design,” IEEE Trans. Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2179–2191, Jun. 2018. doi: 10.1109/TNNLS.2018.2810138
|