Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints

Xiong Yang; Bo Zhao

doi:10.1109/JAS.2020.1003063

Volume 7 Issue 2

Mar. 2020

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2020 > 7(2): 575-583

Xiong Yang and Bo Zhao, "Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 575-583, Mar. 2020. doi: 10.1109/JAS.2020.1003063

Citation:

Xiong Yang and Bo Zhao, "Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 575-583, Mar. 2020. doi: 10.1109/JAS.2020.1003063

Citation:

PDF( 1231 KB)

Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints

doi: 10.1109/JAS.2020.1003063

Xiong Yang^,,
Bo Zhao

Funds: This work was supported by the National Natural Science Foundation of China (61973228, 61973330)

More Information

Author Bio:
Xiong Yang (M’19) received the B.S. degree in mathematics and applied mathematics from Central China Normal University, in 2008, the M.S. degree in pure mathematics from Shandong University, in 2011, and the Ph.D. degree in control theory and control engineering from the Institute of Automation, Chinese Academy of Sciences, in 2014. From 2014 to 2016, he was an Assistant Professor with the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences. From 2016 to 2018, he was a Post-Doctoral Fellow with the Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI, USA. He is currently an Associate Professor with the School of Electrical and Information Engineering, Tianjin University. His research interests include intelligent control, reinforcement learning, deep neural networks, event-triggered control, and their applications. Dr. Yang was a recipient of the Excellent Award of Presidential Scholarship of the Chinese Academy of Sciences in 2014 and the Outstanding Paper Award of IEEE Transactions on Neural Networks and Learning Systems in 2018

Bo Zhao (M’16) received the B.S. degree in automation, and Ph.D. degree in control science and engineering, all from Jilin University, in 2009 and 2014, respectively. From 2014 to 2017, he was a Post-Doctoral Fellow with the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences. From 2017 to 2018, he joined the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences. He is currently an Associate Professor with the School of Systems Science, Beijing Normal University. He has authored or coauthored over 80 journal and conference papers. His research interests include adaptive dynamic programming, robot control, fault diagnosis and tolerant control, optimal control, and artificial intelligence-based control
Corresponding author: X. Yang is with the School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China (e-mail: xiong.yang@tju.edu.cn)
Received Date: 2019-08-14
Revised Date: 2019-11-27
Accepted Date: 2020-01-21

Available Online: 2020-02-12

Abstract

Abstract

In this paper, we present an optimal neuro-control scheme for continuous-time (CT) nonlinear systems with asymmetric input constraints. Initially, we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints. Then, we develop a Hamilton-Jacobi-Bellman equation (HJBE), which arises in the discounted cost optimal control problem. To obtain the optimal neurocontroller, we utilize a critic neural network (CNN) to solve the HJBE under the framework of reinforcement learning. The CNN’s weight vector is tuned via the gradient descent approach. Based on the Lyapunov method, we prove that uniform ultimate boundedness of the CNN’s weight vector and the closed-loop system is guaranteed. Finally, we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples.
- Adaptive critic designs (ACDs),
- asymmetric input constraint,
- critic neural network (CNN),
- nonlinear systems,
- optimal control,
- reinforcement learning (RL)

FullText(HTML)

References(46)

References

[1]	D. Vrabie, K. G. Vamvoudakis, and F. L. Lewis, Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. London: IET, 2013.
[2]	X. Yang and H. B. He, “Self-learning robust optimal control for continuoustime nonlinear systems with mismatched disturbances,” Neural Networks, vol. 99, pp. 19–30, 2018. doi: 10.1016/j.neunet.2017.11.022
[3]	W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd ed. Hoboken, NJ: John Wiley & Sons, 2007.
[4]	D. Liu, Q. Wei, D. Wang, X. Yang, and H. Li, Adaptive Dynamic Programming with Applications in Optimal Control. Cham, Switzerland: Springer, 2017.
[5]	X. N. Zhong, Z. Ni, and H. B. He, “Gr-GDHP: a new architecture for globalized dual heuristic dynamic programming,” IEEE Trans. Cybernetics, vol. 47, no. 10, pp. 3318–3330, Oct. 2017. doi: 10.1109/TCYB.2016.2598282
[6]	D. Wang and X. N. Zhong, “Advanced policy learning near-optimal regulation,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 743–749, May 2019. doi: 10.1109/JAS.2019.1911489
[7]	Q. L. Wei, D. R. Liu, Y. Liu, and R. Z. Song, “Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 2, pp. 168–176, Apr. 2017. doi: 10.1109/JAS.2016.7510262
[8]	L. Dong, X. N. Zhong, C. Y. Sun, and H. B. He, “Event-triggered adaptive dynamic programming for continuous-time systems with control constraints,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 8, pp. 1941–1952, Aug. 2017. doi: 10.1109/TNNLS.2016.2586303
[9]	B. Zhao and D. R. Liu, “Event-triggered decentralized tracking control of modular reconfigurable robots through adaptive dynamic programming,” IEEE Trans. Industrial Electronics, vol. 67, no. 4, pp. 3054–3064, Apr. 2020. doi: 10.1109/TIE.2019.2914571
[10]	Y. Jiang and Z.-P. Jiang, Robust Adaptive Dynamic Programming. Hoboken, New Jersey: John Wiley & Sons, 2017.
[11]	R. Z. Song, F. L. Lewis, and Q. L. Wei, “Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzerosum games,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 3, pp. 704–713, Mar. 2017. doi: 10.1109/TNNLS.2016.2582849
[12]	H. G. Zhang, K. Zhang, Y. L. Cai, and J. Han, “Adaptive fuzzy fault-tolerant tracking control for partially unknown systems with actuator faults via integral reinforcement learning method,” IEEE Trans. Fuzzy Systems, vol. 27, no. 10, pp. 1986–1998, Oct. 2019. doi: 10.1109/TFUZZ.2019.2893211
[13]	L. Liu, Z. S. Wang, and H. G. Zhang, “Adaptive fault-tolerant tracking control for MIMO discrete-time systems via reinforcement learning algorithm with less learning parameters,” IEEE Trans. Automation Science and Engineering, vol. 14, no. 1, pp. 299–313, Jan. 2017. doi: 10.1109/TASE.2016.2517155
[14]	Y.-J. Liu, S. Li, S. C. Tong, and C. L. P. Chen, “Adaptive reinforcement learning control based on neural approximation for nonlinear discretetime systems with unknown nonaffine dead-zone input,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 1, pp. 295–305, Jan. 2019. doi: 10.1109/TNNLS.2018.2844165
[15]	J. N. Li, H. Modares, T. Y. Chai, F. L. Lewis, and L. H. Xie, “Off-policy reinforcement learning for synchronization in multiagent graphical games,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2434–2445, Oct. 2017. doi: 10.1109/TNNLS.2016.2609500
[16]	J. H. Qin, M. Li, Y. Shi, Q. C. Ma, and W. X. Zheng, “Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 1, pp. 85–96, Jan. 2019. doi: 10.1109/TNNLS.2018.2832025
[17]	X. Yang and H. B. He, “Event-triggered robust stabilization of nonlinear input-constrained systems using single network adaptive critic designs,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2853089, Jul. 2018.
[18]	B. Widrow, N. K. Gupta, and S. Maitra, “Punish/reward: learning with a critic in adaptive threshold systems,” IEEE Trans. Systems,Man,and Cybernetics, vol. 3, no. 5, pp. 455–465, Sept. 1973.
[19]	D. V. Prokhorov and D. C. Wunsch, “Adaptive critic designs,” IEEE Trans. Neural Networks, vol. 8, no. 5, pp. 997–1007, Sept. 1997. doi: 10.1109/72.623201
[20]	R. Padhi, N. Unnikrishnan, X. H. Wang, and S. N. Balakrishnan, “A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems,” Neural Networks, vol. 19, no. 10, pp. 1648–1660, 2006. doi: 10.1016/j.neunet.2006.08.010
[21]	D. Wang, D. R. Liu, Q. C. Zhang, and D. B. Zhao, “Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics,” IEEE Trans. Systems,Man,and Cybernetics:Systems, vol. 46, no. 11, pp. 1544–1555, Nov. 2016. doi: 10.1109/TSMC.2015.2492941
[22]	B. Luo, D. R. Liu, T. W. Huang, and D. Wang, “Model-free optimal tracking control via critic-only Q-learning,” IEEE Trans. Neural Networks and Learning Systems, vol. 27, no. 10, pp. 2134–2144, Oct. 2016. doi: 10.1109/TNNLS.2016.2585520
[23]	H. G. Zhang, Y. H. Luo, and D. R. Liu, “Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints,” IEEE Trans. Neural Networks, vol. 20, no. 9, pp. 1490–1503, Sept. 2009. doi: 10.1109/TNN.2009.2027233
[24]	M. M. Ha, D. Wang, and D. R. Liu, “Event-triggered adaptive critic control design for discrete-time constrained nonlinear systems,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2868510. Sept. 2018.
[25]	M. Abu-Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, pp. 779–791, May 2005. doi: 10.1016/j.automatica.2004.11.034
[26]	H. Modares, F. L. Lewis, and M. Naghibi-Sistani, “Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks,” IEEE Trans. Neural Networks and Learning Systems, vol. 24, no. 10, pp. 1513–1525, Oct. 2013. doi: 10.1109/TNNLS.2013.2276571
[27]	Y. H. Zhu, D. B. Zhao, H. B. He, and J. H. Ji, “Event-triggered optimal control for partially-unknown constrained-input systems via adaptive dynamic programming,” IEEE Trans. Industrial Electronics, vol. 64, no. 5, pp. 4101–4109, May 2017. doi: 10.1109/TIE.2016.2597763
[28]	D. Wang, H. B. He, and D. R. Liu, “Adaptive critic nonlinear robust control: a survey,” IEEE Trans. Cybernetics, vol. 47, no. 10, pp. 3429–3451, Oct. 2017. doi: 10.1109/TCYB.2017.2712188
[29]	H. G. Zhang, K. Zhang, G. Y. Xiao, and H. Jiang, “Robust optimal control scheme for unknown constrained-input nonlinear systems via a plug-n-play event-sampled critic-only algorithm,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2018.2889377, Feb. 2019.
[30]	L. L. Cui, X. P. Xie, X. W. Wang, Y. H. Luo, and J. B. Liu, “Event-triggered singlenetwork ADP method for constrained optimal tracking control of continuous-time nonlinear systems,” Applied Mathematics and Computation, vol. 352, pp. 220–234, Jul. 2019. doi: 10.1016/j.amc.2019.01.066
[31]	Y. Jiang, J. L. Fan, T. Y. Chai, and F. L. Lewis, “Dual-rate operational optimal control for flotation industrial process with unknown operational model,” IEEE Trans. Industrial Electronics, vol. 66, no. 6, pp. 4587–4599, Jun. 2019. doi: 10.1109/TIE.2018.2856198
[32]	L. H. Kong, W. He, Y. T. Dong, L. Cheng, C. G. Yang, and Z. J. Li, “Asymmetric bounded neural control for an uncertain robot by state feedback and output feedback,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2019.2901277, Apr. 2019.
[33]	W. Zhou, H. C. Liu, H. B. He, J. Yi, and T. F. Li, “Neuro-optimal tracking control for continuous stirred tank reactor with input constraints,” IEEE Trans. Industrial Informatics, vol. 15, no. 8, pp. 4516–4524, Aug. 2019. doi: 10.1109/TII.2018.2884214
[34]	X. Yang, D. R. Liu, D. Wang, and Q. L. Wei, “Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning,” Neural Networks, vol. 55, pp. 30–41, 2014. doi: 10.1016/j.neunet.2014.03.008
[35]	Y. H. Zhu, D. B. Zhao, X. Yang, and Q. C. Zhang, “Policy iteration for H_∞ optimal control of polynomial nonlinear systems via sum of squares programming,” IEEE Trans. Cybernetics, vol. 48, no. 2, pp. 500–509, Feb. 2018. doi: 10.1109/TCYB.2016.2643687
[36]	W. Rudin, Principles of Mathematical Analysis, 3rd ed. New York: McGraw-Hill Publishing Co., 1976.
[37]	K. Hornik, M. Stinchcombe, and H. White, “Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks,” Neural Networks, vol. 3, no. 5, pp. 551–560, 1990. doi: 10.1016/0893-6080(90)90005-6
[38]	K. G. Vamvoudakis and F. L. Lewis, “Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem,” Automatica, vol. 46, no. 5, pp. 878–888, May 2010. doi: 10.1016/j.automatica.2010.02.018
[39]	Z. J. Fu, W. F. Xie, S. Rakheja, and J. Na, “Observer-based adaptive optimal control for unknown singularly perturbed nonlinear systems with input constraints,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 1, pp. 48–57, Jan. 2017. doi: 10.1109/JAS.2017.7510322
[40]	D. S. Mitrinovic and P. M. Vasic, Analytic Inequalities. Berlin: Springer, 1970.
[41]	X. Yang, D. R. Liu, H. W. Ma, and Y. C. Xu, “Online approximate solution of HJI equation for unknown constrained-input nonlinear continuous-time systems,” Information Sciences, vol. 328, pp. 435–454, Jan. 2016. doi: 10.1016/j.ins.2015.09.001
[42]	D. R. Liu, X. Yang, D. Wang, and Q. L. Wei, “Reinforecement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints,” IEEE Trans. Cybernetics, vol. 45, no. 7, pp. 1372–1385, Jul. 2015. doi: 10.1109/TCYB.2015.2417170
[43]	X. Yang and H. B. He, “Adaptive critic learning and experience replay for decentralized event-triggered control of nonlinear interconnected systems,” IEEE Trans. Systems, Man, and Cybernetics: Systems, doi: 10.1109/TSMC.2019.2898370, Mar. 2019.
[44]	Z. Ni, N. Malla, and X. N. Zhong, “Prioritizing useful experience replay for heuristic dynamic programming-based learning systems,” IEEE Trans. Cybernetics, vol. 49, no. 11, pp. 3911–3922, Nov. 2019. doi: 10.1109/TCYB.2018.2853582
[45]	L. Liu, Z. S. Wang, and H. G. Zhang, “Neural-network-based robust optimal tracking control for MIMO discrete-time systems with unknown uncertainty using adaptive critic design,” IEEE Trans. Neural Networks and Learning Systems, vol. 29, no. 4, pp. 1239–1251, Apr. 2018. doi: 10.1109/TNNLS.2017.2660070
[46]	Z. S. Wang, L. Liu, Y. M. Wu, and H. G. Zhang, “Optimal fault-tolerant control for discrete-time nonlinear strict-feedback systems based on adaptive critic design,” IEEE Trans. Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2179–2191, Jun. 2018. doi: 10.1109/TNNLS.2018.2810138

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(7)

Get Citation

PDF

XML

Article Metrics

Article views (1295) PDF downloads(87)

Highlights

An optimal neural control is proposed for nonlinear systems with asymmetric input constraints.
This paper introduces a discounted-cost function to tackle asymmetric input constraints.
Only a critic neural network is utilized to implement the present optimal neuro-control scheme.
Uniform ultimate boundedness stability of all the signals in closed-loop system is proved.

Optimal Neuro-Control Strategy for Nonlinear Systems With Asymmetric Input Constraints

doi: 10.1109/JAS.2020.1003063

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content