A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 1 Issue 4
Oct.  2014

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Jing Na and Guido Herrmann, "Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems," IEEE/CAA J. of Autom. Sinica, vol. 1, no. 4, pp. 412-422, 2014.
Citation: Jing Na and Guido Herrmann, "Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems," IEEE/CAA J. of Autom. Sinica, vol. 1, no. 4, pp. 412-422, 2014.

Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems

Funds:

This work was supported by National Natural Science Foundation of China (61203066).

  • This paper proposes an online adaptive approximate solution for the infinite-horizon optimal tracking control problem of continuous-time nonlinear systems with unknown dynamics. The requirement of the complete knowledge of system dynamics is avoided by employing an adaptive identifier in conjunction with a novel adaptive law, such that the estimated identifier weights converge to a small neighborhood of their ideal values. An adaptive steady-state controller is developed to maintain the desired tracking performance at the steady-state, and an adaptive optimal controller is designed to stabilize the tracking error dynamics in an optimal manner. For this purpose, a critic neural network (NN) is utilized to approximate the optimal value function of the Hamilton-Jacobi-Bellman (HJB) equation, which is used in the construction of the optimal controller. The learning of two NNs, i.e., the identifier NN and the critic NN, is continuous and simultaneous by means of a novel adaptive law design methodology based on the parameter estimation error. Stability of the whole system consisting of the identifier NN, the critic NN and the optimal tracking control is guaranteed using Lyapunov theory; convergence to a near-optimal control law is proved. Simulation results exemplify the effectiveness of the proposed method.

     

  • loading
  • [1]
    Lewis F L, Vrabie D, Syrmos V L. Optimal Control. Wiley. com, 2012.
    [2]
    Vrabie D, Lewis F L. Neural network approach to continuous-time directadaptive optimal control for partially unknown nonlinear systems. NeuralNetworks, 2009, 22(3): 237-246
    [3]
    Sastry S, Bodson M. Adaptive Control: Stability, Convergence, andRobustness. New Jersey: Prentice Hall, 1989.
    [4]
    Ioannou P A, Sun J. Robust Adaptive Control. New Jersey: PrenticeHall, 1996.
    [5]
    Sutton R S, Barto A G. Reinforcement Learning: An Introduction.Cambridge: Cambridge University Press, 1998.
    [6]
    Doya K J. Reinforcement learning in continuous time and space. Neuralcomputation, 2000, 12(1): 219-245
    [7]
    Sutton R S, Barto A G, Williams R J. Reinforcement learning is directadaptive optimal control. IEEE Control Systems Magazine, 1992, 12(2):19-22
    [8]
    Werbos P J. A menu of designs for reinforcement learning over time.Neural Networks for Control. MA, USA: MIT Press Cambridge, 1990.67-95
    [9]
    Si J, Barto A G, Powell W B, Wunsch D C. Handbook of Learning andApproximate Dynamic Programming. Los Alamitos: IEEE Press, 2004.
    [10]
    Wang F Y, Zhang H G, Liu D R. Adaptive dynamic programming:an introduction. IEEE Computational Intelligence Magazine, 2009, 4(2):39-47
    [11]
    Lewis F L, Vrabie D. Reinforcement learning and adaptive dynamic programmingfor feedback control. IEEE Circuits and Systems Magazine,2009 9(3): 32-50
    [12]
    Zhang H G, Zhang X, Luo Y H, Yang J. An overview of research onadaptive dynamic programming. Acata Automatica Sinica, 2013, 39(4):303-311
    [13]
    Dierks T, Thumati B T, Jagannathan S. Optimal control of unknownaffine nonlinear discrete-time systems using offline-trained neural networkswith proof of convergence. Neural Networks, 2009, 22(5):851-860
    [14]
    Al-Tamimi A, Lewis F L, Abu-Khalaf M. Discrete-time nonlinearHJB solution using approximate dynamic programming: convergenceproof. IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics, 2008, 38(4): 943-949
    [15]
    Wang D, Liu D R, Wei Q L, Zhao D B, Jin N. Optimal control ofunknown nonaffine nonlinear discrete-time systems based on adaptivedynamic programming. Automatica, 2012, 48(8): 1825-1832
    [16]
    Hanselmann T, Noakes L, Zaknich A. Continuous-time adaptive critics.IEEE Transactions on Neural Networks, 2007, 18(3): 631-647
    [17]
    Abu-Khalaf M, Lewis F L. Nearly optimal control laws for nonlinearsystems with saturating actuators using a neural network HJB approach.Automatica, 2005, 41(5): 779-791
    [18]
    Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis F L. Adaptive optimalcontrol for continuous-time linear systems based on policy iteration.Automatica, 2009, 45(2): 477-484
    [19]
    Vamvoudakis K G, Lewis F L. Online actor-critic algorithm to solve thecontinuous-time infinite horizon optimal control problem. Automatica,2010, 46(5): 878-888
    [20]
    Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis K G, Lewis F L,Dixon W E. A novel actor-critic-identifier architecture for approximateoptimal control of uncertain nonlinear systems. Automatica, 2013, 49(1):82-92
    [21]
    Zhang H G, Cui L, Zhang X, Luo Y. Data-driven robust approximateoptimal tracking control for unknown general nonlinear systems usingadaptive dynamic programming method. IEEE Transactions on NeuralNetworks, 2011, 22(12): 2226-2236
    [22]
    Mannava A, Balakrishnan S N, Tang L, Landers R G. Optimal trackingcontrol of motion systems. IEEE Transactions on Control SystemsTechnology, 2012, 20(6): 1548-1558
    [23]
    Nodland D, Zargarzadeh H, Jagannathan S. Neural network-basedoptimal adaptive output feedback control of a helicopter UAV. IEEETransactions on Neural Networks and Learning Systems, 2013, 24(7):1061-1073
    [24]
    Na J, Herrmann G, Ren X M, Mahyuddin M N, Barber P. Robust adaptivefinite-time parameter estimation and control of nonlinear systems.In: Proceedings of IEEE International Symposium on Intelligent Control(ISIC). Denver, CO: IEEE, 2011. 1014-1019
    [25]
    Uang H J, Chen B S. Robust adaptive optimal tracking design foruncertain missile systems: a fuzzy approach. Fuzzy Sets and Systems,2002, 126(1): 63-87
    [26]
    Krstic M, Kokotovic P V, Kanellakopoulos I. Nonlinear and AdaptiveControl Design. New York: Wiley, 1995.
    [27]
    Kosmatopoulos E B, Polycarpou M M, Christodoulou M A, Ioannou PA. High-order neural network structures for identification of dynamicalsystems. IEEE Transactions on Neural Networks, 1995, 6(2): 422-431
    [28]
    Abdollahi F, Talebi H A, Patel R V. A stable neural network-based observerwith application to flexible-joint manipulators. IEEE Transactionson Neural Networks, 2006, 17(1): 118-129
    [29]
    Lin J S, Kanellakopoulos I. Nonlinearities enhance parameter convergencein strict feedback systems. IEEE Transactions on AutomaticControl, 1999, 44(1): 89-94
    [30]
    Edwards C, Spurgeon S K. Sliding Mode Control: Theory and Applications.Boca Raton: CRC Press, 1998.
    [31]
    Sira-Ramirez H. Differential geometric methods in variable-structurecontrol. International Journal of Control, 1988, 48 (4): 1359-1390
    [32]
    Nevistic V, Primbs J A. Constrained Nonlinear Optimal Control: AConverse HJB Approach, Technical Report CIT-CDS 96-021, CaliforniaInstitute of Technology, Pasadena, CA, 1996.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (1476) PDF downloads(26) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return