A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 8 Issue 1
Jan.  2021

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Xiaofeng Li, Lu Dong and Changyin Sun, "Data-Based Optimal Tracking of Autonomous Nonlinear Switching Systems," IEEE/CAA J. Autom. Sinica, vol. 8, no. 1, pp. 227-238, Jan. 2021. doi: 10.1109/JAS.2020.1003486
Citation: Xiaofeng Li, Lu Dong and Changyin Sun, "Data-Based Optimal Tracking of Autonomous Nonlinear Switching Systems," IEEE/CAA J. Autom. Sinica, vol. 8, no. 1, pp. 227-238, Jan. 2021. doi: 10.1109/JAS.2020.1003486

Data-Based Optimal Tracking of Autonomous Nonlinear Switching Systems

doi: 10.1109/JAS.2020.1003486
Funds:  This work was supported by the National Natural Science Foundation of China (61921004, U1713209, 61803085, and 62041301)
More Information
  • In this paper, a data-based scheme is proposed to solve the optimal tracking problem of autonomous nonlinear switching systems. The system state is forced to track the reference signal by minimizing the performance function. First, the problem is transformed to solve the corresponding Bellman optimality equation in terms of the Q-function (also named as action value function). Then, an iterative algorithm based on adaptive dynamic programming (ADP) is developed to find the optimal solution which is totally based on sampled data. The linear-in-parameter (LIP) neural network is taken as the value function approximator. Considering the presence of approximation error at each iteration step, the generated approximated value function sequence is proved to be boundedness around the exact optimal solution under some verifiable assumptions. Moreover, the effect that the learning process will be terminated after a finite number of iterations is investigated in this paper. A sufficient condition for asymptotically stability of the tracking error is derived. Finally, the effectiveness of the algorithm is demonstrated with three simulation examples.

     

  • loading
  • [1]
    D. Liberzon, Switching in Systems and Control. Boston, USA: Birkhäuser, 2003.
    [2]
    X. P. Xu and P. J. Antsaklis, “Optimal control of switched systems based on parameterization of the switching instants,” IEEE Trans. Autom. Control, vol. 49, no. 1, pp. 2–16, Jan. 2004. doi: 10.1109/TAC.2003.821417
    [3]
    M. Soler, A. Olivares, E. Staffetti, and D. Zapata, “Framework for aircraft trajectory planning toward an efficient air traffic management,” J. Aircr., vol. 49, no. 1, pp. 341–348, Jan.-Feb. 2012. doi: 10.2514/1.C031490
    [4]
    K. Benmansour, A. Benalia, M. Djemaï, and J. de Leon, “Hybrid control of a multicellular converter,” Nonlinear Anal.:Hybrid Syst., vol. 1, no. 1, pp. 16–29, Mar. 2007. doi: 10.1016/j.nahs.2006.06.001
    [5]
    A. Heydari and S. N. Balakrishnan, “Optimal multi-therapeutic HIV treatment using a global optimal switching scheme,” Appl. Math. Comput., vol. 219, no. 14, pp. 7872–7881, Mar. 2013.
    [6]
    M. Rinehart, M. Dahleh, D. Reed, and I. Kolmanovsky, “Suboptimal control of switched systems with an application to the disc engine,” IEEE Trans. Control Syst. Technol., vol. 16, no. 2, pp. 189–201, Mar. 2008. doi: 10.1109/TCST.2007.903366
    [7]
    A. Heydari and S. N. Balakrishnan, “Optimal orbit transfer with ON-OFF actuators using a closed form optimal switching scheme,” in Proc. AIAA Guidance, Navigation, Control Conf., Boston, USA, 2013, pp. 2013–4635.
    [8]
    H. Axelsson, M. Boccadoro, M. Egerstedt, P. Valigi, and Y. Wardi, “Optimal mode-switching for hybrid systems with varying initial states,” Nonlinear Anal.:Hybrid Syst., vol. 2, no. 3, pp. 765–772, Aug. 2008. doi: 10.1016/j.nahs.2007.11.010
    [9]
    X. P. Xu and P. J. Antsaklis, “Optimal control of switched systems via non-linear optimization based on direct differentiations of value functions,” Int. J. Control, vol. 75, no. 16–17, pp. 1406–1426, Nov. 2002. doi: 10.1080/0020717021000023825
    [10]
    X. C. Ding, A. Schild, M. Egerstedt, and J. Lunze, “Real-time optimal feedback control of switched autonomous systems,” IFAC Proc. Vol., vol. 42, no. 17, pp. 108–113, Sep. 2009. doi: 10.3182/20090916-3-ES-3003.00020
    [11]
    Y. Wardi and M. Egerstedt, “Algorithm for optimal mode scheduling in switched systems,” in Proc. American Control Conf. (ACC), Montreal, Canada, 2012, pp. 4546–4551.
    [12]
    H. Axelsson, M. Egerstedt, Y. Wardi, and G. Vachtsevanos, “Algorithm for switching-time optimization in hybrid dynamical systems,” in Proc. IEEE Int. Symp., Mediterrean Conf. Control and Automation Intelligent Control, 2005, Limassol, Cyprus, 2005, pp. 256–261.
    [13]
    M. Sakly, A. Sakly, N. Majdoub, and M. Benrejeb, “Optimization of switching instants for optimal control of linear switched systems based on genetic algorithms,” IFAC Proc. Vol., vol. 42, no. 19, pp. 249–253, Sep. 2009. doi: 10.3182/20090921-3-TR-3005.00045
    [14]
    R. Luus and Y. Q. Chen, “Optimal switching control via direct search optimization,” in Proc. IEEE Int. Symp. Intelligent Control, Houston, USA, 2003, pp. 371–376.
    [15]
    R. Long, J. M. Fu, and L. Y. Zhang, “Optimal control of switched system based on neural network optimization,” in Proc. 4th Int. Conf. Intelligent Computing, Shanghai, China, 2008, pp. 799–806.
    [16]
    M. Rungger and O. Stursberg, “A numerical method for hybrid optimal control based on dynamic programming,” Nonlinear Anal.:Hybrid Syst., vol. 5, no. 2, pp. 254–274, May 2011. doi: 10.1016/j.nahs.2010.09.002
    [17]
    F. L. Lewis, D. L. Vrabie, and V. L. Syrmos, Optimal Control. New York, USA: Wiley, 2012.
    [18]
    M. H. Korayem, A. Zehfroosh, H. Tourajizadeh, and S. Manteghi, “Optimal motion planning of non-linear dynamic systems in the presence of obstacles and moving boundaries using SDRE: Application on cable-suspended robot,” Nonlinear Dyn., vol. 76, no. 2, pp. 1423–1441, Jan. 2014. doi: 10.1007/s11071-013-1219-7
    [19]
    M. H. Korayem and H. Tourajizadeh, “Maximum DLCC of spatial cable robot for a predefined trajectory within the workspace using closed loop optimal control approach,” J. Intell. Robot. Syst., vol. 63, no. 1, pp. 75–99, Jan. 2011. doi: 10.1007/s10846-010-9521-9
    [20]
    M. H. Korayem, M. Bamdad, H. Tourajizadeh, A. H. Korayem, and S. Bayat, “Analytical design of optimal trajectory with dynamic load-carrying capacity for cable-suspended manipulator,” Int. J. Adv. Manuf. Technol., vol. 60, no. 1–4, pp. 317–327, Aug. 2012. doi: 10.1007/s00170-011-3579-9
    [21]
    D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming. Belmont, USA: Athena Scientific, 1996.
    [22]
    D. V. Prokhorov and D. C. Wunsch, “Adaptive critic designs,” IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997–1007, Sep. 1997. doi: 10.1109/72.623201
    [23]
    M. Abu-Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, pp. 779–791, May 2005. doi: 10.1016/j.automatica.2004.11.034
    [24]
    D. Vrabie and F. Lewis, “Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems,” Neural Netw., vol. 22, no. 3, pp. 237–246, Apr. 2009. doi: 10.1016/j.neunet.2009.03.008
    [25]
    K. G. Vamvoudakis and F. L. Lewis, “Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem,” Automatica, vol. 46, no. 5, pp. 878–888, May 2010. doi: 10.1016/j.automatica.2010.02.018
    [26]
    A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof,” IEEE Trans. Syst.,Man,Cybern. B,Cybern., vol. 38, no. 4, pp. 943–949, Aug. 2008. doi: 10.1109/TSMCB.2008.926614
    [27]
    D. R. Liu and Q. L. Wei, “Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 621–634, Mar. 2014. doi: 10.1109/TNNLS.2013.2281663
    [28]
    J. J. Murray, C. J. Cox, G. G. Lendaris, and R. Saeks, “Adaptive dynamic programming,” IEEE Trans. Syst.,Man,Cybern. C Appl. Rev., vol. 32, no. 2, pp. 140–153, May 2002. doi: 10.1109/TSMCC.2002.801727
    [29]
    F. L. Lewis and D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32–50, Aug. 2009. doi: 10.1109/MCAS.2009.933854
    [30]
    A. Heydari and S. N. Balakrishnan, “Optimal switching between autonomous subsystems,” J. Franklin Inst., vol. 351, no. 5, pp. 2675–2690, May 2014. doi: 10.1016/j.jfranklin.2013.12.008
    [31]
    Y. Z. Huang and D. R. Liu, “Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear systems using iterative ADP algorithm,” Neurocomputing, vol. 125, pp. 46–56, Feb. 2014. doi: 10.1016/j.neucom.2012.07.047
    [32]
    D. R. Liu, D. Wang, D. B. Zhao, Q. L. Wei, and N. Jin, “Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming,” IEEE Trans. Autom. Sci. Eng., vol. 9, no. 3, pp. 628–634, Jul. 2012. doi: 10.1109/TASE.2012.2198057
    [33]
    J. Si and Y. T. Wang, “Online learning control by association and reinforcement,” IEEE Trans. Neural Netw., vol. 12, no. 2, pp. 264–276, Mar. 2001. doi: 10.1109/72.914523
    [34]
    B. Luo, D. R. Liu, T. W. Huang, and D. Wang, “Model-free optimal tracking control via critic-only Q-learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 10, pp. 2134–2144, Oct. 2016. doi: 10.1109/TNNLS.2016.2585520
    [35]
    B. Luo, H. N. Wu, T. W. Huang, and D. R. Liu, “Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design,” Automatica, vol. 50, no. 12, pp. 3281–3290, Dec. 2014. doi: 10.1016/j.automatica.2014.10.056
    [36]
    T. Bian and Z. P. Jiang, “Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design,” Automatica, vol. 71, pp. 348–360, Sep. 2016. doi: 10.1016/j.automatica.2016.05.003
    [37]
    Y. Jiang and Z. P. Jiang, “Global adaptive dynamic programming for continuous-time nonlinear systems,” IEEE Trans. Autom. Control, vol. 60, no. 11, pp. 2917–2929, Nov. 2015. doi: 10.1109/TAC.2015.2414811
    [38]
    L. Dong, X. N. Zhong, C. Y. Sun, and H. B. He, “Adaptive event-triggered control based on heuristic dynamic programming for nonlinear discrete-time systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 7, pp. 1594–1605, Jul. 2017. doi: 10.1109/TNNLS.2016.2541020
    [39]
    L. Dong, Y. F. Tang, H. B. He, and C. Y. Sun, “An event-triggered approach for load frequency control with supplementary ADP,” IEEE Trans. Power Syst., vol. 32, no. 1, pp. 581–589, Jan. 2017. doi: 10.1109/TPWRS.2016.2537984
    [40]
    D. Wang, M. M. Ha, and J. F. Qiao, “Self-learning optimal regulation for discrete-time nonlinear systems under event-driven formulation,” IEEE Trans. Autom. Control, vol. 65, no. 3, pp. 1272–1279, Mar. 2020. doi: 10.1109/TAC.2019.2926167
    [41]
    D. Wang, H. B. He, X. N. Zhong, and D. R. Liu, “Event-driven nonlinear discounted optimal regulation involving a power system application,” IEEE Trans. Ind. Electron., vol. 64, no. 10, pp. 8177–8186, Oct. 2017. doi: 10.1109/TIE.2017.2698377
    [42]
    D. Wang, H. B. He, and D. R. Liu, “Adaptive critic nonlinear robust control: A survey,” IEEE Trans. Cybern., vol. 47, no. 10, pp. 3429–3451, Oct. 2017. doi: 10.1109/TCYB.2017.2712188
    [43]
    D. Wang, “Robust policy learning control of nonlinear plants with case studies for a power system application,” IEEE Trans. Ind. Inform., vol. 16, no. 3, pp. 1733–1741, Mar. 2020. doi: 10.1109/TII.2019.2925632
    [44]
    C. X. Mu, Z. Ni, C. Y. Sun, and H. B. He, “Air-breathing hypersonic vehicle tracking control based on adaptive dynamic programming,” IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 3, pp. 584–598, Mar. 2017. doi: 10.1109/TNNLS.2016.2516948
    [45]
    C. Cai, C. K. Wong, and B. G. Heydecker, “Adaptive traffic signal control using approximate dynamic programming,” Transp. Res. C-Emerg. Technol., vol. 17, no. 5, pp. 456–474, Oct. 2009. doi: 10.1016/j.trc.2009.04.005
    [46]
    A. Heydari, “Optimal switching of DC-DC power converters using approximate dynamic programming,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 3, pp. 586–596, Mar. 2018. doi: 10.1109/TNNLS.2016.2635586
    [47]
    A. Heydari, “Optimal switching with minimum dwell time constraint,” J. Franklin Inst., vol. 354, no. 11, pp. 4498–4518, Jul. 2017. doi: 10.1016/j.jfranklin.2017.04.015
    [48]
    A. Heydari, “Feedback solution to optimal switching problems with switching cost,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 10, pp. 2009–2019, Oct. 2016. doi: 10.1109/TNNLS.2015.2388672
    [49]
    A. Heydari, “Optimal scheduling for reference tracking or state regulation using reinforcement learning,” J. Franklin Inst., vol. 352, no. 8, pp. 3285–3303, Aug. 2015. doi: 10.1016/j.jfranklin.2014.11.008
    [50]
    T. Sardarmehni and A. Heydari, “Policy iteration for optimal switching with continuous-time dynamics,” in 2016 Int. Joint Conf. Neural Networks (IJCNN), Vancouver, Canada, 2016, pp. 3536–3543.
    [51]
    A. Heydari and S. N. Balakrishnan, “Optimal switching and control of nonlinear switching systems using approximate dynamic programming,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 6, pp. 1106–1117, Jun. 2014. doi: 10.1109/TNNLS.2013.2288067
    [52]
    A. Heydari, “Optimal codesign of control input and triggering instants for networked control systems using adaptive dynamic programming,” IEEE Trans. Ind. Electron., vol. 66, no. 1, pp. 482–490, Jan. 2019. doi: 10.1109/TIE.2018.2823699
    [53]
    A. Heydari, “Optimal triggering of networked control systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 7, pp. 3011–3021, Jul. 2018.
    [54]
    W. Rudin, Principles of Mathematical Analysis. New York, USA: McGraw-Hill, 1976.
    [55]
    H. K. Khalil, Nonlinear Systems. 3rd ed. Upper Saddle River, USA: Prentice Hall, 2002.
    [56]
    T. Sardarmehni and A. Heydari, “Sub-optimal switching in anti-lock brake systems using approximate dynamic programming,” IET Control Theory Appl., vol. 13, no. 9, pp. 1413–1424, Jun. 2019. doi: 10.1049/iet-cta.2018.5428

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(15)

    Article Metrics

    Article views (1188) PDF downloads(97) Cited by()

    Highlights

    • Develop a data-based method for optimal tracking of autonomous switching systems.
    • The effects of approximation error and finite number of iterations are considered.
    • Provide theoretical analysis of the continuity, the convergence, and the stability.

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return