A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 5 Issue 1
Jan.  2018

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Majid Mazouchi, Mohammad Bagher Naghibi-Sistani and Seyed Kamal Hosseini Sani, "A Novel Distributed Optimal Adaptive Control Algorithm for Nonlinear Multi-Agent Differential Graphical Games," IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 331-341, Jan. 2018. doi: 10.1109/JAS.2017.7510784
Citation: Majid Mazouchi, Mohammad Bagher Naghibi-Sistani and Seyed Kamal Hosseini Sani, "A Novel Distributed Optimal Adaptive Control Algorithm for Nonlinear Multi-Agent Differential Graphical Games," IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 331-341, Jan. 2018. doi: 10.1109/JAS.2017.7510784

A Novel Distributed Optimal Adaptive Control Algorithm for Nonlinear Multi-Agent Differential Graphical Games

doi: 10.1109/JAS.2017.7510784
More Information
  • In this paper, an online optimal distributed learning algorithm is proposed to solve leader-synchronization problem of nonlinear multi-agent differential graphical games. Each player approximates its optimal control policy using a single-network approximate dynamic programming (ADP) where only one critic neural network (NN) is employed instead of typical actorcritic structure composed of two NNs. The proposed distributed weight tuning laws for critic NNs guarantee stability in the sense of uniform ultimate boundedness (UUB) and convergence of control policies to the Nash equilibrium. In this paper, by introducing novel distributed local operators in weight tuning laws, there is no more requirement for initial stabilizing control policies. Furthermore, the overall closed-loop system stability is guaranteed by Lyapunov stability analysis. Finally, Simulation results show the effectiveness of the proposed algorithm.

     

  • loading
  • [1]
    R. Olfati-Saber and R. M. Murray, "Consensus problems in networks of agents with switching topology and time-delays, " IEEE Trans. Automat. Control, vol. 49, no. 9, pp. 1520-1533, Sep. 2004. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=1333204
    [2]
    W. Ren and R. W. Beard, "Consensus seeking in multiagent systems under dynamically changing interaction topologies, " IEEE Trans. Automat. Control, vol. 50, no. 5, pp. 655-661, May 2005. http://www.docin.com/p-950098404.html
    [3]
    W. Ren and R. W. Beard, Distributed Consensus in Multi-Vehicle Cooperative Control: Theory and Applications. Berlin, Germany: SpringerVerlag, 2008. http://www.springerlink.com/content/978-1-84800-015-5
    [4]
    J. A. Fax and R. M. Murray, "Information flow and cooperative control of vehicle formations, " IEEE Trans. Automat. Control, vol. 49, no. 9, pp. 1465-1476, Sep. 2004. http://www.sciencedirect.com/science/article/pii/S1474667015385219
    [5]
    A. Jadbabaie, J. Lin, and A. S. Morse, "Coordination of groups of mobile autonomous agents using nearest neighbor rules, " IEEE Trans. Automat. Control, vol. 48, no. 6, pp. 988-1001, Jun. 2003. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=1205192
    [6]
    Z. H. Qu, Cooperative Control of Dynamical Systems: Applications to Autonomous Vehicles. New York, USA: Springer-Verlag, 2009.
    [7]
    F. L. Lewis, H. W. Zhang, K. Hengster-Movric, and A. Das, Cooperative Control of Multi-Agent Systems: Optimal and Adaptive Design Approaches. Berlin, Germany: Spring-Verlag, 2014. http://dl.acm.org/citation.cfm?id=2584518
    [8]
    M. Defoort, T. Floquet, A. Kokosy, and W. Perruquetti, "Sliding-mode formation control for cooperative autonomous mobile robots, " IEEE Trans. Ind. Electron., vol. 55, no. 11, pp. 3944-3953, Nov. 2008. http://ieeexplore.ieee.org/document/4601469/
    [9]
    J. Mei, W. Ren, and G. F. Ma, "Distributed containment control for Lagrangian networks with parametric uncertainties under a directed graph, " Automatica, vol. 48, no. 4, pp. 653-659, Apr. 2012. http://www.sciencedirect.com/science/article/pii/S0005109812000362
    [10]
    W. Lin, "Distributed UAV formation control using differential game approach, " Aerosp. Sci. Technol., vol. 35, pp. 54-62, May 2014. doi: 10.1016/j.ast.2014.02.004
    [11]
    R. Abdolee, B. Champagne, and A. H. Sayed, "Diffusion adaptation over multi-agent networks with wireless link impairments, " IEEE Trans. Mobile Comput., vol. 15, no. 6, pp. 1362-1376, Jun. 2016. http://ieeexplore.ieee.org/document/7165668/
    [12]
    W. Q. Wang, "Carrier frequency synchronization in distributed wireless sensor networks, " IEEE Syst. J., vol. 9, no. 3, pp. 703-713, Sep. 2015. http://ieeexplore.ieee.org/document/6851115/
    [13]
    S. M. Mu, T. G. Chu, and L. Wang, "Coordinated collective motion in a motile particle group with a leader, " Phys. A, vol. 351, no. 2-4, pp. 211-226, Jun. 2005. http://www.sciencedirect.com/science/article/pii/S0378437104016097
    [14]
    V. Nasirian, S. Moayedi, A. Davoudi, and F. L. Lewis, "Distributed cooperative control of DC microgrids, " IEEE Trans. Power Electron., vol. 30, no. 4, pp. 2288-2303, Apr. 2015. http://ieeexplore.ieee.org/document/6816073/
    [15]
    L. L. Fan, V. Nasirian, H. Modares, F. L. Lewis, Y. D. Song, and A. Davoudi, "Game-theoretic control of active loads in DC microgrids, " IEEE Trans. Energy Convers., vol. 31, no. 3, pp. 882-895, Sep. 2016. http://ieeexplore.ieee.org/document/7438848/
    [16]
    D. M. Xie and J. H. Chen, "Consensus problem of data-sampled networked multi-agent systems with time-varying communication delays, " Trans. Inst. Meas. Control, vol. 35, no. 6, pp. 753-763, Mar. 2013. http://connection.ebscohost.com/c/articles/88953215/consensus-problem-data-sampled-networked-multi-agent-systems-time-varying-communication-delays
    [17]
    S. Y. Tu and A. H. Sayed, "Diffusion strategies outperform consensus strategies for distributed estimation over adaptive networks, " IEEE Trans. Signal Process., vol. 60, no. 12, pp. 6217-6234, Dec. 2012. http://ieeexplore.ieee.org/document/6296723/
    [18]
    H. W. Zhang, F. L. Lewis, and Z. H. Qu, "Lyapunov, adaptive, and optimal design techniques for cooperative systems on directed communication graphs, " IEEE Trans. Ind. Electron., vol. 59, no. 7, pp. 3026-3041, Jul. 2012. http://ieeexplore.ieee.org/document/5898403/
    [19]
    W. Ren, R. W. Beard, and E. M. Atkins, "Information consensus in multivehicle cooperative control, " IEEE Control Syst., vol. 27, no. 2, pp. 71-82, Apr. 2007. http://www.springer.com/978-1-84800-014-8
    [20]
    Z. J. Tang, "Leader-following consensus with directed switching topologies, " Trans. Inst. Meas. Control, vol. 37, no. 3, pp. 406-413, Jul. 2015. doi: 10.1177/0142331214540931
    [21]
    A. R. Wei, X. M. Hu, and Y. Z. Wang, "Tracking control of leaderfollower multi-agent systems subject to actuator saturation, " IEEE/CAA J. Automat. Sin., vol. 1, no. 1, pp. 84-91, Jan. 2014. http://ieeexplore.ieee.org/document/7004624/
    [22]
    C. H. Zhang, L. Chang, and X. F. Zhang, "Leader-follower consensus of upper-triangular nonlinear multi-agent systems, " IEEE/CAA J. Automat. Sin., vol. 1, no. 2, pp. 210-217, Apr. 2014. http://ieeexplore.ieee.org/document/7004552/
    [23]
    C. R. Wang, X. H. Wang, and H. B. Ji, "A continuous leader-following consensus control strategy for a class of uncertain multi-agent systems, " IEEE/CAA J. Automat. Sin., vol. 1, no. 2, pp. 187-192, Apr. 2014. http://ieeexplore.ieee.org/document/7004549/
    [24]
    Y. G. Hong, J. P. Hu, and L. X. Gao, "Tracking control for multi-agent consensus with an active leader and variable topology, " Automatica, vol. 42, no. 7, pp. 1177-1182, Jul. 2006.
    [25]
    G. Owen, Game Theory. New York, USA:Academic Press, 1982.
    [26]
    T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory (Classics in Applied Mathematics). Philadelphia, PA, USA: SIAM, 1999.
    [27]
    E. Semsar-Kazerooni and K. Khorasani, "Multi-agent team cooperation: A game theory approach, " Automatica, vol. 45, no. 10, pp. 2205-2213, Oct. 2009. http://www.sciencedirect.com/science/article/pii/S0005109809002970
    [28]
    C. X. Jiang, Y. Chen, and K. J. R. Liu, "Distributed adaptive networks: A graphical evolutionary game-theoretic view, " IEEE Trans. Signal Process., vol. 61, no. 22, pp. 5675-5688, Nov. 2013. http://ieeexplore.ieee.org/document/6632955
    [29]
    C. X. Jiang, Y. Chen, Y. Gao, and K. J. R. Liu, "Indian buffet game with negative network externality and non-Bayesian social learning, " IEEE Trans. Syst. Man Cybern. Syst., vol. 45, no. 4, pp. 609-623, Apr. 2015. http://ieeexplore.ieee.org/document/6983589/
    [30]
    R. Kamalapurkar, J. R. Klotz, and W. E. Dixon, "Concurrent learningbased approximate feedback-Nash equilibrium solution of N-player nonzero-sum differential games, " IEEE/CAA J. Automat. Sin., vol. 1, no. 3, pp. 239-247, Jul. 2014. http://ieeexplore.ieee.org/document/7004681/
    [31]
    K. G. Vamvoudakis, F. L. Lewis, and G. R. Hudas, "Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality, " Automatica, vol. 48, no. 8, pp. 1598-1611, Aug. 2012. http://www.sciencedirect.com/science/article/pii/S0005109812002476
    [32]
    R. S. Sutton and A. G. Barto, Reinforcement Learning:An Introduction. Cambridge, MA, USA:MIT Press, 1998.
    [33]
    P. J. Werbos, "Approximate dynamic programming for real-time control and neural modeling, " Handbook of Intelligent Control, D. A. White and D. A. Sofge, Eds. New York, USA: Van Nostrand Reinhold, 1992. https://www.mendeley.com/research-papers/approximate-dynamic-programming-realtime-control-neural-modeling-5/
    [34]
    J. J. Murray, C. J. Cox, G. G. Lendaris, and R. Saeks, "Adaptive dynamic programming, " IEEE Trans. Syst. Man Cybern. C, vol. 32, no. 2, pp. 140-153, May 2002. doi: 10.1109/TSMCC.2002.801727
    [35]
    H. Modares, F. L. Lewis, and M. B. Naghibi-Sistani, "Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems, " Automatica, vol. 50, no. 1, pp. 193-202, Jan. 2014. http://www.sciencedirect.com/science/article/pii/S0005109813004767
    [36]
    H. Modares and F. L. Lewis, "Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning, " Automatica, vol. 50, no. 7, pp. 1780-1792, Jul. 2014.
    [37]
    Z. P. Jiang and Y. Jiang, "Robust adaptive dynamic programming for linear and nonlinear systems: An overview, " Eur. J. Control, vol. 19, no. 5, pp. 417-425, Sep. 2013. http://www.sciencedirect.com/science/article/pii/S0947358013000861
    [38]
    S. Bhasin, R. Kamalapurkar, M. Johnson, K. G. Vamvoudakis, F. L. Lewis, and W. E. Dixon, "A novel actorCcriticCidentifier architecture for approximate optimal control of uncertain nonlinear systems, " Automatica, vol. 49, no. 1, pp. 82-92, Jan. 2013. http://www.sciencedirect.com/science/article/pii/S0005109812004827
    [39]
    K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem, " Automatica, vol. 46, no. 5, pp. 878-888, May 2010. doi: 10.1016/j.automatica.2010.02.018
    [40]
    F. Tatari, M. B. Naghibi-Sistani, and K. G. Vamvoudakis, "Distributed learning algorithm for non-linear differential graphical games, " Trans. Inst. Meas. Control, Vol 39, no. 2, pp. 173-182, Feb. 2017.
    [41]
    K. G. Vamvoudakis and F. L. Lewis, "Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations, " Automatica, vol. 47, no. 8, pp. 1556-1569, Aug. 2011. http://www.sciencedirect.com/science/article/pii/S0005109811001774
    [42]
    H. G. Zhang, L. L. Cui, and Y. H. Luo, "Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP, " IEEE Trans. Cybern., vol. 43, no. 1, pp. 206-216, Feb. 2013. http://www.ncbi.nlm.nih.gov/pubmed/22759477
    [43]
    M. I. Abouheaf and F. L. Lewis, "Multi-agent differential graphical games: Nash online adaptive learning solutions, " in Proc. 52nd Annu. Conf. Decision and Control. Firenze, Italy, 2013, pp. 5803-5809. doi: 10.1109/CDC.2013.6760804
    [44]
    M. I. Abouheaf, F. L. Lewis, and M. S. Mahmoud, "Differential graphical games: Policy iteration solutions and coupled Riccati formulation, " in Proc. 2014 European Control Conf. . Strasbourg, France, 2014, pp. 1594-1599. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6862473
    [45]
    Q. L. Wei, D. R. Liu, and F. L. Lewis, "Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games, " Inform. Sci., vol. 317, pp. 96-113, Oct. 2015. http://www.sciencedirect.com/science/article/pii/S0020025515003266
    [46]
    F. A. Yaghmaie, F. L. Lewis, and R. Su, "Output regulation of heterogeneous linear multi-agent systems with differential graphical game, " Int. J. Robust Nonlinear Control, vol. 26, pp. 2256-2278, Jul. 2016. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7171071
    [47]
    Q. Jiao, H. Modares, S. Y. Xu, F. L. Lewis, and K. G. Vamvoudakis, "Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control, " Automatica, vol. 69, pp. 24-34, Jul. 2016. http://www.sciencedirect.com/science/article/pii/S0005109816300346
    [48]
    M. I. Abouheaf, F. L. Lewis, K. G. Vamvoudakis, S. Haesaert, and R. Babuska, "Multi-agent discrete-time graphical games and reinforcement learning solutions, " Automatica, vol. 50, no. 12, pp. 3038-3053, Dec. 2014. http://www.sciencedirect.com/science/article/pii/S0005109814004282
    [49]
    A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems, " IEEE Trans. Syst. Man Cybern., vol. SMC-13, no. 5, pp. 834-846, Sep. -Oct. 1983. http://dl.acm.org/citation.cfm?id=104432
    [50]
    T. Dierks and S. Jagannathan, "Optimal control of affine nonlinear continuous-time systems using an online Hamilton-Jacobi-Isaacs formulation, " in Proc. 49th Conf. Decision and Control. Atlanta, GA, USA, 2010, pp. 3048-3053. http://ieeexplore.ieee.org/document/6639915/
    [51]
    M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach, " Automatica, vol. 41, no. 5, pp. 779-791, May 2005. doi: 10.1016/j.automatica.2004.11.034
    [52]
    F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal Control. Hoboken, NJ, USA: John Wiley, 2012.
    [53]
    F. L. Lewis, S. Jagannathan, and A. Yeşildirek, Neural Network Control of Robot Manipulators and Nonlinear Systems. London, UK: Taylor and Francis, 1999.
    [54]
    H. K. Khalil, Nonlinear Systems. Englewood Cliffs, New Jersey, USA: Prentice-Hall, 1996.
    [55]
    B. A. Finlayson, The Method of Weighted Residuals and Variational Principles. New York, USA:Academic Press, 1990.
    [56]
    P. Ioannou and B. Fidan. Adaptive Control Tutorial (Advances in Design and Control). Philadelphia, PA: SIAM, 2006. http://dl.acm.org/citation.cfm?id=1196479
    [57]
    J. J. E. Slotine and W. P. Li, Applied Nonlinear Control. Englewood Cliffs, NJ, USA: Prentice Hall, 1991.
    [58]
    S. Sastry and M. Bodson, Adaptive Control: Stability, Convergence, and Robustness. Englewood Cliffs, NJ: Prentice Hall, 1989. http://dl.acm.org/citation.cfm?id=63437

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(1)

    Article Metrics

    Article views (1017) PDF downloads(208) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return