Citation: | Y. Zhang, Y. Wang, and Y. Cai, “Value iteration-based distributed adaptive dynamic programming for multi-player differential game with incomplete information,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 2, pp. 1–12, Feb. 2025. |
[1] |
T. Başar and G. Zaccour, Handbook of Dynamic Game Theory. Springer, Aug. 2018.
|
[2] |
P. An, M. Liu, Y. Wan, and F. L. Lewis, “Multi-player H∞ differential game using On-Policy and Off-Policy reinforcement learning,” in the 16th IEEE Int. Conf. Control and Automation, pp. 1137–1142, Oct. 2020. ISSN: 1948–3457.
|
[3] |
E. Garcia, D. W. Casbeer, A. Von Moll, and M. Pachter, “Multiple pursuer multiple evader differential games,” IEEE Trans. Automatic Control, vol. 66, pp. 2345–2350, May 2021. doi: 10.1109/TAC.2020.3003840
|
[4] |
Z. Zhou and H. Xu, “Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning,” Neurocomputing, vol. 484, pp. 46–58, May 2022. doi: 10.1016/j.neucom.2021.01.141
|
[5] |
J. Sun and Z. Ming, “Cooperative differential game-based distributed optimal synchronization control of heterogeneous nonlinear multiagent systems,” IEEE Trans. Cybernetics, vol. 53, pp. 7933–7942, Dec. 2023. doi: 10.1109/TCYB.2023.3240983
|
[6] |
D. Wang, N. Gao, D. Liu, J. Li, and F. L. Lewis, “Recent progress in reinforcement learning and adaptive dynamic programming for advanced control applications,” IEEE/CAA J. Autom. Sinica, vol. 11, pp. 18–36, Jan. 2024. doi: 10.1109/JAS.2023.123843
|
[7] |
J. Zhao, “Data-driven adaptive dynamic programming for optimal control of continuous-time multicontroller systems with unknown dynamics,” IEEE Access, vol. 10, pp. 41503–41511, 2022. doi: 10.1109/ACCESS.2022.3168032
|
[8] |
X. Li, L. Wang, Y. An, Q.-L. Huang, Y.-H. Cui, and H.-S. Hu, “Dynamic path planning of mobile robots using adaptive dynamic programming,” Expert Systems With Applications, vol. 235, p. 121112, Jan. 2024. doi: 10.1016/j.eswa.2023.121112
|
[9] |
Y. Zhu, D. Zhao, X. Li, and D. Wang, “Control-limited adaptive dynamic programming for multi-battery energy storage systems,” IEEE Trans. Smart Grid, vol. 10, pp. 4235–4244, Jul. 2019. doi: 10.1109/TSG.2018.2854300
|
[10] |
T. Lyu, H. Xu, L. Zhang, and Z. Han, “Source selection and resource allocation in wireless-powered relay networks: An adaptive dynamic programming-based approach,” IEEE Internet of Things J., vol. 11, pp. 8973–8988, Mar. 2024. doi: 10.1109/JIOT.2023.3321673
|
[11] |
Z. Lin, J. Ma, J. Duan, S. E. Li, H. Ma, B. Cheng, and T. H. Lee, “Policy iteration based approximate dynamic programming toward autonomous driving in constrained dynamic environment,” IEEE Trans. Intelligent Transportation Systems, vol. 24, pp. 5003–5013, May 2023. doi: 10.1109/TITS.2023.3237568
|
[12] |
T. Liu, L. Cui, B. Pang, and Z.-P. Jiang, “A unified framework for data-driven optimal control of connected vehicles in mixed traffic,” IEEE Trans. Intelligent Vehicles, vol. 8, pp. 4131–4145, Aug. 2023. doi: 10.1109/TIV.2023.3287131
|
[13] |
R. Song, Q. Wei, H. Zhang, and F. L. Lewis, “Discrete-time non-zero-sum games with completely unknown dynamics,” IEEE Trans. Cybernetics, vol. 51, pp. 2929–2943, June 2021. doi: 10.1109/TCYB.2019.2957406
|
[14] |
J. Li, Z. Xiao, J. Fan, T. Chai, and F. L. Lewis, “Off-policy Q-learning: Solving Nash equilibrium of multi-player games with network-induced delay and unmeasured state,” Automatica, vol. 136, p. 110076, Feb. 2022. doi: 10.1016/j.automatica.2021.110076
|
[15] |
H. Jiang, B. Zhou, and G.-R. Duan, “Modified λ-policy iteration based adaptive dynamic Programming for Unknown Discrete-Time Linear Systems,” IEEE Trans. Neural Networks and Learning Systems, vol. 35, pp. 3291–3301, Mar. 2024. doi: 10.1109/TNNLS.2023.3244934
|
[16] |
F. F. M. El-Sousy, M. M. Amin, and A. Al-Durra, “Adaptive optimal tracking control via actor-critic-identifier based adaptive dynamic programming for permanent-magnet synchronous motor drive system,” IEEE Trans. Industry Applications, vol. 57, pp. 6577–6591, Nov. 2021. doi: 10.1109/TIA.2021.3110936
|
[17] |
J. Na, Y. Lv, K. Zhang, and J. Zhao, “Adaptive identifier-critic-based optimal tracking control for nonlinear systems with experimental validation,” IEEE Trans. Systems, Man, and Cybernetics: Systems, vol. 52, pp. 459–472, Jan. 2022. doi: 10.1109/TSMC.2020.3003224
|
[18] |
H. Li, D. Liu, and D. Wang, “Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics,” IEEE Trans. Autom. Science and Engineering, vol. 11, pp. 706–714, Jul. 2014. doi: 10.1109/TASE.2014.2300532
|
[19] |
C. Chen, H. Modares, K. Xie, F. L. Lewis, Y. Wan, and S. Xie, “Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics,” IEEE Trans. Autom. Control, vol. 64, pp. 4423–4438, Nov. 2019. doi: 10.1109/TAC.2019.2905215
|
[20] |
J. Sun and C. Liu, “Distributed zero-sum differential game for multi-agent nonlinear systems via adaptive dynamic programming,” in Proc. the 37th Chinese Control Conf., pp. 2770–2775, 2018.
|
[21] |
J. Sun and T. Long, “Event-triggered distributed zero-sum differential game for nonlinear multi-agent systems using adaptive dynamic programming,” ISA Trans., vol. 110, pp. 39–52, 2021. doi: 10.1016/j.isatra.2020.10.043
|
[22] |
K. A. Cavalieri, N. Satak, and J. E. Hurtado, “Incomplete information pursuit-evasion games with uncertain relative dynamics,” in Proc. AIAA Guidance, Navigation, and Control Conf. (National Harbor, Maryland), American Institute of Aeronautics and Astronautics, Jan. 2014.
|
[23] |
D. Cappello and T. Mylvaganam, “Distributed control of multi-agent systems via linear quadratic differential games with partial information,” in Proc. IEEE Conf. Decision and Control, pp. 4565–4570, Dec. 2018.
|
[24] |
F. Koepf, S. Ebbert, M. Flad, and S. Hohmann, “Adaptive dynamic programming for cooperative control with incomplete information,” in Proc. IEEE Int. Conf. on Systems, Man, and Cybernetics, pp. 2632–2638, 2018.
|
[25] |
Y. Zhang, L. Zhang, and Y. Cai, “Value iteration-based cooperative adaptive optimal control for multi-player differential games with incomplete information,” IEEE/CAA J. Autom. Sinica, vol. 11, pp. 690–697, Mar. 2024. doi: 10.1109/JAS.2023.124125
|
[26] |
M. Abu-Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, pp. 779–791, May 2005. doi: 10.1016/j.automatica.2004.11.034
|
[27] |
K. G. Vamvoudakis, F. L. Lewis, and G. R. Hudas, “Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality,” Automatica, vol. 48, pp. 1598–1611, Aug. 2012. doi: 10.1016/j.automatica.2012.05.074
|
[28] |
T. Bian and Z.-P. Jiang, “Reinforcement learning and adaptive optimal control for continuous-time nonlinear systems: A value iteration approach,” IEEE Trans. Neural Networks and Learning Systems, vol. 33, pp. 2781–2790, Jul. 2022. doi: 10.1109/TNNLS.2020.3045087
|
[29] |
H. K. Khalil, 1950, Nonlinear Systems. Upper Saddle River, NJ: Upper Saddle River, NJ : Prentice Hall, 3rd ed. ed., 2002.
|
[30] |
P. G. Ciarlet, Linear and Nonlinear Functional Analysis With Applications: with 401 Problems and 52 Figures. Philadelphia: Society for Industrial and Applied Mathematics, 2013.
|
[31] |
Y. Zhang, B. Zhao, D. Liu, and S. Zhang, “Adaptive dynamic programming-based event-triggered robust control for multiplayer nonzero-sum games with unknown dynamics,” IEEE Trans. Cybernetics, vol. 53, pp. 5151–5164, Aug. 2023. doi: 10.1109/TCYB.2022.3175650
|