Citation: | Y. Zhang, Y. Wang, and Y. Cai, “Value iteration-based distributed adaptive dynamic programming for multi-player differential game with incomplete information,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 2, pp. 1–12, Feb. 2025. |
[1] |
T. Başar and G. Zaccour, Handbook of Dynamic Game Theory. Springer, Aug. 2018.
|
[2] |
P. An, M. Liu, Y. Wan, and F. L. Lewis, “Multi-player H∞ differential game using on-policy and off-policy reinforcement learning,” in Proc. 16th IEEE Int. Conf. Control and Automation, pp. 1137–1142, Oct. 2020. ISSN: 1948–3457.
|
[3] |
E. Garcia, D. W. Casbeer, A. Von Moll, and M. Pachter, “Multiple pursuer multiple evader differential games,” IEEE Trans. Autom. Control, vol. 66, no. 5, pp. 2345–2350, May 2021. doi: 10.1109/TAC.2020.3003840
|
[4] |
Z. Zhou and H. Xu, “Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning,” Neurocomputing, vol. 484, pp. 46–58, May 2022. doi: 10.1016/j.neucom.2021.01.141
|
[5] |
J. Sun and Z. Ming, “Cooperative differential game-based distributed optimal synchronization control of heterogeneous nonlinear multiagent systems,” IEEE Trans. Cybernetics, vol. 53, no. 12, pp. 7933–7942, Dec. 2023. doi: 10.1109/TCYB.2023.3240983
|
[6] |
D. Wang, N. Gao, D. Liu, J. Li, and F. L. Lewis, “Recent progress in reinforcement learning and adaptive dynamic programming for advanced control applications,” IEEE/CAA J. Autom. Sinica, vol. 11, no.1, pp. 18–36, Jan. 2024. doi: 10.1109/JAS.2023.123843
|
[7] |
J. Zhao, “Data-driven adaptive dynamic programming for optimal control of continuous-time multicontroller systems with unknown dynamics,” IEEE Access, vol. 10, pp. 41503–41511, 2022. doi: 10.1109/ACCESS.2022.3168032
|
[8] |
X. Li, L. Wang, Y. An, Q.-L. Huang, Y.-H. Cui, and H.-S. Hu, “Dynamic path planning of mobile robots using adaptive dynamic programming,” Expert Systems With Applications, vol. 235, p. 121112, Jan. 2024. doi: 10.1016/j.eswa.2023.121112
|
[9] |
Y. Zhu, D. Zhao, X. Li, and D. Wang, “Control-limited adaptive dynamic programming for multi-battery energy storage systems,” IEEE Trans. Smart Grid, vol. 10, no. 4, pp. 4235–4244, Jul. 2019. doi: 10.1109/TSG.2018.2854300
|
[10] |
T. Lyu, H. Xu, L. Zhang, and Z. Han, “Source selection and resource allocation in wireless-powered relay networks: An adaptive dynamic programming-based approach,” IEEE Internet of Things J., vol. 11, pp. 8973–8988, Mar. 2024. doi: 10.1109/JIOT.2023.3321673
|
[11] |
Z. Lin, J. Ma, J. Duan, S. E. Li, H. Ma, B. Cheng, and T. H. Lee, “Policy iteration based approximate dynamic programming toward autonomous driving in constrained dynamic environment,” IEEE Trans. Intelligent Transportation Systems, vol. 24, no. 5, pp. 5003–5013, May 2023. doi: 10.1109/TITS.2023.3237568
|
[12] |
T. Liu, L. Cui, B. Pang, and Z.-P. Jiang, “A unified framework for data-driven optimal control of connected vehicles in mixed traffic,” IEEE Trans. Intelligent Vehicles, vol. 8, no. 8, pp. 4131–4145, Aug. 2023. doi: 10.1109/TIV.2023.3287131
|
[13] |
R. Song, Q. Wei, H. Zhang, and F. L. Lewis, “Discrete-time non-zero-sum games with completely unknown dynamics,” IEEE Trans. Cybernetics, vol. 51, no. 6, pp. 2929–2943, June 2021. doi: 10.1109/TCYB.2019.2957406
|
[14] |
J. Li, Z. Xiao, J. Fan, T. Chai, and F. L. Lewis, “Off-policy Q-learning: Solving Nash equilibrium of multi-player games with network-induced delay and unmeasured state,” Automatica, vol. 136, p. 110076, Feb. 2022. doi: 10.1016/j.automatica.2021.110076
|
[15] |
H. Jiang, B. Zhou, and G.-R. Duan, “Modified λ-policy iteration based adaptive dynamic programming for unknown discrete-time linear systems,” IEEE Trans. Neural Networks and Learning Systems, vol. 35, no. 3, pp. 3291–3301, Mar. 2024. doi: 10.1109/TNNLS.2023.3244934
|
[16] |
F. F. M. El-Sousy, M. M. Amin, and A. Al-Durra, “Adaptive optimal tracking control via actor-critic-identifier based adaptive dynamic programming for permanent-magnet synchronous motor drive system,” IEEE Trans. Industry Applications, vol. 57, no. 6, pp. 6577–6591, Nov. 2021. doi: 10.1109/TIA.2021.3110936
|
[17] |
J. Na, Y. Lv, K. Zhang, and J. Zhao, “Adaptive identifier-critic-based optimal tracking control for nonlinear systems with experimental validation,” IEEE Trans. Systems, Man, and Cybernetics: Systems, vol. 52, no. 1, pp. 459–472, Jan. 2022. doi: 10.1109/TSMC.2020.3003224
|
[18] |
H. Li, D. Liu, and D. Wang, “Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics,” IEEE Trans. Autom. Science and Engineering, vol. 11, no. 3, pp. 706–714, Jul. 2014. doi: 10.1109/TASE.2014.2300532
|
[19] |
C. Chen, H. Modares, K. Xie, F. L. Lewis, Y. Wan, and S. Xie, “Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics,” IEEE Trans. Autom. Control, vol. 64, no. 11, pp. 4423–4438, Nov. 2019. doi: 10.1109/TAC.2019.2905215
|
[20] |
J. Sun and C. Liu, “Distributed zero-sum differential game for multi-agent nonlinear systems via adaptive dynamic programming,” in Proc. 37th Chinese Control Conf., pp. 2770–2775, 2018.
|
[21] |
J. Sun and T. Long, “Event-triggered distributed zero-sum differential game for nonlinear multi-agent systems using adaptive dynamic programming,” ISA Trans., vol. 110, pp. 39–52, 2021. doi: 10.1016/j.isatra.2020.10.043
|
[22] |
K. A. Cavalieri, N. Satak, and J. E. Hurtado, “Incomplete information pursuit-evasion games with uncertain relative dynamics,” in Proc. AIAA Guidance, Navigation, and Control Conf. National Harbor, Maryland: American Institute of Aeronautics and Astronautics, Jan. 2014.
|
[23] |
D. Cappello and T. Mylvaganam, “Distributed control of multi-agent systems via linear quadratic differential games with partial information,” in Proc. IEEE Conf. Decision and Control, pp. 4565–4570, Dec. 2018.
|
[24] |
F. Koepf, S. Ebbert, M. Flad, and S. Hohmann, “Adaptive dynamic programming for cooperative control with incomplete information,” in Proc. IEEE Int. Conf. Systems, Man, and Cybernetics, pp. 2632–2638, 2018.
|
[25] |
Y. Zhang, L. Zhang, and Y. Cai, “Value iteration-based cooperative adaptive optimal control for multi-player differential games with incomplete information,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 3, pp. 690–697, Mar. 2024. doi: 10.1109/JAS.2023.124125
|
[26] |
M. Abu-Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, pp. 779–791, May 2005. doi: 10.1016/j.automatica.2004.11.034
|
[27] |
K. G. Vamvoudakis, F. L. Lewis, and G. R. Hudas, “Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality,” Automatica, vol. 48, no. 8, pp. 1598–1611, Aug. 2012. doi: 10.1016/j.automatica.2012.05.074
|
[28] |
T. Bian and Z.-P. Jiang, “Reinforcement learning and adaptive optimal control for continuous-time nonlinear systems: A value iteration approach,” IEEE Trans. Neural Networks and Learning Systems, vol. 33, no. 7, pp. 2781–2790, Jul. 2022. doi: 10.1109/TNNLS.2020.3045087
|
[29] |
H. K. Khalil, 1950, Nonlinear Systems. Upper Saddle River, NJ: Prentice Hall, 3rd ed., 2002.
|
[30] |
P. G. Ciarlet, Linear and Nonlinear Functional Analysis With Applications: With 401 Problems and 52 Figures. Philadelphia: Society for Industrial and Applied Mathematics, 2013.
|
[31] |
Y. Zhang, B. Zhao, D. Liu, and S. Zhang, “Adaptive dynamic programming-based event-triggered robust control for multiplayer nonzero-sum games with unknown dynamics,” IEEE Trans. Cybernetics, vol. 53, no. 8, pp. 5151–5164, Aug. 2023. doi: 10.1109/TCYB.2022.3175650
|