IEEE/CAA Journal of Automatica Sinica
Citation: | N. Chen, L. Li, and W. Mao, “Equilibrium strategy of the pursuit-evasion game in three-dimensional space,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 446–458, Feb. 2024. doi: 10.1109/JAS.2023.123996 |
The pursuit-evasion game models the strategic interaction among players, attracting attention in many realistic scenarios, such as missile guidance, unmanned aerial vehicles, and target defense. Existing studies mainly concentrate on the cooperative pursuit of multiple players in two-dimensional pursuit-evasion games. However, these approaches can hardly be applied to practical situations where players usually move in three-dimensional space with a three-degree-of-freedom control. In this paper, we make the first attempt to investigate the equilibrium strategy of the realistic pursuit-evasion game, in which the pursuer follows a three-degree-of-freedom control, and the evader moves freely. First, we describe the pursuer’s three-degree-of-freedom control and the evader’s relative coordinate. We then rigorously derive the equilibrium strategy by solving the retrogressive path equation according to the Hamilton-Jacobi-Bellman-Isaacs (HJBI) method, which divides the pursuit-evasion process into the navigation and acceleration phases. Besides, we analyze the maximum allowable speed for the pursuer to capture the evader successfully and provide the strategy with which the evader can escape when the pursuer’s speed exceeds the threshold. We further conduct comparison tests with various unilateral deviations to verify that the proposed strategy forms a Nash equilibrium.
[1] |
I. E. Weintraub, M. Pachter, and E. García, “An introduction to pursuit-evasion differential games,” in Proc. American Control Conf., 2020, pp. 1049–1066.
|
[2] |
M. J. Osborne and A. Rubinstein, A Course in Game Theory. Cambridge, USA: MIT Press, 1994.
|
[3] |
H. Huang, J. Ding, W. Zhang, and C. J. Tomlin, “Automation-assisted capture-the-flag: A differential game approach,” IEEE Trans. Control Systems Technology, vol. 23, no. 3, pp. 1014–1028, 2015. doi: 10.1109/TCST.2014.2360502
|
[4] |
J. Shinar, M. Guelman, and A. Green, “An optimal guidance law for a planar pursuit-evasion game of kind,” Computers &Mathematics With Applications, vol. 18, no. 1−3, pp. 35–44, 1989.
|
[5] |
V. Turetsky and J. Shinar, “Missile guidance laws based on pursuit-evasion game formulations,” Automatica, vol. 39, no. 4, pp. 607–618, 2003. doi: 10.1016/S0005-1098(02)00273-X
|
[6] |
W. Li, Y. Zhu, and D. Zhao, “Missile guidance with assisted deep reinforcement learning for head-on interception of maneuvering target,” Complex &Intelligent Systems, vol. 8, no. 2, pp. 1205–1216, 2022.
|
[7] |
J. Guo, Z. Wang, J. Lan, B. Dong, R. Li, Q. Yang, and J. Zhang, “Maneuver decision of UAV in air combat based on deterministic policy gradient,” in Proc. IEEE 17th Int. Conf. Control & Automation, 2022, pp. 243–248.
|
[8] |
R. Vidal, S. Rashid, C. Sharp, O. Shakernia, K. Jin, and S. Sastry, “Pursuit-evasion games with unmanned ground and aerial vehicles,” in Proc. IEEE Int. Conf. Robotics and Automation, 2001, vol. 3, pp. 2948–2955.
|
[9] |
Q. Yang, J. Zhang, G. Shi, J. Hu, and Y. Wu, “Maneuver decision of UAV in short-range air combat based on deep reinforcement learning,” IEEE Access, vol. 8, pp. 363–378, 2020. doi: 10.1109/ACCESS.2019.2961426
|
[10] |
M. Chen, Z. Zhou, and C. J. Tomlin, “Multiplayer reach-avoid games via pairwise outcomes,” IEEE Trans. Autom. Control, vol. 62, no. 3, pp. 1451–1457, 2016.
|
[11] |
E. Garcia, D. W. Casbeer, and M. Pachter, “Design and analysis of state-feedback optimal strategies for the differential game of active defense,” IEEE Trans. Autom. Control, vol. 64, no. 2, pp. 553–568, 2018.
|
[12] |
S. Pan, H. Huang, J. Ding, W. Zhang, D. M. S. vić, and C. J. Tomlin, “Pursuit, evasion and defense in the plane,” in Proc. American Control Conf., 2012, pp. 4167–4173.
|
[13] |
R. Isaacs, Differential Games: A Mathematical Theory With Applications to Warfare and Pursuit, Control and Optimization. Mineola, USA: John Wiley and Sons, Inc., 1965.
|
[14] |
L. C. Evans, Partial Differential Equations. Providence, USA: American Mathematical Soc., 2010.
|
[15] |
P. Hagedorn and J. Breakwell, “A differential game with two pursuers and one evader,” J. Optimization Theory and Applications, vol. 18, no. 1, pp. 15–29, 1976. doi: 10.1007/BF00933791
|
[16] |
J. Breakwell and P. Hagedorn, “Point capture of two evaders in succession,” J. Optimization Theory and Applications, vol. 27, no. 1, pp. 89–97, 1979. doi: 10.1007/BF00933327
|
[17] |
A. T. Bilgin and E. Kadioglu-Urtis, “An approach to multi-agent pursuit evasion games using reinforcement learning,” in Proc. Int. Conf. Advanced Robotics, 2015, pp. 164–169.
|
[18] |
E. Bakolas and P. Tsiotras, “Relay pursuit of a maneuvering target using dynamic Voronoi diagrams,” Automatica, vol. 48, no. 9, pp. 2213–2220, 2012. doi: 10.1016/j.automatica.2012.06.003
|
[19] |
Z. Zhou, W. Zhang, J. Ding, H. Huang, D. M. Stipanović, and C. J. Tomlin, “Cooperative pursuit with Voronoi partitions,” Automatica, vol. 72, pp. 64–72, 2016. doi: 10.1016/j.automatica.2016.05.007
|
[20] |
J. Chen, W. Zha, Z. Peng, and D. Gu, “Multi-player pursuit-evasion games with one superior evader,” Automatica, vol. 71, pp. 24–32, 2016. doi: 10.1016/j.automatica.2016.04.012
|
[21] |
M. Ramana and M. Kothari, “A cooperative pursuit-evasion game of a high speed evader,” in Proc. IEEE Conf. Decision and Control, 2015, pp. 2969–2974.
|
[22] |
X. Fang, C. Wang, L. Xie, and J. Chen, “Cooperative pursuit with multi-pursuer and one faster free-moving evader,” IEEE Trans. Cyber., vol. 52, no. 3, pp. 1405–1414, 2022.
|
[23] |
W. Zha, J. Chen, Z. Peng, and D. Gu, “Construction of barrier in a fishing game with point capture,” IEEE Trans. Cyber., vol. 47, no. 6, pp. 1409–1422, 2016.
|
[24] |
E. Garcia, D. W. Casbeer, A. Von Moll, and M. Pachter, “Multiple pursuer multiple evader differential games,” IEEE Trans. Automatic Control, vol. 66, no. 5, pp. 2345–2350, 2020.
|
[25] |
G. Hexner, “A differential game of incomplete information,” J. Optimization Theory and Applications, vol. 28, no. 2, pp. 213–232, 1979. doi: 10.1007/BF00933243
|
[26] |
M. Pachter and Y. Yavin, “A stochastic homicidal chauffeur pursuit-evasion differential game,” J. Optimization Theory and Applications, vol. 34, no. 3, pp. 405–424, 1981. doi: 10.1007/BF00934680
|
[27] |
Y. Yang and J. Wang, “An overview of multi-agent reinforcement learning from game theoretical perspective,” [Online], Available: https://arxiv.org/abs/2011.00583, 2020.
|
[28] |
J. Selvakumar and E. Bakolas, “Min-max Q-learning for multi-player pursuit-evasion games,” Neurocomputing, vol. 475, pp. 1–14, 2022. doi: 10.1016/j.neucom.2021.12.025
|
[29] |
R. Lowe, Y. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. Advances in Neural Information Processing Systems, 2017, vol. 30, pp. 6379–6390.
|
[30] |
Y. Wang, L. Dong, and C. Sun, “Cooperative control for multi-player pursuit-evasion games with reinforcement learning,” Neurocomputing, vol. 412, pp. 101–114, 2020. doi: 10.1016/j.neucom.2020.06.031
|
[31] |
K. Wan, D. Wu, Y. Zhai, B. Li, X. Gao, and Z. Hu, “An improved approach towards multi-agent pursuit-evasion game decision-making using deep reinforcement learning,” Entropy, vol. 23, no. 11, p. 1433, 2021. doi: 10.3390/e23111433
|
[32] |
Z. Zhou and H. Xu, “Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning,” Neurocomputing, vol. 484, pp. 46–58, 2022. doi: 10.1016/j.neucom.2021.01.141
|
[33] |
T. T. Nguyen, N. D. Nguyen, and S. Nahavandi, “Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications,” IEEE Trans. Cyber., vol. 50, no. 9, pp. 3826–3839, 2020. doi: 10.1109/TCYB.2020.2977374
|
[34] |
Y. Yang, L. Liao, H. Yang, and S. Li, “An optimal control strategy for multi-UAVs target tracking and cooperative competition,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 12, pp. 1931–1947, 2021. doi: 10.1109/JAS.2020.1003012
|
[35] |
N. Wen, L. Zhao, X. Su, and P. Ma, “UAV online path planning algorithm in a low altitude dangerous environment,” IEEE/CAA J. Autom. Sinica, vol. 2, no. 2, pp. 173–185, 2015. doi: 10.1109/JAS.2015.7081657
|
[36] |
Z. Zuo, C. Liu, Q.-L. Han, and J. Song, “Unmanned aerial vehicles: Control methods and future challenges,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 4, pp. 601–614, 2022. doi: 10.1109/JAS.2022.105410
|
[37] |
T. Miloh, “A note on three-dimensional pursuit-evasion game with bounded curvature,” IEEE Trans. Automatic Control, vol. 27, no. 3, pp. 739–741, 1982. doi: 10.1109/TAC.1982.1102992
|
[38] |
N. Greenwood, “A differential game in three dimensions: The aerial dogfight scenario,” Dynamics and Control, vol. 2, no. 2, pp. 161–200, 1992. doi: 10.1007/BF02169496
|
[39] |
N. Rajan and M. Ardema, “Interception in three dimensions-an energy formulation,” J. Guidance,Control,and Dynamics, vol. 8, no. 1, pp. 23–30, 1985. doi: 10.2514/3.19930
|
[40] |
F. Imado and T. Kuroda, “A method to solve missile-aircraft pursuit-evasion differential games,” IFAC Proceedings Volumes, vol. 38, no. 1, pp. 176–181, 2005.
|
[41] |
Z. Hu, P. Gao, and F. Wang, “Research on autonomous maneuvering decision of UCAV based on approximate dynamic programming,” [Online], Available: https://arxiv.org/abs/1908.10010, 2019.
|
[42] |
T. Başar and G. J. Olsder, Dynamic Noncooperative Game Theory. New York, USA: SIAM, 1998.
|
[43] |
T. Başar, A. Haurie, and G. Zaccour, Nonzero-Sum Differential Games. Cham, Switzerland: Springer Int. Publishing, 2018, pp. 61–110.
|
[44] |
P. L A, Differential Games Of Pursuit, ser. Series on Optimization, vol 2. Singapore: World Scientific, 1993.
|
[45] |
D. Liberzon, Calculus of Variations and Optimal Control Theory: A Concise Introduction. USA: Princeton University Press, 2011.
|
[46] |
X. Liao, C. Zhou, J. Wang, J. Fan, and Z. Zhang, “A wire-driven elastic robotic fish and its design and cpg-based control,” J. Intelligent &Robotic Systems, vol. 107, no. 1, p. 4, 2022.
|
[47] |
J. Chai, W. Chen, Y. Zhu, Z. Yao, and D. Zhao, “A hierarchical deep reinforcement learning framework for 6-DoF UCAV air-to-air combat,” IEEE Trans. Systems,Man,and Cyber: Systems, vol. 53, no. 9, pp. 5417–5429, 2023. doi: 10.1109/TSMC.2023.3270444
|
[48] |
G. Wu, S. Bai, and P. Hjørnet, “On the stiffness of three/four degree-of-freedom parallel pick-and-place robots with four identical limbs,” in Proc. IEEE Int. Conf. Robotics and Automation, 2016, pp. 861–866.
|