IEEE/CAA Journal of Automatica Sinica
Citation: | Rushikesh Kamalapurkar, Justin R. Klotz and Warren E. Dixon, "Concurrent Learning-based Approximate Feedback-Nash Equilibrium Solution of N-player Nonzero-sum Differential Games," IEEE/CAA J. of Autom. Sinica, vol. 1, no. 3, pp. 239-247, 2014. |
[1] |
Kirk D E. Optimal Control Theory: An Introduction. New York: Dover Publications, 2004.
|
[2] |
Lewis F L, Vrabie D, Syrmos V L. Optimal Control (Third edition). New York: John Wiley & Sons, 2012.
|
[3] |
Isaacs R. Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York: Dover Publications, 1999.
|
[4] |
Tijs S. Introduction to Game Theory. Hindustan Book Agency, 2003.
|
[5] |
Basar T, Olsder G. Dynamic Noncooperative Game Theory (Second edition). Philadelphia, PA: SIAM, 1999.
|
[6] |
Nash J. Non-cooperative games. The Annals of Mathematics, 1951, 54(2): 286-295
|
[7] |
Case J H. Toward a theory of many player differential games. SIAM Journal on Control, 1969, 7(2): 179-197
|
[8] |
Starr A W, Ho C Y. Nonzero-sum differential games. Journal of Optimization Theory and Applications, 1969, 3(3): 184-206
|
[9] |
Starr A, Ho C Y. Further properties of nonzero-sum differential games. Journal of Optimization Theory and Applications, 1969, 3(4): 207-219
|
[10] |
Friedman A. Differential Games. New York: John Wiley and Sons, 1971.
|
[11] |
Littman M. Value-function reinforcement learning in Markov games. Cognitive Systems Research, 2001, 2(1): 55-66
|
[12] |
Wei Q L, Zhang H G. A new approach to solve a class of continuoustime nonlinear quadratic zero-sum game using ADP. In: Proceedings of the IEEE International Conference on Networking Sensing and Control. Sanya, China: IEEE, 2008. 507-512
|
[13] |
Vamvoudakis K, Lewis F. Online synchronous policy iteration method for optimal control. In: Proceedings of the Recent Advances in Intelligent Control Systems. London: Springer, 2009. 357-374
|
[14] |
Zhang H G, Wei Q L, Liu D R. An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica, 2010, 47: 207-214
|
[15] |
Zhang X, Zhang H G, Luo Y H, Dong M. Iteration algorithm for solving the optimal strategies of a class of nonaffine nonlinear quadratic zero-sum games. In: Proceedings of the IEEE Conference Decision and Control. Xuzhou, China: IEEE, 2010. 1359-1364
|
[16] |
Vrabie D, Lewis F. Integral reinforcement learning for online computation of feedback Nash strategies of nonzero-sum differential games. In: Proceedings of the 49th IEEE Conference Decision and Control. Atlanta, GA: IEEE, 2010. 3066-3071
|
[17] |
Vamvoudakis K G, Lewis F L. Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica, 2011, 47(8): 1556-1569
|
[18] |
Zhang H, Liu D, Luo Y, Wang D. Adaptive Dynamic Programming for Control-Algorithms and Stability (Communications and Control Engineering). London: Springer-Verlag, 2013
|
[19] |
Johnson M, Bhasin S, Dixon W E. Nonlinear two-player zero-sum game approximate solution using a policy iteration algorithm. In: Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference. Orlando, FL, USA: IEEE, 2011. 142-147
|
[20] |
Mehta P, Meyn S. Q-learning and pontryagin's minimum principle. In: Proceedings of the IEEE Conference on Decision and Control. Shanghai, China: IEEE, 2009. 3598-3605
|
[21] |
Vrabie D, Abu-Khalaf M, Lewis F L, Wang Y Y. Continuous-time ADP for linear systems with partially unknown dynamics. In: Proceedings of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning. Honolulu, HI: IEEE, 2007. 247-253
|
[22] |
Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, 1998. 4
|
[23] |
Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis K, Lewis F L, Dixon W. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica, 2013, 49(1): 89-92
|
[24] |
Chowdhary G, Yucelen T, Mühlegg M, Johnson E N. Concurrent learning adaptive control of linear systems with exponentially convergent bounds. International Journal of Adaptive Control and Signal Processing, 2012, 27(4): 280-301
|
[25] |
Chowdhary G V, Johnson E N. Theory and flight-test validation of a concurrent-learning adaptive controller. Journal of Guidance, Control, and Dynamics, 2011, 34(2): 592-607
|
[26] |
Ioannou P, Sun J. Robust Adaptive Control. New Jersey: Prentice Hall, 1996. 198
|
[27] |
Khalil H K. Nonlinear Systems (Third edition). New Jersey: Prentice Hall, 2002. 172
|
[28] |
Nevistić V, Primbs J A. Constrained nonlinear optimal control: a converse HJB approach. California Institute of Technology, Pasadena, CA 91125, Techical Report CIT-CDS 96-021, 1996. 5
|
[29] |
Savitzky A, Golay M J E. Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 1964, 36(8): 1627-1639
|