IEEE/CAA Journal of Automatica Sinica
Citation: | Lei Xue, Changyin Sun, Donald Wunsch, Yingjiang Zhou and Fang Yu, "An Adaptive Strategy via Reinforcement Learning for the Prisoner's Dilemma Game," IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 301-310, Jan. 2018. doi: 10.1109/JAS.2017.7510466 |
[1] |
J. Seiffertt, S. Mulder, R. Dua, and D. C. Wunsch, "Neural networks and Markov models for the iterated prisoner's dilemma, " in Proc. Int. Joint Conf. Neural Networks, Atlanta, GA, USA, 2009, pp. 2860-2866. http://dl.acm.org/citation.cfm?id=1704398
|
[2] |
H. Y. Quek, K. C. Tan, C. K. Goh, and H. A. Abbass, "Evolution and incremental learning in the iterated prisoner's dilemma, " IEEE Trans. Evol. Comput., vol. 13, no. 2, pp. 303-320, Apr. 2009. http://ieeexplore.ieee.org/document/4703197/
|
[3] |
R. Axelrod, The Evolution of Cooperation. New York, USA: Basic, 1984.
|
[4] |
M. A. Nowak, R. M. May, "Evolutionary games and spatial chaos, " Nature, vol. 359, no. 6398, pp. 826-829, Oct. 1992. http://www.jstor.org/servlet/linkout?suffix=rf92&dbid=16&doi=10.1086%2F670192&key=10.1038%2F359826a0
|
[5] |
F. Fu, M. A. Nowak, and C. Hauert, "Invasion and expansion of cooperators in lattice populations: Prisoner's dilemma vs. snowdrift games, " J. Theor. Biol., vol. 266, no. 3, pp. 358-366, Oct. 2010. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2927800/?tool=pubmed
|
[6] |
J. Liu, Y. Li, C. Xu, and P. M. Hui, "Evolutionary behavior of generalized zero-determinant strategies in iterated prisoner's dilemma, " Phys. A Stat. Mech. Appl., vol. 430, pp. 81-92, Jul. 2015. http://www.sciencedirect.com/science/article/pii/S0378437115002034
|
[7] |
G. Szabó, G. Fath, "Evolutionary games on graphs, " Phys. Rep., vol. 446, no. 4-6, pp. 97-216, Jul. 2007. http://www.sciencedirect.com/science/article/pii/S0370157307001810
|
[8] |
D. C. Wunsch and S. Mulder, "Evolutionary algorithms, Markov decision processes, adaptive critic designs, and clustering: Commonalities, hybridization and performance, " in Proc. Int. Conf. Intelligent Sensing and Information Processing, Chennai, India, 2004, pp. 477-482. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=1287704
|
[9] |
H. Ishibuchi and N. Namikawa, "Evolution of iterated prisoner's dilemma game strategies in structured demes under random pairing in game playing, " IEEE Trans. Evol. Comput., vol. 9, no. 6, pp. 552-561, Dec. 2005. http://ieeexplore.ieee.org/document/1545934/
|
[10] |
H. Ishibuchi, H. Ohyanagi, and Y. Nojima, "Evolution of strategies with different representation schemes in a spatial iterated prisoner's dilemma game, " IEEE Trans. Comput. Intell. AI Games, vol. 3, no. 1, pp. 67-82, Mar. 2011. http://ieeexplore.ieee.org/document/5705567/
|
[11] |
D. Ashlock and E. Y. Kim, "Fingerprinting: Visualization and Automatic analysis of prisoner's dilemma strategies, " IEEE Trans. Evol. Comput., vol. 12, no. 5, pp. 647-659, Oct. 2008. http://ieeexplore.ieee.org/document/4492964/
|
[12] |
D. Ashlock, E. Y. Kim, and W. Ashlock, "Fingerprint analysis of the noisy prisoner's dilemma using a finite-state representation, " IEEE Trans. Comput. Intell. AI Games, vol. 1, no. 2, pp. 154-167, Jun. 2009. http://ieeexplore.ieee.org/document/4804733/
|
[13] |
D. Ashlock and C. Lee, "Agent-case embeddings for the analysis of evolved systems, " IEEE Trans. Evol. Comput., vol. 17, no. 2, pp. 227-240, Apr. 2013. http://ieeexplore.ieee.org/document/6384730/
|
[14] |
J. S. Wu, Y. Q. Hou, L. C. Jiao, and H. J. Li, "Community structure inhibits cooperation in the spatial prisoner's dilemma, " Phys. A Stat. Mech. Appl., vol. 412, pp. 169-179, Oct. 2014. http://www.sciencedirect.com/science/article/pii/S0378437114005172
|
[15] |
Y. Z. Cui and X. Y. Wang, "Uncovering overlapping community structures by the key bi-community and intimate degree in bipartite networks, " Phys. A Stat. Mech. Appl., vol. 407, pp. 7-14, Aug. 2014. http://www.sciencedirect.com/science/article/pii/S037843711400288X
|
[16] |
S. P. Nageshrao, G. A. D. Lopes, D. Jeltsema, and R. Babuška, "Porthamiltonian systems in adaptive and learning control:A survey, " IEEE Trans. Autom. Control, vol. 61, no. 5, pp. 1223-1238, May 2016. doi: 10.1109/TAC.2015.2458491
|
[17] |
C. M. Liu, X. Xu, and D. W. Hu, "Multiobjective reinforcement learning: A comprehensive overview, " IEEE Trans. Syst. Man Cybern. Syst., vol. 45, no. 3, pp. 385-398, Mar. 2015.
|
[18] |
Y. J. Liu, Y. Gao, S. C. Tong, and Y. M. Li, "Fuzzy approximation-based adaptive backstepping optimal control for a class of nonlinear discretetime systems with dead-zone, " IEEE Trans. Fuzzy Syst., vol. 24, no. 1, pp. 16-28, Feb. 2016. http://ieeexplore.ieee.org/document/7072483/
|
[19] |
Y. Gao and Y. J. Liu, "Adaptive fuzzy optimal control using direct heuristic dynamic programming for chaotic discrete-time system, " J. Vibrat. Control, vol. 22, no. 2, pp. 595-603, 2016. doi: 10.1177/1077546314534286
|
[20] |
Y. J. Liu, L. Tang, S. C. Tong, C. L. P. Chen, and D. J. Li, "Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems, " IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 1, pp. 165-176, Jan. 2015. http://www.ncbi.nlm.nih.gov/pubmed/25438326
|
[21] |
K. G. Vamvoudakis, F. L. Lewis, and G. R. Hudas, "Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality, " Automatica, vol. 48, no. 8, pp. 1598-1611, Aug. 2012.
|
[22] |
P. Hingston and G. Kendall, "Learning versus evolution in iterated prisoner's dilemma, " in Proc. Congr. Evolutionary Computation, Portland, OR, USA, 2004, pp. 364-372. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=1330880
|
[23] |
S. Y. Chong and X. Yao, "Multiple choices and reputation in multiagent interactions, " IEEE Trans. Evol. Comput., vol. 11, no. 6, pp. 689-711, Dec. 2007. http://ieeexplore.ieee.org/document/4358753/
|
[24] |
E. Semsar-Kazerooni and K. Khorasani, "Multi-agent team cooperation: A game theory approach, " Automatica, vol. 45, no. 10, pp. 2205-2213, Oct. 2009. http://www.sciencedirect.com/science/article/pii/S0005109809002970
|
[25] |
D. Ashlock, J. A. Brown, and P. Hingston, "Multiple opponent optimization of prisoner's dilemma playing agents, " IEEE Trans. Comput. Intell. AI Games, vol. 7, no. 1, pp. 53-65, Mar. 2015. http://ieeexplore.ieee.org/document/6819427/
|
[26] |
J. W. Li and G. Kendall, "The effect of memory size on the evolutionary stability of strategies in iterated prisoner's dilemma, " IEEE Trans. Evol. Comput., vol. 18, no. 6, pp. 819-826, Dec. 2014. http://ieeexplore.ieee.org/document/6642072
|
[27] |
K. Moriyama, "Learning-rate adjusting Q-learning for prisoner's dilemma games, " in Proc. IEEE/WIC/ACM Int. Conf. Web Intelligence and Intelligent Agent Technology, Sydney, NSW, Australia, 2008, pp. 322-325. http://ieeexplore.ieee.org/document/4740642/
|
[28] |
X. Y. Deng, Z. P. Zhang, Y. Deng, Q. Liu, and S. H. Chang, "Self-adaptive win-stay-lose-shift reference selection mechanism promotes cooperation on a square lattice, " Appl. Math. Comput., vol. 284, pp. 322-331, Jul. 2016. http://www.sciencedirect.com/science/article/pii/S0096300316302028
|
[29] |
F. C. Santos and J. M. Pacheco, "Scale-free networks provide a unifying framework for the emergence of cooperation, " Phys. Rev. Lett., vol. 95, no. 9, pp. Article ID 098104, Aug. 2005. http://www.ncbi.nlm.nih.gov/pubmed/16197256?dopt=Abstract
|
[30] |
F. C. Santos and J. M. Pacheco, "A new route to the evolution of cooperation, " J. Evol. Biol., vol. 19, no. 3, pp. 726-733, May 2006. doi: 10.1111/jeb.2006.19.issue-3
|