IEEE/CAA Journal of Automatica Sinica
Citation: | Xin Chen, Bo Fu, Yong He and Min Wu, "Timesharing-tracking Framework for Decentralized Reinforcement Learning in Fully Cooperative Multi-agent System," IEEE/CAA J. of Autom. Sinica, vol. 1, no. 2, pp. 127-133, 2014. |
[1] |
Gao Yang, Chen Shi-Fu, Lu Xin. Research on reinforcement learning technology:a review. Acta Automatica Sinica, 2004, 30(1):86-100(in Chinese)
|
[2] |
Busoniu L, Babuska R, Schutter B D. Decentralized reinforcement learning control of a robotic manipulator. In:Proceedings of the 9th International Conference on Control, Automation, Robotics and Vision. Singapore, Singapore:IEEE, 2006. 1347-1352
|
[3] |
Maravall D, De Lope J, Douminguez R. Coordination of communication in robot teams by reinforcement learning. Robotics and Autonomous Systems, 2013, 61(7):661-666
|
[4] |
Gabel T, Riedmiller M. The cooperative driver:multi-agent learning for preventing traffic jams. International Journal of Traffic and Transportation Engineering, 2013, 1(4):67-76
|
[5] |
Tumer K, Agogino A K. Distributed agent-based air traffic flow management. In:Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems. Honolulu, Hawaii, USA:ACM, 2007. 330-337
|
[6] |
Tang Hao, Wan Hai-Feng, Han Jiang-Hong, Zhou Lei. Coordinated lookahead control of multiple CSPS system by multi-agent reinforcement learning. Acta Automatica Sinica, 2010, 36(2):330-337(in Chinese)
|
[7] |
Busoniu L, Babuska R, De Schutter B. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics-Part C:Applications and Reviews, 2008, 38(2):156-172
|
[8] |
Abdallah S, Lesser V. A multiagent reinforcement learning algorithm with non-linear dynamics. Journal of Artificial Intelligence Research, 2008, 33:521-549
|
[9] |
Xu Xin, Shen Dong, Gao Yan-Qing, Wang Kai. Learning control of dynamical systems based on Markov decision processes:research frontiers and outlooks. Acta Automatica Sinica, 2012, 38(5):673-687(in Chinese)
|
[10] |
Fulda N, Ventura D. Predicting and preventing coordination problems in cooperative Q-learning systems. In:Proceedings of the 20th International Joint Conference on Artificial Intelligence. San Francisco, CA, USA:Morgan Kaufmann Publishers Inc, 2007. 780-785
|
[11] |
Chen X, Chen G, Cao W H, Wu M. Cooperative learning with joint state value approximation for multi-agent systems. Journal of Control Theory and Applications, 2013, 11(2):149-155
|
[12] |
Wang Y, de Silva C W. Multi-robot box-pushing:single-agent Qlearning vs. team Q-learning. In:Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. Beijing, China:IEEE, 2006. 3694-3699
|
[13] |
Cheng Yu-Hu, Feng Huan-Ting, Wang Xue-Song. Expectationmaximization policy search with parameter-based exploration. Acta Automatica Sinica, 2012, 38(1):38-45(in Chinese)
|
[14] |
Teboul O, Kokkinos I, Simon L, Koutsourakis P, Paragios N. Parsing facades with shape grammars and reinforcement learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(7):1744-1756
|
[15] |
Matignon L, Laurent G J, Fort-Piat N L. Independent reinforcement learners in cooperative Markov games:a survey regarding coordination problems. The Knowledge Engineering Review, 2012, 27:1-31
|
[16] |
Bowling M, Veloso M. Multiagent learning using a variable learning rate. Artificial Intelligence, 2002, 136(2):215-250
|
[17] |
Kapetanakis S, Kudenko D. Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems. In:Proceedings of the Third International Joint Conference an Autonomous Agents and Multiagent System. New York, USA:IEEE, 2004. 1258-1259
|
[18] |
Matignon L, Laurent G J, Fort-Piat N L. Hysteretic Q-learning:an algorithm for decentralized reinforcement learning in cooperative multiagent teams. In:Proceedings of IEEE/RSJ International Conference on Intelligent Robots and System. San Diego, California, USA:IEEE, 2007. 64-69
|
[19] |
Tsitsiklis J N. On the convergence of optimistic policy iteration. The Journal of Machine Learning Research, 2003, 3:59-72
|
[20] |
Wang Y, de Silva C W. A machine-learning approach to multi-robot coordination. Engineering Applications of Artificial Intelligence, 2008, 21(3):470-484
|