Traffic Signal Timing via Deep Reinforcement Learning

Li Li; Yisheng Lv; Fei-Yue Wang

Volume 3 Issue 3

Jul. 2016

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2016 > 3(3): 247-254

Li Li, Yisheng Lv and Fei-Yue Wang, "Traffic Signal Timing via Deep Reinforcement Learning," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 3, pp. 247-254, 2016.

Citation:

Li Li, Yisheng Lv and Fei-Yue Wang, "Traffic Signal Timing via Deep Reinforcement Learning," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 3, pp. 247-254, 2016.

Li Li, Yisheng Lv and Fei-Yue Wang, "Traffic Signal Timing via Deep Reinforcement Learning," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 3, pp. 247-254, 2016.

Citation:

Li Li, Yisheng Lv and Fei-Yue Wang, "Traffic Signal Timing via Deep Reinforcement Learning," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 3, pp. 247-254, 2016.

PDF( 909 KB)

Traffic Signal Timing via Deep Reinforcement Learning

1. Department of Automation, Tsinghua University, Beijing 100084, China, and also with Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Nanjing 210096, China;
2. State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

Funds:

This work was supported by National Natural Science Foundation of China (61533019, 71232006, 61233001).

More Information

Abstract

Abstract

In this paper, we propose a set of algorithms to design signal timing plans via deep reinforcement learning. The core idea of this approach is to set up a deep neural network (DNN) to learn the Q-function of reinforcement learning from the sampled traffic state/control inputs and the corresponding traffic system performance output. Based on the obtained DNN, we can find the appropriate signal timing policies by implicitly modeling the control actions and the change of system states. We explain the possible benefits and implementation tricks of this new approach. The relationships between this new approach and some existing approaches are also carefully discussed.
- Traffic control,
- reinforcement learning,
- deep learning,
- deep reinforcement learning

FullText(HTML)

References(32)

References

[1]	Mirchandani P, Head L. A real-time traffic signal control system: architecture, algorithms, and analysis. Transportation Research, Part C: Emerging Technologies, 2001, 9(6): 415-432
[2]	Papageorgiou M, Diakaki C, Dinopoulou V, Kotsialos A, Wang Y B. Review of road traffic control strategies. Proceedings of the IEEE, 2003, 91(12): 2043-2067
[3]	Mirchandani P, Wang F Y. RHODES to intelligent transportation systems. IEEE Intelligent Systems, 2005, 20(1): 10-15
[4]	Chen B, Cheng H H. A review of the applications of agent technology in traffic and transportation systems. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(2): 485-497
[5]	Li L, Wen D, Yao D Y. A survey of traffic control with vehicular communications. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(1): 425-432
[6]	Bellemans T, De Schutter B, De Moor B. Model predictive control for ramp metering of motorway traffic: a case study. Control Engineering Practice, 2006, 14(7): 757-767
[7]	Timotheou S, Panayiotou C G, Polycarpou M M. Distributed traffic signal control using the cell transmission model via the alternating direction method of multipliers. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(2): 919-933
[8]	Wang F Y. Parallel control and management for intelligent transportation systems: concepts, architectures, and applications. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(3): 630-638
[9]	Wang F Y. Agent-based control for networked traffic management systems. IEEE Intelligent Systems, 2005, 20(5): 92-96
[10]	Li L, Wen D. Parallel systems for traffic control: a rethinking. IEEE Transactions on Intelligent Transportation Systems, 2015, 17(4): 1179-1182
[11]	Liu H C, Han K, Gayah V V, Friesz T L, Yao T. Data-driven linear decision rule approach for distributionally robust optimization of on-line signal control. Transportation Research, Part C: Emerging Technologies, 2015, 59: 260-277
[12]	Yang I, Jayakrishnan R. Real-time network-wide traffic signal optimization considering long-term green ratios based on expected route flows. Transportation Research, Part C: Emerging Technologies, 2015, 60: 241-257
[13]	Rinaldi M, Tampre C M J. An extended coordinate descent method for distributed anticipatory network traffic control. Transportation Research, Part B: Methodological, 2015, 80: 107-131
[14]	Sánchez-Medina J J, Galán-Moreno M J, Rubio-Royo E. Traffic signal optimization in La Almozara district in Saragossa under congestion conditions, using genetic algorithms, traffic microsimulation, and cluster computing. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(1): 132-141
[15]	Bingham E. Reinforcement learning in neurofuzzy traffic signal control. European Journal of Operational Research, 2001, 131: 232-241
[16]	Prashanth L A, Bhatnagar S. Reinforcement learning with function approximation for traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 2011, 12(2): 412-421
[17]	El-Tantawy S, Abdulhai B, Abdelgawad H. Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): methodology and large-scale application on downtown Toronto. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(3): 1140-1150
[18]	Ozan C, Baskan O, Haldenbilen S, Ceylan H. A modified reinforcement learning algorithm for solving coordinated signalized networks. Transportation Research, Part C: Emerging Technologies, 2015, 54: 40-55
[19]	Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness A, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529-533
[20]	Sutton R, Barto A. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.
[21]	Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504-507
[22]	Bengio y. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009, 2(1): 1-127
[23]	Lange S, Riedmiller M. Deep auto-encoder neural networks in reinforcement learning. In: Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN). Barcelona: IEEE, 2010. 1-8
[24]	Abtahi F, Fasel I. Deep Belief Nets as function approximators for reinforcement learning. In: Proceedings of Workshops at the 25th AAAI Conference on Artificial Intelligence. Frankfurt, Germany: AIAA, 2011.
[25]	Lin W H, Lo H K, Xiao L. A quasi-dynamic robust control scheme for signalized intersections. Journal of Intelligent Transportation Systems: Technology, Planning, and Operations, 2011, 15(4): 223-233
[26]	Tong Y, Zhao L, Li L, Zhang Y. Stochastic programming model for oversaturated intersection signal timing. Transportation Research, Part C: Emerging Technologies, 2015, 58: 474-486
[27]	Wang F Y. Building knowledge structure in neural nets using fuzzy logic. in Robotics and Manufacturing: Recent Trends in Research Education and Applications, edited by M. Jamshidi, New York, NY, ASME (American Society of Mechanical Engineers) Press, 1992.
[28]	Wang F Y, Kim H M. Implementing adaptive fuzzy logic controllers with neural networks: a design paradigm. Journal of Intelligent & Fuzzy Systems, 1995, 3(2): 165-180
[29]	Chen C, Wang F Y. A self-organizing neuro-fuzzy network based on first order effect sensitivity analysis. Neurocomputing, 2013, 118: 21-32
[30]	Wang F Y. Toward a Revolution in transportation Operations: AI for Complex Systems. IEEE Intelligent Systems, 2008, 23(6): 8-13
[31]	Wang F Y. Parallel system methods for management and control of complex systems. Control and Decision, 2004, 19(5): 485-489 (in Chinese)
[32]	Wang F Y. Parallel control: a method for data-driven and computational control. Acta Automatica Sinica, 2013, 39(4): 293-302 (in Chinese)

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Get Citation

PDF

XML

Article Metrics

Article views (2103) PDF downloads(142)

Traffic Signal Timing via Deep Reinforcement Learning

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Export File

Citation

Format

Content