A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 11 Issue 9
Sep.  2024

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Y. Li, Y. Zhang, X. Li, and  C. Sun,  “Regional multi-agent cooperative reinforcement learning for city-level traffic grid signal control,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 1987–1998, Sept. 2024. doi: 10.1109/JAS.2024.124365
Citation: Y. Li, Y. Zhang, X. Li, and  C. Sun,  “Regional multi-agent cooperative reinforcement learning for city-level traffic grid signal control,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 1987–1998, Sept. 2024. doi: 10.1109/JAS.2024.124365

Regional Multi-Agent Cooperative Reinforcement Learning for City-Level Traffic Grid Signal Control

doi: 10.1109/JAS.2024.124365
Funds:  This work was supported by the National Science and Technology Major Project (2021ZD0112702), the National Natural Science Foundation (NNSF) of China (62373100, 62233003), and the Natural Science Foundation of Jiangsu Province of China (BK20202006)
More Information
  • This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system. A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency. Firstly a regional multi-agent Q-learning framework is proposed, which can equivalently decompose the global Q value of the traffic system into the local values of several regions. Based on the framework and the idea of human-machine cooperation, a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to real-time traffic flow densities. In order to achieve better cooperation inside each region, a lightweight spatio-temporal fusion feature extraction network is designed. The experiments in synthetic, real-world and city-level scenarios show that the proposed RegionSTLight converges more quickly, is more stable, and obtains better asymptotic performance compared to state-of-the-art models.

     

  • loading
  • [1]
    K. N. Qureshi and A. H. Abdullah, “A survey on intelligent transportation systems,” Middle-East J. Scientific Research, vol. 15, no. 5, pp. 629–642, 2013.
    [2]
    H. Wei, G. Zheng, V. Gayah, and Z. Li, “A survey on traffic signal control methods,” arXiv preprint arXiv: 1904.08117, 2019.
    [3]
    A. J. Miller, “Settings for fixed-cycle traffic signals,” J. Operational Research Society, vol. 14, no. 4, pp. 373–386, 1963. doi: 10.1057/jors.1963.61
    [4]
    A. Salkham, R. Cunningham, A. Garg, and V. Cahill, “A collaborative reinforcement learning approach to urban traffic control optimization,” in Proc. IEEE/WIC/ACM Int. Conf. Web Intelligence and Intelligent Agent Tech., 2008, vol. 2, pp. 560–566.
    [5]
    G. F. Newell, “Approximation methods for queues with application to the fixed-cycle traffic light,” Siam Review, vol. 7, no. 2, pp. 223–240, 1965. doi: 10.1137/1007038
    [6]
    P. Varaiya, “The max-pressure controller for arbitrary networks of signalized intersections,” in Advances in Dynamic Network Modeling in Complex Transportation Systems. New York, USA: Springer, 2013, pp. 27–66.
    [7]
    X. Zang, H. Yao, G. Zheng, N. Xu, K. Xu, and Z. Li, “MetaLight: Value-based meta-reinforcement learning for traffic signal control,” in Proc. AAAI Conf. Artificial Intelligence, 2020, vol. 34, no. 1, pp. 1153–1160.
    [8]
    B. Abdulhai, R. Pringle, and G. J. Karakoulas, “Reinforcement learning for true adaptive traffic signal control,” J. Transportation Engineering, vol. 129, no. 3, pp. 278–285, 2003.
    [9]
    X. Liang, X. Du, G. Wang, and Z. Han, “A deep reinforcement learning network for traffic light cycle control,” IEEE Trans. Vehicular Technology, vol. 68, no. 2, pp. 1243–1253, 2019. doi: 10.1109/TVT.2018.2890726
    [10]
    L. Li, Y. Lv, and F.-Y. Wang, “Traffic signal timing via deep reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 3, no. 3, pp. 247–254, 2016. doi: 10.1109/JAS.2016.7508798
    [11]
    J. Wu and Y. Lou, “Efficient centralized traffic grid signal control based on meta-reinforcement learning,” IEEE/CAA J. Autom. Sinica, 2023. DOI: 10.1109/JAS.2023.123270
    [12]
    L. Prashanth and S. Bhatnagar, “Reinforcement learning with function approximation for traffic signal control,” IEEE Trans. Intelligent Transportation Systems, vol. 12, no. 2, pp. 412–421, 2010.
    [13]
    L. N. Alegre, T. Ziemke, and A. L. Bazzan, “Using reinforcement learning to control traffic signals in a real-world scenario: An approach based on linear function approximation,” IEEE Trans. Intelligent Transportation Systems, vol. 23, no. 7, pp. 9126–9135, 2021.
    [14]
    Y. Liu, L. Liu, and W.-P. Chen, “Intelligent traffic light control using distributed multi-agent Q learning,” in Proc. IEEE 20th Int. Conf. Intelligent Transportation Systems, 2017, pp. 1–8.
    [15]
    T. Chu, J. Wang, L. Codecà, and Z. Li, “Multi-agent deep reinforcement learning for large-scale traffic signal control,” IEEE Trans. Intelligent Transportation Systems, vol. 21, no. 3, pp. 1086–1095, 2019.
    [16]
    X. Wang, L. Ke, Z. Qiao, and X. Chai, “Large-scale traffic signal control using a novel multiagent reinforcement learning,” IEEE Trans. Cybern., vol. 51, no. 1, pp. 174–187, 2020.
    [17]
    Z. Li, H. Yu, G. Zhang, S. Dong, and C.-Z. Xu, “Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning,” Transportation Research Part C: Emerging Technologies, vol. 125, p. 103059, 2021. doi: 10.1016/j.trc.2021.103059
    [18]
    T. Chu, S. Qu, and J. Wang, “Large-scale traffic grid signal control with regional reinforcement learning,” in Proc. American Control Conf., 2016, pp. 815–820.
    [19]
    T. Tan, F. Bao, Y. Deng, A. Jin, Q. Dai, and J. Wang, “Cooperative deep reinforcement learning for large-scale traffic grid signal control,” IEEE Trans. Cybern., vol. 50, no. 6, pp. 2687–2700, 2019.
    [20]
    S. Jiang, Y. Huang, M. Jafari, and M. Jalayer, “A distributed multi-agent reinforcement learning with graph decomposition approach for large-scale adaptive traffic signal control,” IEEE Trans. Intelligent Transportation Systems, vol. 23, no. 9, pp. 14689–14701, 2023.
    [21]
    L. Yan, L. Zhu, K. Song, Z. Yuan, Y. Yan, Y. Tang, and C. Peng, “Graph cooperation deep reinforcement learning for ecological urban traffic signal control,” Applied Intelligence, vol. 53, no. 6, pp. 6248–6265, 2023. doi: 10.1007/s10489-022-03208-w
    [22]
    H. Wei, N. Xu, H. Zhang, G. Zheng, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li, “Colight: Learning network-level cooperation for traffic signal control,” in Proc. 28th ACM Int. Conf. Information and Knowledge Management, 2019, pp. 1913–1922.
    [23]
    L. Wu, M. Wang, D. Wu, and J. Wu, “DynSTGAT: DynAMIC spatial-temporal graph attention network for traffic signal control,” in Proc. 30th ACM Int. Conf. Inform. & Knowledge Management, 2021, pp. 2150–2159.
    [24]
    H. Wei, C. Chen, G. Zheng, K. Wu, V. Gayah, K. Xu, and Z. Li, “PressLight: Learning max pressure control to coordinate traffic signals in arterial network,” in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2019, pp. 1290–1298.
    [25]
    S. Guicheng and W. Yang, “Review on DEC-Pomdp model for MARL algorithms,” in Proc. Smart Communi., Intelligent Algorithms and Interactive Methods; 4th Int. Conf. Wireless Communi. and Appli., 2022, pp. 29–35.
    [26]
    M. L. Littman, “Markov games as a framework for multi-agent reinforcement learning,” in Machine Learning Proceedings 1994. New Brunswick, USA: Elsevier, 1994, pp. 157–163.
    [27]
    R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. London, UK: MIT Press, 2018.
    [28]
    R. Lowe, Y. I. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. 31st Int. Conf. Neural Inform. Proc. Syst., 2017, vol. 30, pp. 6382–6393.
    [29]
    P. Sunehag, G. Lever, A. Gruslys, et al., “Value-decomposition networks for cooperative multi-agent learning,” arXiv preprint arXiv: 1706.05296, 2017.
    [30]
    T. Rashid, M. Samvelyan, C. Schroeder, G. Farquhar, J. Foerster, and S. Whiteson, “Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning,” in Proc. Int. Conf. Machine Learning, 2018, pp. 4295–4304.
    [31]
    C. Guestrin, D. Koller, and R. Parr, “Multiagent planning with factored MDPs,” in Proc. 14th Int. Conf. Neural Inform. Processing Syst.: Natural and Synthetic, 2001 vol. 14, pp. 1523–1530.
    [32]
    G. Tesauro, “Extending Q-learning to general adaptive multi-agent systems,” in Proc. 16th Int. Conf. Neural Inform. Proc. Syst., 2003 vol. 16, pp. 871–878.
    [33]
    M. Tan, “Multi-agent reinforcement learning: Independent vs. cooperative agents,” in Proc. 10th Int. Conf. Machine Learning, 1993, pp. 330–337.
    [34]
    P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” Stat, vol. 150, p. 20, 2017.
    [35]
    Q. Wang, B. Wu, P. Zhu, P. Li, and Q. Hu, “ECA-Net: Efficient channel attention for deep convolutional neural networks,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2020.
    [36]
    H. Zhang, S. Feng, C. Liu, Y. Ding, Y. Zhu, Z. Zhou, W. Zhang, Y. Yu, H. Jin, and Z. Li, “CityFlow: A multi-agent reinforcement learning environment for large scale city traffic scenario,” in Proc. World Wide Web Conf., 2019, pp. 3620–3624.
    [37]
    P. Varaiya, The Max-Pressure Controller for Arbitrary Networks of Signalized Intersections. New York, USA: Springer, 2013.
    [38]
    T. Nishi, K. Otaki, K. Hayakawa, and T. Yoshimura, “Traffic signal control based on reinforcement learning with graph convolutional neural nets,” in Proc. 21st IEEE Int. Conf. Intelligent Transportation Systems, 2018.
    [39]
    P. Zhou, X. Chen, Z. Liu, T. Braud, P. Hui, and J. Kangasharju, “DRLE: Decentralized reinforcement learning at the edge for traffic light control in the IOV,” IEEE Trans. Intelligent Transportation Systems, vol. 22, no. 4, pp. 2262–2273, 2020.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(5)

    Article Metrics

    Article views (168) PDF downloads(50) Cited by()

    Highlights

    • As to the city-level traffic signal control problem, a regional multi-agent Q-learning framework is developed to simplify the overall complex traffic signal control problem to several regional control problems
    • Based on the idea of human-machine cooperation, a dynamic zoning approach is designed to divided the entire traffic network into several strong-coupled regions
    • A lightweight spatio-temporal fusion feature extraction network is designed to achieve better cooperation inside each region
    • The numerical experiments are conducted under a synthetic scenario, a real-world scenario and a city-level scenario to illustrate the effectiveness of the proposed method

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return