Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System

Teng Liu; Bin Tian; Yunfeng Ai; Fei-Yue Wang

doi:10.1109/JAS.2020.1003072

Volume 7 Issue 2

Mar. 2020

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2020 > 7(2): 617-626

Teng Liu, Bin Tian, Yunfeng Ai and Fei-Yue Wang, "Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 617-626, Mar. 2020. doi: 10.1109/JAS.2020.1003072

Citation:

Teng Liu, Bin Tian, Yunfeng Ai and Fei-Yue Wang, "Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 617-626, Mar. 2020. doi: 10.1109/JAS.2020.1003072

Citation:

PDF( 2078 KB)

Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System

doi: 10.1109/JAS.2020.1003072

Funds: The work was supported in part by the National Natural Science Foundation of China (61533019, 91720000), Beijing Municipal Science and Technology Commission (Z181100008918007), and the Intel Collaborative Research Institute for Intelligent and Automated Connected Vehicles (pICRI-IACVq)

More Information

Author Bio:
Teng Liu (M’18) received the B.S. degree in mathematics from Beijing Institute of Technology, in 2011. He received the Ph.D. degree in automotive engineering from Beijing Institute of Technology (BIT), in 2017. He worked as a Research Fellow at Vehicle Intelligence Pioneers Ltd. for one year. Now, he is a member of IEEE VTS, IEEE ITS, IEEE IES, IEEE TEC. Dr. Liu is now a Professor in the Department of Automotive Engineering, Chongqing University. Dr. Liu has more than 8 years’ research and working experience in renewable vehicle and connected autonomous vehicle. His research interests include reinforcement learning (RL)-based energy management in hybrid electric vehicles, RL-based decision making for autonomous vehicles, and CPSS-based parallel driving. He has published over 30 SCI papers and 10 conference papers in these areas. He received the Merit Student of Beijing in 2011, the Teli Xu Scholarship (Highest Honor) of Beijing Institute of Technology in 2015, “Top 10” in 2018 IEEE VTS Motor Vehicle Challenge and sole outstanding winner in 2018 ABB Intelligent Technology Competition

Bin Tian (M’18) received the B.S. degree from Shandong University, in 2009 and the Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences, in 2014. He is currently an Associate Professor of the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interests include computer vision, machine learning, and automated driving

Yunfeng Ai (M’18) received the B.S. degree from Shandong University, in 2001 and the Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences, in 2006. He was a Visiting Scholar from December 2015 to October 2016 and Postdoctoral Researcher from November 2016 to April 2017 at Carnegie Mellon University. He is currently a Research Scientist of University of Chinese Academy of Sciences. His research interests include computer vision, machine learning, parallel robots, and automated driving

Fei-Yue Wang (S’87–M’89–SM’94–F’03) received the Ph.D. degree in computer and systems engineering from Rensselaer Polytechnic Institute, Troy, New York in 1990. He joined the University of Arizona in 1990 and became a Professor and Director of the Robotics and Automation Lab (RAL) and Program in Advanced Research for Complex Systems (PARCS). Dr. Wang’s research focuses on methods and applications for parallel systems, social computing, and knowledge automation. Currently he is EiC of IEEE Transactions on Computational Social Systems, Founding EiC of IEEE/CAA Journal of Automatica Sinica, and Chinese Journal of Command and Control. Since 1997, he has served as General or Program Chair of more than 20 IEEE, INFORMS, ACM, and ASME conferences. In 2007, he received the National Prize in Natural Sciences of China and was awarded the Outstanding Scientist by ACM for his research contributions in intelligent control and social computing. He received IEEE ITS Outstanding Application and Research Awards in 2009, 2011 and 2015, and IEEE SMC Norbert Wiener Award in 2014
Corresponding author: B. Tian is with the Vehicle Intelligence Pioneers Inc., Qingdao 266109, and also with the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China (e-mail: bin.tian@ia.ac.cn)
Received Date: 2018-07-05
Revised Date: 2018-10-10
Accepted Date: 2018-12-12

Abstract

Abstract

As a complex and critical cyber-physical system (CPS), the hybrid electric powertrain is significant to mitigate air pollution and improve fuel economy. Energy management strategy (EMS) is playing a key role to improve the energy efficiency of this CPS. This paper presents a novel bidirectional long short-term memory (LSTM) network based parallel reinforcement learning (PRL) approach to construct EMS for a hybrid tracked vehicle (HTV). This method contains two levels. The high-level establishes a parallel system first, which includes a real powertrain system and an artificial system. Then, the synthesized data from this parallel system is trained by a bidirectional LSTM network. The lower-level determines the optimal EMS using the trained action state function in the model-free reinforcement learning (RL) framework. PRL is a fully data-driven and learning-enabled approach that does not depend on any prediction and predefined rules. Finally, real vehicle testing is implemented and relevant experiment data is collected and calibrated. Experimental results validate that the proposed EMS can achieve considerable energy efficiency improvement by comparing with the conventional RL approach and deep RL.
- Bidirectional long short-term memory (LSTM) network,
- cyber-physical system (CPS),
- energy management,
- parallel system,
- reinforcement learning (RL)

FullText(HTML)

References(40)

References

[1]	F.-Y. Wang, “The emergence of intelligent enterprises: from CPS to CPSS,” IEEE Trans. Intell. Transp. Syst., vol. 25, no. 4, pp. 85–88, 2010.
[2]	F.-Y. Wang, “Control 5.0: from Newton to Merton in popper’s cyber-social-physical spaces,” IEEE/CAA J. Autom. Sinica, vol. 3, no. 3, pp. 233–234, 2016. doi: 10.1109/JAS.2016.7508796
[3]	X. L. Tang, X. S. Hu, W. Yang, and H. S. Yu, “Novel torsional vibration modeling and assessment of a power-split hybrid electric vehicle equipped with a dual mass flywheel,” IEEE Trans. Veh. Technol., vol. 67, no. 3, pp. 1900−2000, 2018.
[4]	T. Liu, X. S. Hu, W. H. Hu, and Y. Zou, “A heuristic planning reinforcement learning-based energy management for power-split plug-in hybrid electric vehicles,” IEEE Trans. Industrial Informatics, Mar. 2019.
[5]	T. Liu, X. S. Hu, S. E. Li, and D. P. Cao, “Reinforcement learning optimized look-ahead energy management of a parallel hybrid electric vehicle,” IEEE/ASME Trans. Mechatronics, vol. 22, no. 4, pp. 1497–1507, 2017. doi: 10.1109/TMECH.2017.2707338
[6]	Y. Zou, T. Liu, D. X. Liu, and F. C. Sun, “Reinforcement learning-based real-time energy management for a hybrid tracked vehicle,” Applied Energy, vol. 171, pp. 372–382, 2016. doi: 10.1016/j.apenergy.2016.03.082
[7]	C. Lv, Y. H. Liu, X. S. Hu, H. Guo, D. P. Cao, and F.-Y. Wang, “Simultaneous observation of hybrid states for cyber-physical systems: a case study of electric vehicle powertrain,” IEEE Trans. Cybernetics, vol. 48, no. 8, pp. 2357–2367, 2018.
[8]	X. S. Hu, H. Wang, and X. L. Tang, “Cyber-physical control for ener-gy-saving vehicle following with connectivity,” IEEE Trans. Indus. Electron., vol. 64, no. 11, pp. 8578–8587, 2017.
[9]	Y. Zou, Z. H. Kong, T. Liu, and D. X. Liu, “A real-time Markov chain driver model for tracked vehicles and its validation: its adaptability via stochastic dynamic programming,” IEEE Trans. Veh. Technol., vol. 66, no. 5, pp. 3571–3582, 2017.
[10]	T. Liu, Y. Zou, D. X. Liu, and F. C. Sun, “Reinforcement learning of adaptive energy management with transition probability for a hybrid electric tracked vehicle,” IEEE Trans. Ind. Electron., vol. 62, no. 12, pp. 7837–7846, 2015.
[11]	C. M. Martinez, X. S. Hu, D. P. Cao E. Velenis, B. Gao, and M. Wellers, “Energy management in plug-in hybrid electric vehicles: recent progress and a connected vehicles perspective,” IEEE Trans. Veh. Technol., vol. 66, no. 6, pp. 4534–4549, 2017. doi: 10.1109/TVT.2016.2582721
[12]	Y. C. Qin, F. Zhao, Z. F. Wang L. Gu, and M. M. Dong, “Comprehensive analysis for influence of controllable damper time delay on semi-active suspension control strategies,” J. Vibration and Acoustics-Trans. ASME, vol. 139, no. 3, pp. 031006-1–031006-12, 2017. doi: 10.1115/1.4035700
[13]	T. Liu, B. Wang, and C. L. Yang, “Online Markov chain-based energy management for a hybrid tracked vehicle with speedy Q-learning,” Energy, vol. 160, pp. 544–555, 2018. doi: 10.1016/j.energy.2018.07.022
[14]	H. S. Ramadan, M. Becherif, and F. Claude, “Energy management improvement of hybrid electric vehicles via combined GPS/rule-based methodology,” IEEE Trans. Autom. Sci. Eng., vol. 14, no. 2, pp. 586–597, 2017. doi: 10.1109/TASE.2017.2650146
[15]	K. Li, F. C. Chou, and J. Y. Yen, “Real-time, energy-efficient traction allocation strategy for the compound electric propulsion system,” IEEE/ASME Trans. Mechatronics, vol. 22, no. 3, pp. 1371–1380, 2017. doi: 10.1109/TMECH.2017.2667725
[16]	M. Muratori and G. Rizzoni, “Residential demand response: dynamic energy management and time-varying electricity pricing,” IEEE Trans. Power syst., vol. 31, no. 2, pp. 1108–1117, 2016. doi: 10.1109/TPWRS.2015.2414880
[17]	S. Delprat, T. Hofman, and S. Paganelli, “Hybrid vehicle energy management: singular optimal control,” IEEE Trans. Veh. Technol, vol. 66, no. 6, pp. 9654–9666, 2017. doi: 10.1109/TVT.2017.2746181
[18]	L. L. Guo, B. Z. Gao, Q. F. Liu, J. H. Tang, and H. Chen, “On-line optimal control of the gearshift command for multispeed electric vehicles,” IEEE/ASME Trans. Mechatronics, vol. 22, no. 4, pp. 1519–1530, 2017. doi: 10.1109/TMECH.2017.2716340
[19]	J. H. Han, D. Kum, and Y. Park, “Synthesis of predictive equivalent consumption minimization strategy for hybrid electric vehicles based on closed-form solution of optimal equivalence factor, ” IEEE Trans. Veh. Technol., 2017.
[20]	P. Nyberg, E. Frisk, and L. D. Nielsen, “Using real-world driving databases to generate driving cycles with equivalence properties,” IEEE Trans. Veh. Technol, vol. 65, no. 6, pp. 4095–4105, Jun. 2016. doi: 10.1109/TVT.2015.2502069
[21]	T. Liu, X. L. Tang, H. Wang, H. Yu, and X. S. Hu, “Adaptive hierarchical energy management design for a plug-in hybrid electric vehicle,” IEEE Trans. Veh. Technol., Jul, 2019.
[22]	T. Liu, Y. Zou, D. X. Liu, and F. C. Sun, “Reinforcement learning-based energy management strategy for a hybrid electric tracked vehicle,” Energies, vol. 8, no. 7, pp. 7243–7260, 2015. doi: 10.3390/en8077243
[23]	M. Deniša, A. Gams, A. Ude, and T. Petric, “Learning compliant movement primitives through demonstration and statistical generalization,” IEEE/ASME Trans. Mechatronics, vol. 21, no. 5, pp. 2581–2594, 2017.
[24]	V. Mnih, K. Kavukcuoglu, D. Silver, and A. Graves, “Playing atari with deep reinforcement learning, ” arXiv preprint, arXiv: 1312.5602, 2013.
[25]	M. Hagan, H. Demuth, M. Beale, and O. De Jess, Neural Network Design, Boston, MA, Martin Hagan, 2014.
[26]	F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning precise timing with LSTM recurrent networks,” J. Machine Learning Research, vol. 3, no. 1, pp. 115–143, 2002.
[27]	L. Li, S. X. You, C. Yang, B. J. Yan, J. Song, and Z. Chen, “Driving-behavior-aware stochastic model predictive control for plug-in hybrid electric buses,” Appl Energy, vol. 162, pp. 868–879, 2016. doi: 10.1016/j.apenergy.2015.10.152
[28]	F.-Y. Wang, “Artificial societies, computational experiments, and parallel systems: a discussion on computational theory of complex social-economic systems,” Complex Syst. Complex. Sci., vol. 1, no. 4, pp. 25–35, Oct. 2004.
[29]	F.-Y. Wang, “toward a paradigm shift in social computing: the ACP approach,” IEEE Intell. Syst., vol. 22, no. 5, pp. 65–67, Sept.–Oct. 2007. doi: 10.1109/MIS.2007.4338496
[30]	F.-Y. Wang, “Parallel control and management for intelligent transportation systems: concepts, architectures, and applications,” IEEE Trans. Intell. Transp. Syst., vol. 11, no. 3, pp. 630–638, Sep. 2010. doi: 10.1109/TITS.2010.2060218
[31]	F.-Y. Wang and S. N. Tang, “Artificial societies for integrated and sustainable development of metropolitan systems,” IEEE Intell. Syst., vol. 19, no. 4, pp. 82–87, Jul.–Aug. 2004. doi: 10.1109/MIS.2004.22
[32]	F.-Y. Wang, H. G. Zhang, and D. R. Liu, “Adaptive dynamic programming: an introduction,” IEEE Comput. Intell. Magazine, vol. 4, no. 2, pp. 39–47, Jun. 2009.
[33]	T. Liu, H. L. Yu, H. Y. Guo, Y. C. Qin, and Y. Zou, “Online energy management for multimode plug-in hybrid electric vehicles,” IEEE Trans. Industrial Informatics, vol. 15, no. 7, pp. 4352–4361, Jul. 2019.
[34]	P. Shan, R. Li, S. H. Ning, and Q. Yang, “Markov decision process toolbox, ” in Proc. of IEEE Int. Workshop on Open-Source Software for Scientific Computation (OSSC), Sep. 2009.
[35]	L. Li, Y. L. Lin, N. N. Zheng, and F.-Y. Wang, “Parallel learning: a perspective and a framework,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 3, pp. 389–395, 2017. doi: 10.1109/JAS.2017.7510493
[36]	C. Lv, X. S. Hu, A. Sangiovanni-Vincentelli, Y. T. Li, C. M. Martinez, and D. P. Cao, “Driving-style-based codesign optimization of an automated electric vehicle: a cyber-physical system approach,” IEEE Trans. Indus. Electron, vol. 66, no. 4, pp. 2965–2975, 2018.
[37]	C. Lv, Y. Xing, C. Lu, Y. H. Liu, H. Y. Guo, H. B. Gao, and D. P. Cao, “Hybrid-learning-based classification and quantitative inference of driver braking intensity of an electrified vehicle,” IEEE Trans. Veh. Technol., vol. 67, no. 7, pp. 5718–5729, 2018.
[38]	C. Lv, Y. Xing, J. Z. Zhang, X. X. Na, Y. T. Li, T. Liu, D. P. Cao, and F.-Y. Wang, “Leven-berg-marquardt backpropagation training of multilayer neural networks for state estimation of a safety-critical cyber-physical system,” IEEE Trans. Industrial Informatics, vol. 14, no. 8, pp. 3436–3446, 2017.
[39]	Y. Xing, C. Lv, H. J. Wang, D. P. Cao, E. Velenis, and F.-Y. Wang, “Driver lane change intention inference for intelligent vehicles: framework, survey, and challenges,” IEEE Trans. Veh. Technol, vol. 68, no. 5, pp. 4377–4390, 2019.
[40]	T. Liu and X. S. Hu, “A Bi-level control for energy efficiencyimprovement of a hybrid tracked vehicle,” IEEE Trans. Industrial Informatics, vol. 14, no. 4, pp. 1616–1625, 2018. doi: 10.1109/TII.2018.2797322

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(10) / Tables(2)

Get Citation

PDF

XML

Article Metrics

Article views (1275) PDF downloads(115)

Highlights

Parallel reinforcement learning was used to construct energy management strategy.
A parallel system including a real powertrain and an artificial system was built.
Data from the parallel system is trained by a bidirectional long short-term network.
Real vehicle testing is implemented and experiment data is collected and calibrated.

Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System

doi: 10.1109/JAS.2020.1003072

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content