Barrier-Certified Learning-Enabled Safe Control Design for Systems Operating in Uncertain Environments

Zahra Marvi; Bahare Kiumarsi

doi:10.1109/JAS.2021.1004347

Volume 9 Issue 3

Mar. 2022

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2022 > 9(3): 437-449

Z. Marvi and B. Kiumarsi, “Barrier-certified learning-enabled safe control design for systems operating in uncertain environments,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 3, pp. 437–449, Mar. 2022. doi: 10.1109/JAS.2021.1004347

Citation:

Z. Marvi and B. Kiumarsi, “Barrier-certified learning-enabled safe control design for systems operating in uncertain environments,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 3, pp. 437–449, Mar. 2022. doi: 10.1109/JAS.2021.1004347

Citation:

PDF( 1206 KB)

Barrier-Certified Learning-Enabled Safe Control Design for Systems Operating in Uncertain Environments

doi: 10.1109/JAS.2021.1004347

Zahra Marvi^1
,,
Bahare Kiumarsi^{1
,
,}

Department of Electrical and Computer Engineering, Michigan State University, MI 48824 USA

More Information

Author Bio:
Zahra Marvi (Student Member, IEEE) is currently a Ph.D. candidate with the Department of Electrical and Computer Engineering at Michigan State University, USA. Prior to that, she was with Advanced Robotics and Automated Systems (ARAS), K. N. Toosi University of Technology, Iran, where she received the B.Sc. degree in electrical engineering in 2013, and the M.Sc. degree in mechatronics engineering in 2016. Her research interests include nonlinear control, reinforcement learning, multi-agent systems and robotics. Her current research focus is to design controllers for safety-critical systems under model and environmental uncertainty

Bahare Kiumarsi (Member, IEEE) received the B.S. degree in electrical engineering from the Shahrood University of Technology, Iran, in 2009, the M.S. degree in electrical engineering from the Ferdowsi University of Mashhad, Iran, in 2013, and the Ph.D. degree in electrical engineering from the University of Texas at Arlington (UT Arlington), USA, in 2017. In 2018, she was a Post-Doctoral Research Associate with the Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, USA. She is currently an Assistant Professor with the Department of Electrical and Computer Engineering, Michigan State University, USA. Her current research interests include machine learning in control, security of cyber-physical systems, game theory, and distributed control. Dr. Kiumarsi was a Recipient of the UT-Arlington N. M. Stelmakh Outstanding Student Research Award and the UT Arlington Graduate Dissertation fellowship in 2017
Corresponding author: Bahare Kiumarsi, e-mail: kiumarsi@msu.edu
Received Date: 2021-05-04
Revised Date: 2021-07-08
Accepted Date: 2021-09-23

Available Online: 2021-10-19

Abstract

Abstract

This paper presents learning-enabled barrier-certified safe controllers for systems that operate in a shared environment for which multiple systems with uncertain dynamics and behaviors interact. That is, safety constraints are imposed by not only the ego system’s own physical limitations but also other systems operating nearby. Since the model of the external agent is required to impose control barrier functions (CBFs) as safety constraints, a safety-aware loss function is defined and minimized to learn the uncertain and unknown behavior of external agents. More specifically, the loss function is defined based on barrier function error, instead of the system model error, and is minimized for both current samples as well as past samples stored in the memory to assure a fast and generalizable learning algorithm for approximating the safe set. The proposed model learning and CBF are then integrated together to form a learning-enabled zeroing CBF (L-ZCBF), which employs the approximated trajectory information of the external agents provided by the learned model but shrinks the safety boundary in case of an imminent safety violation using instantaneous sensory observations. It is shown that the proposed L-ZCBF assures the safety guarantees during learning and even in the face of inaccurate or simplified approximation of external agents, which is crucial in safety-critical applications in highly interactive environments. The efficacy of the proposed method is examined in a simulation of safe maneuver control of a vehicle in an urban area.
- Control barrier functions (CBFs),
- experience replay,
- learning,
- safety-critical systems,
- uncertainty

FullText(HTML)

References(31)

References

[1]	A. D. Ames, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs with application to adaptive cruise control,” in Proc. 53rd IEEE Conf. Decision and Control, pp. 6271–6278, Dec. 2014.
[2]	A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs for safety critical systems,” IEEE Trans. Automatic Control, vol. 62, pp. 3861–3876, Aug. 2017. doi: 10.1109/TAC.2016.2638961
[3]	M. Ohnishi, L. Wang, G. Notomista, and M. Egerstedt, “Barrier certified adaptive reinforcement learning with applications to brushbot navigation,” IEEE Trans. Robotics, vol. 35, no. 5, pp. 1186–1205, 2019. doi: 10.1109/TRO.2019.2920206
[4]	L. Wang, E. A. Theodorou, and M. Egerstedt, “Safe learning of quadrotor dynamics using barrier certificates,” in Proc. IEEE Int. Conf. Robotics and Automation, pp. 2460–2465, May 2018.
[5]	M. Srinivasan, S. Coogan, and M. Egerstedt, “Control of multi-agent systems with finite time control barrier certificates and temporal logic,” in Proc. IEEE Conf. Decision and Control, pp. 1991–1996, Dec. 2018.
[6]	L. Wang, A. Ames, and M. Egerstedt, “Safety barrier certificates for heterogeneous multi-robot systems,” in Proc. American Control Conf., pp. 5213–5218, 2016.
[7]	X. Xu, P. Tabuada, J. W. Grizzle, and A. D. Ames, “Robustness of control barrier functions for safety critical control,” IFAC-PapersOnLine, vol. 48, no. 27, pp. 54–61, 2015. doi: 10.1016/j.ifacol.2015.11.152
[8]	S. Sui, S. Tong, and C. L. P. Chen, “Finite-time filter decentralized control for nonstrict-feedback nonlinear large-scale systems,” IEEE Trans. Fuzzy Systems, vol. 26, no. 6, pp. 3289–3300, 2018. doi: 10.1109/TFUZZ.2018.2821629
[9]	S. Sui, C. L. P. Chen, and S. Tong, “Neural network filtering control design for nontriangular structure switched nonlinear systems in finite time,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 7, pp. 2153–2162, 2019. doi: 10.1109/TNNLS.2018.2876352
[10]	Y. Ouyang, L. Dong, L. Xue, and C. Sun, “Adaptive control based on neural networks for an uncertain 2-DOF helicopter system with input deadzone and output constraints,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 807–815, 2019. doi: 10.1109/JAS.2019.1911495
[11]	T. Gao, Y. Liu, L. Liu, and D. Li, “Adaptive neural network-based control for a class of nonlinear pure-feedback systems with timevarying full state constraints,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 5, pp. 923–933, 2018. doi: 10.1109/JAS.2018.7511195
[12]	W. He, Y. Chen, and Z. Yin, “Adaptive neural network control of an uncertain robot with full-state constraints,” IEEE Trans. Cybernetics, vol. 46, pp. 620–629, Mar. 2016. doi: 10.1109/TCYB.2015.2411285
[13]	L. Liu, T. Gao, Y. Liu, and S. Tong, “Time-varying asymmetrical BLFs based adaptive finite-time neural control of nonlinear systems with full state constraints,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1335–1343, 2020.
[14]	R. Cheng, G. Orosz, R. Murray, and J. Burdick, “End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks,” Proc. the AAAI Conf. Artificial Intelligence, vol. 33, pp. 3387–3395, 2019. doi: 10.1609/aaai.v33i01.33013387
[15]	A. J. Taylor and A. D. Ames, “Adaptive safety with control barrier functions,” in Proc. American Control Conf., pp. 1399–1405, 2020.
[16]	B. T. Lopez, J. J. E. Slotine, and J. P. How, “Robust adaptive control barrier functions: An adaptive and data-driven approach to safety,” IEEE Control Systems Letters, vol. 5, no. 3, pp. 1031–1036, 2021. doi: 10.1109/LCSYS.2020.3005923
[17]	A. Taylor, A. Singletary, Y. Yue, and A. Ames, “Learning for safetycritical control with control barrier functions,” in Proc. 2nd Conf. Learning for Dynamics and Control, vol. 120 of Proc. Machine Learning Research, pp. 708–717, PMLR, 10–11 Jun. 2020.
[18]	Z. Marvi and B. Kiumarsi, “Safe reinforcement learning: A control barrier function optimization approach,” Int. Journal of Robust and Nonlinear Control, pp. 1–18, 2020.
[19]	N. Wen, L. Zhao, X. Su, and P. Ma, “UAV online path planning algorithm in a low altitude dangerous environment,” IEEE/CAA J. Autom. Sinica, vol. 2, no. 2, pp. 173–185, 2015. doi: 10.1109/JAS.2015.7081657
[20]	D. Sadigh, Safe and Interactive Autonomy: Control, Learning, and Verification. Ph.D. thesis, EECS Department, University of California, Berkeley, Aug. 2017.
[21]	M. Nagumo, “Uber die lage der integralkurven gewohnlicher differentialgleichungen,” in Proc. Physico-Mathematical Society of Japan. 3rd Series, vol. 24, pp. 551–559, 1942.
[22]	F. Blanchini, “Set invariance in control,” Automatica, vol. 35, no. 11, pp. 1747–1767, 1999. doi: 10.1016/S0005-1098(99)00113-2
[23]	F. Blanchini and S. Miani, Set-Theoretic Methods in Control. Birkhäuser Basel, 2015.
[24]	G. Bouligand, Introducion a la Geometrie Infinitesimale Directe. Gauthiers-Villars, 1932.
[25]	H. Khalil, Nonlinear Systems. Pearson Education, Prentice Hall, 2002.
[26]	F. L. Lewis, A. Yesildirak, and S. Jagannathan, Neural Network Control of Robot Manipulators and Nonlinear Systems. Bristol, PA, USA: Taylor & Francis, Inc., 1998.
[27]	H. Modares, F. L. Lewis, and M. Naghibi-Sistani, “Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks,” IEEE Trans. Neural Networks and Learning Systems, vol. 24, pp. 1513–1525, Oct. 2013. doi: 10.1109/TNNLS.2013.2276571
[28]	P. J. Werbos, “Approximate dynamic programming for real-time control and neural modeling,” in Handbook of Intelligent Control, 1992.
[29]	P. J. Werbos, “Neural networks for control and system identification,” in Proc. IEEE Conf. Decision and Control, pp. 260–265 vol.1, Dec. 1989.
[30]	D. Zhao, Q. Zhang, D. Wang, and Y. Zhu, “Experience replay for optimal control of nonzero-sum game systems with unknown dynamics,” IEEE Trans. Cybernetics, vol. 46, pp. 854–865, Mar. 2016. doi: 10.1109/TCYB.2015.2488680
[31]	K. Vogel, “A comparison of headway and time to collision as safety indicators,” Accident Analysis and Prevention, vol. 35, no. 3, pp. 427–433, 2003. doi: 10.1016/S0001-4575(02)00022-2

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(10) / Tables(1)

Get Citation

PDF

XML

Article Metrics

Article views (1000) PDF downloads(110)

Highlights

The problem of safe control design for systems operating in uncertain shared environments is formulated as two sets of decoupled dynamics with a safety criterion defined as a function of both ego and external agent’s states to have a more inclusive scheme for safety-critical systems operating in cluttered environment
A novel learning-enabled ZCBF is proposed which is capable of safety guarantee during learning of unknown dynamics
Safety-aware model learning is proposed for rapid convergence of the approximated safe set to the exact one

Barrier-Certified Learning-Enabled Safe Control Design for Systems Operating in Uncertain Environments

doi: 10.1109/JAS.2021.1004347

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content