A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 9 Issue 3
Mar.  2022

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Z. Marvi and B. Kiumarsi, “Barrier-certified learning-enabled safe control design for systems operating in uncertain environments,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 3, pp. 437–449, Mar. 2022. doi: 10.1109/JAS.2021.1004347
Citation: Z. Marvi and B. Kiumarsi, “Barrier-certified learning-enabled safe control design for systems operating in uncertain environments,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 3, pp. 437–449, Mar. 2022. doi: 10.1109/JAS.2021.1004347

Barrier-Certified Learning-Enabled Safe Control Design for Systems Operating in Uncertain Environments

doi: 10.1109/JAS.2021.1004347
More Information
  • This paper presents learning-enabled barrier-certified safe controllers for systems that operate in a shared environment for which multiple systems with uncertain dynamics and behaviors interact. That is, safety constraints are imposed by not only the ego system’s own physical limitations but also other systems operating nearby. Since the model of the external agent is required to impose control barrier functions (CBFs) as safety constraints, a safety-aware loss function is defined and minimized to learn the uncertain and unknown behavior of external agents. More specifically, the loss function is defined based on barrier function error, instead of the system model error, and is minimized for both current samples as well as past samples stored in the memory to assure a fast and generalizable learning algorithm for approximating the safe set. The proposed model learning and CBF are then integrated together to form a learning-enabled zeroing CBF (L-ZCBF), which employs the approximated trajectory information of the external agents provided by the learned model but shrinks the safety boundary in case of an imminent safety violation using instantaneous sensory observations. It is shown that the proposed L-ZCBF assures the safety guarantees during learning and even in the face of inaccurate or simplified approximation of external agents, which is crucial in safety-critical applications in highly interactive environments. The efficacy of the proposed method is examined in a simulation of safe maneuver control of a vehicle in an urban area.

     

  • loading
  • [1]
    A. D. Ames, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs with application to adaptive cruise control,” in Proc. 53rd IEEE Conf. Decision and Control, pp. 6271–6278, Dec. 2014.
    [2]
    A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs for safety critical systems,” IEEE Trans. Automatic Control, vol. 62, pp. 3861–3876, Aug. 2017. doi: 10.1109/TAC.2016.2638961
    [3]
    M. Ohnishi, L. Wang, G. Notomista, and M. Egerstedt, “Barrier certified adaptive reinforcement learning with applications to brushbot navigation,” IEEE Trans. Robotics, vol. 35, no. 5, pp. 1186–1205, 2019. doi: 10.1109/TRO.2019.2920206
    [4]
    L. Wang, E. A. Theodorou, and M. Egerstedt, “Safe learning of quadrotor dynamics using barrier certificates,” in Proc. IEEE Int. Conf. Robotics and Automation, pp. 2460–2465, May 2018.
    [5]
    M. Srinivasan, S. Coogan, and M. Egerstedt, “Control of multi-agent systems with finite time control barrier certificates and temporal logic,” in Proc. IEEE Conf. Decision and Control, pp. 1991–1996, Dec. 2018.
    [6]
    L. Wang, A. Ames, and M. Egerstedt, “Safety barrier certificates for heterogeneous multi-robot systems,” in Proc. American Control Conf., pp. 5213–5218, 2016.
    [7]
    X. Xu, P. Tabuada, J. W. Grizzle, and A. D. Ames, “Robustness of control barrier functions for safety critical control,” IFAC-PapersOnLine, vol. 48, no. 27, pp. 54–61, 2015. doi: 10.1016/j.ifacol.2015.11.152
    [8]
    S. Sui, S. Tong, and C. L. P. Chen, “Finite-time filter decentralized control for nonstrict-feedback nonlinear large-scale systems,” IEEE Trans. Fuzzy Systems, vol. 26, no. 6, pp. 3289–3300, 2018. doi: 10.1109/TFUZZ.2018.2821629
    [9]
    S. Sui, C. L. P. Chen, and S. Tong, “Neural network filtering control design for nontriangular structure switched nonlinear systems in finite time,” IEEE Trans. Neural Networks and Learning Systems, vol. 30, no. 7, pp. 2153–2162, 2019. doi: 10.1109/TNNLS.2018.2876352
    [10]
    Y. Ouyang, L. Dong, L. Xue, and C. Sun, “Adaptive control based on neural networks for an uncertain 2-DOF helicopter system with input deadzone and output constraints,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 807–815, 2019. doi: 10.1109/JAS.2019.1911495
    [11]
    T. Gao, Y. Liu, L. Liu, and D. Li, “Adaptive neural network-based control for a class of nonlinear pure-feedback systems with timevarying full state constraints,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 5, pp. 923–933, 2018. doi: 10.1109/JAS.2018.7511195
    [12]
    W. He, Y. Chen, and Z. Yin, “Adaptive neural network control of an uncertain robot with full-state constraints,” IEEE Trans. Cybernetics, vol. 46, pp. 620–629, Mar. 2016. doi: 10.1109/TCYB.2015.2411285
    [13]
    L. Liu, T. Gao, Y. Liu, and S. Tong, “Time-varying asymmetrical BLFs based adaptive finite-time neural control of nonlinear systems with full state constraints,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1335–1343, 2020.
    [14]
    R. Cheng, G. Orosz, R. Murray, and J. Burdick, “End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks,” Proc. the AAAI Conf. Artificial Intelligence, vol. 33, pp. 3387–3395, 2019. doi: 10.1609/aaai.v33i01.33013387
    [15]
    A. J. Taylor and A. D. Ames, “Adaptive safety with control barrier functions,” in Proc. American Control Conf., pp. 1399–1405, 2020.
    [16]
    B. T. Lopez, J. J. E. Slotine, and J. P. How, “Robust adaptive control barrier functions: An adaptive and data-driven approach to safety,” IEEE Control Systems Letters, vol. 5, no. 3, pp. 1031–1036, 2021. doi: 10.1109/LCSYS.2020.3005923
    [17]
    A. Taylor, A. Singletary, Y. Yue, and A. Ames, “Learning for safetycritical control with control barrier functions,” in Proc. 2nd Conf. Learning for Dynamics and Control, vol. 120 of Proc. Machine Learning Research, pp. 708–717, PMLR, 10–11 Jun. 2020.
    [18]
    Z. Marvi and B. Kiumarsi, “Safe reinforcement learning: A control barrier function optimization approach,” Int. Journal of Robust and Nonlinear Control, pp. 1–18, 2020.
    [19]
    N. Wen, L. Zhao, X. Su, and P. Ma, “UAV online path planning algorithm in a low altitude dangerous environment,” IEEE/CAA J. Autom. Sinica, vol. 2, no. 2, pp. 173–185, 2015. doi: 10.1109/JAS.2015.7081657
    [20]
    D. Sadigh, Safe and Interactive Autonomy: Control, Learning, and Verification. Ph.D. thesis, EECS Department, University of California, Berkeley, Aug. 2017.
    [21]
    M. Nagumo, “Uber die lage der integralkurven gewohnlicher differentialgleichungen,” in Proc. Physico-Mathematical Society of Japan. 3rd Series, vol. 24, pp. 551–559, 1942.
    [22]
    F. Blanchini, “Set invariance in control,” Automatica, vol. 35, no. 11, pp. 1747–1767, 1999. doi: 10.1016/S0005-1098(99)00113-2
    [23]
    F. Blanchini and S. Miani, Set-Theoretic Methods in Control. Birkhäuser Basel, 2015.
    [24]
    G. Bouligand, Introducion a la Geometrie Infinitesimale Directe. Gauthiers-Villars, 1932.
    [25]
    H. Khalil, Nonlinear Systems. Pearson Education, Prentice Hall, 2002.
    [26]
    F. L. Lewis, A. Yesildirak, and S. Jagannathan, Neural Network Control of Robot Manipulators and Nonlinear Systems. Bristol, PA, USA: Taylor & Francis, Inc., 1998.
    [27]
    H. Modares, F. L. Lewis, and M. Naghibi-Sistani, “Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks,” IEEE Trans. Neural Networks and Learning Systems, vol. 24, pp. 1513–1525, Oct. 2013. doi: 10.1109/TNNLS.2013.2276571
    [28]
    P. J. Werbos, “Approximate dynamic programming for real-time control and neural modeling,” in Handbook of Intelligent Control, 1992.
    [29]
    P. J. Werbos, “Neural networks for control and system identification,” in Proc. IEEE Conf. Decision and Control, pp. 260–265 vol.1, Dec. 1989.
    [30]
    D. Zhao, Q. Zhang, D. Wang, and Y. Zhu, “Experience replay for optimal control of nonzero-sum game systems with unknown dynamics,” IEEE Trans. Cybernetics, vol. 46, pp. 854–865, Mar. 2016. doi: 10.1109/TCYB.2015.2488680
    [31]
    K. Vogel, “A comparison of headway and time to collision as safety indicators,” Accident Analysis and Prevention, vol. 35, no. 3, pp. 427–433, 2003. doi: 10.1016/S0001-4575(02)00022-2

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(1)

    Article Metrics

    Article views (1000) PDF downloads(110) Cited by()

    Highlights

    • The problem of safe control design for systems operating in uncertain shared environments is formulated as two sets of decoupled dynamics with a safety criterion defined as a function of both ego and external agent’s states to have a more inclusive scheme for safety-critical systems operating in cluttered environment
    • A novel learning-enabled ZCBF is proposed which is capable of safety guarantee during learning of unknown dynamics
    • Safety-aware model learning is proposed for rapid convergence of the approximated safe set to the exact one

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return