A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
L. Xu, J. Liu, X. Chang, X. Liu, and C. Sun, “Hazard-aware weighted advantage combination for UAV target tracking and obstacle avoidance,” IEEE/CAA J. Autom. Sinica, 2024. doi: 10.1109/JAS.2024.124920
Citation: L. Xu, J. Liu, X. Chang, X. Liu, and C. Sun, “Hazard-aware weighted advantage combination for UAV target tracking and obstacle avoidance,” IEEE/CAA J. Autom. Sinica, 2024. doi: 10.1109/JAS.2024.124920

Hazard-Aware Weighted Advantage Combination for UAV Target Tracking and Obstacle Avoidance

doi: 10.1109/JAS.2024.124920
Funds:  This work was supported by the National Natural Science Foundation of China (62236002, 61921004)
More Information
  • In recent years, the rapid evolution of unmanned aerial vehicles (UAVs) has brought about transformative changes across various industries. However, addressing fundamental challenges in UAV technology, particularly target tracking and obstacle avoidance, remains crucial for wildlife protection, military industry security, etc. Many existing methods based on reinforcement learning to solve UAV multi-tasks need to be redesigned and retrained, and cannot be quickly and effectively extended to other scenarios. To this end, we propose a novel solution based on a hazard-aware weighted advantage combination for UAV target tracking and obstacle avoidance. First, we independently trained the UAV target tracking and obstacle avoidance using the Dueling Double Deep Q-Network reinforcement learning algorithm. Subsequently, in a multitasking scenario, we introduce the two pre-trained networks. Meanwhile, we design a weight determined by the present risk level encountered by the UAV. This weight is utilized to perform a weighted summation of the advantage values from both networks, eliminating the need for retraining to obtain the final action. We validate our approach through extensive simulation experiments in the robotics simulator known as CoppeliaSim. The results demonstrate that our method outperforms current state-of-the-art techniques, achieving superior performance in both tracking accuracy and avoidance of collisions.

     

  • loading
  • [1]
    S. Li, T. Liu, C. Zhang, D.-Y. Yeung, and S. Shen, “Learning unmanned aerial vehicle control for autonomous target following,” in Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018, pp. 4936–4942.
    [2]
    S. P. Bharati, Y. Wu, Y. Sui, C. Padgett, and G. Wang, “Real-time obstacle detection and tracking for sense-and-avoid mechanism in uavs,” IEEE Transactions on Intelligent Vehicles, vol. 3, no. 2, pp. 185–197, 2018. doi: 10.1109/TIV.2018.2804166
    [3]
    B. Li, Z. Gan, D. Chen, and D. Sergey Aleksandrovich, “Uav maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning,” Remote Sensing, vol. 12, no. 22, p. 3789, 2020. doi: 10.3390/rs12223789
    [4]
    S. Bhagat and P. Sujit, “Uav target tracking in urban environments using deep reinforcement learning,” in 2020 International conference on unmanned aircraft systems (ICUAS). IEEE, 2020, pp. 694–701.
    [5]
    J. Kim, “Target following and close monitoring using an unmanned surface vehicle,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 11, pp. 4233–4242, 2018.
    [6]
    L. Xu, T. Wang, W. Cai, and C. Sun, “Uav target following in complex occluded environments with adaptive multi-modal fusion,” Applied Intelligence, vol. 53, no. 13, pp. 16998–17014, 2023. doi: 10.1007/s10489-022-04317-2
    [7]
    Y. Xue and W. Chen, “Multi-agent deep reinforcement learning for uavs navigation in unknown complex environment,” IEEE Transactions on Intelligent Vehicles, 2023.
    [8]
    D. Wang, Q. Pan, Y. Shi, J. Hu, and C. Zhao, “Efficient nonlinear model predictive control for quadrotor trajectory tracking: Algorithms and experiment,” IEEE Transactions on Cybernetics, vol. 51, no. 10, pp. 5057–5068, 2021. doi: 10.1109/TCYB.2020.3043361
    [9]
    Y. Yang, L. Liao, H. Yang, and S. Li, “An optimal control strategy for multi-uavs target tracking and cooperative competition,” IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 12, pp. 1931–1947, 2021. doi: 10.1109/JAS.2020.1003012
    [10]
    P. Sun, S. Li, B. Zhu, Z. Zuo, and X. Xia, “Vision-based fixed-time uncooperative aerial target tracking for uav,” IEEE/CAA Journal of Automatica Sinica, vol. 10, no. 5, pp. 1322–1324, 2023. doi: 10.1109/JAS.2023.123510
    [11]
    H. Huang, Y. Yang, H. Wang, Z. Ding, H. Sari, and F. Adachi, “Deep reinforcement learning for uav navigation through massive mimo technique,” IEEE Transactions on Vehicular Technology, vol. 69, no. 1, pp. 1117–1121, 2019.
    [12]
    S. Feng, L. Zeng, J. Liu, Y. Yang, and W. Song, “Multi-uavs collaborative path planning in the cramped environment,” IEEE/CAA Journal of Automatica Sinica, vol. 11, no. 2, pp. 529–538, 2024. doi: 10.1109/JAS.2023.123945
    [13]
    H. Kandath, T. Bera, R. Bardhan, and S. Sundaram, “Autonomous navigation and sensorless obstacle avoidance for ugv with environment information from uav,” in 2018 Second IEEE International Conference on Robotic Computing (IRC). IEEE, 2018, pp. 266–269.
    [14]
    Z. Han, R. Zhang, N. Pan, C. Xu, and F. Gao, “Fast-tracker: A robust aerial system for tracking agile target in cluttered environments,” in 2021 IEEE international conference on robotics and automation (ICRA). IEEE, 2021, pp. 328–334.
    [15]
    N. Pan, R. Zhang, T. Yang, C. Cui, C. Xu, and F. Gao, “Fast-tracker 2.0: Improving autonomy of aerial tracking with active vision and human location regression,” IET Cyber-Systems and Robotics, vol. 3, no. 4, pp. 292–301, 2021. doi: 10.1049/csy2.12033
    [16]
    X. Zhou, X. Wen, Z. Wang, Y. Gao, H. Li, Q. Wang, T. Yang, H. Lu, Y. Cao, C. Xu, et al, “Swarm of micro flying robots in the wild,” Science Robotics, vol. 7, no. 66, p. eabm595, 2022.
    [17]
    S. S. Mansouri, C. Kanellakis, B. Lindqvist, F. Pourkamali-Anaraki, A.- A. Agha-Mohammadi, J. Burdick, and G. Nikolakopoulos, “A unified nmpc scheme for mavs navigation with 3d collision avoidance under position uncertainty,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 5740–5747, 2020. doi: 10.1109/LRA.2020.3010485
    [18]
    J. Li, H. He, and A. Tiwari, “Simulation of autonomous uav navigation with collision avoidance and space awareness,” in 2020 3rd International Conference on Intelligent Robotic and Control Engineering (IRCE). IEEE, 2020, pp. 110–116.
    [19]
    C. Yan, X. Xiang, and C. Wang, “Towards real-time path planning through deep reinforcement learning for a uav in dynamic environments,” Journal of Intelligent & Robotic Systems, vol. 98, pp. 297–309, 2020.
    [20]
    G. Xu, W. Jiang, Z. Wang, and Y. Wang, “Autonomous obstacle avoidance and target tracking of uav based on deep reinforcement learning,” Journal of Intelligent & Robotic Systems, vol. 104, no. 4, p. 60, 2022.
    [21]
    C. Sampedro, A. Rodriguez-Ramos, H. Bavle, A. Carrio, P. de la Puente, and P. Campoy, “A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques,” Journal of Intelligent & Robotic Systems, vol. 95, pp. 601–627, 2019.
    [22]
    C. Wang, J. Wang, Y. Shen, and X. Zhang, “Autonomous navigation of uavs in large-scale complex environments: A deep reinforcement learning approach,” IEEE Transactions on Vehicular Technology, vol. 68, no. 3, pp. 2124–2136, 2019. doi: 10.1109/TVT.2018.2890773
    [23]
    J. Moon, S. Papaioannou, C. Laoudias, P. Kolios, and S. Kim, “Deep reinforcement learning multi-uav trajectory control for target tracking,” IEEE Internet of Things Journal, vol. 8, no. 20, pp. 15441–15455, 2021. doi: 10.1109/JIOT.2021.3073973
    [24]
    N. Patrizi, G. Fragkos, K. Ortiz, M. Oishi, and E. E. Tsiropoulou, “A uav-enabled dynamic multi-target tracking and sensing framework,” in GLOBECOM 2020-2020 IEEE Global Communications Conference. IEEE, 2020, pp. 1–6.
    [25]
    Z. Xia, J. Du, J. Wang, C. Jiang, Y. Ren, G. Li, and Z. Han, “Multi-agent reinforcement learning aided intelligent uav swarm for target tracking,” IEEE Transactions on Vehicular Technology, vol. 71, no. 1, pp. 931–945, 2021.
    [26]
    Z. Zheng, J. Li, Z. Guan, and Z. Zuo, “Constrained moving path following control for uav with robust control barrier function,” IEEE/CAA Journal of Automatica Sinica, vol. 10, no. 7, pp. 1557–1570, 2023. doi: 10.1109/JAS.2023.123573
    [27]
    J. Kang, J. Chen, M. Xu, Z. Xiong, Y. Jiao, L. Han, D. Niyato, Y. Tong, and S. Xie, “Uav-assisted dynamic avatar task migration for vehicular metaverse services: A multi-agent deep reinforcement learning approach,” IEEE/CAA Journal of Automatica Sinica, vol. 11, no. 2, pp. 430–445, 2024. doi: 10.1109/JAS.2023.123993
    [28]
    P. Sun, B. Zhu, Z. Zuo, and M. V. Basin, “Vision-based finite-time uncooperative target tracking for uav subject to actuator saturation,” Automatica, vol. 130, p. 109708, 2021. doi: 10.1016/j.automatica.2021.109708
    [29]
    W.-C. Chen, C.-L. Lin, Y.-Y. Chen, and H.-H. Cheng, “Quadcopter drone for vision-based autonomous target following,” Aerospace, vol. 10, no. 1, p. 82, 2023. doi: 10.3390/aerospace10010082
    [30]
    Y. Liu, Z. Meng, Y. Zou, and M. Cao, “Visual object tracking and servoing control of a nano-scale quadrotor: System, algorithms, and experiments.,” IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 2, pp. 344–360, 2021. doi: 10.1109/JAS.2020.1003530
    [31]
    P. Sun, B. Zhu, and S. Li, “Vision-based prescribed performance control for uav target tracking subject to actuator saturation,” IEEE Transactions on Intelligent Vehicles, 2023.
    [32]
    N. Bashir, S. Boudjit, G. Dauphin, and S. Zeadally, “An obstacle avoidance approach for uav path planning,” Simulation Modelling Practice and Theory, vol. 129, p. 102815, 2023. doi: 10.1016/j.simpat.2023.102815
    [33]
    W. Hematulin, P. Kamsing, P. Torteeka, T. Somjit, T. Phisannupawong, and T. Jarawan, “Trajectory planning for multiple uavs and hierarchical collision avoidance based on nonlinear kalman filters,” Drones, vol. 7, no. 2, p. 142, 2023. doi: 10.3390/drones7020142
    [34]
    A. Sonny, S. R. Yeduri, and L. R. Cenkeramaddi, “Q-learning-based unmanned aerial vehicle path planning with dynamic obstacle avoidance,” Applied Soft Computing, vol. 147, p. 110773, 2023. doi: 10.1016/j.asoc.2023.110773
    [35]
    C. Yan, C. Wang, X. Xiang, K. H. Low, X. Wang, X. Xu, and L. Shen, “Collision-avoiding flocking with multiple fixed-wing uavs in obstacle-cluttered environments: A task-specific curriculum-based madrl approach,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
    [36]
    C. Hu, Z. Meng, G. Qu, H.-S. Shin, and A. Tsourdos, “Distributed cooperative path planning for tracking ground moving target by multiple fixed-wing uavs via dmpc-gvd in urban environment,” International Journal of Control, Automation and Systems, vol. 19, pp. 823–836, 2021. doi: 10.1007/s12555-019-0625-0
    [37]
    S. Zhao, X. Wang, H. Chen, and Y. Wang, “Cooperative path following control of fixed-wing unmanned aerial vehicles with collision avoidance,” Journal of Intelligent & Robotic Systems, vol. 100, no. 3-4, pp. 1569–1581, 2020.
    [38]
    L. Xu, T. Wang, J. Wang, J. Liu, and C. Sun, “Attention-based policy distillation for uav simultaneous target tracking and obstacle avoidance,” IEEE Transactions on Intelligent Vehicles, vol. 9, no. 2, pp. 3768–3781, 2024. doi: 10.1109/TIV.2023.3342174
    [39]
    A. Singletary, K. Klingebiel, J. Bourne, A. Browning, P. Tokumaru, and A. Ames, “Comparative analysis of control barrier functions and artificial potential fields for obstacle avoidance,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, pp. 8129–8136.
    [40]
    Q. Yuan and X. Li, “Distributed model predictive formation control for a group of uavs with spatial kinematics and unidirectional data transmissions,” IEEE Transactions on Network Science and Engineering, vol. 10, no. 6, pp. 3209–3222, 2023. doi: 10.1109/TNSE.2023.3252724
    [41]
    B. Li, S. Wen, Z. Yan, G. Wen, and T. Huang, “A survey on the control lyapunov function and control barrier function for nonlinear-affine control systems,” IEEE/CAA Journal of Automatica Sinica, vol. 10, no. 3, pp. 584–602, 2023. doi: 10.1109/JAS.2023.123075
    [42]
    I. Misra, A. Shrivastava, A. Gupta, and M. Hebert, “Cross-stitch networks for multi-task learning,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3994–4003.
    [43]
    S. Liu, E. Johns, and A. J. Davison, “End-to-end multi-task learning with attention,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 1871–1880.
    [44]
    L. Duong, T. Cohn, S. Bird, and P. Cook, “Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser,” in Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), 2015, pp. 845–850.
    [45]
    M. Long, Z. Cao, J. Wang, and P. S. Yu, “Learning multiple tasks with multilinear relationship networks,” Advances in neural information processing systems, vol. 30, 2017.
    [46]
    N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. Le, G. Hinton, and J. Dean, “Outrageously large neural networks: The sparsely-gated mixture-of-experts layer,” arXiv preprint arXiv: 1701.06538, 2017.
    [47]
    J. Ma, Z. Zhao, X. Yi, J. Chen, L. Hong, and E. H. Chi, “Modeling task relationships in multi-task learning with multi-gate mixture-of-experts,” in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 1930–1939.
    [48]
    H. Tang, J. Liu, M. Zhao, and X. Gong, “Progressive layered extraction (ple): A novel multi-task learning (mtl) model for personalized recommendations,” in Proceedings of the 14th ACM Conference on Recommender Systems, 2020, pp. 269–278.
    [49]
    X. Sun, R. Panda, R. Feris, and K. Saenko, “Adashare: Learning what to share for efficient deep multi-task learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 8728–8740, 2020.
    [50]
    S. Lee, S. Behpour, and E. Eaton, “Sharing less is more: Lifelong learning in deep networks with selective layer transfer,” in International Conference on Machine Learning. PMLR, 2021, pp. 6065–6075.
    [51]
    D. Bhattacharjee, T. Zhang, S. Süsstrunk, and M. Salzmann, “Mult: An end-to-end multitask learning transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 031–12 041.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(4)

    Article Metrics

    Article views (24) PDF downloads(3) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return