A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
J. Jiang, N. Xia, and S. Zhou, “A multi-type feature fusion network based on importance weighting for occluded human pose estimation,” IEEE/CAA J. Autom. Sinica..
Citation: J. Jiang, N. Xia, and S. Zhou, “A multi-type feature fusion network based on importance weighting for occluded human pose estimation,” IEEE/CAA J. Autom. Sinica..

A Multi-Type Feature Fusion Network Based on Importance Weighting for Occluded Human Pose Estimation

Funds:  This work was supported by Ministry of Education Industry-University Cooperation and Collaborative Education Project (China) (220603231024713)
More Information
  • Human pose estimation is a challenging task in computer vision. Most algorithms perform well in regular scenes, but lack good performance in occlusion scenarios. Therefore, we propose a multi-type feature fusion network based on importance weighting, which consists of three modules. In the first module, we propose a multi-resolution backbone with two feature enhancement sub-modules, which can extract features from different scales and enhance the feature expression ability. In the second module, we enhance the expressiveness of keypoint features by suppressing obstacle features and compensating for the unique and shared attributes of keypoints and topology. In the third module, we perform importance weighting on the adjacency matrix to enable it to describe the correlation among nodes, thereby improving the feature extraction ability. We conduct comparative experiments on the keypoint detection datasets of common objects in Context 2017 (COCO2017), COCO-Wholebody and CrowdPose, achieving the accuracy of 78.9%, 67.1% and 77.6%, respectively. Additionally, a series of ablation experiments are designed to show the performance of our work. Finally, we present the visualization of different scenarios to verify the effectiveness of our work.

     

  • loading
  • [1]
    K. Lee, W. Kim and S. Lee, “From Human Pose Similarity Metric to 3D Human Pose Estimator: Temporal Propagating LSTM Networks,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 1781–1797, 2023.
    [2]
    J. H. White, R. W. Beard, “An iterative pose estimation algorithm based on epipolar geometry with application to multi-target tracking,” in IEEE/CAA Journal of Automatica Sinica, vol. 7, no. 4, pp. 942–953, 2020.
    [3]
    Y. Wu, H. Q. Ding, M. G. Gong, et al, “Evolutionary multiform optimization with two-stage bidirectional knowledge transfer strategy for point cloud registration,” in IEEE Transactions on Evolutionary Computation, vol. 28, no. 1, pp. 62−76, 2024.
    [4]
    Y. Wu, J. M. Liu, Y. Z. Yuan, et al, “Correspondence-Free Point Cloud Registration Via Feature Interaction and Dual Branch[Application Notes],” in IEEE Computational Intelligence Magazine, vol. 18, no. 4, pp. 66−79, 2023.
    [5]
    S. Kreiss, L. Bertoni and A. Alahi, “OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association,” in IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 13498–13511, 2022.
    [6]
    Z. Huo, H. Jin, Y. Qiao and F. Luo, “Deep High-Resolution Network With Double Attention Residual Blocks for Human Pose Estimation,” in IEEE Access, vol. 8, pp. 224947–224957, 2020.
    [7]
    N. Saini, E. Bonetto, E. Price, et al, “AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation,” in IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4805–4812, 2022.
    [8]
    S. Sun, R. Liu and X. Yang, “Depth-Hand: 3D Hand Keypoint Detection With Dense Depth Estimation,” in IEEE Signal Processing Letters, vol. 30, pp. 962–966, 2023.
    [9]
    Y. Gao, Z. Kuang, G. Li, et al, “Hierarchical Reasoning Network for Human-Object Interaction Detection,” in IEEE Transactions on Image Processing, vol. 30, pp. 8306–8317, 2021.
    [10]
    Y. Lu, G. Chen, C. Pang, et al, “Subject-Specific Human Modeling for Human Pose Estimation,” in IEEE Transactions on Human-Machine Systems, vol. 53, no. 1, pp. 54–64, 2023.
    [11]
    Y. Yuan, Y. Wu, X. Fan, et al, “EGST: Enhanced Geometric Structure Transformer for Point Cloud Registration,” in IEEE Transactions on Visualization and Computer Graphics, pp. 1–13, 2023.
    [12]
    B. Xiao, H. Wu, and Y. Wei, “Simple baselines for human pose estimation and tracking,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 466–481.
    [13]
    S. Peng, X. Zhou, Y. Liu, et al, “PVNet: Pixel-Wise Voting Network for 6DoF Object Pose Estimation”, in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, pp. 3212–3223, 2022.
    [14]
    J. Wang, et al, “Deep High-Resolution Representation Learning for Visual Recognition”, in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, pp. 3349–3364, 2021.
    [15]
    T. Zhang, J. Lian, J. Wen, et al, “Multi-Person Pose Estimation in the Wild: Using Adversarial Method to Train a Top-Down Pose Estimation Network,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 53, no. 7, pp. 3919–3929, 2023.
    [16]
    L. Zhou, Y. Chen and J. Wang, “Progressive Direction-Aware Pose Gram mar for Human Pose Estimation”, in IEEE Transactions on Biometrics, Be havior, and Identity Science, vol. 5, pp. 593–605, 2023.
    [17]
    G. Kim, H. Kim, K. Kong, J.-W. Song and S.-J. Kang, “Human Body Aware Feature Extractor Using Attachable Feature Corrector for Human Pose Estimation”, in IEEE Transactions on Multimedia, vol. 25, pp. 5789–5799, 2023.
    [18]
    Z. Zhang, M. Liu, J. Shen, Y. Cheng and S. Wang, “Lightweight Whole Body Human Pose Estimation With Two-Stage Refinement Training Strategy”, in IEEE Transactions on Human-Machine Systems, vol. 54, pp. 121–130, 2024.
    [19]
    A. M. Hafiz, M. Hassaballah, Abdullah Alqahtani, “Reinforcement Learning with an Ensemble of Binary Action Deep Q-Networks”, in Com puter Systems Science and Engineering, vol. 46, no. 3, pp.2651–2666, 2023.
    [20]
    Y. Liu and J. Hua, “L-HRNet: A Lightweight High-Resolution Network for Human Pose Estimation,” 2023 8th International Conference on Intelligent Informatics and Biomedical Sciences (ICⅡBMS), Okinawa, Japan, 2023, pp. 219–224.
    [21]
    Z. Cao, G. Hidalgo, T. Simon, et al, “OpenPose: Realtime multi-person 2D pose estimation using part affinity fields,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 43, no. 1, pp. 172–186, 2021.
    [22]
    Z. Zhang, Y. Luo, and J. Gou, “Double anchor embedding for accurate multi-person 2D pose estimation,” Image Vis. Comput”, vol. 111, Jul. 2021, Art. no. 104198.
    [23]
    L. Ke, M. -C. Chang, H. Qi and S. Lyu, “DetPoseNet: Improving Multi-Person Pose Estimation via Coarse-Pose Filtering,” in IEEE Transactions on Image Processing, vol. 31, pp. 2782–2795, 2022.
    [24]
    H. -S. Fang et al., “AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 6, pp. 7157–7173, 2023.
    [25]
    Y. -J. Wang, Y. -M. Luo, G. -H. Bai, et al, “UformPose: A U-Shaped Hierarchical Multi-Scale Keypoint-Aware Framework for Human Pose Estimation,” in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 4, pp. 1697–1709, 2023.
    [26]
    L. Zhao, J. Xu, C. Gong, et al, “Learning to Acquire the Quality of Human Pose Estimation,” in IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 4, pp. 1555–1568, 2021.
    [27]
    M. Ghafoor and A. Mahmood, “Quantification of Occlusion Handling Capability of a 3D Human Pose Estimation Framework,” in IEEE Transactions on Multimedia, vol. 25, pp. 3311–3318, 2023.
    [28]
    M. C. F. Macedo and A. L. Apolinário, “Occlusion Handling in Augmented Reality: Past, Present and Future,” in IEEE Transactions on Visualization and Computer Graphics, vol. 29, no. 2, pp. 1590–1609, 2023.
    [29]
    D. Poux, B. Allaert, N. Ihaddadene, et al, “Dynamic Facial Expression Recognition Under Partial Occlusion With Optical Flow Reconstruction,” in IEEE Transactions on Image Processing, vol. 31, pp. 446–457, 2022.
    [30]
    Z. Su, et al, “RobustFusion: Robust Volumetric Performance Reconstruction Under Human-Object Interactions From Monocular RGBD Stream,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 5, pp. 6196–6213, 2023.
    [31]
    G. Wang, et al, “High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Seattle, WA, USA, 2020, pp. 6448–6457.
    [32]
    M. Ghafoor and A. Mahmood, “Quantification of Occlusion Handling Capability of a 3D Human Pose Estimation Framework,” in IEEE Transactions on Multimedia, vol. 25, pp. 3311–3318, 2023.
    [33]
    Q. Li, Z. Zhang, F. Zhang, et al, “HRNeXt: High-Resolution Context Network for Crowd Pose Estimation,” in IEEE Transactions on Multimedia, vol. 25, pp. 1521–1528, 2023.
    [34]
    Y. Liu, “Study on human pose estimation based on channel and spatial attention,” 2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE), Guangzhou, China, 2023, pp. 47–50.
    [35]
    W. Yu, Y. Li, R. Wang, et al, “PCFN: Progressive Cross-Modal Fusion Network for Human Pose Transfer,” in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 7, pp. 3369–3382, 2023.
    [36]
    V. Crescitelli, A. Kosuge and T. Oshima, “POISON: Human Pose Estimation in Insufficient Lighting Conditions Using Sensor Fusion,” in IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–8, 2021.
    [37]
    S. Kim, S. Kang, H. Choi, et al, “Keypoint Aware Robust Representation for Transformer-Based Re-Identification of Occluded Person,” in IEEE Signal Processing Letters, vol. 30, pp. 65–69, 2023.
    [38]
    S. W. Chu, C. Zhang, Y. Song and W. Cai, “Channel-Position Self-Attention with Query Refinement Skeleton Graph Neural Network in Human Pose Estimation,” 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 2022, pp. 971–975.
    [39]
    L. Rui, Y. Gao and H. Ren, “EDite-HRNet: Enhanced Dynamic Lightweight High-Resolution Network for Human Pose Estimation,” in IEEE Access, vol. 11, pp. 95948–95957, 2024.
    [40]
    X. -W. Yu and G. -S. Chen, “HRPoseFormer: High-Resolution Transformer for Human Pose Estimation via Multi-Scale Token Aggregation,” 2022 IEEE 16th International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Nangjing, China, 2022, pp. 1–3.
    [41]
    J. Li and M. Wang, “Multi-Person Pose Estimation With Accurate Heatmap Regression and Greedy Association,” in IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 8, pp. 5521–5535.
    [42]
    T. Zou, et al, “KAM-Net: Keypoint-Aware and Keypoint-Matching Network for Vehicle Detection From 2-D Point Cloud,” in IEEE Transactions on Artificial Intelligence, vol. 3, no. 2, pp. 207–217, 2022.
    [43]
    B. A. Pearlmutter and P. Sanatchandran, “Comments on ”Dynamic programming approach to optimal weight selection in multilayer neural networks,” in IEEE Transactions on Neural Networks, vol. 3, no. 6, pp. 1028–1029, 1992.
    [44]
    L. Duan, F. Duan, F. Chapeau-Blondeau, “Noise-Boosted Back propagation Learning of Feedforward Threshold Neural Networks for Function Approximation,” in IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–12, 2021.
    [45]
    C. Li, G. Chen, G. Liang, et al, “A Novel High-Performance Deep Learning Framework for Load Recognition: Deep-Shallow Model Based on Fast Backpropagation,” in IEEE Transactions on Power Systems, vol. 37, no. 3, pp. 1718–1729, 2022.
    [46]
    Z.-C. Fan, T.-S. T. Chan, Y.-H. Yang, et al, “Backpropagation With N-D Vector-Valued Neurons Using Arbitrary Bilinear Products,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 7, pp. 2638–2652, 2020.
    [47]
    R. Wang, C. Huang and X. Wang, “Global Relation Reasoning Graph Convolutional Networks for Human Pose Estimation,” in IEEE Access, vol. 8, pp. 38472–38480, 2020,
    [48]
    X. Yang, S. Li, S. Sun and J. Yan, “Anti-Occlusion Infrared Aerial Target Recognition With Multisemantic Graph Skeleton Model,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–13, 2022.
    [49]
    S. Zhang, W. Zhao, Z. Guan, et al, “Keypoint-graph-driven learning framework for object pose estimation,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 1065–1073.
    [50]
    T.-Y. Lin, et al, “Microsoft COCO: Common objects in context,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 740–755.
    [51]
    L. Xu, et al, “ZoomNAS: Searching for Whole-Body Human Pose Estimation in the Wild,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, pp. 5296–5313, 2023.
    [52]
    J. Li, C. Wang, H. Zhu, et al, “CrowdPose: Efficient crowded scenes pose estimation and a new benchmark,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit (CVPR), Jun. 2019, pp. 10863–10872.
    [53]
    X. Jiang, H. Tao, J. -N. Hwang, “A Multiscale Coarse-to-Fine Human Pose Estimation Network With Hard Keypoint Mining,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 3, pp. 1730–1741, 2024.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(9)  / Tables(8)

    Article Metrics

    Article views (5) PDF downloads(2) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return