Instance by Instance: An Iterative Framework for Multi-Instance 3D Registration

Jiaqi Yang; Xinyue Cao; Xiyu Zhang; Yuxin Cheng; Zhaoshuai Qi; Siwen Quan

doi:10.1109/JAS.2024.125058

Volume 12 Issue 6

Jun. 2025

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2025 > 12(6): 1117-1128

J. Yang, X. Cao, X. Zhang, Y. Cheng, Z. Qi, and S. Quan, “Instance by instance: An iterative framework for multi-instance 3D registration,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 6, pp. 1117–1128, Jun. 2025. doi: 10.1109/JAS.2024.125058

Citation:

J. Yang, X. Cao, X. Zhang, Y. Cheng, Z. Qi, and S. Quan, “Instance by instance: An iterative framework for multi-instance 3D registration,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 6, pp. 1117–1128, Jun. 2025. doi: 10.1109/JAS.2024.125058

Citation:

PDF( 27937 KB)

Instance by Instance: An Iterative Framework for Multi-Instance 3D Registration

doi: 10.1109/JAS.2024.125058

Funds: This work was supported in part by the National Natural Science Foundation of China (62372377)

More Information

Author Bio:
Jiaqi Yang received the B.S. and Ph.D. degrees from Huazhong University of Science and Technology in 2014 and 2019, respectively. Sponsored by China Scholarship Council, he visited the GRASP Laboratory, University of Pennsylvania, USA, from 2017 to 2018. Currently, he is an Associate Professor with the School of Computer Science, Northwestern Polytechnical University. His research interests include local geometric description, 3D registration, 3D feature matching, and 3D object recognition. He served as a Reviewer for several international journals and conferences, including Pattern Recognition, Computer Vision and Image Understanding, International Journal of Remote Sensing, International Conference of Computer Vision, and International Conference of 3D Vision

Xinyue Cao received the B.E. degree from Northwestern Polytechnical University in 2023. She is currently working toward the master degree with the School of Computer Science, Northwestern Polytechnical University. Her main research interest is multi-instance 3D point cloud registration

Xiyu Zhang received the B.S. degree from Northwestern Polytechnical University in 2022. He is currently working toward the Ph.D. degree with the School of Computer Science, Northwestern Polytechnical University. His research interests include 3D feature matching and 3D point cloud registration

Yuxin Cheng received the B.S. degree from Northwestern Polytechnical University in 2021. He is currently working toward the Ph.D. degree with the Department of Electronic and Computer Engineering, Chinese University of Hong Kong, Hong Kong, China. His research interests include 3D point cloud, 3D neural representation, and optimization acceleration

Zhaoshuai Qi received the B.S. degree from Hebei University of Technology in 2013, and the Ph.D. degree from Xi’an Jiaotong University in 2019. He is an Associate Professor with the School of Computer Science, Northwestern Polytechnical University. His research interests include computer vision, point cloud acquisition, and 3-D reconstruction

Siwen Quan received the B.S. degree from Chang’an University in 2015, and the Ph.D. degree from Huazhong University of Science and Technology in 2019. She is currently an Associate Professor with the School of Electronic and Control Engineering, Chang’an University. Her research interests include local geometric shape description, 3D object recognition, and image fusion
Corresponding author: Siwen Quan, e-mail: siwenquan@chd.edu.cn
Received Date: 2024-06-12
Revised Date: 2024-07-30
Accepted Date: 2024-11-18

Available Online: 2025-02-13

Abstract

Abstract

Multi-instance registration is a challenging problem in computer vision and robotics, where multiple instances of an object need to be registered in a standard coordinate system. Pioneers followed a non-extensible one-shot framework, which prioritizes the registration of simple and isolated instances, often struggling to accurately register challenging or occluded instances. To address these challenges, we propose the first iterative framework for multi-instance 3D registration (MI-3DReg) in this work, termed instance-by-instance (IBI). It successively registers instances while systematically reducing outliers, starting from the easiest and progressing to more challenging ones. This enhances the likelihood of effectively registering instances that may have been initially overlooked, allowing for successful registration in subsequent iterations. Under the IBI framework, we further propose a sparse-to-dense correspondence-based multi-instance registration method (IBI-S2DC) to enhance the robustness of MI-3DReg. Experiments on both synthetic and real datasets have demonstrated the effectiveness of IBI and suggested the new state-of-the-art performance with IBI-S2DC, e.g., our mean registration F1 score is 12.02%/12.35% higher than the existing state-of-the-art on the synthetic/real datasets. The source codes are available online at https://github.com/caoxy01/IBI.
- 3D registration,
- iterative framework,
- pose estimation,
- point cloud

FullText(HTML)

References(63)

References

[1]	Á. P. Bustos and T.-J. Chin, “Guaranteed outlier removal for point cloud registration with correspondences,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 12, pp. 2868–2882, Dec. 2018. doi: 10.1109/TPAMI.2017.2773482
[2]	D. Barath and J. Matas, “Graph-cut RANSAC,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 6733–6741.
[3]	C. Choy, W. Dong, and V. Koltun, “Deep global registration,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 2511–2520.
[4]	J. Lee, S. Kim, M. Cho, and J. Park, “Deep Hough voting for robust global registration,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 15974–15983.
[5]	J. Yang, Z. Huang, S. Quan, Z. Qi, and Y. Zhang, “SAC-COT: Sample consensus by sampling compatibility triangles in graphs for 3-D point cloud registration,” IEEE Trans. Geosci. Remote Sens., vol. 60, p. 5700115, 2022.
[6]	X. Bai, Z. Luo, L. Zhou, H. Chen, L. Li, Z. Hu, H. Fu, and C.-L. Tai, “PointDSC: Robust point cloud registration using deep spatial consistency,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 15854–15864.
[7]	Z. Chen, K. Sun, F. Yang, and W. Tao, “SC.2-PCR: A second order spatial compatibility for efficient and robust point cloud registration,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 13211–13221.
[8]	X. Zhang, J. Yang, S. Zhang, and Y. Zhang, “3D registration with maximal cliques,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Vancouver, Canada, 2023, pp. 17745–17754.
[9]	B. Drost, M. Ulrich, N. Navab, and S. Ilic, “Model globally, match locally: Efficient and robust 3D object recognition,” in Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, San Francisco, USA, 2010, pp. 998–1005.
[10]	J. Guo, X. Xing, W. Quan, D.-M. Yan, Q. Gu, Y. Liu, and X. Zhang, “Efficient center voting for object detection and 6D pose estimation in 3D point cloud,” IEEE Trans. Image Process., vol. 30, pp. 5072–5084, May 2021. doi: 10.1109/TIP.2021.3078109
[11]	W. Tang and D. Zou, “Multi-instance point cloud registration by efficient correspondence clustering,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 6657–6666.
[12]	M. Yuan, Z. Li, Q. Jin, X. Chen, and M. Wang, “PointCLM: A contrastive learning-based framework for multi-instance point cloud registration,” in Proc. 17th European Conf. Computer Vision, Tel Aviv, Israel, 2022, pp. 595–611.
[13]	Z. Yu, Z. Qin, L. Zheng, and K. Xu, “Learning instance-aware correspondences for robust multi-instance point cloud registration in cluttered scenes,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2024, pp. 19605–19614.
[14]	C. Choy, J. Park, and V. Koltun, “Fully convolutional geometric features,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 8957–8965.
[15]	X. Bai, Z. Luo, L. Zhou, H. Fu, L. Quan, and C.-L. Tai, “D3Feat: Joint learning of dense detection and description of 3D local features,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 6358–6366.
[16]	S. Huang, Z. Gojcic, M. Usvyatsov, A. Wieser, and K. Schindler, “PREDATOR: Registration of 3D point clouds with low overlap,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 4265–4274.
[17]	R. B. Rusu, N. Blodow, and M. Beetz, “Fast point feature histograms (FPFH) for 3D registration,” in Proc. IEEE Int. Conf. Robotics and Automation, Kobe, Japan, 2009, pp. 3212–3217.
[18]	A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, and T. Funkhouser, “3DMatch: Learning local geometric descriptors from RGB-D reconstructions,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 199–208.
[19]	S. Ao, Q. Hu, B. Yang, A. Markham, and Y. Guo, “SpinNet: Learning a general surface descriptor for 3D point cloud registration,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 11748–11757.
[20]	J. Yang, K. Xian, P. Wang, and Y. Zhang, “A performance evaluation of correspondence grouping methods for 3D rigid data matching,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 6, pp. 1859–1874, Jun. 2021. doi: 10.1109/TPAMI.2019.2960234
[21]	D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision, vol. 60, no. 2, pp. 91–110, Nov. 2004. doi: 10.1023/B:VISI.0000029664.99615.94
[22]	A. Glent Buch, Y. Yang, N. Krüger, and H. Gordon Petersen, “In search of inliers: 3D correspondence by local and global voting,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Columbus, USA, 2014, pp. 2075–2082.
[23]	J. Yang, Y. Xiao, Z. Cao, and W. Yang, “Ranking 3D feature correspondences via consistency voting,” Pattern Recognit. Lett., vol. 117, pp. 1–8, Jan. 2019. doi: 10.1016/j.patrec.2018.11.018
[24]	J. Yang, X. Zhang, S. Fan, C. Ren, and Y. Zhang, “Mutual voting for ranking 3D correspondences,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 6, pp. 4041–4057, Jun. 2024. doi: 10.1109/TPAMI.2023.3268297
[25]	M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, Jun. 1981. doi: 10.1145/358669.358692
[26]	M. Leordeanu and M. Hebert, “A spectral technique for correspondence problems using pairwise constraints,” in Proc. IEEE Int. Conf. Computer Vision, Beijing, China, 2005, pp. 1482–1489.
[27]	F. Tombari and L. Di Stefano, “Object recognition in 3D scenes with occlusions and clutter by Hough voting,” in Proc. 4th Pacific-Rim Symp. Image and Video Technology, Singapore, Singapore, 2010, pp. 349–355.
[28]	J. Yang, H. Li, D. Campbell, and Y. Jia, “Go-ICP: A globally optimal solution to 3D ICP point-set registration,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 11, pp. 2241–2254, Nov. 2016. doi: 10.1109/TPAMI.2015.2513405
[29]	Á. Parra, T.-J. Chin, F. Neumann, T. Friedrich, and M. Katzmann, “A practical maximum clique algorithm for matching with pairwise constraints,” arXiv preprint arXiv: 1902.01534, 2019.
[30]	K. Fu, S. Liu, X. Luo, and M. Wang, “Robust point cloud registration framework based on deep graph matching,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 8889–8898.
[31]	R. Yao, S. Du, W. Cui, A. Ye, F. Wen, H. Zhang, Z. Tian, and Y. Gao, “Hunter: Exploring high-order consistency for point cloud registration with severe outliers,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 12, pp. 14760–14776, Dec. 2023. doi: 10.1109/TPAMI.2023.3312592
[32]	Q.-Y. Zhou, J. Park, and V. Koltun, “Fast global registration,” in Proc. 14th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 766–782.
[33]	H. Yang, J. Shi, and L. Carlone, “TEASER: Fast and certifiable point cloud registration,” IEEE Trans. Robot., vol. 37, no. 2, pp. 314–333, Apr. 2021. doi: 10.1109/TRO.2020.3033695
[34]	J. Yang, J. Chen, S. Quan, W. Wang, and Y. Zhang, “Correspondence selection with loose–tight geometric voting for 3-D point cloud registration,” IEEE Trans. Geosci. Remote Sens., vol. 60, p. 5701914, Jan. 2022.
[35]	S. Quan and J. Yang, “Compatibility-guided sampling consensus for 3-D point cloud registration,” IEEE Trans. Geosci. Remote Sens., vol. 58, no. 10, pp. 7380–7392, Oct. 2020. doi: 10.1109/TGRS.2020.2982221
[36]	Y. Cheng, Z. Huang, S. Quan, X. Cao, S. Zhang, and J. Yang, “Sampling locally, hypothesis globally: Accurate 3D point cloud registration with a RANSAC variant,” Vis. Intell., vol. 1, no. 1, p. 20, Sept. 2023. doi: 10.1007/s44267-023-00022-x
[37]	X. Huang, G. Mei, and J. Zhang, “Feature-metric registration: A fast semi-supervised approach for robust point cloud registration without correspondences,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 11363–11371.
[38]	H. Yu, F. Li, M. Saleh, B. Busam, and S. Ilic, “CoFiNet: Reliable coarse-to-fine correspondences for robust point cloud registration,” in Proc. 35th Int. Conf. Neural Information Processing Systems, 2021, pp. 1828.
[39]	Z. Qin, H. Yu, C. Wang, Y. Guo, Y. Peng, and K. Xu, “Geometric transformer for fast and robust point cloud registration,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 11133–11142.
[40]	S. Ao, Q. Hu, H. Wang, K. Xu, and Y. Guo, “BUFFER: Balancing accuracy, efficiency, and generalizability in point cloud registration,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Vancouver, Canada, 2023, pp. 1255–1264.
[41]	T. Birdal and S. Ilic, “Point pair features based object detection and pose estimation revisited,” in Proc. Int. Conf. 3D Vision, Lyon, France, 2015, pp. 527–535.
[42]	S. Hinterstoisser, V. Lepetit, N. Rajkumar, and K. Konolige, “Going further with point pair features,” in Proc. 14th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 834–848.
[43]	J. Vidal, C.-Y. Lin, and R. Martí, “6D pose estimation using an improved method based on point pair features,” in Proc. 4th Int. Conf. Control, Automation and Robotics, Auckland, New Zealand, 2018, pp. 405–409.
[44]	L. Magri and A. Fusiello, “T-Linkage: A continuous relaxation of J-linkage for multi-model fitting,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Columbus, USA, 2014, pp. 3954–3961.
[45]	L. Magri and A. Fusiello, “Multiple model fitting as a set coverage problem,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 3318–3326.
[46]	L. Magri and F. Andrea, “Robust multiple model fitting with preference analysis and low-rank approximation,” in Proc. British Machine Vision Conf., Swansea, UK, 2015, pp. 20.
[47]	D. Baráth and J. Matas, “Progressive-X: Efficient, anytime, multi-model fitting algorithm,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 3779–3787.
[48]	D. Baráth, D. Rozumny, I. Eichhardt, L. Hajder, and J. Matas, “Progressive-X+: Clustering in the consensus space,” arXiv preprint arXiv: 2103.13875, 2021.
[49]	D. Baráth and J. Matas, “Multi-class model fitting by energy minimization and mode-seeking,” in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 229–245.
[50]	F. Kluger, E. Brachmann, H. Ackermann, C. Rother, M. Y. Yang, and B. Rosenhahn, “CONSAC: Robust multi-model fitting by conditional sample consensus,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 4633–4642.
[51]	Z. Li, J. Ma, and G. Xiao, “Density-guided incremental dominant instance exploration for two-view geometric model fitting,” IEEE Trans. Image Process., vol. 32, pp. 5408–5422, Sept. 2023. doi: 10.1109/TIP.2023.3318945
[52]	W. Yin, S. Lin, Y. Lu, and H. Wang, “Diverse consensuses paired with motion estimation-based multi-model fitting,” in Proc. 32nd ACM Int. Conf. Multimedia, Melbourne, Australia, 2024, pp. 9281–9290.
[53]	E. Rodolà, A. Albarelli, F. Bergamasco, and A. Torsello, “A scale independent selection process for 3D object recognition in cluttered scenes,” Int. J. Comput. Vis., vol. 102, no. 1, pp. 129–145, Mar. 2013.
[54]	A. Albarelli, E. Rodolà, and A. Torsello, “A game-theoretic approach to fine surface registration without initial motion estimation,” in Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, San Francisco, USA, 2010, pp. 430–437.
[55]	J. W. Weibull, Evolutionary Game Theory. Cambridge, USA: MIT Press, 1997.
[56]	J. Yang, Z. Huang, S. Quan, Q. Zhang, Y. Zhang, and Z. Cao, “Toward efficient and robust metrics for RANSAC hypotheses and 3D rigid registration,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 2, pp. 893–906, Feb. 2022. doi: 10.1109/TCSVT.2021.3062811
[57]	S. Quan, J. Ma, F. Hu, B. Fang, and T. Ma, “Local voxelized structure for 3D binary feature representation and robust registration of point clouds from low-cost sensors,” Inf. Sci., vol. 444, pp. 153–171, May 2018. doi: 10.1016/j.ins.2018.02.070
[58]	C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “PointNet++: Deep hierarchical feature learning on point sets in a metric space,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 5105–5114.
[59]	A. Avetisyan, M. Dahnert, A. Dai, M. Savva, A. X. Chang, and M. Nießner, “Scan2CAD: Learning CAD model alignment in RGB-D scans,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 2609–2618.
[60]	M. Savva, F. Yu, H. Su, M. Aono, B. Chen, D. Cohen-Or, W. Deng, H. Su, S. Bai, X. Bai, N. Fish, J. Han, E. Kalogerakis, E. G. Learned-Miller, Y. Li, M. Liao, S. Maji, A. Tatsuma, Y. Wang, N. Zhang, and Z. Zhou, “Large-scale 3D shape retrieval from ShapeNet core55,” in Proc. Eurographics Workshop on 3D Object Retrieval, Lisbon, Portugal, 2016.
[61]	A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner, “ScanNet: Richly-annotated 3D reconstructions of indoor scenes,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 2432–2443.
[62]	N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man, Cybern., vol. 9, no. 1, pp. 62–66, Jan. 1979. doi: 10.1109/TSMC.1979.4310076
[63]	J. Yang, Z. Huang, S. Quan, Z. Cao, and Y. Zhang, “RANSACs for 3D rigid registration: A comparative evaluation,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 10, pp. 1861–1878, Oct. 2022. doi: 10.1109/JAS.2022.105500

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(11) / Tables(12)

Get Citation

PDF

XML

Article Metrics

Article views (106) PDF downloads(12)

Instance by Instance: An Iterative Framework for Multi-Instance 3D Registration

doi: 10.1109/JAS.2024.125058

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Export File

Citation

Format

Content