A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 7 Issue 4
Jun.  2020

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Luping Wang and Hui Wei, "Avoiding Non-Manhattan Obstacles Based on Projection of Spatial Corners in Indoor Environment," IEEE/CAA J. Autom. Sinica, vol. 7, no. 4, pp. 1190-1200, July 2020. doi: 10.1109/JAS.2020.1003117
Citation: Luping Wang and Hui Wei, "Avoiding Non-Manhattan Obstacles Based on Projection of Spatial Corners in Indoor Environment," IEEE/CAA J. Autom. Sinica, vol. 7, no. 4, pp. 1190-1200, July 2020. doi: 10.1109/JAS.2020.1003117

Avoiding Non-Manhattan Obstacles Based on Projection of Spatial Corners in Indoor Environment

doi: 10.1109/JAS.2020.1003117
Funds:  This work was supported by the National Natural Science Foundation of China (61771146, 61375122), the National Thirteen 5-Year Plan for Science and Technology (2017YFC1703303), and in part by Shanghai Science and Technology Development Funds (13dz2260200, 13511504300)
More Information
  • Monocular vision-based navigation is a considerable ability for a home mobile robot. However, due to diverse disturbances, helping robots avoid obstacles, especially non-Manhattan obstacles, remains a big challenge. In indoor environments, there are many spatial right-corners that are projected into two dimensional projections with special geometric configurations. These projections, which consist of three lines, might enable us to estimate their position and orientation in 3D scenes. In this paper, we present a method for home robots to avoid non-Manhattan obstacles in indoor environments from a monocular camera. The approach first detects non-Manhattan obstacles. Through analyzing geometric features and constraints, it is possible to estimate posture differences between orientation of the robot and non-Manhattan obstacles. Finally according to the convergence of posture differences, the robot can adjust its orientation to keep pace with the pose of detected non-Manhattan obstacles, making it possible avoid these obstacles by itself. Based on geometric inferences, the proposed approach requires no prior training or any knowledge of the camera’s internal parameters, making it practical for robots navigation. Furthermore, the method is robust to errors in calibration and image noise. We compared the errors from corners of estimated non-Manhattan obstacles against the ground truth. Furthermore, we evaluate the validity of convergence of differences between the robot orientation and the posture of non-Manhattan obstacles. The experimental results showed that our method is capable of avoiding non-Manhattan obstacles, meeting the requirements for indoor robot navigation.

     

  • loading
  • [1]
    E. J. Gibson and R. D. Walk, “The visual cliff,” Sci. American, vol. 202, pp. 64–71, 1960.
    [2]
    J. J. Koenderink, A. J. V. Doorn, and A. M. Kappers, “Pictorial surface attitude and local depth comparisons,” Percept. Psychophys, vol. 58, no. 2, pp. 163–173, 1996. doi: 10.3758/BF03211873
    [3]
    Z. J. He and K. Nakayama, “Visual attention to surfaces in threedimensional space,” Proc. Natl. Acad. Sci. USA, vol. 92, no. 24, pp. 11155–11159, 1995. doi: 10.1073/pnas.92.24.11155
    [4]
    H. Wei and L. P. Wang, “Visual navigation using projection of spatial rightangle in indoor environment,” IEEE Trans. Image Processing(TIP), vol. 27, no. 7, pp. 3164–3177, 2018. doi: 10.1109/TIP.2018.2818931
    [5]
    H. Wei and L. P. Wang, “Understanding of indoor scenes based on projection of spatial rectangles,” Pattern Recognition, vol. 81, pp. 497–514, 2018. doi: 10.1016/j.patcog.2018.04.017
    [6]
    L. Magerand and A. Del Bue, “Revisiting projective structure from motion: A robust and efficient incremental solution,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 430–443, 2020. doi: 10.1109/TPAMI.2018.2849973
    [7]
    H. Mohamed, K. Nadaoka, and T. Nakamura, “Towards benthic habitat 3d mapping using machine learning algorithms and structures from motion photogrammetry,” Remote Sensing, vol. 12, no. 1, pp. 127, 2020. doi: 10.3390/rs12010127
    [8]
    Y. S. Hung and P. B. Zhang, “An Articulated deformable structure approach to human motion segmentation and shape recovery from an image sequence,” IET Computer Vision, vol. 13, no. 3, pp. 267–276, 2018. doi: 10.1049/iet-cvi.2018.5365
    [9]
    K. Sun and W. B. Tao, “A center-driven image set partition algorithm for efficient structure from motion,” Inf. Sci., vol. 479, pp. 101–115, 2019. doi: 10.1016/j.ins.2018.11.055
    [10]
    M. R. U. Saputra, A. Markham, and N. Trigoni, “Visual SLAM and structure from motion in dynamic environments: A survey,” ACM Comput. Surv., vol. 51, no. 2, pp. 37: 1–37: 36, 2018.
    [11]
    S. Hong and J. Kim, “Selective image registration for efficient visual SLAM on planar surface structures in underwater environment,” Auton. Robots, vol. 43, no. 7, pp. 1665–1679, 2019. doi: 10.1007/s10514-018-09824-1
    [12]
    S. P. Li, T. Zhang, X. Gao, D. Wang, and Y. Xian, “Semi-direct monocular visual and visual-inertial SLAM with loop closure detection,” Robotics and Autonomous Systems, vol. 112, pp. 201–210, 2019. doi: 10.1016/j.robot.2018.11.009
    [13]
    L. H. Xiao, J. G. Wang, X. S. Qiu, Z. Rong, and X. D. Zou, “Dynamic-SLAM: Semantic monocular visual localization and mapping based on deep learning in dynamic environment,” Robotics and Autonomous Systems, vol. 117, pp. 1–16, 2019. doi: 10.1016/j.robot.2019.03.012
    [14]
    R. H. Li, S. Wang, and D. B. Gu, “Ongoing evolution of visual SLAM from geometry to deep learning: Challenges and opportunities,” Cognitive Computation, vol. 10, no. 6, pp. 875–889, 2018. doi: 10.1007/s12559-018-9591-8
    [15]
    Y. Wei, J. Yang, C. Gong, S. Chen, and J. J. Qian, “Obstacle detection by fusing point clouds and monocular image,” Neural Processing Letters, vol. 49, no. 3, pp. 1007–1019, 2019. doi: 10.1007/s11063-018-9861-1
    [16]
    A. Saxena, M. Sun, and A. Y. Ng, “Make3d: Depth perception from a single still image,” AAAI, vol. 3, pp. 1571–1576, 2008.
    [17]
    A. Saxena, M. Sun, and A. Y. Ng, “Learning 3-d scene structure from a single still image,” in Proc. 11th IEEE Int. Conf. Computer Vision. IEEE. pp. 1–8, 2007.
    [18]
    E. Delage, L. Honglak, and A. Y. Ng, “A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image,” CVPR, vol. 2, pp. 2418–2428, 2006.
    [19]
    B. Liu, S. Gould, and D. Koller, “Single image depth estimation from predicted semantic labels,” CVPR, vol. 119, no. 5, pp. 1253–1260, 2010.
    [20]
    A. Shariati, B. Pfrommer, and C. J. Taylor, “Simultaneous localization and layout model selection in Manhattan worlds,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 950–957, 2019. doi: 10.1109/LRA.2019.2893417
    [21]
    J. Straub, O. Freifeld, G. Rosman, J. J. Leonard, and J. W. F. III, “The Manhattan frame model – Manhattan world inference in the space of surface normals,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 1, pp. 235–249, 2018. doi: 10.1109/TPAMI.2017.2662686
    [22]
    L. D. Pero, J. Y. Guan, E. Brau, J. Schlecht, and K. Barnard, “Sampling bedrooms,” CVPR, vol. 1, pp. 2009–2016, 2011.
    [23]
    L. D. Pero, J. Bowdish, D. Fried, B. Kermgard, E. Hartley, and K. Barnard, “Bayesian geometric modeling of indoor scenes,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition. IEEE, pp. 2719–2726, 2012.
    [24]
    L. D. Pero, J. Bowdish, B. Kermgard, E. Hartley, and K. Barnard, “Understanding Bayesian rooms using composite 3d object models,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition. IEEE, pp. 153–160, 2013.
    [25]
    D. C. Lee, M. Hebert, and T. Kanade, “Geometric reasoning for single image structure recovery,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition. IEEE, pp. 2136–2143, 2009.
    [26]
    S. X. Yu, H. Zhang, and J. Malik, “Inferring spatial layout from a single image via depth-ordered grouping,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshops. IEEE, pp. 1–7, 2008.
    [27]
    J. Li, C. Yuce, R. Klein, and A. Yao, “A two-streamed network for estimating fine-scaled depth maps from single RGB images,” Computer Vision and Image Understanding, vol. 186, pp. 25–36, 2019. doi: 10.1016/j.cviu.2019.06.002
    [28]
    S. H. Ding, Q. Zhai, Y. Li, J. D. Zhu, Y. F. Zheng, and D. Xuan, “Simultaneous body part and motion identification for human-following robots,” Pattern Recognition, vol. 50, pp. 118–130, 2016. doi: 10.1016/j.patcog.2015.08.020
    [29]
    Z. Y. Jia, A. Gallagher, A. Saxena, and T. Chen, “3d-based reasoning with blocks, support, and stability,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition. IEEE, pp. 1–8, 2013.
    [30]
    D. Lee, A. Gupta, M. Hebert, and T. Kanade, “Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces,” NIPS, pp. 1288–1296, 2010.
    [31]
    V. Hedau, D. Hoiem, and D. Forsyth, “Thinking inside the box: Using appearance models and context based on room geometry,” in Proc. European Conf. Computer Vision: Part VI. Berlin, Heidelberg, Germany: Springer, pp. 224–237, 2010.
    [32]
    S. Dasgupta, K. Fang, K. Chen, and S. Savarese, “Delay: Robust spatial layout estimation for cluttered indoor scenes,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition. IEEE, pp. 616–624, 2016.
    [33]
    C. H. Zou, A. Colburn, Q. Shan, and D. Hoiem, “Layoutnet: Reconstructing the 3d room layout from a single rgb image,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018, pp. 2051–2059.
    [34]
    Y. Z. Ren, S. W. Li, C. Chen, and C.-C. J. Kuo, “A coarse-to-fine indoor layout estimation (CFILE) method,” in Proc. Asian Conf. Computer Vision, Springer, Cham, pp. 36–51, 2016.
    [35]
    P. Miraldo, F. Eiras, and S. Ramalingam, “Analytical modeling of vanishing points and curves in catadioptric cameras,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018, pp. 2012–2021.
    [36]
    H. Howard-Jenkins, S. Li, and V. Prisacariu, “Thinking outside the box: Generation of unconstrained 3d room layouts,” in Proc. Asian Conf. Computer Vision. Perth, Australia: Springer, 2018, pp. 432–448.
    [37]
    X. T. Li, S. F. Liu, K. Kim, X. L. Wang, M. H. Yang, and J. Kautz, “Putting humans in a scene: Learning affordance in 3d indoor environments,” in Proc. IEEE Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019, pp. 12368–12376.
    [38]
    A. Atapour-Abarghouei and T. P. Breckon, “Veritatem dies aperit – temporally consistent depth prediction enabled by a multi-task geometric and semantic scene understanding approach,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019, pp. 3373–3384.
    [39]
    M. H. Zhai, S. Workman, and N. Jacobs, “Detecting vanishing points using global image context in a non-Manhattanworld,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 5657–5665.
    [40]
    J. Lee and K. Yoon, “Joint estimation of camera orientation and vanishing points from an image sequence in a non-Manhattan world,” Int. J. Computer Vision, vol. 127, no. 10, pp. 1426–1442, 2019. doi: 10.1007/s11263-019-01196-y
    [41]
    P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik “From contours to regions: An empirical evaluation,” in Proc. Asian Conf. Computer Vision, Springer, Cham, pp. 2294–2301, 2009.
    [42]
    V. Hedau, D. Hoiem, and D. Forsyth, “Recovering the spatial layout of cluttered rooms,” In Proc. 12th IEEE Int. Conf. Computer Vision, Kyoto, Japan: IEEE, pp. 1849–1856, 2009.
    [43]
    A. Mallya and S. Lazebnik, “Learning informative edge maps for indoor scene layout prediction,” In Proc. IEEE Int. Conf. Computer Vision. Santiago, Chile: IEEE, pp. 936–944, 2015.
    [44]
    Y. Zhang, F. Yu, S. Song, P. Xu, A. Seff, and J. Xiao, Largescale Scene Understanding Challenge: Room Layout Estimation, 2016.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(11)  / Tables(4)

    Article Metrics

    Article views (854) PDF downloads(42) Cited by()

    Highlights

    • A method is presented to avoid non-Manhattan obstacles in an indoor environment from a monocular camera.
    • The method can cope with the non-Manhattan obstacle without prior training, making it practical and efficient for a navigating robot.
    • The approach is robust against changes in illumination and color in 3D scenes, without the knowledge of camera’s intrinsic parameters, nor of the relation between the camera and world.

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return