A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 9 Issue 6
Jun.  2022

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
X. Li, H. B. Duan, Y. L. Tian, and F.-Y. Wang, “Exploring image generation for UAV change detection,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 6, pp. 1061–1072, Jun. 2022. doi: 10.1109/JAS.2022.105629
Citation: X. Li, H. B. Duan, Y. L. Tian, and F.-Y. Wang, “Exploring image generation for UAV change detection,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 6, pp. 1061–1072, Jun. 2022. doi: 10.1109/JAS.2022.105629

Exploring Image Generation for UAV Change Detection

doi: 10.1109/JAS.2022.105629
Funds:  This work was supported in part by the Science and Technology Innovation 2030-Key Project of “New Generation Artificial Intelligence” (2018AAA0102303), the Young Elite Scientists Sponsorship Program of China Association of Science and Technology (YESS20210289), the China Postdoctoral Science Foundation (2020TQ1057, 2020M682823), and the National Natural Science Foundation of China (U20B2071, U1913602, 91948204)
More Information
  • Change detection (CD) is becoming indispensable for unmanned aerial vehicles (UAVs), especially in the domain of water landing, rescue and search. However, even the most advanced models require large amounts of data for model training and testing. Therefore, sufficient labeled images with different imaging conditions are needed. Inspired by computer graphics, we present a cloning method to simulate inland-water scene and collect an auto-labeled simulated dataset. The simulated dataset consists of six challenges to test the effects of dynamic background, weather, and noise on change detection models. Then, we propose an image translation framework that translates simulated images to synthetic images. This framework uses shared parameters (encoder and generator) and 22 × 22 receptive fields (discriminator) to generate realistic synthetic images as model training sets. The experimental results indicate that: 1) different imaging challenges affect the performance of change detection models; 2) compared with simulated images, synthetic images can effectively improve the accuracy of supervised models.

     

  • loading
  • [1]
    Y. Z. Liu, Z. Y. Meng, Y. Zou, and M. Cao, “Visual object tracking and servoing control of a nano-scale quadrotor: System, algorithms, and experiments,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 344–360, Feb. 2021. doi: 10.1109/JAS.2020.1003530
    [2]
    R. B. Zhang, P. Tang, Y. M. Su, X. Y. Li, G. Yang, and C. T. Shi, “An adaptive obstacle avoidance algorithm for unmanned surface vehicle in complicated marine environments,” IEEE/CAA J. Autom. Sinica, vol. 1, no. 4, pp. 385–396, Oct. 2014. doi: 10.1109/JAS.2014.7004666
    [3]
    M. Puhm, J. Deutscher, M. Hirschmugl, A. Wimmer, U. Schmitt, and M. Schardt, “A near real-time method for forest change detection based on a structural time series model and the kalman filter,” Remote Sens., vol. 12, no. 19, Sept. 2020.
    [4]
    X. B. Gao, E. Akyol, and T. Basar, “Communication scheduling and remote estimation with adversarial intervention,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 32–44, Jan. 2019. doi: 10.1109/JAS.2019.1911318
    [5]
    T. S. F. Haines and T. Xiang, “Background subtraction with DirichletProcess mixture models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 4, pp. 670–683, Apr. 2014. doi: 10.1109/TPAMI.2013.239
    [6]
    M. Narayana, A. Hanson, and E. G. Learned-Miller, “Background subtraction: Separating the modeling and the inference,” Mach. Vision Appl., vol. 25, no. 5, pp. 1163–1174, Jul. 2014. doi: 10.1007/s00138-013-0569-y
    [7]
    X. W. Zhou, C. Yang, and W. C. Yu, “Moving object detection by detecting contiguous outliers in the low-rank representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 2, pp. 597–610, Mar. 2013.
    [8]
    A. R. Rivera, M. Murshed, J. Kim, and O. Chae, “Background modeling through statistical edge-segment distributions,” IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 8, pp. 1375–1387, Aug. 2013. doi: 10.1109/TCSVT.2013.2242551
    [9]
    S. C. Liao, G. Y. Zhao, V. Kellokumpu, M. Pietikäinen, and S. Z. Li, “Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes,” in Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, San Francisco, USA, 2010, pp. 1301−1306.
    [10]
    M. Braham and M. Van Droogenbroeck, “Deep background subtraction with scene-specific convolutional neural networks,” in Proc. Int. Conf. Systems, Signals and Image Processing, Bratislava, Slovakia, 2016, pp. 1−4.
    [11]
    M. Babaee, D. T. Dinh, and G. Rigoll, “A deep convolutional neural network for video sequence background subtraction,” Pattern Recognit., vol. 76, pp. 635–649, Apr. 2018. doi: 10.1016/j.patcog.2017.09.040
    [12]
    A. Gaidon, Q. Wang, Y. Cabon, and E. Vig, “Virtual worlds as proxy for multi-object tracking analysis,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 4340−4349.
    [13]
    G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 3234−3243.
    [14]
    M. A. Lebedev, Y. V. Vizilter, O. V. Vygolov, V. A. Knyaz, and A. Y. Rubis, “Change detection in remote sensing images using conditional adversarial networks,” Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., vol. XLⅡ-2, pp. 565–571, May 2018.
    [15]
    X. D. Niu, M. G. Gong, T. Zhan, and Y. L. Yang, “A conditional adversarial network for change detection in heterogeneous images,” IEEE Geosci. Remote. Sens. Lett., vol. 16, no. 1, pp. 45–49, Jan. 2019. doi: 10.1109/LGRS.2018.2868704
    [16]
    X. H. Li, Z. S. Du, Y. Y. Huang, and Z. Y. Tan, “A deep translation (GAN) based change detection network for optical and SAR remote sensing images,” ISPRS J. Photogramm. Remote Sens., vol. 179, pp. 14–34, Sept. 2021. doi: 10.1016/j.isprsjprs.2021.07.007
    [17]
    Y. Benezeth, P. M. Jodoin, B. Emile, H. Laurent, and C. Rosenberger, “Comparative study of background subtraction algorithms,” J. Electron. Imaging, vol. 19, no. 3, Jul. 2010.
    [18]
    C. Stauffer and W. E. L. Grimson, “Learning patterns of activity using real-time tracking,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 747–757, Aug. 2000. doi: 10.1109/34.868677
    [19]
    A. Elgammal, R. Duraiswami, D. Harwood, and L. S. Davis, “Background and foreground modeling using nonparametric kernel density estimation for visual surveillance,” Proc. IEEE, vol. 90, no. 7, pp. 1151–1163, Jul. 2002. doi: 10.1109/JPROC.2002.801448
    [20]
    D. Mukherjee, Q. M. J. Wu, and T. M. Nguyen, “Multiresolution based Gaussian mixture model for background suppression,” IEEE Trans. Image Process., vol. 22, no. 12, pp. 5022–5035, Dec. 2013. doi: 10.1109/TIP.2013.2281423
    [21]
    R. H. Evangelio, M. Patzold, I. Keller, and T. Sikora, “Adaptively splitted GMM with feedback improvement for the task of background subtraction,” IEEE Trans. Inf. Forensics Secur., vol. 9, no. 5, pp. 863–874, May 2014. doi: 10.1109/TIFS.2014.2313919
    [22]
    O. Barnich and M. Van Droogenbroeck, “ViBe: A universal background subtraction algorithm for video sequences,” IEEE Trans. Image Process., vol. 20, no. 6, pp. 1709–1724, Jun. 2011. doi: 10.1109/TIP.2010.2101613
    [23]
    K. F. Wang, Y. Q. Liu, C. Gou, and F.-Y. Wang, “A multi-view learning approach to foreground detection for traffic surveillance applications,” IEEE Trans. Veh. Technol., vol. 65, no. 6, pp. 4144–4158, Jun. 2016. doi: 10.1109/TVT.2015.2509465
    [24]
    K. Wang, C. Gou, and F.-Y. Wang, “M4CD: A robust change detection method for intelligent visual surveillance,” IEEE Access, vol. 6, pp. 15505–15520, Mar. 2018. doi: 10.1109/ACCESS.2018.2812880
    [25]
    P. L. St-Charles and G. A. Bilodeau, “Improving background subtraction using local binary similarity pattern,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Steamboat Springs, USA, 2014, pp. 509−515.
    [26]
    L. St-Charles, G. A. Bilodeau, and R. Bergevin, “SuBSENSE: A universal change detection method with local adaptive sensitivity,” IEEE Trans. Image Process., vol. 24, no. 1, pp. 359–373, Jan. 2015. doi: 10.1109/TIP.2014.2378053
    [27]
    L. A. Lim and H. Y. Keles, “Foreground segmentation using convolutional neural networks for multiscale feature encoding,” Pattern Recognit. Lett., vol. 112, pp. 256–262, Sept. 2018. doi: 10.1016/j.patrec.2018.08.002
    [28]
    L. A. Lim and H. Y. Keles, “Learning multi-scale features for foreground segmentation,” Pattern Anal. Appl., vol. 23, no. 3, pp. 1369–1380, Aug. 2020. doi: 10.1007/s10044-019-00845-9
    [29]
    Y. L. Tian, X. Li, K. F. Wang, and F.-Y. Wang, “Training and testing object detectors with virtual images,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 2, pp. 539–546, Mar. 2018. doi: 10.1109/JAS.2017.7510841
    [30]
    D. P. Young and J. M. Ferryman, “PETS metrics: On-line performance evaluation service,” in Proc. IEEE Int. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Beijing, China, 2005, pp. 317−324.
    [31]
    L. Y. Li, W. M. Huang, I. Y. H. Gu, and Q. Tian, “Statistical modeling of complex backgrounds for foreground object detection,” IEEE Trans. Image Process., vol. 13, no. 11, pp. 1459–1472, Nov. 2004. doi: 10.1109/TIP.2004.836169
    [32]
    N. Goyette, M. Jodoin, F. Porikli, J. Konrad, and Ishwar, “A novel video dataset for change detection benchmarking,” IEEE Trans. Image Process., vol. 23, no. 11, pp. 4663–4679, Nov. 2014. doi: 10.1109/TIP.2014.2346013
    [33]
    F. Tiburzi, M. Escudero, J. Bescós, and J. M. Martinez, “A ground truth for motion-based video-object segmentation,” in Proc. IEEE Int. Conf. Image Processing, San Diego, USA, 2008, pp. 17−20.
    [34]
    S. Brutzer, B. Höferlin, and G. Heidemann, “Evaluation of background subtraction techniques for video surveillance,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Colorado Springs, USA, 2011, pp. 1937−1944.
    [35]
    I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 2672−2680.
    [36]
    C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” J. Big Data, vol. 6, no. 1, Jul. 2019.
    [37]
    M. Frid-Adar, E. Klang, M. Amitai, J. Goldberger, and H. Greenspan, “Synthetic data augmentation using GAN for improved liver lesion classification,” in Proc. IEEE 15th Int. Symp. Biomedical Imaging, Washington, USA, 2018, pp. 289−293.
    [38]
    J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proc. IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 2242−2251.
    [39]
    C. Gou, H. Zhang, K. F. Wang, F.-Y. Wang, and Q. Ji, “Cascade learning from adversarial synthetic images for accurate pupil detection,” Pattern Recognit., vol. 88, pp. 584–594, Apr. 2019. doi: 10.1016/j.patcog.2018.12.014
    [40]
    X. Li, K. F. Wang, Y. L. Tian, L. Yan, F. Deng, and F.-Y. Wang, “The ParallelEye dataset: A large collection of virtual images for traffic vision research,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 6, pp. 2072–2084, Jun. 2019. doi: 10.1109/TITS.2018.2857566
    [41]
    X. Li, Y. T. Wang, L. Yan, K. F. Wang, F. Deng, and F.-Y. Wang, “ParallelEye-CS: A new dataset of synthetic images for testing the visual intelligence of intelligent vehicles,” IEEE Trans. Veh. Technol., vol. 68, no. 10, pp. 9619–9631, Oct. 2019. doi: 10.1109/TVT.2019.2936227
    [42]
    A. Kundu, Y. Li, F. Dellaert, F. X. Li, and J. M. Rehg, “Joint semantic segmentation and 3D reconstruction from monocular video,” in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 703−718.
    [43]
    G. Rahmon, F. Bunyak, G. Seetharaman, and K. Palaniappan, “Motion U-Net: Multi-cue encoder-decoder network for motion segmentation,” in Proc. 25th Int. Conf. Pattern Recognition, Milan, Italy, 2021, pp. 8125−8132.
    [44]
    F.-Y. Wang and H. Mo, “Some fundamental issues on type-2 fuzzy sets,” Acta Autom. Sinica, vol. 43, no. 7, pp. 1114–1141, Jul. 2017.
    [45]
    H. Mo, F.-Y. Wang, M. Zhou, R. M. Li, and Z. Q. Xiao, “Footprint of uncertainty for type-2 fuzzy sets,” Inf. Sci., vol. 272, pp. 96–110, Jul. 2014. doi: 10.1016/j.ins.2014.02.092

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(5)

    Article Metrics

    Article views (845) PDF downloads(114) Cited by()

    Highlights

    • In this work, we present a typical inland-water scenario and generates simulated multi-challenge sequences for testing the visual intelligence of UAV
    • Besides, an image translation network is proposed to synthesize photo-realistic images. All generation datasets are public available on the website, which may have a large potential to benefit the change detection community in the future
    • Furthermore, we utilize synthetic datasets and corresponding real datasets to conduct change detection experiments. The experimental results demonstrate that synthetic datasets can effectively improve deep learning-based detectors

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return