A Recurrent Attention and Interaction Model for Pedestrian Trajectory Prediction

Xuesong Li; Yating Liu; Kunfeng Wang; Fei-Yue Wang

doi:10.1109/JAS.2020.1003300

Volume 7 Issue 5

Sep. 2020

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2020 > 7(5): 1361-1370

Xuesong Li, Yating Liu, Kunfeng Wang and Fei-Yue Wang, "A Recurrent Attention and Interaction Model for Pedestrian Trajectory Prediction," IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1361-1370, Sept. 2020. doi: 10.1109/JAS.2020.1003300

Citation:

Xuesong Li, Yating Liu, Kunfeng Wang and Fei-Yue Wang, "A Recurrent Attention and Interaction Model for Pedestrian Trajectory Prediction," IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1361-1370, Sept. 2020. doi: 10.1109/JAS.2020.1003300

Citation:

PDF( 1272 KB)

A Recurrent Attention and Interaction Model for Pedestrian Trajectory Prediction

doi: 10.1109/JAS.2020.1003300

Funds: This work was supported by the National Natural Science Foundation of China (U1811463) and the Fundamental Research Funds for the Central Universities (12060093192)

More Information

Author Bio:
Xuesong Li received the B.S. degree in automation from the University of Electronic Science and Technology of China, in 2017. He is currently pursuing the M.S. degree with the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interests include visual tracking, image processing, and machine learning

Yating Liu received the B.S. degree from the Civil Aviation University of China in 2014. She is currently a Ph.D. candidate at the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences as well as University of Chinese Academy of Sciences. Her research interests include visual object detection and tracking, machine learning, and intelligent transportation systems

Kunfeng Wang (M’11–SM’18) received the Ph.D. degree in control theory and control engineering from the Graduate University of Chinese Academy of Sciences, in 2008. After that, he joined the Institute of Automation, Chinese Academy of Sciences and became an Associate Professor at the State Key Laboratory for Management and Control of Complex Systems. From December 2015 to January 2017, he was a Visiting Scholar at the School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA, USA. In August 2019, he moved to Beijing University of Chemical Technology, as a Professor at the College of Information Science and Technology. His research interests include intelligent transportation systems, intelligent vision computing, and machine learning. He serves as an Associate Editor for the IEEE Transactions on Intelligent Transportation Systems

Fei-Yue Wang (S’87–M’89–SM’94–F’03) received the Ph.D. degree in computer and systems engineering from Rensselaer Polytechnic Institute, Troy, New York, USA in 1990. He joined the University of Arizona in 1990 and became a Professor and Director of the Robotics and Automation Laboratory (RAL) and Program in Advanced Research for Complex Systems (PARCS). In 1999, he founded the Intelligent Control and Systems Engineering Center at the Institute of Automation, Chinese Academy of Sciences (CAS), Beijing, China, and in 2002, was appointed as the Director of the Key Laboratory of Complex Systems and Intelligence Science, CAS. In 2011, he became the State Specially Appointed Expert and the Director of the State Key Laboratory for Management and Control of Complex Systems. Dr. Wang’s current research focuses on methods and applications for parallel systems, social computing, and knowledge automation. He was the Founding Editor-in-Chief of the International Journal of Intelligent Control and Systems (1995–2000), Founding EiC of IEEE ITS Magazine (2006–2007), EiC of IEEE Intelligent Systems (2009–2012), and EiC of IEEE Transactions on ITS (2009–2016). Currently he is EiC of China’s Journal of Command and Control. Since 1997, he has served as General or Program Chair of more than 20 IEEE, INFORMS, ACM, ASME conferences. He was the President of IEEE ITS Society (2005–2007), Chinese Association for Science and Technology (CAST, USA) in 2005, the American Zhu Kezhen Education Foundation (2007–2008), and the Vice President of the ACM China Council (2010–2011). Since 2008, he is the Vice President and Secretary General of Chinese Association of Automation. Dr. Wang is elected Fellow of IEEE, INCOSE, IFAC, ASME, and AAAS. In 2007, he received the 2nd Class National Prize in Natural Sciences of China and awarded the Outstanding Scientist by ACM for his work in intelligent control and social computing. He received IEEE ITS Outstanding Application and Research Awards in 2009 and 2011, and IEEE SMC Norbert Wiener Award in 2014
Corresponding author: K. F. Wang is with the College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China (e-mail: wangkf@mail.buct.edu.cn)
*Xuesong Li and Yating Liu contributed equally to this work.
Received Date: 2019-12-10
Revised Date: 2020-02-24
Accepted Date: 2020-04-13

Abstract

Abstract

The movement of pedestrians involves temporal continuity, spatial interactivity, and random diversity. As a result, pedestrian trajectory prediction is rather challenging. Most existing trajectory prediction methods tend to focus on just one aspect of these challenges, ignoring the temporal information of the trajectory and making too many assumptions. In this paper, we propose a recurrent attention and interaction (RAI) model to predict pedestrian trajectories. The RAI model consists of a temporal attention module, spatial pooling module, and randomness modeling module. The temporal attention module is proposed to assign different weights to the input sequence of a target, and reduce the speed deviation of different pedestrians. The spatial pooling module is proposed to model not only the social information of neighbors in historical frames, but also the intention of neighbors in the current time. The randomness modeling module is proposed to model the uncertainty and diversity of trajectories by introducing random noise. We conduct extensive experiments on several public datasets. The results demonstrate that our method outperforms many that are state-of-the-art.
- Deep learning,
- long short-term memory (LSTM),
- recurrent attention and interaction (RAI) model,
- trajectory prediction

FullText(HTML)

*Xuesong Li and Yating Liu contributed equally to this work.

References(33)

References

[1]	F. Large, D. Vasquez, T. Fraichard, and C. Laugier, “Avoiding cars and pedestrians using velocity obstacles and motion prediction,” in Proc. IEEE Intelligent Vehicles Symp., Parma, Italy: IEEE, Jun. 2004. pp. 375−379.
[2]	S. Thompson, T. Horiuchi, and S. Kagami, “A probabilistic model of human motion and navigation intent for mobile robot path planning,” in Proc. 4th IEEE Int. Conf. Autonomous Robots and Agents, Wellington, New Zealand: IEEE, 2009, pp. 663−668.
[3]	D. Helbing, I. Farkas, and T. Vicsek, “Simulating dynamical features of escape panic,” Nature, vol. 407, no. 6803, pp. 487–490, Sep. 2000. doi: 10.1038/35035023
[4]	T. Fernando, S. Denman, S. Sridharan, and C. Fookes, “Soft + Hardwired attention: An LSTM framework for human trajectory prediction and abnormal event detection,” Neural Netw., vol. 108, pp. 466–478, Dec. 2018. doi: 10.1016/j.neunet.2018.09.002
[5]	D. Helbing and P. Molnár, “Social force model for pedestrian dynamics,” Phys. Rev. E, vol. 51, no. 5, pp. 4282–4286, May 1995. doi: 10.1103/PhysRevE.51.4282
[6]	B. T. Morris and M. M. Trivedi, “Trajectory learning for activity understanding: Unsupervised, multilevel, and long-term adaptive approach,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 11, pp. 2287–2301, Nov. 2011. doi: 10.1109/TPAMI.2011.64
[7]	A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, F. F. Li, and S. Savarese, “Social LSTM: Human trajectory prediction in crowded spaces,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 961−971.
[8]	N. Bisagno, B. Zhang, and N. Conci, “Group LSTM: Group trajectory prediction in crowded scenarios,” in Proc. European Conf. Computer Vision, Munich, Germany, 2018, pp. 213−225.
[9]	A. Gupta, J. Johnson, F. F. Li, S. Savarese, and A. Alahi, “Social GAN: Socially acceptable trajectories with generative adversarial networks,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 2255−2264.
[10]	A. Sadeghian, V. Kosaraju, A. Sadeghian, N. Hirose, H. Rezatofighi, and S. Savarese, “SoPhie: An attentive GAN for predicting paths compliant to social and physical constraints,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 1349−1358.
[11]	J. Amirian, J. B. Hayet, and J. Pettré, “Social ways: Learning multi-modal distributions of pedestrian trajectories with GANs,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 2964−2972.
[12]	J. W. Liang, L. Jiang, J. C. Niebles, A. G. Hauptmann, and F. F. Li, “Peeking into the future: Predicting future person activities and locations in videos,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 5718−5727.
[13]	M. Huynh and G. Alaghband, “Trajectory prediction by coupling scene-LSTM with human movement LSTM,” in Proc. Int. Symp. Visual Computing, Lake Tahoe, USA: Springer, Cham, 2019, pp. 244−259.
[14]	H. Minoura, T. Hirakawa, T. Yamashita, and H. Fujiyoshi, “Path predictions using object attributes and semantic environment,” in Proc. 14th Int. Conf. Computer Vision Theory and Applications, Prague, Czech Republic, 2019, pp. 19−26.
[15]	S. Pellegrini, A. Ess, K. Schindler, and L. van Gool, “You’ll never walk alone: Modeling social behavior for multi-target tracking,” in Proc. 12th IEEE Int. Conf. Computer Vision, Kyoto, Japan, 2009, pp. 261−268.
[16]	A. Lerner, Y. Chrysanthou, and D. Lischinski, “Crowds by example,” Comput. Graph. Forum, vol. 26, no. 3, pp. 655–664, Sep. 2007. doi: 10.1111/j.1467-8659.2007.01089.x
[17]	J. Jo, S. Hwang, S. Lee, and Y. Lee, “Multi-mode LSTM network for energy-efficient speech recognition,” in Proc. Int. SoC Design Conf., Daegu, Korea (South), 2018, pp. 133−134.
[18]	R. M. Li, C. Y. Jiang, F. H. Zhu, and X. L. Chen, “Traffic flow data forecasting based on interval type-2 fuzzy sets theory,” IEEE/CAA J. Autom. Sinica, vol. 3, no. 2, pp. 141–148, Apr. 2016. doi: 10.1109/JAS.2016.7451101
[19]	R. Achkar, F. Elias-Sleiman, H. Ezzidine, and N. Haidar, “Comparison of BPA-MLP and LSTM-RNN for stocks prediction,” in Proc. 6th Int. Symp. Computational and Business Intelligence, Basel, Switzerland, 2018, pp. 48−51.
[20]	F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with LSTM,” in Proc. 9th Int. Conf. Artificial Neural Networks, Edinburgh, UK, 1999, pp. 850−855.
[21]	J. Chung, C. Gulcehre, K. H. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv: 1412.3555, 2014.
[22]	P. Zhang, W. L. Ouyang, P. F. Zhang, J. R. Xue, and N. N. Zheng, “SR-LSTM: State refinement for LSTM towards pedestrian trajectory prediction,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 12077−12086.
[23]	Y. Y. Xu, Z. X. Piao, and S. H. Gao, “Encoding crowd interaction with deep neural network for pedestrian trajectory prediction,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 5275−5284.
[24]	X. H. Wang and H. B. Duan, “Hierarchical visual attention model for saliency detection inspired by avian visual pathways,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 2, pp. 540–552, Mar. 2019. doi: 10.1109/JAS.2017.7510664
[25]	F. Zheng, C. Deng, X. Sun, X. Y. Jiang, X. W. Guo, Z. Q. Yu, F. Y. Huang, and R. R. Ji, “Pyramidal person Re-IDentification via multi-loss dynamic training,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 8506−8514.
[26]	P. Chen, X. Y. Xu, and C. Deng, “Deep view-aware metric learning for person re-identification,” in Proc. 27th Int. Joint Conf. Artificial Intelligence, Stockholm, Sweden, 2018, pp. 620−626.
[27]	C. H. Shan, J. B. Zhang, Y. J. Wang, and L. Xie, “Attention-based end-to-end speech recognition on voice search,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Calgary, Canada, 2018, pp. 4764−4768.
[28]	P. Zhou, W. Shi, J. Tian, Z. Y. Qi, B. C. Li, H. W. Hao, and B. Xu, “Attention-based bidirectional long short-term memory networks for relation classification,” in Proc. 54th Annu. Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, pp. 207−212.
[29]	A. Al-Molegi, M. Jabreel, and A. Martínez-Ballesté, “Move, attend and predict: An attention-based neural model for people’s movement prediction,” Pattern Recognit. Lett., vol. 112, pp. 34–40, Sep. 2018. doi: 10.1016/j.patrec.2018.05.015
[30]	S. Haddad, M. Q. Wu, H. Wei, and S. K. Lam, “Situation-aware pedestrian trajectory prediction with spatio-temporal attention model,” in Proc. 24th Computer Vision Winter Workshop, Stift Vorau, Austria, 2019.
[31]	A. Vemula, K. Muelling, and J. Oh, “Social attention: Modeling attention in human crowds,” in Proc. IEEE Int. Conf. Robotics and Autom., Brisbane, Australia, 2018, pp. 4601−4607.
[32]	Y. L. Zhu, D. H. Qian, D. C. Ren, and H. X. Xia, “StarNet: Pedestrian trajectory prediction using deep neural network in star topology,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Macau, China, 2019, pp. 8075−8080.
[33]	S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. 32nd Int. Conf. Machine Learning, Lille, France, 2015.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(7) / Tables(4)

Get Citation

PDF

XML

Article Metrics

Article views (1951) PDF downloads(109)

Highlights

Spatio-temporal feature is mined to model the attention, interaction and randomness of motion.
Attention module assigns different weights to the input sequence of target in time domain.
Spatial pooling module models the social behaviors of neighbors at current and historical moments.

A Recurrent Attention and Interaction Model for Pedestrian Trajectory Prediction

doi: 10.1109/JAS.2020.1003300

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content