跳到主要內容

簡易檢索 / 詳目顯示

研究生: 李蕓竹
Yun-Chu Lee
論文名稱: Multi-modal Transformer Path Prediction Version 2 for Autonomous Vehicle
指導教授: 孫敏德
Min-Te Sun
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系在職專班
Executive Master of Computer Science & Information Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 55
中文關鍵詞: 軌跡預測
相關次數: 點閱:19下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,自動駕駛技術取得了顯著進展,特別是在路徑預測研究領域。軌跡預測在自動駕駛系統中是一個關鍵的技術挑戰,涉及基於車輛感知到的當前環境數據,精確預測未來的行駛路徑。本研究介紹了一個名為Multi-modal Transformer Path Prediction Version 2 (MTPPV2) 的軌跡預測系統。MTPPV2 在前一版本 MTPP 的基礎上,引入了多頭自注意力機制到Transformer 架構中,以更好地捕捉複雜的時空依賴關係。此外,為了提高預測準確性,我們利用線性插值來填補 NuScenes 數據集中的缺失幀,確保處理過的數據保持足夠的連續性,以進行準確的軌跡預測。最後,為了加快模型訓練速度,系統重新設計以支持多 GPU 訓練模式。實驗結果顯示,在 NuScenes 數據集上,與 MTPP 相比,MTPPV2 在長期軌跡預測的 ADE和 FDE 方面分別提升了 50.5% 和 52.6%。


    In recent years, automated driving technology has seen remarkable advancements, particularly in the field of path prediction research. Trajectory prediction presents a crucial technical challenge in automated driving systems, involving the accurate forecasting of future driving paths based on current environmental data sensed by vehicles. This study introduces a trajectory prediction system named Multi-modal Transformer Path Prediction Version 2 (MTPPV2). Building on top of the previous version of MTPP, MTPPV2 incorporates a multi-head self-attention mechanism into the transformer architecture to better capture complex spatial and temporal dependencies. Moreover, to enhance prediction accuracy, we utilize linear interpolation to fill in missing frames in the NuScenes dataset, ensuring the processed data maintain the continuity necessary for accurate trajectory prediction. Finally, to expedite model training, the system is redesigned to support multi-GPU training mode. The experimental results on the nuScenes dataset demonstrate that, compared to MTPP, MTPPV2 improves the ADE and FDE of the prediction of long-term trajectory by 50. 5% and 52. 6%, respectively.

    1 Introduction 1 2 Related Work 4 2.1 Trajectory Prediction 4 2.1.1 Rule-based Models 4 2.1.2 Bayesian-based Models 4 2.1.3 Learning-based Models 5 2.2 Map Representation 7 3 Preliminary 9 3.1 Dataset 9 3.1.1 NuScenes 9 3.1.2 Argoverse 10 3.1.3 Comparison between nuScenes and Argoverse 11 3.2 Attention Mechanism 12 3.3 Transformer Model 15 4 Design 18 4.1 Research motivation 18 4.2 Problem Definition 18 4.3 Research Challenges 19 4.4 System Overview 20 4.5 Data pre-processing 21 4.5.1 Data Smoothing 21 4.5.2 Data Augmentation 21 4.5.3 Data Upsampling 22 4.5.4 Data Imputation 22 4.5.5 Lane Processing 23 4.6 Model Architecture 29 4.6.1 History Encoder 30 4.6.2 Map and Lane Encoder 31 4.6.3 Fusion 32 4.6.4 Trajectory prediction and generation 33 4.6.5 Loss function 33 4.6.6 Multi-GPU Training 34 5 Performance 35 5.1 Experimental Environment 35 5.2 Evaluation Metrics 35 5.3 Model Training Configuration 36 5.4 Experimental Results And Comparison 36 5.4.1 The Results Without Map Information 37 5.4.2 The Results Incorporating Map Information 37 5.4.3 The Results Incorporating Lane Information 38 6 Conclusion 40

    [1] nuscenes leaderboard. https://eval.ai/web/challenges/challenge-page/591/leaderboard/1659.
    [2] William Qi Benjamin Wilson, John Lambert Tanmay Agarwal, Siddhesh Khandelwal Jagjeet Singh, Ratnesh Kumar Bowen Pan, Jhony Kaesemodel Pontes Andrew Hartnett, Peter Carr Deva Ramanan, and James Hays. Argoverse 2: Next generation datasets for self-driving perception and forecasting. arXiv preprint arXiv: Arxiv2301.00493, 1 2023.
    [3] Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6 2020.
    [4] Rohan Chandra, Uttaran Bhattacharya, Aniket Bera, and Dinesh Manocha. Traphic: Trajectory prediction in dense and heterogeneous traffic using weighted interactions.IEEE/CVF CVPR, 2019.
    [5] Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, and James Hays. Argoverse: 3d tracking and forecasting with rich maps. In Proceedings of
    the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6 2019.
    [6] Chih-Wei Chen, Charles Harrison, and Hsin-Hsiung Huang. The unsupervised method of vessel movement trajectory prediction. ArXiv, 2020.41
    [7] Guangyi Chen, Zhenhao Chen, Shunxing Fan, and Kun Zhang. Unsupervised sampling promoting for stochastic human trajectory prediction. CVPR, 2023.
    [8] Yoshua Bengio Dzmitry Bahdanau, Kyunghyun Cho. Neural machine translation by jointly learning to align and translate. arXiv, 2014.
    [9] Neda Masoud Ethan Zhang, Ruixuan Zhang. Predictive trajectory planning for autonomous vehicles at intersections using reinforcement learning. ScienceDirect, 2023.
    [10] Maximilian Geisslinger, Phillip Karle, Johannes Betz, and Markus Lienkamp. Watchand-learn-net: Self-supervised online learning for probabilistic vehicle trajectory prediction. IEEE, 2021.
    [11] Meng-Hao Guo, Zheng-Ning Liu, Tai-Jiang Mu, and Shi-Min Hu. Beyond selfattention: External attention using two linear layers for visual tasks. arXiv, 2021.
    [12] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
    [13] Faris Janjos, Maxim Dolgov, and Marius Zollner. Self-supervised action-space prediction for automated driving. IEEE, 2021.
    [14] Ruochen Jiao, Xiangguo Liu, Takami Sato, Qi Alfred Chen, and Qi Zhu. Semisupervised semantics-guided adversarial training for robust trajectory prediction. IEEE/CVF, 2023.
    [15] D.Barber KC.K.I.Williams. Bayesian classification with gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998.42
    [16] Alp Kucukelbir, Dustin Tran, Rajesh Ranganath, Andrew Gelman, and David M Blei. Automatic differentiation variational inference. Journal of Machine Learning
    Research, pages 430–474, 2017.
    [17] Nick Lamm, Shashank Jaiprakash, Malavika Srikanth, and Iddo Drori. Vehicle trajectory prediction by transfer learning of semi-supervised models. ArXiv, 2020.
    [18] Yann LeCun, L´eon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
    [19] Maohan Liang, Ryan Wen Liu, Shichen Li, Zhe Xiao, Xin Liu, and Feng Lu. An unsupervised learning method with convolutional auto-encoder for vessel trajectory
    similarity computation. ScienceDirect, 2021.
    [20] Mengmeng Liu, Hao Cheng, Lin Chen, Hellward Broszio, Jiangtao Li, Runjiang Zhao, Monika Sester, and Michael Ying Yang. Laformer : Trajectory prediction for autonomous driving with lane-aware scene constraints. ArXiv, 2023.
    [21] Rong Liu, Jinling Wang, and Bingqi Zhang. High definition map for automated driving: Overview and analysis. The Journal of Navigation, 73(2):324–341, March 2020.
    [22] Pei Lv, Wentong Wang, Yunxin Wang, Yuzhen Zhang, Mingliang Xu, and Changsheng Xu. Ssagcn: Social soft attention graph convolution network for pedestrian trajectory prediction. IEEE, 2023.
    [23] Isabel Marti, Vicente R.Tomas, Arturo Saez, and Juan J.Martinez. A rule-based multi-agent system for road traffic management. 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, 2009.
    [24] Amir Rasouli Mozhgan Pourkeshavarz, Changhe Chen. Learn tarot with mentor: A meta-learned self-supervised approach for trajectory prediction. IEEE/CVF CVPR, 2023.
    [25] Mohan M.Trivedi Nachiket Deo. Convolutional social pooling for vehicle trajectory prediction. The IEEE Conference on Computer Vision and Pattern Recognition
    (CVPR) Workshops, 2018.
    [26] Daehee Park, Hobin Ryu, Yunseo Yang, Jegyeong Cho, Jiwon Kim, and Kuk-Jin Yoon. Leveraging future relationship reasoning for vehicle trajectory prediction. Arxiv, 2023.
    [27] Carole G. Prevost, Andre Desbiens, and Eric Gagnon. Extended kalman filter for state estimation and trajectory prediction of a moving object detected by an unmanned aerial vehicle. In 2007 American Control Conference, pages 1805-1810,2007.
    [28] P.Mello R.Cucchiara, M.Piccardi. Image analysis and rule-based reasoning for a traffic monitoring system. IEEE Transactions on Intelligent Transportation Systems,pages 119–130, 2000.
    [29] David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagating errors. Nature, 323(6088):533–536, 1986.
    [30] Martin Treiber, Arne Kesting, and Dirk Helbing. Delays, inaccuracies and anticipation in microscopic traffic models. Physica A: Statistical Mechanics and its Applications, 389(21):4275–4288, 2010.
    [31] Li-Wu Tsao, Yan-Kai Wang, Hao-Siang Lin, Hong-Han Shuai, Lai-Kuan Wong, and Wen-Huang Cheng. Social-ssl: Self-supervised cross-sequence representation learning based on transformers for multi-agent trajectory prediction. ECCV, 2022.
    [32] Chia Hong Tseng, Jie Zhang, Min-Te Sun, Kazuya Sakai, and Wei-Shinn Ku. Multimodal transformer path prediction for autonomous vehicle. ArXiv, 2022.
    [33] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N.Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. arXiv, 2017.
    [34] Jiaqi Xiang, Qingdong Li, Xiwang Dong, and Zhang Ren. Continuous control with deep reinforcement learning for mobile robot navigation. IEEE, 2019.
    [35] Zhongliang Zhao, Mostafa Karimzadeh, Lucas Pacheco, Hugo Santos, Denis Rosario, Torsten Braun, and Eduardo Cerqueira. Mobility management with transferable reinforcement learning trajectory prediction. IEEE, 2020.
    [36] Zikang Zhou, Jianping Wang, Yung-Hui Li, and Yu-Kai Huang. Query-centric trajectory prediction. IEEE/CVF CVPR, 2023.

    QR CODE
    :::