跳到主要內容

簡易檢索 / 詳目顯示

研究生: 曾嘉鴻
Chia-Hong Tseng
論文名稱: Multi-modal Transformer Path Prediction for autonomous vehicle
指導教授: 孫敏德
Min-Te Sun
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 49
中文關鍵詞: 自動駕駛
外文關鍵詞: Path Prediction
相關次數: 點閱:10下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 車輛軌跡預測在自動駕駛系統中是一個具有挑戰性的題目,且攸關自駕車行駛在道路上的安危。近年來有很多的研究者都在研究這個題目,然而很多研究並沒有使用到道路資訊和Transformer的架構。藉由自駕車上面不同的感知器所收集到的資料,我們提出一套軌跡預測系統用來預測車輛接下來的行徑軌跡。為了達到更精準的預測,我們的模型採用修改過的transformer架構。我們為了更好的運用道路資訊,我們在資料預處理的時候會將一些與車輛行徑方向不同的道路刪除,除此之外我們也將一些較小的道路結合將道路資料處理到符合模型的輸入。在最後我們用nuScene資料集做了很多實驗來驗證我們所提出的系統是有效的。


    Reasoning about vehicle path prediction is an essential and challanging problem for the safe operation of autonomous driving systems. There existing many research works for path prediction. However, most of them do not use lane information and are not based on the Transformer architecture. By utilizing different types of data collected from sensors equipped on the self-driving vehicles, we propose a path prediction system named Multi-model Transformer Path Prediction (MTPP) that aims to predict long-term future trajectory of target agents. To achieve more accurate path prediction, the Transformer architecture is adopted in our model. To better utilize the lane information, the lanes which are in opposite direction to target agent are not likely to be taken by the target agent and are consequently filtered out. In addition, consecutive lane chunks are combined to ensure the lane input to be long enough for path prediction. An extensive evaluation is conducted to show the efficacy of the proposed system using nuScene, a real-world trajectory forecasting dataset.

    1 Introduction 1 2 Related Work 4 2.1 Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Map Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Preliminary 6 3.1 dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.1 nuScenes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3.1 Mechanism inside Transformer . . . . . . . . . . . . . . . . . . . . . 10 4 Design 12 4.1 Data pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.1 Data Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.2 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.3 Lane Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2 Model architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.1 Agent history encoding . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.2 Map information encoding . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.3 Features fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.4 Future lane prediction . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.5 Trajectory generation . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.6 Autoregression vs non-autoregression . . . . . . . . . . . . . . . . . 25 4.2.7 Loss function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3.1 Autoregression vs Non-Autoregression . . . . . . . . . . . . . . . . 27 4.3.2 Issue with Lane Processing . . . . . . . . . . . . . . . . . . . . . . . 27 5 Performance 29 5.1 Environmental Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.3 Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.4 Experimental Results and Analysis . . . . . . . . . . . . . . . . . . . . . . 30 5.4.1 Auto-regression vs Non Auto-regression . . . . . . . . . . . . . . . . 30 5.4.2 MTPP vs Trajectron++ . . . . . . . . . . . . . . . . . . . . . . . . 31 5.5 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6 Conclusion 35

    [1] self-driving car. https://www.tesla.com/. Accessed: 2022-02-23.
    [2] Florent Altch ́e and Arnaud de La Fortelle. An lstm network for highway trajectory
    prediction. In 2017 IEEE 20th International Conference on Intelligent Transportation
    Systems (ITSC), pages 353–359, 2017.
    [3] Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang
    Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. nuscenes:
    A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF
    Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
    [4] Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Slawomir Bak,
    Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, and James
    Hays. Argoverse: 3d tracking and forecasting with rich maps. In Proceedings of the
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June
    2019.
    [5] Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, and Huchuan Lu.
    Transformer tracking. In Proceedings of the IEEE/CVF Conference on Computer
    Vision and Pattern Recognition (CVPR), pages 8126–8135, June 2021.
    [6] Nachiket Deo and Mohan M. Trivedi. Convolutional social pooling for vehicle tra-
    jectory prediction. In Proceedings of the IEEE Conference on Computer Vision and
    Pattern Recognition (CVPR) Workshops, June 2018.
    [7] Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, and Richard Socher.
    Non-autoregressive neural machine translation, 2017.
    [8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning
    for image recognition, 2015.
    [9] Sepp Hochreiter and J ̈urgen Schmidhuber. Long short-term memory. Neural Com-
    putation, 9(8):1735–1780, 1997.
    [10] Adam Houenou, Philippe Bonnifait, V ́eronique Cherfaoui, and Wen Yao. Vehicle
    trajectory prediction based on motion model and maneuver recognition. In 2013
    IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 4363–
    4369, 2013.
    [11] Donggi Jeong, Minjin Baek, and Sang-Sun Lee. Long-term prediction of vehicle
    trajectory based on a deep neural network. In 2017 International Conference on
    Information and Communication Technology Convergence (ICTC), pages 725–727,
    2017.
    [12] Rudolph Emil Kalman. A new approach to linear filtering and prediction problems.
    Transactions of the ASME–Journal of Basic Engineering, 82(Series D):35–45, 1960.
    [13] ByeoungDo Kim, Seong Hyeon Park, Seokhwan Lee, Elbek Khoshimjonov, Dong-
    suk Kum, Junsoo Kim, Jeong Soo Kim, and Jun Won Choi. Lapred: Lane-aware
    prediction of multi-modal future trajectories of dynamic agents. In Proceedings of
    the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
    pages 14636–14645, June 2021.
    [14] Peng Liu, Arda Kurt, and ̈Umit ̈Ozg ̈uner. Trajectory prediction of a lane chang-
    ing vehicle based on driver behavior estimation and classification. In 17th Interna-
    tional IEEE Conference on Intelligent Transportation Systems (ITSC), pages 942–
    947, 2014.
    [15] Jos ́E E. Naranjo, Carlos Gonzalez, Ricardo Garcia, and Teresa de Pedro. Lane-
    change fuzzy control in autonomous vehicles for the overtaking maneuver. IEEE
    Transactions on Intelligent Transportation Systems, 9(3):438–450, 2008.
    [16] Yoshihiro Nishiwaki, Chiyomi Miyajima, Hidenori Kitaoka, and Kazuya Takeda.
    Stochastic modeling of vehicle trajectory during lane-changing. In 2009 IEEE Inter-
    national Conference on Acoustics, Speech and Signal Processing, pages 1377–1380,
    2009.
    [17] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory
    Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Des-
    maison, Andreas K ̈opf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani,
    Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala.
    Pytorch: An imperative style, high-performance deep learning library, 2019.
    [18] Carole G. Prevost, Andre Desbiens, and Eric Gagnon. Extended kalman filter for
    state estimation and trajectory prediction of a moving object detected by an un-
    manned aerial vehicle. In 2007 American Control Conference, pages 1805–1810,
    2007.
    [19] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement, 2018.
    [20] Tim Salzmann, Boris Ivanovic, Punarjay Chakravarty, and Marco Pavone. Trajec-
    tron++: Multi-agent generative trajectory forecasting with heterogeneous data for
    control. CoRR, abs/2001.03093, 2020.
    [21] Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik,
    Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan,
    Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Kri-
    vokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, and
    Dragomir Anguelov. Scalability in perception for autonomous driving: Waymo open
    dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pat-
    tern Recognition (CVPR), June 2020.
    [22] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N.
    Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need, 2017.
    [23] Lingyao Zhang, Po-Hsun Su, Jerrick Hoang, Galen Clark Haynes, and Mi-
    col Marchetti-Bowick. Map-adaptive goal-based trajectory prediction. CoRR,
    abs/2009.04450, 2020.

    QR CODE
    :::