跳到主要內容

簡易檢索 / 詳目顯示

研究生: 吳柏賢
Po-Hsien Wu
論文名稱: 基於Zynq SoC的實時嵌入式 行人追蹤系统
Real-time Embedded Human Tracking System in Zynq SoC
指導教授: 蔡宗漢
Tsung-Han Tsai
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 53
中文關鍵詞: 行人追蹤物件偵測深度學習
外文關鍵詞: human follwing, deep learning, object detection
相關次數: 點閱:8下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 行人跟隨機器人一直是個在非常熱門的應用,隨著近年來深度學習的流行與其所需硬體設備的發展,基於深度學習的追蹤演算法被廣泛應用在跟隨機器人的應用上。越來越多基於圖像處理追蹤的方法被提出,其中也包含許多深度學習方法,這些方法,最依賴於資源強大的計算資源,如 GPU 服務器。
    本論文關注行人追蹤任務的複雜度問題,通過它提出一種更高效的行人追蹤方法,結合了單物件追蹤器KCF、行人偵測模組YOLO v3與相似度比對模組,以克服追蹤任務中計算速度和準確度的衝突。為了保留系統因應神經網路發展靈活更動的可行性,我們選擇透過基於Zynq SoC的HW/SW Co-design,PL(Programming Logic)端使用AXI總線協議來與PS(Processing System)端溝通,由PS端處理非神經網路運算與資料傳輸,PL端處理所有神經網路相關運算。此外,我們在Zynq UltraScale + MPSoC ZCU104中引入了一個新的AI加速器框架Vitis-AI及其深度學習單元(DPU)來加速系統中的YOLO v3行人偵測模組。最後我們的行人跟蹤方法在增加了一個單物件跟蹤器後,系統處理速度實現了1.27倍的加速。與CPU Intel Core i7700k@4.2GHz 上的系統相比,ZCU104 上的YOLO v3行人檢測模組速度加速了1.53 倍,而功耗節省了87.1%,在ZCU104上達到409 GOPs且只需耗能15.57W,達到0.29 GOPS/s/DSP的效能。整體系統能以11.5 FPS執行


    Human following robots have been a very popular application, and with the recent popularity of deep learning and the development of the required hardware devices, deep learning based tracking algorithms are widely used in following robots applications. More and more tracking methods based on image processing have been proposed, which also include many deep learning methods that rely most on powerful computing resources such as GPU servers.
    This paper focuses on the complexity of human tracking task by which a more efficient human tracking method is proposed, combining single-object tracker KCF, human detection module YOLO v3 and similarity comparison module to overcome the conflict of computational speed and accuracy in tracking task. In order to preserve the feasibility of flexible system changes in response to neural network development, we chose to use the HW/SW co-design based on Zynq SoC, with the PL (Programming Logic) part using AXI bus protocol to communicate with the PS (Processing System) part, and the PS part handling non-neural network computations and data transfers. The PL part deal with all neural network related computations. In addition, we introduced a new AI accelerator framework, Vitis-AI, and its Deep Processing Unit (DPU) in Zynq UltraScale + MPSoC ZCU104 to accelerate the YOLO v3 human detection module in the system. Finally, our human tracking approach can run at 11.5 FPS, achieving a 1.27x acceleration in system processing speed with the addition of a single-object tracker. Compared to the system on the CPU Intel Core i7700k@4.2GHz, the YOLO v3 human detection module on the ZCU104 accelerates 1.53 times faster while saving 87.1% in power consumption, reaching 409 GOPs on the ZCU104 and consuming only 15.57W, achieving a performance of 0.29 GOPS/s/DSP.

    摘要 I ABSTRACT II 1. 序論 1 1.1. 研究背景與動機 1 1.2. 論文架構 4 2. 文獻探討 5 2.1. 行人跟隨系統 5 2.2. 單物件追蹤演算法 7 2.3. 物件偵測神經網路 11 2.4. 深度學習加速器 17 3. 硬體架構設計 19 3.1. EFFICIENT HUMAN FOLLOWING SYSTEM 20 3.2. 硬體加速控制模組 21 3.3. HW/SW CO-DESIGN 25 4. 軟硬體實現結果 27 4.1. 行人追蹤演算法分析 28 4.2. 硬體加速模組 31 4.3. 系統整體表現 34 5. 結論 36 參考文獻 37

    [1] L. Hongmei, H. Lin, Z. Ruiqiang, L. Lei, W. Diangang, L. Jiazhou, in 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC), Object tracking in video sequence based on Kalman filter, (2020), pp. 106–110. https://doi.org/10.1109/ICCEIC51584.2020.00029
    [2] Y. Wang, W. Shi, S. Wu, Robust UAV-based tracking using hybrid classifiers. Mach. Vis. Appl. 30(1), 125–137 (2019). https://doi.org/10.1007/s00138-018-0981-4
    [3] R. Iguernaissi, D. Merad, K. Aziz, P. Drap, People tracking in multi-camera systems: a review. Multimed. Tools Appl. 78(8), 10773–10793 (2019). https://doi.org/10.1007/s11042-018-6638-5
    [4] H. Zhang, Z. Zhang, L. Zhang, Y. Yang, Q. Kang, D. Sun, Object tracking for a smart city using IoT and edge computing. Sensors. 19(9), 1987 (2019). https://doi.org/10.3390/s19091987
    [5] Y. Wu, J. Lim, M.-H. Yang, Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015). https://doi.org/10.1109/TPAMI.2014.2388226
    [6] Xilinx, Zynq UltraScale+ MPSoC ZCU104 Evaluation Kit Quick Start Guide. Available:https://www.xilinx.com/support/documentation/boards_and_kits/zcu104/xtp482-zcu104-quickstart.pdf. Accessed May 201
    [7] Xilinx, SDSoC environment user guide. Available:https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug1027-sdsoc-user-guide.pdf. Accessed May 201
    [8] H. D. Foster, “FPGA verification challenges and opportunities,”Mentor, Wilsonville, OR, USA, Rep., 2018. [Online]. Available:https://s3.amazonaws.com/verificationhorizons.verificationacademy.com/volume-14_issue-3/articles/fpga-verification-challenges-and-opportunities_vh-v14-i3.p
    [9] A. Bhutani and S. Yadav. (2019).Field Programmable Gate Array(FPGA) Market Share, 2019–2026 Forecasts. [Online]. Available:https://www.gminsights.com/industry-analysis/field-programmable-gate-array-fpga-market-siz
    [10] Xilinx.Vivado High-Level Synthesis. Accessed: Jul. 20, 2020.[Online].Available:https://www.xilinx.com/products/design-tools/vivado/integration/esl-design.html#overvi
    [11] PYNQ—Python Productivity for Zynq. Accessed: Jul. 20, 2020.[Online]. Available: http://www.pynq.
    [12] Bellotto, N., Hu, H.: Multisensor-based human detection and tracking formobile service robots. IEEE Trans. Syst. Man, Cybern. Part B 39(1), 167–181, (2009).
    [13] Kim, M., et al.: RFID-enabled target tracking and following with amobile robot using direction finding antennas. In: 2007 IEEE Interna-tional Conference on Automation Science and Engineering, pp. 1014–1019(2007)
    [14] Verma, N.K., et al.: Vision based object follower automated guided vehi-cle using compressive tracking and stereo-vision. In: 2015 IEEE BombaySection Symposium (IBSS), pp. 1–6 (2015)
    [15] Pang, L., et al.: A human-following approach using binocular camera. In:2017 IEEE International Conference on Mechatronics and Automation(ICMA), pp. 1487–1492 (2017)
    [16] Chi, W., et al.: A gait recognition method for human following in servicerobots. IEEE Trans. Syst. Man, Cybern. Syst. 48(9), 1429–1440, (2018)
    [17] Sun, S., et al.: Human recognition for following robots with a Kinect sen-sor. In: 2016 IEEE International Conference on Robotics and Biomimet-ics (ROBIO), pp. 1331–1336 (2016)
    [18] Lee, C., et al.: Real-time embedded system for human detection and track-ing. Proceedings of the International Conference on Image Processing,Computer Vision, and Pattern Recognition (IPCV), pp. 147–148 (2019)
    [19] Lee, B., et al.: Robust human following by deep Bayesian trajectory pre-diction for home service robots. 2018 IEEE International Conferenceon Robotics and Automation (ICRA), Brisbane, QLD, pp. 7189–7195(2018)
    [20] Chen, B.X., Sahdev, R., Tsotsos, J.K., et al.: Integrating stereo vision witha CNN tracker for a person-following robot. . Lect. Notes Comput. Sci.,10528, 300–313 (2017)
    [21] Koide, K., et al.: Monocular person tracking and identification with on-linedeep feature selection for person following robots. Rob. Autom. Syst. 124,1033448
    [22] J. F. Henriques, R. Caseiro, P. Martins and J. Batista, "High-Speed Tracking with Kernelized Correlation Filters," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 3, pp. 583-596, 1 March 2015, doi: 10.1109/TPAMI.2014.2345390.
    [23] Iswanto, I.: Visual object tracking based on mean-shift and particle-Kalman filter. Procedia Comput. Science A., Li, B. 116, 587–595 (2017)
    [24] Q. Guo, W. Feng, C. Zhou, R. Huang, L. Wan and S. Wang, "Learning Dynamic Siamese Network for Visual Object Tracking," 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1781-1789, doi: 10.1109/ICCV.2017.196.
    [25] A. He, C. Luo, X. Tian and W. Zeng, "A Twofold Siamese Network for Real-Time Object Tracking," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 4834-4843, doi: 10.1109/CVPR.2018.00508.
    [26] H. Nam and B. Han, "Learning Multi-domain Convolutional Neural Networks for Visual Tracking," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4293-4302, doi: 10.1109/CVPR.2016.465.
    [27] Algabri R, Choi M-T. Deep-Learning-Based Indoor Human Following of Mobile Robot Using Color Feature. Sensors. 2020; 20(9):2699. https://doi.org/10.3390/s20092699
    [28] M. Danelljan, F. S. Khan, M. Felsberg and J. Van De Weijer, "Adaptive Color Attributes for Real-Time Visual Tracking," 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1090-1097, doi: 10.1109/CVPR.2014.143.
    [29] Wang, Mei; Deng, Weihong (2020). Deep Face Recognition: A Survey. Neurocomputing, S0925231220316945–.doi:10.1016/j.neucom.2020.10.081
    [30] Girshick, Ross, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
    [31] Girshick, Ross. Fast R-CNN. Proceedings of the IEEE international conference on computer vision. 2015.
    [32] Ren, Shaoqing, et al. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 2015, 28: 91-99.
    [33] Redmon, Joseph, et al. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 779-788.
    [34] Liu, Wei, et al. Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, Cham, 2016. p. 21-37.
    [35] Simonyan, Karen; Zisserman, Andrew. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    [36] He, Kaiming, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 770-778.
    [37] Lin, Tsung-Yi, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. p. 2117-2125.
    [38] Wang, E.; Davis, J.J.; Zhao, R.; Ng, H.C.; Niu, X.; Luk, W.; Cheung, P.Y.; Constantinides, G.A. Deep Neural Network Approximationfor Custom Hardware: Where We’ve Been, Where We’re Going.ACM Comput. Surv. (CSUR)2019,52, 1–39
    [39] Jouppi, N.P.; Young, C.; Patil, N.; Patterson, D. A domain-specific architecture for deep neural networks.Commun. ACM2018,61, 50–59
    [40] Intel AI.Intel Nervana Neural Network Processors (NNP) Redefine AI Silicon; Intel: Santa Clara, CA, USA, 2017.
    [41] Chung, E.; Fowers, J.; Ovtcharov, K.; Papamichael, M.; Caulfield, A.; Massengil, T.; Liu, M.; Lo, D.; Alkalay, S.; Hasel-man, M.;et al.Accelerating persistent neural networks at datacenter scale.Hot Chips2017,29. Available online:https://old.hotchips.org/wp-content/uploads/hc_archives/hc29/HC29.22-Tuesday-Pub/HC29.22.60-NeuralNet1-Pub/HC29.22,622-Brainwave-Datacenter-Chung-Microsoft-2017_08_11_2017.pdf (accessed on 8 January 2021)
    [42] Xilinx Zynq DPU v3.1 Product Guide. Available online: https://www.xilinx.com/support/documentation/ip_documentation/dpu/v3_1/pg338-dpu.pdf (accessed on 8 January 2021).
    [43] Xilinx Vivado Design Suite User Guide—High-Level Synthesis. Available online: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_2/ug902-vivado-high-level-synthesis.pdf (accessed on 8 January 2021)
    [44] Xilinx Vitis AI User Guide. Available online: https://www.xilinx.com/support/documentation/sw_manuals/vitis_ai/1_2/ug1414-vitis-ai.pdf (accessed on 8 January 2021).
    [45] A. Ajmal, C. Hollitt, M. Frean and H. Al-Sahaf, "A Comparison of RGB and HSV Colour Spaces for Visual Attention Models," 2018 International Conference on Image and Vision Computing New Zealand (IVCNZ), 2018, pp. 1-6, doi: 10.1109/IVCNZ.2018.8634752.
    [46] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, pp. 886-893 vol. 1, doi: 10.1109/CVPR.2005.177.
    [47] J. Xiao, R. Stolkin, M. Oussalah and A. Leonardis, "Continuously Adaptive Data Fusion and Model Relearning for Particle Filter Tracking With Multiple Features," in IEEE Sensors Journal, vol. 16, no. 8, pp. 2639-2649, April15, 2016, doi: 10.1109/JSEN.2016.2514704.
    [48] Firouznia, Marjan; Faez, Karim; Amindavar, Hamidreza; Koupaei, Javad Alikhani (2018). Chaotic particle filter for visual object tracking. Journal of Visual Communication and Image Representation, 53(), 1–12. doi:10.1016/j.jvcir.2018.02.014
    [49] Yuan, Heng; Jiang, Wen-Tao; Liu, Wan-Jun; Zhang, Sheng-Chong (2019). Visual node prediction for visual tracking. Multimedia Systems, (), –. doi:10.1007/s00530-019-00603-1
    [50] “Bonn Benchmark on Tracking”, for more information please visit http://www.iai.uni-bonn.de/˜kleind/tracking/
    [51] Jesse Davis and Mark Goadrich. 2006. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd international conference on Machine learning (ICML '06). Association for Computing Machinery, New York, NY, USA, 233–240. https://doi.org/10.1145/1143844.1143874
    [52] Y. J. Wai, Z. bin Mohd Yussof, S. I. bin Salim, and L. K. Chuan, “Fixed Point Implementation of Tiny-Yolo-v2 using OpenCL on FPGA,” International Journal of Advanced Computer Science and Applications, vol. 9, no. 10, 2018. [Online]. Available: http://dx.doi.org/10.14569/IJACSA.2018.091062
    [53] J. Ma, L. Chen, and Z. Gao, “Hardware Implementation and Optimization of Tiny-YOLO Network,” in Digital TV and Wireless Multimedia Communication, G. Zhai, J. Zhou, and X. Yang, Eds. Singapore: Springer Singapore, 2018, pp. 224–234.
    [54] A. Ahmad, M. A. Pasha and G. J. Raza, "Accelerating Tiny YOLOv3 using FPGA-Based Hardware/Software Co-Design," 2020 IEEE International Symposium on Circuits and Systems (ISCAS), 2020, pp. 1-5, doi: 10.1109/ISCAS45731.2020.9180843.

    QR CODE
    :::