跳到主要內容

簡易檢索 / 詳目顯示

研究生: 沈桓慶
Huan-Ching Shen
論文名稱: 基於聯合嵌入之雙手配對與追蹤系統
A Hands Pairing and Tracking System base on Associative Embedding
指導教授: 范國清
Kuo-Chin Fan
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 45
中文關鍵詞: 深度學習偵測系統手勢追蹤物件偵測類神經網路聯合嵌入
外文關鍵詞: deep learning, detection system, gesture tracking, object detection, neural network, associative embedding
相關次數: 點閱:19下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 手部追蹤旨在預測影像序列中多個手的軌跡,對於空中手寫、手語辨識及手勢辨識等應用具有重要的意義,而將雙手分組可以使上述應用實現更複雜的功能。
    本論文提出基於YOLOv3和聯合嵌入的方法,整合多目標追蹤和關節點檢測的單階段類神經網路模型和演算法,實現實時的多人雙手追蹤。


    Hand tracking aims to predict the trajectory of multiple hands in an image sequence, which is of great significance for applications such as air handwriting, sign language recognition and gesture recognition, and grouping the hands can enable the above applications to achieve more complex functions.
    This paper proposes a single-stage neural network model and algorithm based on YOLOv3 and associative embedding, integrating multi-target tracking and joint point detection, to achieve real-time multi-person hand tracking.

    頁次 中文摘要......................................................................................................................... iii 英文摘要......................................................................................................................... v 謝誌................................................................................................................................. vii 目錄................................................................................................................................. ix 圖目錄............................................................................................................................. xi 表目錄.............................................................................................................................xiii 一、 緒論......................................................................................................... 1 1.1 研究動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 研究目的與方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 論文架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 二、 相關文獻與背景知識............................................................................. 3 2.1 物件偵測 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 非極大值抑制 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 聯合嵌入 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 三、 研究架構................................................................................................. 9 3.1 研究流程 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 模型結構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 雙手偵測 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.4 雙手配對與追蹤 . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 四、 實驗結果................................................................................................. 15 4.1 實驗環境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 測試結果 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3 實驗數據 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 五、 結論......................................................................................................... 19 參考文獻......................................................................................................................... 21 附錄一............................................................................................................................. 23

    [1] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali
    Farhadi. You only look once: Unified, real-time object detection.
    In Proceedings of the IEEE conference on computer vision and pattern
    recognition, pages 779–788, 2016.
    [2] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy,
    Scott Reed, Cheng-Yang Fu, and Alexander C Berg. Ssd: Single
    shot multibox detector. In European conference on computer vision,
    pages 21–37. Springer, 2016.
    [3] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik.
    Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer
    vision and pattern recognition, pages 580–587, 2014.
    [4] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster
    r-cnn: Towards real-time object detection with region proposal
    networks. arXiv preprint arXiv:1506.01497, 2015.
    [5] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr
    Dollár. Focal loss for dense object detection. In Proceedings of the
    IEEE international conference on computer vision, pages 2980–2988,
    2017.
    [6] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose estimation using part affinity fields.
    In Proceedings of the IEEE conference on computer vision and pattern
    recognition, pages 7291–7299, 2017.
    [7] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks.
    23Advances in neural information processing systems, 25:1097–1105,
    2012.
    [8] Alejandro Newell, Zhiao Huang, and Jia Deng. Associative embedding: End-to-end learning for joint detection and grouping.
    arXiv preprint arXiv:1611.05424, 2016.
    [9] Hei Law and Jia Deng. Cornernet: Detecting objects as paired
    keypoints. In Proceedings of the European conference on computer
    vision (ECCV), pages 734–750, 2018.
    [10] Zhongdao Wang, Liang Zheng, Yixuan Liu, and Shengjin
    Wang. Towards real-time multi-object tracking. arXiv preprint
    arXiv:1909.12605, 2(3):4, 2019.
    [11] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
    [12] Tomas Simon, Hanbyul Joo, and Yaser Sheikh. Hand keypoint detection in single images using multiview bootstrapping.
    CVPR, 2017.
    [13] Hanbyul Joo, Tomas Simon, Xulong Li, Hao Liu, Lei Tan, Lin
    Gui, Sean Banerjee, Timothy Scott Godisart, Bart Nabbe, Iain
    Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh.
    Panoptic studio: A massively multiview system for social interaction capture. IEEE Transactions on Pattern Analysis and Machine
    Intelligence, 2017.
    [14] Hanbyul Joo, Hao Liu, Lei Tan, Lin Gui, Bart Nabbe, Iain
    Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh.
    Panoptic studio: A massively multiview system for social motion
    capture. In The IEEE International Conference on Computer Vision
    (ICCV), 2015.
    [15] Sven Bambach, Stefan Lee, David J. Crandall, and Chen Yu.
    Lending a hand: Detecting hands and recognizing activities in
    complex egocentric interactions. In The IEEE International Conference on Computer Vision (ICCV), December 2015.
    24[16] James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1,
    pages 281–297. Oakland, CA, USA, 1967.
    [17] A. Neubeck and L. Van Gool. Efficient non-maximum suppression. In 18th International Conference on Pattern Recognition
    (ICPR’06), volume 3, pages 850–855, 2006.

    QR CODE
    :::