跳到主要內容

簡易檢索 / 詳目顯示

研究生: 蔡易儒
Yih-Ru Tsai
論文名稱: 應用於複雜背景之三維手勢架構設計
Design and Implementation of a 3D Hand Gesture Architecture System under Complicated Environment
指導教授: 蔡宗漢
Tsung-Han Tsai
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 67
中文關鍵詞: 手勢辨識硬體架構三維深度
相關次數: 點閱:8下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在這個數位資訊爆炸化的時代,越來越多廠商開發出產品來提升我們日常生活的品質與方便性,近幾年來手勢控制的發展使人們逐漸從傳統鍵盤與滑鼠等操作介面,轉變為更符合直覺操作模式的遠距手勢控制,然而使用視覺影像來做手勢辨識還是有很多挑戰需要克服。本論文提出一個基於三維手勢辨識系統硬體架構,本系統使用低成本的雙攝影機來計算深度資訊,不但可以提供整張影像的深度資訊,也能在嚴峻複雜的背景中切割出感興趣物件來使系統辨識。
    目前大多的手勢偵測研究採用膚色或運動量值做為前處理步驟,但只透過膚色資訊和運動量無法在複雜的環境下還能使系統正常的運作,本論文提出一個自適應性的膚色深度過濾,此方法可以有效分離出系統感興趣的部分「手」,並透過物件標籤法來去除剩餘雜訊並計算出手部的座標資訊,最後透過深度及座標資訊完成整體的手勢辨識,此系統包含一種靜態手勢與兩種動態手勢,且透過FPGA的實做驗證讓使用者更能體會到人機互動的概念。


    In this digital information explosion era, more and more products are proposed to enhance the quality and convenience of our life. As the growing number of gesture control systems have been developed, keyboard and mouse have been replaced to remote control products which are more intuitive for users gradually. However, vision-based hand gesture recognition is still a challenging problem to overcome. In this paper, an architecture system was proposed with dual-camera to construct the depth map and segment the interesting object under complicated environment to recognize the dynamic hand gesture.
    Most of hand detection methods adopt skin filter or motion filter as the preprocessing. However, it can’t still segmentation the interesting object correctly under some complicated environment. The proposed system adopt adaptive depth filter which can separate foreground to segment the interesting object. We also proposed dynamic gesture recognition by using depth and coordinate information. The system contains one static gesture and two dynamic gestures and implement in FPGA which make users know more about HCI systems.

    摘要 I ABSTRACT II TABLE OF CONTENTS III LIST OF FIGURES IV LIST OF TABLES VI CHAPTER 1 1 1.1 BACKGROUND 1 1.2 MOTIVATION 4 1.3 THESIS ORGANIZATION 5 CHAPTER 2 6 2.1 OVERVIEW 6 2.2 DEPTH INFORMATION EXTRACTION 7 2.2.1. Kinect 7 2.2.2. Stereo Matching 9 2.3 INTERESTING PART DETECTION 11 2.4 HARDWARE DESIGN 14 CHAPTER 3 15 3.1 OVERVIEW 15 3.2 PRE-PROCESSING 17 3.3 DEPTH EXTRACTION 25 3.4 SKIN DETECTION 30 3.5 ADAPTIVE DEPTH DYNAMIC THRESHOLD 32 3.6 OBJECT LABELING 35 3.7 GESTURE RECOGNITION 41 CHAPTER 4 43 4.1 ASIC DESIGN 43 4.2 FPGA IMPLEMENTATION 49 CHAPTER 5 53 REFERENCES 55

    [1] S. Mitra and T. Acharya, “Gesture recognition: A survey,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 37, no. 3, pp. 311–324, Apr. 2007.
    [2] P. Kumar, S. S. Rautaray, and A. Agrawal, “Hand data glove: A new generation real-time mouse for human-computer interaction,” International Conference on Recent Advances in Information Technology (RAIT),2012, pp. 750-755.
    [3] J. P. Wachs, M. Kolsch, H. Stern, and Y. Edan, “Vision-based handgesture applications,” Commun. ACM, vol. 54, no. 2, pp. 60–71, Feb. 2011.
    [4] M. Turk, Handbook of Virtual Environment Technology. Lawrence Erlbaum Associates, Inc., 2001.
    [5] https://www.stereolabs.com/
    [6] Cheng-Yuan Ko, Chung-Te Li, Chien Wu, and Liang-Gee Chen, “3D hand localization by low cost webcams,” IS&T/SPIE Electronic Imaging (IS&T/SPIE EI), Jan. 2013.
    [7] http://blog.udn.com/mhwu1/7534030
    [8] K. Hisatomi, M. Kano, K. Ikeya, M. Katayama, and T. Mishina, “Depth Estimation Based on an Infrared Projector and an Infrared Color Stereo Camera by Using Cross-based Dynamic Programming with Cost Volume Filter,” IEEE International Conference on 3D Vision (3DV), pp. 580-588, Oct. 2015.
    [9] G. A. Kordelas, D. S. Alexiadis, P. Daras, and E. Izquierdo, ”Revisiting Guide Image Filter Based Stereo Matching and Scanline Optimization for Improved Disparity Estimation,” IEEE International Conference on Image Processing (ICIP), pp. 3803-3807, Jan. 2015.
    [10] K. Ju, B. Wang, and H. Xiong, “Structure-aware Priority Belief Propagation for Depth Estimation,” IEEE Visual Communications and Image Processing (VCIP), pp. 1-4, Dec. 2015.
    [11] C. Zhang, J. Bastian, C. Shen, A. van den Hengel, T. Shen, “Extended depth-of-field via focus stacking and graph cuts,” IEEE International Conference on Image Processing (ICIP), pp. 1272-1276, Feb. 2014.
    [12] R. W. Rahmat, Z. H. Al-Tairi, M. I. Saripan, and P. S. Sulaiman, “Removing Shadow for Hand Segmentation Based on Background Subtraction,” in International Conference on Advanced Computer Science Applications and Technologies (ACSAT), 2012,PP. 481-485.
    [13] Tsung-Han Tsai, Chih-Chi Huang and Kung-Long Zhang, “Embedded Virtual Mouse System by Using Hand Gesture Recognition”, IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW), pp.352-353, June, 2015.
    [14] Q. Chen, N. D. Georganas, and E. M. Petriu, “Hand Gesture Recognition Using Haar-Like Features and a Stochastic Context-Free Grammar,” IEEE Transactions on Instrumentation and Measurement, vol. 57, No.8, pp.1562-1571, August, 2008.
    [15] Po-Kuan Huang, Tung-Yang Lin, Hsu-Ting Lin, Chi-Hao Wu, Ching-Chun Hsiao, Chao-Kang Liao, Peter Lemmens, “Real-time stereo matching for 3D hand gesture recognition,” in IEEE International SoC Design Conference (ISOCC), 2012, pp.29-32.
    [16] Cheng-Yuan Ko, Chung-Te Li, Chien Wu, and Liang-Gee Chen, “3D hand localization by low cost webcams,” IS&T/SPIE Electronic Imaging (IS&T/SPIE EI), Jan, 2013.
    [17] Y. C. Fan, H. K. Liu, “FPGA based memory efficient high resolution stereo vision system for video tolling,” IEEE International Symposium on Medical Measurements and Applications (MeMeA), pp. 432-435, July 2015.
    [18] Y. Shan, Z. Wang, W. Wang, Y. Hao, Y. Wang, K. Tsoi, W. Luk, H. Yang, “FPGA based memory efficient high resolution stereo vision system for video tolling,” IEEE International Conference on Field-Programmable Technology (FPT), pp. 29-32, Dec. 2012.
    [19] P. K. Huang, T. Y. Lin, H. T. Lin, C. H. Wu, C. C. Hsiao, C. K. Liao P. Lemmens, ” Real-time stereo matching for 3D hand gesture recognition,” IEEE International SoC Design Conference (ISOCC), pp.29-32, Nov. 2012.
    [20] Smith, Alvy Ray. "Color gamut transform pairs." ACM Siggraph Computer Graphics 12.3 (1978): 12-19
    [21] Yongquan Xia, Longyuan Guo, Min Huang, Rui Ma, “A New Fast Matching Approach of Large Disparity Stereopair,” in Congress on Image and Signal Processing, 2008, pp.286-290.
    [22] K. Jong-hak, Y. Zhongyun,S. Sang-hyeob, C. Jun-dong, “A fast region expansion labeling of connected components in binary image”, IEEE International Symposium on Consumer Electronics, June 2014
    [23] W. Ran, Y. Zhishuai, L. Minghang, W. Yikai, C. Yuchun, “Real-time visual static hand gesture recognition system and its FPGA-based hardware implementation,” IEEE International Conference on Signal Processing (ICSP), pp. 434-439, Oct. 2014.
    [24] S. K. Gehrig, F. Eberli, and T. Meyer, “A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching,” Computer Vision Systems, Lecture Notes in Computer Science, vol. 5815, pp. 134-143, 2009.
    [25] N. Chang, T. Tsai, B. Hsu, Y. Chen, and T. Chang, “Algorithm and Architecture of Disparity Estimation With Mini-Census Adaptive Support Weight,” IEEE Trans. on CSVT, vol. 20, no. 6, pp. 792-805, June. 2010.

    QR CODE
    :::