| 研究生: |
劉乃菀 Nai-yuan Liu |
|---|---|
| 論文名稱: |
硬體化動態物件偵測引擎設計與手勢辨識應用 Design of Hardwared Motion Object Detection Engine and Application of Gesture Recognition |
| 指導教授: |
陳慶瀚
Chin-han Chen |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 中文 |
| 論文頁數: | 69 |
| 中文關鍵詞: | 物件偵測 、手勢辨識 |
| 相關次數: | 點閱:24 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
以視覺為基礎的人機互動(Human-Machine Interaction,HMI)技術需藉助一系列複雜的影像處理流程,因而需要高速的處理器才足以因應演算法的實作,對於硬體資源有限的嵌入式系統,要實現即時人機互動的影像處理有其困難度。本研究提出一個即時的動態物件偵測嵌入式硬體架構,再將之應用以手勢辨識。這個方法從連續影像適應性地建立背景模型,並使用連通物件技術標定目標物件。在手勢辨識的應用中,目標物件就是運動手勢。
我們採用MIAT嵌入式系統設計方法論,將動態物件偵測演算法設計成嵌入式硬體,以大幅提高系統即時性能。然後將得到的動態物件資訊傳入手勢辨識系統進行分析,透過動態手勢軌跡變化特徵,結合模糊神經網路(Fuzzy Neural Network, FNN)推論系統進行動態手勢辨識,最後取得辨識結果提供人機互動的指令。使用者可以利用本系統擴充自訂手勢指令以增加其應用範疇和客製功能。我們所實現的即時動態物件偵測硬體加速引擎工作時脈可達107.63MHz,估測其效率可達每秒350張640×480影像的效能,相較於軟體系統,本研究使用低成本的硬體即可滿足即時嵌入式系統需求。
Vision-based human-computer interaction (HMI) technology requires a series of complex image processing and a high-speed processor is necessary for implementing those algorithms. For embedded hardware system, it is even harder to achieve a real-time human-computer interaction system which is based on image processing. For these reasons, we proposed a design of real-time hardware motion object detection engine and the application of gesture recognition. First, a background model is established adaptively from continuous images. Then, the target object will be pointed out with connected component. In gesture recognition, the target object is the moving hand gesture.
In implementation, the MIAT embedded system design methodology is applied to the hardware motion object detection engine to improve the system performance. Afterwards, the moving object information will be sent to the gesture recognition system. The fuzzy neural network (FNN) is also applied in the gesture recognition system for analyzing the dynamic gesture trajectory features. Finally, the results of gesture recognition are provided for the instructions of human-computer interaction. Beside, in order to increase the scope of application and customization, we designed a user interface for users to expand their own gesture command in the system.
The proposed hardware motion object detection engine can work up to 107.63MHz of the system clock, which is equivalent to approximately 350 fps using images with 640 × 480 dpi. Comparing to software systems, our system can easily meet the needs of real-time embedded systems with low-cost hardware.
[1] Wikipedia., "Human-computer interaction," ed.
[2] J. M. Carroll. Human Computer Interaction(HCI). Available: http://www.interaction-design.org/encyclopedia/human_computer_interaction_hci.html#heading_Beyond_the_desktop_html_pages_35313
[3] 劉說芳, 陳連福, 陳莞鈞, and 陳盈秀, "探討感官多模式之人機互動介面發展與應用形式," 2010.
[4] 楊國棟. Wii介紹. Available: http://www2.nuk.edu.tw/lib/e-news/20071101/3-3.htm
[5] Wikipedia. iPhone. Available: http://zh.wikipedia.org/wiki/IPhone
[6] Wikipedia., "Kinect," ed.
[7] Wikipedia. Leap Motion. Available: http://en.wikipedia.org/wiki/Leap_Motion#Hardware_partnerships
[8] 張光華, "A Study on the Economic Polarized Light Stereoscopic Projectionn System," Master, Optics and Photonics, National Central University, 2007.
[9] (2003). Pulsed time-of-flight laser rangefinding. Available: http://herkules.oulu.fi/isbn9514269667/html/c305.html
[10] C. Guan, L. G. Hassebrook, D. L. Lau, and V. Yalla, "Near-infrared composite pattern projection for continuous motion hand-computer interaction," Journal of Visual Communication and Image Representation, vol. 18, pp. 141-150, Apr 2007.
[11] M. Z. Brown, D. Burschka, and G. D. Hager, "Advances in computational stereo," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, pp. 993-1008, Aug 2003.
[12] 蘇木春 and 張孝德, 機器學習:類神經網路、模糊系統以及基因演算法: 全華圖書股份有限公司, 2010.
[13] R. Basri and D. W. Jacobs, "Recognition using region correspondences," International Journal of Computer Vision, vol. 25, pp. 145-166, Nov 1997.
[14] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, pp. 91-110, Nov 2004.
[15] J. Y. Kuo, T. Y. Lai, F. Huang, and K. Liu, "The color recognition of objects of survey and implementation on real-time video surveillance," IEEE International Conference on Systme Man and Cybernetics(SMC), pp. 3741-3748, 2010.
[16] J. H. Cho and S. D. Kim, "Object detection using spatio-temporal thresholding in image sequences," Electronics Letters, vol. 40, pp. 1109-1110, Sep 2 2004.
[17] M. S. Nagmode and M. A. Joshi, "Moving Object Detection From Image Sequence in Context with Multimedia Processing," IET International Conference on Wireless, Moble and Multimedia Networks, pp. 259-262, 2008.
[18] H. P. Moravec, "Visual mapping by a robot rover," International Joint Conference on Artificial intelligence, vol. 1, pp. 598-600, 1979.
[19] C. Harris and M. Stephens, "A combined corner and edge detector," The Fourth Alvey Vision Conference, pp. 147-151, 1988.
[20] D. Comaniciu and P. Meer, "Mean shift: A robust approach toward feature space analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 603-619, May 2002.
[21] J. B. Shi and J. Malik, "Normalized cuts and image segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, pp. 888-905, Aug 2000.
[22] C. P. Papageorgiou, M. Oren, and T. Poggio, "A general framework for object detection," International Conference on Computer Vision, pp. 555-562, 1998.
[23] H. A. Rowley, S. Baluja, and T. Kanade, "Neural network-based face detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 23-38, Jan 1998.
[24] P. Viola, M. J. Jones, and D. Snow, "Detecting pedestrians using patterns of motion and appearance," International Journal of Computer Vision, vol. 63, pp. 153-161, Jul 2005.
[25] N. J. B. Mcfarlane and C. P. Schofield, "Segmentation and Tracking of Piglets in Images," Machine Vision and Applications, vol. 8, pp. 187-193, 1995.
[26] C. Stauffer and W. E. L. Grimson, "Learning patterns of activity using real-time tracking," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, pp. 747-757, Aug 2000.
[27] N. M. Oliver, B. Rosario, and A. P. Pentland, "A Bayesian computer vision system for modeling human interactions," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, pp. 831-843, Aug 2000.
[28] K. Kim, T. H. Chalidabhongse, D. Harwood, and L. Davis, "Real-time foreground-background segmentation using codebook model," Real-Time Imaging, vol. 11, pp. 172-185, 2005.
[29] J. Shi and C. Tomasi, "Good features to track," IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 593-600, 1994.
[30] A. Yilmaz, O. Javed, and M. Shah, "Object tracking: A survey," ACM Computing Surveys, vol. 38, 2006.
[31] M. Kass, A. Witkin, and D. Terzopoulos, "Snakes: Active contour models," International Journal of Computer Vision, vol. 1, pp. 321-332, 1988.
[32] (2011). 背景建模算法(一)颜色背景模型. Available: http://underthehood.blog.51cto.com/2531780/484191
[33] S. Brutzer, B. Hoferlin, and G. Heidemann, "Evaluation of Background Subtraction Techniques for Video Surveillance," CVPR, pp. 1937-1944, 2011.
[34] R. Xue, H. Song, and H. Zhang, "Overview of Background Modeling Method Based on Pixel," 电视技术, vol. 36, 2012.
[35] R. Jain and H. Nagel, "On the analysis of accumulative difference pictures from image sequences of real world scenes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, pp. 206-214, 1979.
[36] J. Zhong and S. Sclaroff, "Segmenting foreground objects from a dynamic textured background via a robust kalman filter," IEEE International Conference on Computer Vision(ICCV), vol. 2, 2003.
[37] C. R. Wren, A. Azarbayejani, T. Darrell, and A. P. Pentland, "Pfinder: Real-time tracking of the human body," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, pp. 780-785, Jul 1997.
[38] A. Elgammal, R. Duraiswami, D. Harwood, and L. S. Davis, "Background and foreground modeling using nonparametric kernel density estimation for visual surveillance," Proceedings of the IEEE, vol. 90, pp. 1151-1163, Jul 2002.
[39] C. Manresa, J. Varona, R. Mas, and F. J. Perales, "Hand Tracking and Gesture Recognition for Human-Computer Interaction," ELCVIA, vol. 5, pp. 96-104, 2005.
[40] W. T. Freeman and M. Roth, "Orientation Histograms for Hand Gesture Recognition," IEEE Intl. Wkshp. on Automatic Face and Gesture Recognition, vol. 50, p. 174, 1995.
[41] K. J. Chang, "Computer Vision Based Hand Gesture Recognition System," Master, Electrical Engineering, National Tsing Hua University, 2005.
[42] L.-g. Zhang, J.-q. Wu, W. Gao, and H.-X. Yao, "Hand Gesture Recognition Based on Hausdorff Distance," Journal of Image and Graphics, vol. 7, 2002.
[43] M. Elmezain, A. Al-Hamadi, and B. Michaelis, "Hand Trajectory-based Gesture Spotting and Recognition Using HMM," IEEE International Conference on Image, pp. 3541-3544, 2009.
[44] A. Corradini, "Dynamic TimeWarping for Off-line Recognition of a Small Gesture Vocabulary," IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time System, pp. 82-89, 2001.
[45] K. Murakami and H. Taguchi, "Gesture Recognition using Recurrent Neural Networks," SIGCHI Conference on Human Factors in Computing System, pp. 237-242, 1991.
[46] C.-H. Chen and J.-H. Dai, "Design and High-Level Synthesis of Discrete-Event Controller," National Conference of Automatic Control and Mechtronics System, pp. 75-80, 2002.
[47] C.-H. Chen, T.-K. Yao, J.-H. Dai, and C.-Y. Chen, "A pipelined multiprocessor SOC design methodology for streaming signal processing," Journal of Vibration and Control.
[48] C.-H. Chen, C.-M. Kuo, S.-H. Hsieh, and C.-Y. Chen, "High efficient VLSI implementation of probabilistic neural network image interpolator," Journal of Vibration and Control.
[49] C.-H. CHen, C.-M. Kuo, C.-Y. Chen, and J.-H. Dai, "The designe and synthesis using hierarchical robotic discrete-event modeling," Journal of Vibration and Control, 2012.
[50] R. J. Mayer, "IDEF0 Function Modeling," Air force Systems Command, 1992.
[51] R. David, "Grafcet - a Powerful Tool for Specification of Logic Controllers," IEEE Transactions on Control Systems Technology, vol. 3, pp. 253-268, Sep 1995.
[52] Y.-S. Hsu, "Continuous 3D Gesture Recognition Based on Stereo Vision," Master, Computer Science and Information Engineering, National Central University, Taiwan, 2010.
[53] 鄭琇文. (2009). 科技始於人性 徹底顛覆人機介面Canesta讓手勢即可操控電視夢想得以實現. Available: http://www.digitimes.com.tw/tw/dt/n/shwnws.asp?CnlID=13&Cat=&Cat1=&id=12680
[54] Y. C. Chen and C. C. Teng, "A Model-Reference Control-Structure Using a Fuzzy Neural-Network," Fuzzy Sets and Systems, vol. 73, pp. 291-312, Aug 8 1995.
[55] F. J. Lin, W. J. Hwang, and R. J. Wai, "A supervisory fuzzy neural network control system for tracking periodic inputs," IEEE Transactions on Fuzzy Systems, vol. 7, pp. 41-52, Feb 1999.
[56] 林進燈. 類神經網路. Available: http://www.ecaa.ntu.edu.tw/weifang/LifeScience/FuzzyNeuro.html
[57] S. Apewokin, B. Valentine, D. Forsthoefel, L. Wills, S. Wills, and A. Gentile, Embedded Real-Time Surveillance Using Multimodal Mean Background Modeling, 2009.