跳到主要內容

簡易檢索 / 詳目顯示

研究生: 何迪亞
Wisnu Aditya
論文名稱: 應用於虛擬甘美朗之手部追蹤辨識系統
HAND TRACKING AND GESTURE RECOGNITION FOR PLAYING VIRTUAL GAMELAN
指導教授: 施國琛
Timothy K. Shih
Herman Tolle
Herman Tolle
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系在職專班
Executive Master of Computer Science & Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 76
中文關鍵詞: 手跟踪手勢識別深度數據即時的DBSCAN虛擬Gamelan
外文關鍵詞: Hand Tracking, Gesture Recognition, Depth Data, Real-Time, DBSCAN, Virtual Gamelan
相關次數: 點閱:13下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 居住在現代社會的人通常會以程度來忘記自己的文化,他們更喜歡現代的東西而不是傳統的。在這種情況下,不能不斷地留下來,因為如果連續一天傳統文化將會消失,所以我們需要保持傳統文化的創新和創新的方式。印度尼西亞的一種傳統表演藝術是六世紀以來的加梅蘭(Gamelan)。結合傳統文化與現代技術有望解決這個問題。該組合以虛擬gamelan系統的形式實現。該系統使用手勢實時控制,使其看起來像原始gamelan玩。使用手勢玩gamelan為gamelan的玩家提供了一個新的體驗。使用手勢提供可選的智能和自然的方式來接口用於人機通信的工具。手分割和跟踪是任何手勢識別應用程序中最大的問題,它們為後續的手勢識別算法提供了最重要的輸入。使用深度數據可以加快分割過程,因為深度數據具有可以識別對象位置的信息,因此我們可以輕鬆地分離對象和背景。我們使用閾值法進行分割,該閾值將能夠減少要處理的數據量,從而加快計算過程。在這項研究中,我們提出了基於密度的空間聚類應用與噪聲(DBSCAN)的數據聚類算法。這種方法用於
    手跟踪和手勢識別。手跟踪過程使用DBSCAN獲得手級,DBSCAN預計將生成代表右手和左手的兩個類。但是,這兩個類需要進行標記,以便在下一個幀中不改變類。使用距離測量方法的其他手勢。從當前幀中的類與前一幀之間的手中心的位置獲得距離。最後,我們做了一些實驗來獲得DBSCAN的最佳參數,這個參數將產生最好的結果。然後我們通過玩各種姿勢來測試系統。使用DBSCAN的手勢的平均精度為92%。結果表明,我們的方法在虛擬gamelan系統上表現良好.


    People who live in modern society usually forget their culture by degrees, they prefer about modern thing rather than traditional. It cannot be left continuously in this kind of situation, because if it continuous someday traditional culture will vanish, so we need to preserve our traditional culture a creative and innovative way. One kind of the traditional performing arts from Indonesia is Gamelan since the 6th century. Combining the traditional culture and modern technologies is expected to solve this problem. This combination is implemented in the form of a virtual gamelan system. The system is controlled in real-time using hand gestures to make it look like the original gamelan play. Using gesture for playing gamelan provide a new experience to the players of gamelan. The use of a hand gesture offering an optional intelligent and natural way to interface tools for human computer communication. Hand segmentation and tracking are the biggest issues in any hand-gesture recognition application and they provide the most vital input for the succeeding gesture recognition algorithm. Using depth data can speed up the process of segmentation because the depth data has information that can recognize the position of an object, then we can separate objects and backgrounds easily. We do segmentation using the threshold method, this threshold will be able to reduce the amount of data to be processed so as to speed up the computation process. In this research, we propose Density-based spatial clustering of applications with noise (DBSCAN) for a data clustering algorithm. This method used in both hand tracking and hand gesture recognition. Hand tracking process uses DBSCAN to obtain hand classes, DBSCAN is expected to produce two classes representing the right hand and left hand. However, these two classes need to be labeled so that no class changes in the next frame. Other gestures using distance measurement methods. The distance is obtained from the position of the hand center between the classes in the current frame with the previous frame. Finally, we did some experiments to get the best parameters for DBSCAN, this parameter will produce the best result. Then we tested the system by playing in various poses. The average accuracy of hand gesture that using DBSCAN is 92%. The results show that our method performs well on a virtual gamelan system.

    摘要 i Abstract iii Acknowledgements v Contents vi List of Figures x List of Tables xiii Chapter 1. Introduction 1 1.1 Background 1 1.2 Problem definition 3 1.3 Scope and Limitation 4 Chapter 2. Related Work 5 2.1 Virtual Musical applications 5 2.1.1 The Edutainment of Virtual Music Instrument for Thai Xylophone (Ranad-ek) 5 2.1.2 A Virtual Xylophone for Music Education 6 2.2 Hand Segmentation 7 2.2.1 Hand Segmentation based on Improved Gaussian Mixture Model 7 2.2.2 Depth-Based Hand Pose Segmentation with Hough Random Forest 8 2.3 Hand Tracking Method 8 2.3.1 Tracking Multiple Rigid Symmetric and Non-Symmetric Objects in Real-Time Using Depth Data 9 2.3.2 A Real Time Alphabets Sign Language Recognition System using Hands Tracking 9 2.4 Hand Gesture Recognition 10 2.4.1 A static hand gesture recognition method based on the depth information 10 2.5 Density Based Spatial Clustering of Application with Noise 10 2.5.1 Automatic Object Detection using DBSCAN for Counting Intoxicated Flies in the FLORIDA Assay 11 2.6 Gamelan 12 2.6.1 Gamelan Notation 12 2.6.2 Gamelan Gending 14 2.6.3 Gamelan Instruments 15 Chapter 3. Research Methodology 16 3.1 System Architecture 16 3.2 System Requirement 16 3.3 System Interface 17 3.3.1 Scenario 17 3.3.2 Creating 3D Object 18 3.3.3 Creating 3D Virtual World 19 3.3.3 System Interface 20 3.3.4 Kinect V2 and Unity 3D Integration 21 3.3.5 Kinect Position and Tracking Area 22 3.4 Proposed Method 23 3.4.1 Get Depth Data 24 3.4.2 Down Sampling Data 26 3.4.2 Data Segmentation 26 3.4.4 Data Clustering 29 3.4.5 Get the Hand Area 31 3.4.5 Hand Labeling 33 3.4.6 Gesture Recognition 37 3.4.7 Scoring System 49 3.4.8 Playing Guidance 50 Chapter 4. Experiment Result and Discussion 51 4.1 Hand Tracking 51 4.1.1 Hand Cluster 51 4.1.2 Hand Pose 54 4.2 Hand Gesture 60 Chapter 5. Conclusion & Future Work 69 5.1 Conclusion 69 5.2 Future Work 70 References 72

    [1] V. Kaul, "Globalisation and crisis of cultural identity," Journal of Research in International Business and Management , vol. 2, no. 13, pp. 341-349, 2012.
    [2] R. Kurin, Safeguarding intangible cultural heritage in the 2003 UNESCO Convention : a critical appraisal, Blackwell Publishing, 2003.
    [3] Y. K. Suprapto, I. K. E. Purnama, M. Hariadi, M. H. Purnomo and T. Usagawa, "Sound Modeling of Javanese Traditional Music Instrument," in International Conference on Instrumentation, Communication, Information Technology, and Biomedical Engineering 2009, Bandung, 2009.
    [4] UNESCO, "What is intangible cultural heritage?," 2011. [Online]. Available: https://ich.unesco.org/doc/src/01851-EN.pdf. [Accessed 07 07 2017].
    [5] A. Gherbi and F. Khendek, "UML Profiles for Real-Time Systems and their," Journal of Object Technology Vol. 5, No. 4, pp. 149-169, 2006 .
    [6] L. V. David, D. Piotr and C. F. Diego, "Focal-Plane Moving Object Segmentation for RealTime Video Surveillance," IEEE International Symposium on Circuits and Systems, pp. 1600 - 1603, 2008.
    [7] B. Srinivasan, S. Pather, R. Hill, F. Ansari and D. Niehaus, "A Firm Real-Time System Implementation Using Commercial Off-The-Shelf," in Real-Time Technology and Applications Symposium, 1998. Proceedings. Fourth IEEE, Denver, 1998.
    [8] L. Xia, C.-C. Chen and J. K. Aggarwal, "Human detection using depth information by Kinect," in Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference, Colorado , 2011.
    [9] A. G. Yiannoulis, Y. S. Boutalis and B. G. Mertzios, "Recent Trends in Multimedia Information Processing," in Proceedings of the 9th International Workshop on Systems, Signals and Image Processing, Manchester, 2002.
    [10] H. K. Muhammad, S. Kimiaki, S. F. Muhammad and G. Marcin, "Multiple human detection in depth images," IEEE 18th International Workshop on Multimedia Signal Processing (MMSP), pp. 1-6, 2016.
    [11] J. Shotton, R. Girshick, A. Fitzgibbon, T. Sharp, M. Cook, M. Finocchio, R. Moore, P. Kohli, A. Criminisi, A. Kipman and A. Blake, "Efficient Human Pose Estimation from Single Depth Images," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 2821 - 2840, 2013.
    [12] R. Azad, B. Azad, N. B. Khalifa and S. Jamali, "Real-Time Human-Computer Interaction Based On Face And Hand Gesture Recognition," International Journal in Foundations of Computer Science & Technology, pp. 37-48, 2014.
    [13] U. Soni, A. Trivedi and N. Roberts, "Real-time hand tracking using integrated Optical flow and CAMshift algorithm," in Second International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), Kolkata, 2016.
    [14] J. Hou, C. Sha, X. Qi and N.-M. Q. Lei, "Cluster Merging Based on Dominant Sets," in Similarity-Based Pattern Recognition: Third International Workshop, Copenhagen, 2015.
    [15] C. S. L. C. Jian Hou and N.-M. Q. Qi Xia, "Temporal Denoising of Kinect Depth Data," in Merging dominant sets and DBSCAN for robust clustering and image segmentation, Paris, 2014.
    [16] K. Essmaeel, L. Gallo, E. Damiani, G. D. Pietro and A. Dipandà, "Temporal denoising of Kinect depth data," in 2012 Eighth International Conference on Signal Image Technology and Internet Based Systems, Naples, 2012.
    [17] O. Wasenmüller and D. Stricker, "Comparison of Kinect v1 and v2 Depth Images in Terms of Accuracy and Precision," in Asian Conference on Computer Vision 2016 Workshops, Taipe, 2016.
    [18] M. Laddawan, T. Apichai, Porpipat and N. Ratchadawan, "The Edutainment of Virtual Music Instrment for Thai Xylophone (Ranad-ek)," in 2012 International Conference for Internet Technology and Secured Transactions, London, 1012.
    [19] N. Burks and L. S. a. J. Saquer, "A Virtual Xylophone for Music Education," in 2016 IEEE International Symposium on Multimedia, San Jose, 2016.
    [20] Y. Zheng and P. Zheng, "Hand Segmentation Based on Improved Gaussian Mixture Model," in 2015 International Conference on Computer Science and Applications (CSA), Wuhan, 2015.
    [21] W.-J. Tsai, J.-C. Chen and K. W. Lin, "Depth-Based Hand Pose Segmentation with Hough Random Forest," in 2016 3rd International Conference on Green Technology and Sustainable Development (GTSD), Kaoshiung, 2016.
    [22] S. Akkaladevi, M. Ankerl, C. Heindl and A. Pichler, "Tracking multiple rigid symmetric and non-symmetric objects in real-time using depth data," in 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, 2016.
    [23] Sulfayanti, Dewiani and A. Lawi, "A real time alphabets sign language recognition system using hands tracking," in 2016 International Conference on Computational Intelligence and Cybernetics, Makassar, 2017.
    [24] L. Ma and W. Huang, "A Static Hand Gesture Recognition Method Based on the Depth Information," in 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, 2016.
    [25] C. Bodenstein, M. Götz, A. Jansen, H. Scholz and M. Riedel, "Automatic Object Detection using DBSCAN for Counting Intoxicated Flies in the FLORIDA Assay," in Machine Learning and Applications (ICMLA), 2016 15th IEEE International Conference, Anaheim, 2016.
    [26] J. Hilder, Central Javanese Gamelan Handbook, Kelburn: Communication Services Section, Victoria University of Wellington, 1992.
    [27] S. Hong, J. Choi, J. Feyereisl, B. Han and L. S. Davis, "Joint Image Clustering and Labeling," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,, pp. 1411-1424, 2016.
    [28] M. Sokolova and G. Lapalme, "Performance Measures in Classification of Human Communications," in Advances in Artificial Intelligence, Montreal, 2007.
    [29] B. S. Santos, P. Dias, A. Pimentel, J.-W. Baggerman, C. Ferreira, S. Silva and J. Madeira, "Head-mounted display versus desktop for 3D navigation in virtual reality: a user study," Multimedia Tools and Applications, vol. 41, pp. 161-181, 2009.
    [30] B. Jones, R. Sodhi, M. Murdock, R. Mehra and H. Benko, "RoomAlive: Magical Experiences Enabled by Scalable, Adaptive Projector-Camera Units," in UIST '14 Proceedings of the 27th annual ACM symposium on User interface software and technology, Honolulu, 2014.
    [31] M. Krichenbauer, G. Yamamoto, T. Taketom, C. Sandor and H. Kato, "Augmented Reality vs Virtual Reality for 3D Object Manipulation," TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, vol. 14, no. 8, pp. 1-10, 2015.

    QR CODE
    :::