跳到主要內容

簡易檢索 / 詳目顯示

研究生: 王振庭
Cheng-Ting Wang
論文名稱: 基於深度資訊與應用肌電訊號深度學習之虛擬大提琴設計
The design of virtual cello based on depth image and electromyography using deep learning
指導教授: 施國琛
Timothy K. Shih
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 75
中文關鍵詞: 手部追蹤手勢辨識肌電訊號虛擬樂器卷積類神經網絡MIDI
外文關鍵詞: Hand tracking, Gesture recognition, Electromyography, Convolutional neural network(CNN), Virtual instrument, MIDI
相關次數: 點閱:17下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現今,人機互動的技術主要發展是將使用者的指令傳送給電腦,而開發一個自然且直觀的交互技術是人機互動的重要目標。
    通常,使用者只使用滑鼠和鍵盤與電腦進行互動。
    目前這項研究主要針對在互動上更自然、直接且有效的與電腦溝通。
    讓使用者可以輕易透過手,頭,臉部表情,語音和肌電訊號與電腦進行互動,並且將介面設計更自然且直觀,可以用於擴增實境和虛擬實境。
    基於與人交互的界面是與計算機進行交互的自然而直觀的方式, 這樣的介面將用於AR / VR的環境。

    在本文中,我們在虛擬實境中使用手部追蹤與識別設計了一個即時虛擬大提琴。
    使用者只需坐在桌子前面並將使用者的手放置相機前。取得使用者的手勢後,使用者可以演奏虛擬大提琴。
    程式可靈活彈性調整的,使用者可以調整樂器的把位、和弦、音調高/低、音色。
    我們使用Realsense和Myo感測器來取得使用者的手部資訊。
    Realsense負責手部追蹤以觸發聲音。
    Myo感測器負責手勢辨識以控制MIDI功能。
    然而,我們使用卷積神經網絡(CNN)來辨識和分析表面肌電圖的四個靜態手勢。
    在系統中,我們使用OpenGL繪製手部模型以及使用者介面,並且使用OpenCV協助圖像處理。
    我們使用Rtmidi程式庫生成MIDI訊息並傳輸到數位音訊工作站(DAW)。
    通過使用附載VST,使樂器聽起來更真實。
    雖然這個系統無法播放速度很高的歌曲,但對於慢歌曲來說,它是穩定的,可以用於專業音樂表演。
    在實驗中,我們用表面肌電圖(sEMG)訓練一個卷積神經網絡模型,
    實驗結果顯示,特定用戶數據集的準確率為94.3%,一般數據集的準確率為86.1%。
    因此,我們還使用動態時間扭曲來評估虛擬大提琴。
    這個評估可以直接測量虛擬樂器產生的MIDI和標準MIDI的旋律相似性。


    Nowadays new technologies of Human Computer Interaction (HCI) are being developed to deliver user's command to the computer.
    Developing natural and intuitive interaction techniques is an important goal in HCI.
    Typically, users interact with the computer using mouse and keyboard.
    Currently, the research is directed towards a new part of the interaction that provides a way to more natural, direct and effective communication.
    Let users can interact with the computers through the hand, head, facial expressions, voice, and electromyography signal.
    Interface based on interaction with hands is a natural and intuitive way to interact with the computers. Such an interface could be used for AR/VR environment.

    This paper aims to apply the hand tracking and recognition used in virtual reality (VR) technology to a real-time application for the musical instrument, a virtual cello.
    The proposed application can play the realistic sound of the musical instrument.
    The user only needs to sit in front of the table and raises user’s hand to face the camera. The system will start playing the virtual cello after capturing the user's hand gesture.
    The program is flexible, the user can adjust parameters like key notes, chords, pitch up/down and tone mode.
    We used Realsense and Myo sensor to capture the hand information of the user at the application.
    The Realsense is responsible for hand tracking to trigger the sound.
    The Myo sensor is responsible for hand gesture recognition to control MIDI functions.
    However, we use the convolutional neural network (CNN) to recognize and analyze the four static hand gestures from surface electromyography.
    In the system, we use OpenGL to draw our interface and display the 3D model in the screen, the OpenCV is helping us to process image.
    We used Rtmidi library to generate the MIDI message and transfer to the digital audio workstation (DAW).
    By using plugin virtual studio technology (VST) to make the musical instrument sound more realistic.
    Although this system could not play the song which has a high speed, for slow songs, it is stable and could be used in a professional music performance.
    In the experiment, we trained the convolutional neural network model with surface electromyography (sEMG)
    The experimental results demonstrated that the accuracy is 94.3\% in the specific-user dataset and 86.1\% in general dataset.
    Therefore, we also used dynamic time warping to evaluate the virtual cello.
    This evaluation can be directly extended to measure the melodic similarity of generating the MIDI of virtual instrument and standard-MIDI.

    Contents List of Figures xi List of Tables xiii 1 Introduction 1 1.1 Background 1 1.2 Motivation 1 1.3 Thesis Organization 2 2 Related Work 3 2.1 Virtual Instrument 3 2.2 Hand Tracking 4 2.3 Hand Gesture Recognition 6 2.3.1 sEMG Hand Gesture Recognition 6 2.4 Artificial Neural Networks 11 2.4.1 Deep Learning Frameworks 11 2.4.2 Convolutional Neural Networks 12 2.5 Dynamic time warping 14 3 Purposed Method 15 3.1 Development Equipment 15 3.2 Hand Detection 18 3.3 Hand Motion 20 3.4 Hand tracking 21 3.5 Finger Tapping 22 3.6 Hand Gesture Recognition 23 3.7 Data Acquisition and Data Preprocessing 24 3.8 Deep Learning Framework of Purposed Method 26 3.9 Classification Algorithm 26 4 Application 33 4.1 Virtual Cello Application System 33 4.2 Application System Architecture 36 4.3 MIDI 39 5 Experiment Result 43 5.1 Experimental Environment 43 5.2 Dataset Collection 44 5.3 Convolutional Neural Network Model Structure Evaluation 44 5.4 Testing on independent and cross-correlation sEMG image 50 5.5 Filter shaping for convolutional neural network 52 5.6 Evaluation of virtual cello system 53 6 Conclusion and Future Work 57 6.1 Conclusion 57 6.2 Future work 58 References 60

    [1] Get off the deep learning bandwagon and get some perspective. https:
    //www.pyimagesearch.com/2014/06/09/get-deep-learning-bandwagon-get-
    perspective/. Accessed: 2018-05.

    [2] A. Ganiev, H.-S. Shin, K.-H. L. Study on virtual control of a robotic arm via a myo
    armband for the selfmanipulation of a hand amputee. International Journal of Applied
    Engineering Research 11, 4 (2016), 775–782.

    [3] Abreu, J. G., T. J. M.-F. L. S. . T. V. Evaluating sign language recognition using
    the myo armband. Virtual and Augmented Reality (SVR) (2016), 64–70.

    [4] Allard, U. C., N. F. F.-C. L. G. P. G. C. L. F. . G. B. A convolutional neural
    network for robotic arm guidance using semg based frequency-features. Intelligent
    Robots and Systems (IROS) (2016), 2464–2470.

    [5] C. Jose L. Flores, A. E. Gladys Cutipa, R. L. E. Application of convolutional
    neural networks for static hand gestures recognition under different invariant features.
    Electronics, Electrical Engineering and Computing (INTERCON) (2017).

    [6] CK. Nymoen, M.R. Haugen, A. J. Mu myo - evaluating and exploring the myo
    armband for musical interaction. International Conference on New Interfaces for Musical
    Expression (NIME) (2015).

    [7] Comaniciu, D., Ramesh, V., and Meer, P. Real-time tracking of non-rigid objects
    using mean shift. In Computer Vision and Pattern Recognition, 2000. Proceedings.
    IEEE Conference on (2000), vol. 2, IEEE, pp. 142–149.

    [8] C.P. Robinson, Baihua Li, Q. M. e. a. Pattern classification of hand movements
    using time domain features of electromyography. Movement Computing (2017), 1–6.

    [9] Erik Scheme MSc, P., and Kevin Englehart PhD, P. Electromyogram pattern
    recognition for control of powered upper-limb prostheses: State of the art and challenges
    for clinical use. Journal of rehabilitation research and development 48, 6 (2011), 643.

    [10] et al., A. M. Q. A multi-sensory gesture-based occupational therapy environment for
    controlling home appliances. nternational Conference on Multimedia Retrieval (2015),
    671–674.

    [11] Fleishman, S., Kliger, M., Lerner, A., and Kutliroff, G. Icpik: Inverse
    kinematics based articulated-icp. In Proceedings of the IEEE Conference on Computer
    Vision and Pattern Recognition Workshops (2015), pp. 28–35.

    [12] G Luzhnica, J Simon, E. L. V. P. A sliding window approach to natural hand
    gesture recognition using a custom data glove. 3D User Interfaces (3DUI) (2016), 81–90.

    [13] Garg, P., A. N. S. S. Vision based hand gesture recognition, world academy of
    science. Engineering and Technology, 49 (2009), 972–977.

    [14] Gary P. Scavone, M. U. The rtmidi tutorial. https://www.music.mcgill.ca/
    ~gary/rtmidi/, 2017.

    [15] Geng, W., Du, Y., Jin, W., Wei, W., Hu, Y., and Li, J. Gesture recognition by
    instantaneous surface emg images. Scientific reports 6 (2016), 36571.

    [16] Hariadi, R. R., and Kuswardayan, I. Design and implementation of virtual
    indonesian musical instrument (vimi) application using leap motion controller. In
    Information & Communication Technology and Systems (ICTS), 2016 International
    Conference on (2016), IEEE, pp. 43–48.

    [17] Hsu, M. H., Shih, T. K., and Chiang, J. S. Real-time finger tracking for virtual in-
    struments. In Ubi-Media Computing and Workshops (UMEDIA), 2014 7th International
    Conference on (2014), IEEE, pp. 133–138.

    [18] Hur, M. A. O. H. Myoelectric control systems a survey. Biomedical Signal Processing
    and Control 2, 4 (2007), 275–94.

    [19] Jiaxiang Wu, J. C. Bayesian co-boosting for multi-modal gesture recognition. Machine
    Learning Research 15, 1 (2014), 3013–3036.

    [20] K. Liu, N. K. Real-time robust vision-based hand gesture recognition using stereo
    images. J. Real-Time Image Process (2013).

    [21] Karabulut, D., Ortes, F., Arslan, Y. Z., and Adli, M. A. Comparative
    evaluation of emg signal features for myoelectric controlled human arm prosthetics.
    Biocybernetics and Biomedical Engineering 37, 2 (2017), 326–335.

    [22] Liang, H., Wang, J., Sun, Q., Liu, Y.-J., Yuan, J., Luo, J., and He, Y.
    Barehanded music: real-time hand interaction for virtual piano. In Proceedings of the
    20th ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (2016),
    ACM, pp. 87–94.

    [23] Mondal, T., Ragot, N., Ramel, J.-Y., and Pal, U. Performance evaluation of dtw
    and its variants for word spotting in degraded documents. In Document Analysis and
    Recognition (ICDAR), 2015 13th International Conference on (2015), IEEE, pp. 1141–
    1145.
    [24] O. M. Bjorndalen, R. B. Mido - midi objects for python. https://github.com/
    olemb/mido., 2014.

    [25] Pancholi, S., and Joshi, A. M. Portable emg data acquisition module for upper
    limb prosthesis application. IEEE Sensors Journal (2018).

    [26] Saggio, G., Giannini, F., Todisco, M., and Costantini, G. A data glove based
    sensor interface to expressively control musical processes. In Advances in Sensors
    and Interfaces (IWASI), 2011 4th IEEE International Workshop on (2011), IEEE,
    pp. 192–195.

    [27] Silva, E. C., C. E. W. . M. A. A. Sensor data fusion for full arm tracking using myo
    armband and leap motion. Computer Games and Digital Entertainment (SBGames)
    (2015), 128–134.

    [28] Soni, U., Trivedi, A., and Roberts, N. Real-time hand tracking using integrated
    optical flow and camshift algorithm. In Research in Computational Intelligence and
    Communication Networks (ICRCICN), 2016 Second International Conference on (2016),
    IEEE, pp. 135–140.

    [29] Sridhar, S., Mueller, F., Zollhöfer, M., Casas, D., Oulasvirta, A., and
    Theobalt, C. Real-time joint tracking of a hand manipulating an object from rgb-d
    input. In European Conference on Computer Vision (2016), Springer, pp. 294–310.

    [30] Tosin, M. C., Majolo, M., Chedid, R., Cene, V. H., and Balbinot, A. semg
    feature selection and classification using svm-rfe. In Engineering in Medicine and Biology
    Society (EMBC), 2017 39th Annual International Conference of the IEEE (2017), IEEE,
    pp. 390–393.

    [31] Wei, W., Wong, Y., Du, Y., Hu, Y., Kankanhalli, M., and Geng, W. A
    multi-stream convolutional neural network for semg-based gesture recognition in muscle-
    computer interface. Pattern Recognition Letters (2017).

    [32] Yu, H., and Henriquez, I. Music classification based on melodic similarity with
    dynamic time warping. In Computational Intelligence and Computing Research (ICCIC),
    2013 IEEE International Conference on (2013), IEEE, pp. 1–6.

    QR CODE
    :::