| 研究生: |
吳承宗 Cheng-tsung Wu |
|---|---|
| 論文名稱: |
基於深度攝影機之混合實境互動桌 A Depth-camera-cased Mixed Reality Interactive Table |
| 指導教授: |
蘇木春
Mu-chun Su |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 100 |
| 語文別: | 中文 |
| 論文頁數: | 98 |
| 中文關鍵詞: | 樂器數位介面 、積木 、人機互動 、深度攝影機 、立體物件辨識 、觸控螢幕介面 |
| 外文關鍵詞: | building blocks, musical instrument digital interface (MIDI), three-dimensional object recognition, touch screen interface, depth camera, human-computer interaction |
| 相關次數: | 點閱:15 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文將微軟Kinect結合投影機建立一個混合實境互動桌。在這個互動桌可以提供兩種不同的應用模式,分別是觸控螢幕應用模式與混合實境音樂互動應用模式。在執行兩個應用之前,必須先進行Kinect與投影畫面之間的校正動作,取得座標系校正的轉換矩陣,此轉換矩陣的目的是將原本以Kinect為原點的座標系轉換至以投影畫面左上角為原點的座標系,藉由轉換矩陣將每個時刻的深度資訊進行座標系轉換,得到一組以投影畫面左上角為原點的三維點集,並以此點集建置深度俯視圖。
在觸控螢幕應用模式當中,系統可以辨識使用者的八種手型,並依照手型與手型對於投影畫面的距離對應適當的滑鼠指令,將投影畫面變成觸控螢幕。而在混合實境音樂互動應用模式中,我們提出一個紀錄三維物件的方法,使用者可以發揮創意將積木組成任意形狀的物件放在投影平面上,並從120種樂器當中挑選一個適合它的樂器,讓原本只具有觸覺和視覺的積木,再加上聽覺。系統將辨識出使用者指定的樂器物件,並在樂器物件周圍隨著物件角度繪製最常用的21個虛擬琴鍵,可同時多個樂器與多位使用者合奏,藉此不僅可以達到認識樂器的效果與合奏的樂趣,還有無限大的創意空間亦有助於刺激思考。
This thesis combines Microsoft Kinect with a projector to create a mixed reality interactive table. This interactive table can provide two different modes, the touch screen mode and the mixed reality interactive music mode. Before the implementation of two modes must be regulate the Kinect and the projector to get the coordinates transformation matrix. The purpose of transformation matrix is change the origin of the coordinate system from Kinect to the upper left corner of the projector’s screen. The real world points set of each frame multiply this transformation matrix. We could get new a point set and the origin of this point set is the upper left corner of the projector’s screen. Then build the disparity map in top view by the converted point set.
In the touch screen mode, this system could recognize eight hand gestures. According to the hand gestures and hand’s height to decide the instruction of mouse. So we can change the projector’s screen into touch screen. In the mixed reality interactive music mode, we will provide a three-dimensional object recognition. Users could develop their creativity to compose blocks of arbitrary shape on the projector’s screen. And select a suitable instrument from 120 kinds of musical instruments for the objects (blocks). Blocks have been only tactile and visual, coupled with hearing. This system will recognize the user-specified instrument object, and drew 21 notes next to the instrument object. People could play more than one instrument at the same time. Achieving the understanding of musical instruments, the fun of ensemble, and infinite creative space and stimulates thinking.
[1] Microsoft Kinect [Online] Available: http://www.microsoft.com/en-us/kinectforwindows/discover/features.aspx Jun. 26, 2012[data accessed]
[2] G. Fanelli, J. Gall, and L. V. Gool, “Real Time Head Pose Estimation with Random Regression Forests,” in IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[3] Andrew D. Wilson, “Using a Depth Camera As a Touch Sensor,” in Proc. Of ACM International Conference on Interactive Tabletops and Surfaces, pp. 69–72, 2010.
[4] 國立故宮博物院 [Online] Available: http://www.npm.gov.tw/exh99/npm_digital/ch3.html Jun. 26, 2012[data accessed]
[5] 第四屆台北數位藝術節KT科藝獎scratch互動音樂桌[Online] Available: http://www.youtube.com/watch?v=BSutayY9ik0 Jun. 26, 2012[data accessed]
[6] J.M. Rehg and T. Kanade, “Digiteyes: Vision-based hand tracking for human-computer interaction,” in Proc. of the Workshop on Motion of Non-Rigid and Articulated Bodies, pp. 16-22, 1994.
[7] E. Ueda, Y. Matsumoto, M. Imai, and T. Ogasawara, “Hand Pose Estimation for Vision-based Human Interface,” IEEE Trans. on Industrial Electronics, vol. 50, no. 4, pp. 676-684, 2003.
[8] A. Causo, M. Matsuo, E. Ueda, K. Takemura, Y. Matsumoto, J. Takamatsu, and T. Ogasawara, “Hand pose estimation using voxel-based individualized hand model,” in IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 2009, pp. 451-456, 14-17 Jul. 2009.
[9] R. Yang and S. Sarkar, “Gesture Recognition Using Hidden Markov Models from Fragmented Observations,” in Proc. IEEE Conference Computer Vision and Pattern Recognition, 2006
[10] M. Elmezain and A. Al-Hamadi, “Gesture Recognition for Alphabets from Hand Motion Trajectory Using Hidden Markov Models,” 2007 IEEE International Symposium on Signal Processing and Information Technology, pp. 1192-1197, 15-18 Dec. 2007
[11] M. A. Amin and H. Yan, “Sign Language Finger Alphabet Recognition From Gabor-PCA Representation of Hand Gestures,” in Proc. of the Sixth International Conference on Machine Learning and Cybernetics, pp. 2218-2223, 2007.
[12] Y. Fang, J. Cheng, K. Wang, and H. Lu, “Hand Gesture Recognition Using Fast Multi-scale Analysis,” in Proc. of IEEE international Conference on Image and Graphics, 2007.
[13] M. Vafadar and A. Behrad, “Human Hand Gesture Recognition Using Motion Orientation Histogram for Interaction of Handicapped Persons with Computer,” Lecture Notes In Computer Science, vol. 5099, pp. 378-385, 2008.
[14] A. Bobick and A.Wilson, “A state based approach to the representation and recognition of gesture,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 19, no. 12, pp. 1325–1337, Dec. 1997
[15] J. F. Lichtenauer, E. A. Hendriks, and M. J. T. Reinders, “Sign Language Recognition by Combining Statistical DTW and Independent Classification,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 2040-2046, 2008.
[16] M. V. Lamar, M. S. Bhuiyan, and A. Iwata, “T-CombNET - A Neural Network Dedicated to Hand Gesture Recognition,” Lecture Notes In Computer Science, vol. 1811, pp. 613-622, 2000.
[17] E. Stergiopoulou, N. Papamarkos, and A. Atsalakis, “Hand Gesture Recognition Via a New Self-organized Neural Network,” Lecture Notes In Computer Science, vol. 3773, pp. 891-904, 2005.
[18] P. Hong, M. Turk, and T. Huang, “Gesture modeling and recognition using finite state machines,” in Proc. Fourth IEEE International Conference and Gesture Recognition, pp. 410-415, 2000.
[19] M. A. Amin and H. Yan, “Sign Language Finger Alphabet Recognition From Gabor-PCA Representation of Hand Gestures,” in Proc. of the Sixth International Conference on Machine Learning and Cybernetics, pp. 2218-2223, 2007.
[20] 鋼琴玩具圖[Online] Available: http://udn.gohappy.com.tw/shopping/Browse.do?op=vp&sid=9&cid=60974&pid=1225174 Jun. 26, 2012[data accessed]
[21] 鋼琴手套[Online] Available: http://www.diytrade.com/china/3/products/6813317/%E9%8B%BC%E7%90%B4%E6%89%8B%E5%A5%97.html Jun. 26, 2012[data accessed]
[22] Saxophone Toy [Online] Available: http://www.amazon.com/The-Little-Toy-Co-120580/dp/B000OP31KC/ref=pd_sim_t_1 Jun. 26, 2012[data accessed]
[23] Trumpet Toy [Online] Available: http://www.amazon.com/The-Little-Toy-Co-120571/dp/B000P4XR56/ref=cm_cmu_pg__header Jun. 26, 2012[data accessed]
[24] 吉他英雄[Online] Available: http://hub.guitarhero.com/ Jun. 26, 2012[data accessed]
[25] Air Guitar Pro[Online] Available: http://www.geekalerts.com/air-guitar-pro-infrared-ray-instead-of-strings/ Jun. 26, 2012[data accessed]
[26] 敲擊木琴[Online] Available: http://twins.shop.rakuten.tw/200000000898422/ Jun. 26, 2012[data accessed]
[27] 小提琴音樂玩具[Online] Available: http://buy.yahoo.com.tw/gdsale/gdsale.asp?gdid=1727278&co_servername=d2b13b620e66407ad4c0547de27c5982# Jun. 26, 2012[data accessed]
[28] S. Jorda, G. Geiger, M. Alonso, and M. Kaltenbrunner, “The reacTable: exploring the synergy between live music performance and tabletop tangible interfaces,” in 1st International Conference on Tangible and Embedded Interaction, pp. 139-146, 2007.
[29] G. Weinberg, “The beatbug – evolution of a musical controller,” Digital Creativity, vol. 19, no 1, pp. 13-18, 2008.
[30] J. Nielsen, N. K. Barendsen, and C. Jessen, “RoboMusicKids- Music education with robotic building blocks,“ in 2nd IEEE International Conference on Digital Game and Intelligent Toy Enhanced Learning, pp. 149-156, 2008.
[31] J. Zigelbaum, A. Millner, B. Desai, H. Ishii, “BodyBeats: whole-body, musical interfaces for children.” in Extended Abstracts of Conference on Human Factors in Computing Systems. ACM Press, pp. 1595–1600, 2006.
[32] H. K. Min, “SORISU: Sound with Numbers“, in 9th International Conference on New Interfaces for Musical Expression, 2009.
[33] S. Chun, A. Hawryshkewich, K. Jung, and P. Pasquier, “Freepad: A Custom Paper-based MIDI Interface,” in 10th International Conference on New Interfaces for Musical Expression, 2010.
[34] G. Geiger, “Using the Touch Screen as a Controller for Portable Computer Music Instruments,” in 6th International Conference on New Interfaces for Musical Expression, 2006.
[35] G. Schiemer and M. Havryliv, “Pocket Gamelan: interactive mobile music performance,” in Proc. of Mobility Conference: 4th International Conference on Mobile Technology, Applications and Systems, pp. 716-719, 2007
[36] M. Rohs, G. Essl, and M. Roth, “CaMus: Live music performance using camera phones and visual grid tracking,” in Proc. of the 6th International Conference on New Instruments for Musical Expression, pp. 31-36, Jun. 2006.
[37] M. Rohs and G. Essl, “Camus2: optical flow and collaboration in camera phone music performance,” in 7th International Conference on New Interfaces for Musical Expression, pp. 160–163, 2007.
[38] Apple, Inc. [Online] Available: http://www.apple.com/iphone/ Jun. 26, 2012[data accessed]
[39] G. Weinberg, A. Beck and M. Godfrey. “ZooZBeat: a Gesture-based Mobile Music Studio,” in 9th International Conference on New Interfaces of Musical Expression, pp. 312-315, Pittsburgh, PA, USA, 2009,.
[40] String Trio, [Online] Available: http://itunes.apple.com/app/string-trio/id342414859?mt=8 Jun. 26, 2012[data accessed]
[41] G. Wang, “Designing smule''s iphone ocarina,” in Proc. of the 9th International Conference on New Interfaces for Musical Expression, Pittsburgh, 2009.
[42] Leaftrombone, [Online] Available: http://leaftrombone.smule.com/ Jun. 26, 2012[data accessed]
[43] 林志杰,新版MIDI玩家手冊,第3波出版社,民國八十三年。
[44] MIDI 事件[Online] Available: http://www.sonicspot.com/guide/MIDI files.html Jun. 26, 2012[data accessed]
[45] GM音色表[Online]Available: http://www.gtxs.com.tw/tech/tech_music/gm_list.htm Jun. 26, 2012[data accessed]
[46] K. S. Arun, Thomas Huang and S. D. Blostein, “Least-Squares Fitting of Two 3-D Point Sets,” IEEE Trans. Analysis and Machine Intelligence, vol.9, no.5, pp. 698-700, 1987.
[47] 蔡嵩陽,「即時手型辨識系統及其於家電控制之應用」,國立中央大學資訊工程研究所碩士論文,民國一百年。
[48] 吳成柯、戴善榮、程湘君、雲立實,數位影像處理,儒林圖書有限公司,台北市,民國九十年十月。
[49] B. F. Wu, S. P. Lin, and C. C. Chiu, “Extracting characters from real vehicle licence plates out-of-doors,” Computer Vision, IET , vol.1, no.1, pp. 2-10, Mar. 2007.
[50] M. K. Hu, “Visual pattern recognition by moment invariants,” IRE Transactions on Information Theory, vol.8, no.2, pp. 179-187, Feb. 1962.
[51] R. B. Rusu, “Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments,” PhD thesis, Computer Science department, Technischen Universitat Munchen, Germany, 2009