| 研究生: |
黃超群 Chao-Chun Huang |
|---|---|
| 論文名稱: |
身心障礙者輔具之研製 The Study of Auxiliary Equipments for the Spinal Cord Injuries and the Blind |
| 指導教授: |
蘇木春
Mu-Chun Su |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 90 |
| 語文別: | 中文 |
| 論文頁數: | 64 |
| 中文關鍵詞: | 聲控電腦 、語音辨識 、點字樂譜 |
| 外文關鍵詞: | voice control Human-Computer Interface, Braille music scores, speech recognize |
| 相關次數: | 點閱:24 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
摘要
對一般人士來說,透過滑鼠和鍵盤來操作電腦是相當方便的事情,不幸地,對於身心障礙人士卻沒有辦法享受到這種方便的互動。如何改善這個狀況是個刻不容緩的任務。對於人類來說,語音是一種自然的溝通方式,對於沒辦法使用這些控制介面的人士,使用語音操作來控制電腦是個不錯的取代方式。目前,應用於語音辨識上,主要有下列三種方法,(1) 動態時間校正(DTW)演算法,(2) 隱藏Markov模型(HMM),和(3) 模糊類神經方法(neuro-fuzzy approaches)。當然,這些方法都有個別的優缺點,在本篇論文中,我們提出了一個新的方法,結合了Kohonen的自我組織特徵映射圖網路(SOM)和一個圖樣比對方式,來應用於辨識單詞語音上。我們使用這個新的語音辨識方法來實現了一個聲控滑鼠介面,使得使用者能夠以語音命令來操作滑鼠。
在本論文中我們也發展了一個點字樂譜自動化製作的整合環境。透過此整合環境,使用者可以將所要學習的樂譜,先經掃描器轉至電腦檔案中,然後,藉由辨識程式辨識在被掃描的樂譜上的所有音樂符號,最後再將辨識結果轉換成點字樂譜檔案,並將點字樂譜檔案以語音呈現給使用者。藉由此整合環境,我們可大量減少人工製作的費用以及縮短製作時間,以便為國內視覺障礙人士製作大量的點字樂譜。
Abstract:
For able-bodies people, access to computers can be taken for granted because conventional computer interfaces (e.g. a keyboard and a mouse) are designed with the able-bodies in mind. Unfortunately, people with physical disabilities cannot enjoy the benefits provided by computers on equal term. Therefore, how to lower or even tear down the barriers between computers and users with disabilities is a very demanding task. Since speech is a natural communication means for human being, the voice-operation feature is the idea control method for the large number of disabled people who cannot use conventional computer interfaces. Currently, there are three different approaches to speech recognition such as (1) dynamic time warping (DTW) algorithm, (2) hidden Markov model (HMM), and (3) neuro-fuzzy approaches. Each has its own merits and disadvantages. In this thesis, a new method that combines Kohonen’s self-organizing feature map and a simple pattern matching method is proposed for isolated word recognition. Based on this new recognition method a voice-controlled mouse is implemented to allow the user to issue voice commands to move the cursor and/or click the buttons.
In this thesis, we also develop an integrated environment for automatically manufacturing Braille music scores. Under this integrated environment, users can first use a scanner to scan music scores into a computer, run the recognition software to recognize musical symbols on scanned music scores, and then transform the recognized results into Braille music scores. By using the proposed integrated environment, we can greatly reduce the labor fees and shorten the time for manufacturing a large amount of Braille music scores for the blind in Taiwan.
參考文獻
[1] 陳友倫,張恒雄,黃美娟,鄧復旦,郭德威,“新開發之殘障者人機界面”, The Biomedical Engineering Society 1997 Annual Symposium, pp. 146-147, 1997。
[2] 張彧,“工作用輔具”,輔具之友通訊, No. 7, pp. 4-15,1998。
[3] 蘇木春,楊孟達,胡家銘,“以超音波偵測頭動之人機介面”,中華醫學工程學刊, vol. 18, No. 3, pp.183-188 , 1998。
[4] 蘇木春,賴友人,鐘明蒼,”身體障礙者之聲控人機介面”,淡江大學電機工程學系電子電路組碩士班論文,pp. 18-32. 2001
[5] 視覺障礙人士點字研究叢書第七籍 點字樂符精解,教育部視覺障礙人士點字研究小組 國立台南師範學院 台灣省視覺障礙兒童混合教育計劃師資訓練班
[6] W. Y. Liou and J. H. Chen, ”Man-machine interface-Eye Mouse”, 中華民國84年醫學工程科技研討會, pp. 235-236, 台南, 1995.
[7] D. A. Anson, Alternative Computer Access: A Guide to Selection, Philadelphia, 1997.
[8] F. Azam and H. F. VanLandingham, “Adaptive self organizing feature map neuro-fuzzy technique for dynamic system identification,” IEEE ISIC/CIRA/ISAS Joint Conference, pp. 337-341, Maryland, U.S.A., 1998.
[9] D. Burr, B. Ackland, and N. Weste, “Array configurations for dynamic time warping,” IEEE Trans. Acoust,. Speech, Signal Processing, vol. ASSP-26, pp. 43-49, Feb 1978.
[10] D. Binbridge and T C Bell, ”Dealing with superimposed objects in optical music recognition, ” sixth Int. conference on Image Processing and its Application, pp. 756-760, 1997.
[11] S. Baumann, ”A simplified attributed graph grammar for high-level music recognition, “ Proceeding of the Third Int. Conference on Document Analysis and Recognition, vol. 2, pp.1080-1083, 1995.
[12] A. M. Cook and S. M. Hussey, Assistive Technologies: Principles and Practice. Baltimore: Mosby, 1995.
[13] B. Couasnon and J. Camillerapp, “A way to separate knowledge from program in structured document analysis: application to optical music recognition,” 3 rd Int. Conference on Document Analysis and Recognition, pp. 1092-1097, 1995.
[14] Y. Chaya and P. S. Fisher, "Voice Controlled Smart House," Proceedings of the IEEE 1993 International Conference on Consumer Electronics, pp.154-155.
[15] Eye Ware, http://www.assistivetech.comp/p-eyeware.htm。
[16] B. Fritzke, “Growing cell structures-a self-organizing network for unsupervised and supervised learning,” Neural Networks, vol. 7, no. 9, pp. 1441-1460, 1994.
[17] J. A. Flanagan, “Analyzing a self-organizing algorithm,” Neural Networks, vol. 10, no. 5, pp. 875-883, 1997.
[18] J. A. Flanagan, “Self-organization in Kohonen’s SOM,” Neural Networks, vol. 9, no. 7, pp. 1185-1197, 1996.
[19] L. A. Frey, Jr. K. P. White, and T. E. Hutchison, "Eye-gaze Word Processing," IEEE Trans. on System, Man and Cybernetics, pp. 944-950, 1990.
[20] B. Hu and Q. M. Hua, "New Method for Human-computer Interaction by using Eye Gaze," Proceedings of the IEEE International Conference on Systems,Man and Cybernetics, pp.2723-2728 vol. 3,1994.
[21] T.E. Hutchinson, Jr. K.P. White, W.N. Martin, K. C. Reichert, and L. A. Frey, "Human-computer Interaction Using Eye-gaze Input," IEEE Trans. on System, Man, and Cybernetics, pp. 1527-1534, 1989.
[22] T. Hauck, ”SAM: an improved input device, ” Johns Hopkins APL Technical Digest, vd.13: pp. 490-493, 1992.
[23] Z. Huang, A. Kuh, “A Combined Self-Organizing Feature Map and Multilayer Perceptron for Isolated Word Recognition,” IEEE Transactions on Signal Processing, vol. 40 Issue: 11 , pp. 2651 -2657. Nov. 1992.
[24] Y. P. Jun, H. Yoon, J. W. Cho, “L* learning: a fast self-organizing feature map learning algorithm based on incremental ordering,” IEICE Transactions on Information & Systems, vol. E76, no. 6, pp. 698-706, 1993.
[25] B. Kelly and S. Benny, "Implementing Voice-activated Control with DSPs," Electronic Design, 1995.
[26] J. Kangas and T. Kohonen, “Developments and applications of the self-organizing map and related algorithms,” Mathematics and Computers in Simulation, vol. 41, pp. 3-12, 1996.
[27] T. Kohonen, Self-Organization and Associative Memory, 3rd ed. Heidelberg: Springer-Verlag, 1989.
[28] T. Kohonen, “Automatic formation of topological maps of patterns in a self-organizing system,” Proceeding of 2nd Scandinavian Conference on Image Analysis, pp. 214-220, 1981.
[29] T. Kohonen, “Clustering, taxonomy, and topological maps of patterns,” Proceeding of Sixth International Conference on Pattern Recognition, pp. 114-128, 1982.
[30] T. Kohonen, Self-organizing maps, Springer Series in Information Sciences, vol. 30, 1995.
[31] T. Kohonen, “The self-organizing map,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1464-1480, September 1990.
[32] C. H. Luo, C. H. Shih, and C. T. Shih, ” Chinese mouse code communication auxiliary system for the Disabled, “ Chinese Journal of Medical and Biological Engineering, vol.16, no.2, pp.214- 223,1996.
[33] Z. –P. Lo and B. Bavarian, “On the rate of convergence in topology preserving neural networks,” Biological Cybernetics, vol. 65, pp. 55-63, 1991.
[34] H. Miyao, Y. Nakano, “Head and stem extraction from printed music score using a neural network approach, “ pp.1074-1079, 1995.
[35] V. Miguel, M. Francisco, A. M. Teresa, and del P. Francisco,
"Multimodal Environmental Control System for Elderly and Disabled
People," Proceedings of the 1996 18th Annual International
Conference of the IEEE Engineering in Medicine and Biology
Society, 1996.
[36] MidiConverter, Author: Jeff Glatt,Shareware.com music machine
,URL:http://www.borg.com/~jglatt/progs/software.htmor, URL:http://www.hitsquad.com/smm/programs/MIDI_File_Converter/
[37] K. C. Ng, R D Boyle, and D Cooper, ”Low-and high-level approaches to optical music score recognition,” IEE Colloquium on Document Image Processing and Multimedia Environments, pp. 311-316, 1995.
[38] Opus Braille Music Reference Multimedia CD-ROM Library, Volume 2 v1.0:How to Read Braille Music. ISBN 1-892195-02-X
[39] V. Poulain d’Andecy, J. Camillerapp, and I. Leplumey, “Kalman filtering for segment detection: application to music scores analysis,” 12 IAPR Int. conference on Pattern Recognition, pp. 301-305, 1994.
[40] L.Rabineer, and B. H. Juang, ”An introduction to hidden Markov models”, IEEE ASSP Mag., 2, (1), pp. 4-16. 1986.
[41] L. Rabineer and B. H. Juang, “Fundamentals of Speech Recognition,” Prentice Hall,1993.
[42] K. Todd Reed and J. R. Parker, ”Automatic computer recognition of printed music,” Int. Conference Pattern Recognition, pp. 803-807, 1996.
[43] H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Trans. Acoust., Speech, Signal Processing. vol. ASSP-26, pp. 43-49, Feb 1978.
[44] K. B. Stanton et al, "Psubot--A Voice-controlled Wheelchair for the Handicapped," Midwest Symposium on Circuits and Systems 33rd Midwest Symposium on Circuits and Systems, pp.669-672, 1990.
[45] M. C. Su, T. K. Liu, and H. T. Chang, “An efficient initialization scheme for the self-organizing feature map algorithm,” IEEE International Joint Conference on Neural Networks, Washington D.C., 1999.
[46] M. C. Su, J. S. Chiang, and H. H. Chen, 1999, “Study of Braille Music Scores,” The Biomedical Engineering Society 1999 Annual Symposium, pp. 225-226, Taiwan.
[47] M. C. Su, H. H. Chen, C. D. Fang, and C. C. Wang, 2000, “A neural-network-based system for recognizing music symbols,” The Biomedical Engineering Society 2000 Annual Symposium, Taipei, Dec. 15-16.
[48] M. C. Su, J. S. Chiang, and H. H. Chen, 2000, “A neural-network-based method for the recognition of note heads from printed music scores,” in ICCE 2000, pp. 417-420, New Jersey, USA.
[49] M. C. Su, C. Y. Tew, and H. H. Chen, “Musical Symbol Recognition using SOM-based Fuzzy Systems,” in the joint IFSA/NAFIPS 2001 Conference, pp. 2150-2153, July 25-28 2001 Vancouver, Canada.
[50] M. C. Su, H. H. Chen, and W. C. Cheng, “A Neural-Network-Based Approach to Optical Symbol Recognition,” (to appear) in Neural Processing Letters, April 2002.
[51] SmartScore, Musitek Corporation 410 Bryant Circle, Suite K, Ojai,CA 93023 URL: http://www.musitek.com.
[52] A. P. Varga and R. K. Moore, “Hidden Markov model decomposition of speech and noise,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, pp. 845-848. 1990.
[53] J. Wang, "Integration of Eye-gaze, Voice and Manual Response in Multimodal User Interface" Proceedings of the IEEE International Conference on System, Man and Cybernetics Proceedings, pp.3938-3942 vol.5 1995.
[54] N. Weste, D. Burr, and B. Ackland, “Dynamic time warp pattern matching using an integrated multiprocessing array.” IEEE Trans. Comput., vol. C-32, pp. 731-744, Aug 1983.
[55] I. Yoda, K. Yamamoto, and H. Yamada, ”Automatic construction of recognition procedures for musical notes by GA,” Proc. of DAS 94, pp.203-209, 1994.