身心障礙者輔具之研製｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃超群 Chao-Chun Huang
論文名稱：	身心障礙者輔具之研製 The Study of Auxiliary Equipments for the Spinal Cord Injuries and the Blind
指導教授：	蘇木春 Mu-Chun Su
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
畢業學年度：	90
語文別：	中文
論文頁數：	64
中文關鍵詞：	聲控電腦、語音辨識、點字樂譜
外文關鍵詞：	voice control Human-Computer Interface, Braille music scores, speech recognize
相關次數：	點閱：24 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

摘要
對一般人士來說，透過滑鼠和鍵盤來操作電腦是相當方便的事情，不幸地，對於身心障礙人士卻沒有辦法享受到這種方便的互動。如何改善這個狀況是個刻不容緩的任務。對於人類來說，語音是一種自然的溝通方式，對於沒辦法使用這些控制介面的人士，使用語音操作來控制電腦是個不錯的取代方式。目前，應用於語音辨識上，主要有下列三種方法，(1) 動態時間校正(DTW)演算法，(2) 隱藏Markov模型(HMM)，和(3) 模糊類神經方法(neuro-fuzzy approaches)。當然，這些方法都有個別的優缺點，在本篇論文中，我們提出了一個新的方法，結合了Kohonen的自我組織特徵映射圖網路(SOM)和一個圖樣比對方式，來應用於辨識單詞語音上。我們使用這個新的語音辨識方法來實現了一個聲控滑鼠介面，使得使用者能夠以語音命令來操作滑鼠。
在本論文中我們也發展了一個點字樂譜自動化製作的整合環境。透過此整合環境，使用者可以將所要學習的樂譜，先經掃描器轉至電腦檔案中，然後，藉由辨識程式辨識在被掃描的樂譜上的所有音樂符號，最後再將辨識結果轉換成點字樂譜檔案，並將點字樂譜檔案以語音呈現給使用者。藉由此整合環境，我們可大量減少人工製作的費用以及縮短製作時間，以便為國內視覺障礙人士製作大量的點字樂譜。

Abstract:
For able-bodies people, access to computers can be taken for granted because conventional computer interfaces (e.g. a keyboard and a mouse) are designed with the able-bodies in mind. Unfortunately, people with physical disabilities cannot enjoy the benefits provided by computers on equal term. Therefore, how to lower or even tear down the barriers between computers and users with disabilities is a very demanding task. Since speech is a natural communication means for human being, the voice-operation feature is the idea control method for the large number of disabled people who cannot use conventional computer interfaces. Currently, there are three different approaches to speech recognition such as (1) dynamic time warping (DTW) algorithm, (2) hidden Markov model (HMM), and (3) neuro-fuzzy approaches. Each has its own merits and disadvantages. In this thesis, a new method that combines Kohonen’s self-organizing feature map and a simple pattern matching method is proposed for isolated word recognition. Based on this new recognition method a voice-controlled mouse is implemented to allow the user to issue voice commands to move the cursor and/or click the buttons.
In this thesis, we also develop an integrated environment for automatically manufacturing Braille music scores. Under this integrated environment, users can first use a scanner to scan music scores into a computer, run the recognition software to recognize musical symbols on scanned music scores, and then transform the recognized results into Braille music scores. By using the proposed integrated environment, we can greatly reduce the labor fees and shorten the time for manufacturing a large amount of Braille music scores for the blind in Taiwan.

目錄
第一章 緒論	1
1.1 引言	1
1.2 研究動機	3
1.3 論文架構	5
第二章 聲控式電腦人機介面	6
2.1語音處理的方式	6
2.2語音特徵參數的擷取	10
2.2.1 特徵參數的擷取	10
2.3自我組織特徵映射圖網路	13
2.3.1自我組織特徵映射圖演算法	14
2.3.2參數的選擇	18
2.3.3細調自我組織特徵映射圖	21
2.4自我組織特徵映射圖網路訓練方式	23
2.4.1網路初始化	23
2.5語音特徵比對的方式	27
2.6以自我映射組織圖(SOM-based)處理語音辨識與動態時間校準(DTW-based)之比較	29
第三章 點字樂譜之製作	31
3.1引言	31
3.2樂譜影像輸入系統	32
3.3光學樂譜辨識系統	33
3.4點字符號資料庫	33
3.5轉譯程式	36
3.6語音輸出	43
3.7結果與討論	43
第四章 結論與展望	48
參考文獻	50
圖目錄
圖2.1 語音辨識系統:訓練階段和辨識階段。	9
圖2.2 端點偵測:目的是為了將一段語音的開始或結尾中沒有聲音訊號的部分移除。	11
圖2.3以二維矩陣方式排列的類神經元陣列。	15
圖2.4鄰域函數之型式 (a)正方形的鄰域函數；(b)六邊形的鄰域函數。	19
圖2.5高斯型式之鄰域函數。	20
圖2.6以網格點設定法來初始化特徵映射圖。	27
圖2.7 得勝者情形圖表:相同的語音”上”彼此的得勝者輸出情形。	28
圖2.8 對第一組的第一個語音的相似度結果﹕其中A,B,C與D代表為四組訓練集，而1..9則是表示9種不同的語音單詞，最下面的11這一列是同一組中的最大值。	29
圖2.9聲控電腦操作介面：藉由輸入的語音命令來控制滑鼠移動於螢幕小鍵盤上，並決定所擊發動作(左，右鍵等…)。	30
圖3.1 點字樂譜自動化製作的整合環境所包含的四個子系統。	33
圖3.2 點字符號表示法的差異處：和絃的表示方式不同。	35
圖3.3、點字樂譜資料庫：儲存的資料型態。	36
圖3.4 音符、點字符號與對應之文字符號示意圖。	37
圖3.5透過MidiConverter可將MIDI檔轉換成文字檔。(a) MidiConverter所提供的轉換介面；(b) MIDI檔案的文字檔型態。	42
圖3.6轉譯程式的流程圖。	43
圖3.7播放程式介面：使用方向鍵來控制播放的樂譜。	45
圖3.8 安裝程式執行畫面。	46
圖3.9 ODBC相關設定，資料來源名稱”TtoBdata”。	47
圖3.10 實驗結果： (a)一組旋律、(b)該旋律MIDI檔案的文字檔、(c)經過轉換程式處理後的結果，需要點字字型才能觀看真實的點字符號結果。	49

                                

參考文獻
[1] 陳友倫，張恒雄，黃美娟，鄧復旦，郭德威，“新開發之殘障者人機界面”, The Biomedical Engineering Society 1997 Annual Symposium, pp. 146-147, 1997。
[2] 張彧，“工作用輔具”，輔具之友通訊, No. 7, pp. 4-15，1998。
[3] 蘇木春，楊孟達，胡家銘，“以超音波偵測頭動之人機介面”，中華醫學工程學刊, vol. 18, No. 3, pp.183-188 , 1998。
[4] 蘇木春，賴友人，鐘明蒼，”身體障礙者之聲控人機介面”，淡江大學電機工程學系電子電路組碩士班論文，pp. 18-32. 2001
[5] 視覺障礙人士點字研究叢書第七籍點字樂符精解,教育部視覺障礙人士點字研究小組國立台南師範學院台灣省視覺障礙兒童混合教育計劃師資訓練班
[6] W. Y. Liou and J. H. Chen, ”Man-machine interface-Eye Mouse”, 中華民國84年醫學工程科技研討會, pp. 235-236, 台南, 1995.
[7] D. A. Anson, Alternative Computer Access: A Guide to Selection, Philadelphia, 1997.
[8] F. Azam and H. F. VanLandingham, “Adaptive self organizing feature map neuro-fuzzy technique for dynamic system identification,” IEEE ISIC/CIRA/ISAS Joint Conference, pp. 337-341, Maryland, U.S.A., 1998.
[9] D. Burr, B. Ackland, and N. Weste, “Array configurations for dynamic time warping,” IEEE Trans. Acoust,. Speech, Signal Processing, vol. ASSP-26, pp. 43-49, Feb 1978.
[10] D. Binbridge and T C Bell, ”Dealing with superimposed objects in optical music recognition, ” sixth Int. conference on Image Processing and its Application, pp. 756-760, 1997.
[11] S. Baumann, ”A simplified attributed graph grammar for high-level music recognition, “ Proceeding of the Third Int. Conference on Document Analysis and Recognition, vol. 2, pp.1080-1083, 1995.
[12] A. M. Cook and S. M. Hussey, Assistive Technologies: Principles and Practice. Baltimore: Mosby, 1995.
[13] B. Couasnon and J. Camillerapp, “A way to separate knowledge from program in structured document analysis: application to optical music recognition,” 3 rd Int. Conference on Document Analysis and Recognition, pp. 1092-1097, 1995.
[14] Y. Chaya and P. S. Fisher, "Voice Controlled Smart House," Proceedings of the IEEE 1993 International Conference on Consumer Electronics, pp.154-155.
[15] Eye Ware, http://www.assistivetech.comp/p-eyeware.htm。
[16] B. Fritzke, “Growing cell structures-a self-organizing network for unsupervised and supervised learning,” Neural Networks, vol. 7, no. 9, pp. 1441-1460, 1994.
[17] J. A. Flanagan, “Analyzing a self-organizing algorithm,” Neural Networks, vol. 10, no. 5, pp. 875-883, 1997.
[18] J. A. Flanagan, “Self-organization in Kohonen’s SOM,” Neural Networks, vol. 9, no. 7, pp. 1185-1197, 1996.
[19] L. A. Frey, Jr. K. P. White, and T. E. Hutchison, "Eye-gaze Word Processing," IEEE Trans. on System, Man and Cybernetics, pp. 944-950, 1990.
[20] B. Hu and Q. M. Hua, "New Method for Human-computer Interaction by using Eye Gaze," Proceedings of the IEEE International Conference on Systems，Man and Cybernetics, pp.2723-2728 vol. 3,1994.
[21] T.E. Hutchinson, Jr. K.P. White, W.N. Martin, K. C. Reichert, and L. A. Frey, "Human-computer Interaction Using Eye-gaze Input," IEEE Trans. on System, Man, and Cybernetics, pp. 1527-1534, 1989.
[22] T. Hauck, ”SAM: an improved input device, ” Johns Hopkins APL Technical Digest, vd.13: pp. 490-493, 1992.
[23] Z. Huang, A. Kuh, “A Combined Self-Organizing Feature Map and Multilayer Perceptron for Isolated Word Recognition,” IEEE Transactions on Signal Processing, vol. 40 Issue: 11 , pp. 2651 -2657. Nov. 1992.
[24] Y. P. Jun, H. Yoon, J. W. Cho, “L* learning: a fast self-organizing feature map learning algorithm based on incremental ordering,” IEICE Transactions on Information & Systems, vol. E76, no. 6, pp. 698-706, 1993.
[25] B. Kelly and S. Benny, "Implementing Voice-activated Control with DSPs," Electronic Design, 1995.
[26] J. Kangas and T. Kohonen, “Developments and applications of the self-organizing map and related algorithms,” Mathematics and Computers in Simulation, vol. 41, pp. 3-12, 1996.
[27] T. Kohonen, Self-Organization and Associative Memory, 3rd ed. Heidelberg: Springer-Verlag, 1989.
[28] T. Kohonen, “Automatic formation of topological maps of patterns in a self-organizing system,” Proceeding of 2nd Scandinavian Conference on Image Analysis, pp. 214-220, 1981.
[29] T. Kohonen, “Clustering, taxonomy, and topological maps of patterns,” Proceeding of Sixth International Conference on Pattern Recognition, pp. 114-128, 1982.
[30] T. Kohonen, Self-organizing maps, Springer Series in Information Sciences, vol. 30, 1995.
[31] T. Kohonen, “The self-organizing map,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1464-1480, September 1990.
[32] C. H. Luo, C. H. Shih, and C. T. Shih, ” Chinese mouse code communication auxiliary system for the Disabled, “ Chinese Journal of Medical and Biological Engineering, vol.16, no.2, pp.214- 223,1996.
[33] Z. –P. Lo and B. Bavarian, “On the rate of convergence in topology preserving neural networks,” Biological Cybernetics, vol. 65, pp. 55-63, 1991.
[34] H. Miyao, Y. Nakano, “Head and stem extraction from printed music score using a neural network approach, “ pp.1074-1079, 1995.
[35] V. Miguel, M. Francisco, A. M. Teresa, and del P. Francisco,
"Multimodal Environmental Control System for Elderly and Disabled
People," Proceedings of the 1996 18th Annual International
Conference of the IEEE Engineering in Medicine and Biology
Society, 1996.
[36] MidiConverter, Author: Jeff Glatt,Shareware.com music machine
,URL:http://www.borg.com/~jglatt/progs/software.htmor, URL:http://www.hitsquad.com/smm/programs/MIDI_File_Converter/
[37] K. C. Ng, R D Boyle, and D Cooper, ”Low-and high-level approaches to optical music score recognition,” IEE Colloquium on Document Image Processing and Multimedia Environments, pp. 311-316, 1995.
[38] Opus Braille Music Reference Multimedia CD-ROM Library, Volume 2 v1.0:How to Read Braille Music. ISBN 1-892195-02-X
[39] V. Poulain d’Andecy, J. Camillerapp, and I. Leplumey, “Kalman filtering for segment detection: application to music scores analysis,” 12 IAPR Int. conference on Pattern Recognition, pp. 301-305, 1994.
[40] L.Rabineer, and B. H. Juang, ”An introduction to hidden Markov models”, IEEE ASSP Mag., 2, (1), pp. 4-16. 1986.
[41] L. Rabineer and B. H. Juang, “Fundamentals of Speech Recognition,” Prentice Hall,1993.
[42] K. Todd Reed and J. R. Parker, ”Automatic computer recognition of printed music,” Int. Conference Pattern Recognition, pp. 803-807, 1996.
[43] H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Trans. Acoust., Speech, Signal Processing. vol. ASSP-26, pp. 43-49, Feb 1978.
[44] K. B. Stanton et al, "Psubot--A Voice-controlled Wheelchair for the Handicapped," Midwest Symposium on Circuits and Systems 33rd Midwest Symposium on Circuits and Systems, pp.669-672, 1990.
[45] M. C. Su, T. K. Liu, and H. T. Chang, “An efficient initialization scheme for the self-organizing feature map algorithm,” IEEE International Joint Conference on Neural Networks, Washington D.C., 1999.
[46] M. C. Su, J. S. Chiang, and H. H. Chen, 1999, “Study of Braille Music Scores,” The Biomedical Engineering Society 1999 Annual Symposium, pp. 225-226, Taiwan.
[47] M. C. Su, H. H. Chen, C. D. Fang, and C. C. Wang, 2000, “A neural-network-based system for recognizing music symbols,” The Biomedical Engineering Society 2000 Annual Symposium, Taipei, Dec. 15-16.
[48] M. C. Su, J. S. Chiang, and H. H. Chen, 2000, “A neural-network-based method for the recognition of note heads from printed music scores,” in ICCE 2000, pp. 417-420, New Jersey, USA.
[49] M. C. Su, C. Y. Tew, and H. H. Chen, “Musical Symbol Recognition using SOM-based Fuzzy Systems,” in the joint IFSA/NAFIPS 2001 Conference, pp. 2150-2153, July 25-28 2001 Vancouver, Canada.
[50] M. C. Su, H. H. Chen, and W. C. Cheng, “A Neural-Network-Based Approach to Optical Symbol Recognition,” (to appear) in Neural Processing Letters, April 2002.
[51] SmartScore, Musitek Corporation 410 Bryant Circle, Suite K, Ojai,CA 93023 URL: http://www.musitek.com.
[52] A. P. Varga and R. K. Moore, “Hidden Markov model decomposition of speech and noise,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, pp. 845-848. 1990.
[53] J. Wang, "Integration of Eye-gaze, Voice and Manual Response in Multimodal User Interface" Proceedings of the IEEE International Conference on System, Man and Cybernetics Proceedings, pp.3938-3942 vol.5 1995.
[54] N. Weste, D. Burr, and B. Ackland, “Dynamic time warp pattern matching using an integrated multiprocessing array.” IEEE Trans. Comput., vol. C-32, pp. 731-744, Aug 1983.
[55] I. Yoda, K. Yamamoto, and H. Yamada, ”Automatic construction of recognition procedures for musical notes by GA,” Proc. of DAS 94, pp.203-209, 1994.

簡易檢索 / 詳目顯示

相關論文