| 研究生: |
吳柏葦 Po-Wei Wu |
|---|---|
| 論文名稱: |
利用Android系統開發可攜式語音診斷與復健系統 Development of an Android-based speech diagnosis and rehabilitation system |
| 指導教授: | 吳炤民 |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2015 |
| 畢業學年度: | 104 |
| 語文別: | 中文 |
| 論文頁數: | 127 |
| 中文關鍵詞: | Android 、構音障礙 、智慧型裝置 、復健 、語音分析 、聲譜 、頻譜 |
| 相關次數: | 點閱:5 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
摘要
「語言」是人類重要交際的工具之一,近年來構音障礙患者逐年攀升,對於人際溝通是嚴重的阻礙。隨著科技的發展,行動裝置的使用愈來愈普遍,若能把語音診斷、復健應用在日常生活上的行動裝置,將可對患者及語言治療師有相當大的幫助。因此,本研究目的是以Android系統為平台開發一套語音診斷復健系統,利用行動裝置內建的麥克風進行錄音,透過本系統的使用者介面,來分析正常語音及構音障礙個案的語音訊號,提供給語言治療師語音波形、頻譜、聲譜等資訊來觀察比對,讓構音障礙者在言語上的診斷及復健訓練,有更大的治療效果。
本研究利用本實驗室先前與台北榮民總醫院所合作錄製個案的語音資料,透過快速傅立葉轉換 (Fast Fourier Transform, FFT) 及線性預測編碼 (Linear Predictive Coding, LPC) 的聲學量化方法,將語音的時域訊號量化再透過圖形及數據的方式來呈現語音相關的診斷資訊。本研究針對所錄製的中文母音/ㄚ/、/一/、/ㄨ/、/ㄝ/、/ㄛ/以及子音/ㄍ/、/ㄎ/、/ㄏ/、/ㄐ/、/ㄑ/、/ㄒ/、/ㄓ/、/ㄔ/、/ㄕ/來做比較,探討正常語音與構音障礙者之間的差異性,藉由母音共振峰的觀察以及子音在聲譜上的能量分佈,來判別構音障礙者的發音是否正確,從比較結果顯示,使用者可以透過本研究系統能夠觀察出構音障礙者發音的問題所在,除此之外,構音障礙患者也可以運用本系統作自主語音訓練。
為了評估本系統的正確性、功能性及實用性,我們將本研究的分析結果和Praat系統以及本研究室先前利用電腦的Matlab軟體所發展之可見式語音診斷及復健系統,做使用者介面、介面功能、方法以及整體系統比較。從比較結果發現,本研究在雙語音輸入介面上做語音比對,比起Praat系統要來的更適合作為診斷語音的工具,亦可利用本系統來呈現語音訓練復健前後的成效。此外,本研究將語音診斷與復健系統改良為輕量且擁有攜帶的功能,並且可以在智慧型手機、平板上使用,相較於以電腦的Matlab軟體所發展之可見式語音診斷及復健系統以及Praat系統要來的便利許多。因此,無論在治療師的攜帶診斷上或是患者的自主訓練的方便性,都具有相當大的幫助。
本論文所開發的語音診斷及復健系統可以在行動裝置上呈現正常語音與構音障礙個案之間的差異,讓語言治療師在臨床診斷上,運用此系統作為診斷及評估的工具,了解構音障礙者的發音狀況,提供患者更完善的訓練復健工具。
Abstract
Language is one of the most important communication tool. Recently, patients with articulatory disorders are increasing year by year and causing problems in their communication with other people. Mobile devices, such as smart phone and tablet PC, become more common as the development of technology advances. It would be a great help for patients and speech therapists if we could apply our mobile devices for speech diagnosis and rehabilitation. Therefore, the purpose of this study was to develop an Android-based speech diagnosis and rehabilitation system which could be used to record and compare the speech signals from the normal speaker and the patient with articulation disorder via the user interface. In this user interface, the clinical users could compare these two signals in the forms of speech signal, spectrum, spectrogram, and the fundamental frequency to provide a quantitative analysis for the speech therapist and greater therapeutic effect for patients with articulatory disorders.
In this study, we analyzed and compared speech recordings, including Chinese vowel and consonants, from the previous cooperative hospital with fast Fourier transform, linear predictive coding and other acoustic quantitative methods to provide useful speech-related diagnostic graphic information. Our results showed that clinical users could observe the difference between normal and disordered speech through our research system with differences in the first three formant frequencies for vowel and energy distribution in the spectrograms for consonants. In addition, patients with articulatory disorders could also use our system for self-training and -learning.
In order to evaluate the validity, functionality and usefulness of our system, we compared the results of our system with the Praat system and the visible voice diagnosis and rehabilitation system which was previously developed with Matlab in our lab. In summary, the Android-based speech diagnosis and rehabilitation system could show the differences between normal and disordered speech on the mobile devices. The speech therapists could use our system as a diagnostic and assessment tool in the clinical settings, and provide patients with articulatory disorders a better training and rehabilitation tool.
參考文獻
英文參考資料:
Albertini G., Bonassi S., Dall’Armi, V., Giachetti I., Giaquinto, S., and Mignano, M. (2010). “Spectral analysis of the voice in Down Syndrome.” Research in Developmental Disabilities, 31, 995-1001.
Aziz, Mohd Z. A., Abdullah, Syahrul A. C., Adnan, Syed F. S., and Mazalan, Lucyantie. (2014). “Educational App for Children with Autism Spectrum Disorders.” Procedia Computer Science, No. 42, 70-77.
Chappel, Roger., and Paliwal, Kuldip. (2014). “An educational platform to demonstrate speech processing techniques on Android based smart phones and tablets.” Speech Communication, No. 57, 13-38
Howell, Peter., and Vause Louise. (1985). “Acoustic analysis and perception of vowels in stuttered speech.” J. Acoust. Soc. Am., Vol. 79, No. 5, 1571-1579
Howell Peter., and Mark Williams. (1990). “Acoustic analysis and perception of vowels in children’s and teenager’s stuttered speech.” J. Acoust. Soc. Am., Vol. 91, No. 3, 1697-1706
Julie M. Liss., Stephanie Spitzer., John N. Caviness., Charles Adler., and Brian Edwards. (1998). “Syllabic strength and lexical boundary decisions in the perception of hypokinetic dysarthric speech.” J. Acoust. Soc. Am., Vol. 104, No. 4, 2457-2466.
Kent, R. D. and Read, C. (2002). The acoustic analysis of speech, Thomson Learning: Albany, NY, USA.
Ladefoged, Peter., Harshman Richard., Goldstein Louis., and Llooyd. (1978). “Generating vocal tract shapes from formant frequencies.” J. Acoust. Soc. Am., Vol. 64, No. 4, 1027-1035.
Martin Niedereder and Benjamin T. Sanders. (1962). “The Level Tracer, an Instrument for Time Saving and Better Measurement in the Speech Band.” IEEE, American Institute of Electrical Engineers, Vol. 80, Issue 6, 674-682.
Milenkovic, P. (2004). “TF32 [computer Program].” Madison, WI: University of Wisconsin - Madison, Department of Electrical Engineering.
Park, S. H., Kim, D. J., Lee, J. H., and Yoon,T. S. (1994). “Integrated Speech Training System for Hearing Impaired.” IEEE Transitions on Rehabilitation Engineering, Vol. 2, No. 4, 189-196.
Peterson,G. E. and Barney,H. L. (1952). “Control methods used in a study of the vowels” J. Acoust. Soc. Am., Vol. 24, 175-184.
Robert, E. O. Jr., Metz, D. E., and Haas, A. (2000). Introduction to communication disorders, Pearson Education: Needham Heights, MA.
Skodda, Sabine., Visser, Wenke., and Schlegel Uwe. (2011). “Vowel Articulation in Parkinson’s Disease.” Journal of Voice, Vol. 25, No. 4, 467-472.
Stevens, Kenneth N. and Arthurs. (1954). “ Development of a Quantitative Description of Vowel Articulation.” J. Acoust. Soc. Am., Vol. 27, No. 3, 484-493.
Subramanian Anu., Yairi Ehud., and Amir Ofer. (2003). “Second formant transitions in fluent speech of persistent and recovered preschool children who stutter.” Journal of Communication Disorders, Vol. 36, Issue 1, 59-75.
Syauqy, Dahnial. (2014). “Feature Based Scoring on Visible Speech Diagnostic and Rehabilitation System.” MS Thesis, Department of Electrical Engineering. National Central University: Chungli, Taiwan.
Szameitat, Diana P., Darwin, Chris J., Szameitat, Andre J., Wildgruber, Dirk., and Alter, Kai. (2012). “Formant Characteristics of Human Laughter.” Journal of Voice, Vol. 25, No. 1, 32-37
Tykalova, Tereza., Rusz Jan., Cmejla Roman., Ruzickova Hana., and Ruzicka Evzen. (2014). “Acoustic Investigation of Stress Patterns in Parkinson’s Disease.” Journal of Voice, Vol. 28, No. 1, 129.e1-129.e8.
Willis GL., and Armstrong SM. (1998). “Orphan neurones and amine access: the functional neuropathology of Parkinsonism and neuropsychiatric disease.” Brain Res. Rev., Vol. 27, 177-242.
Android. Google, Inc., Mountain View, CA, USA. (2007) (2015/12/12 Access)
http://www.android.com/
Computerized Speech Lab, KayPENTAX, Montvale, NJ., USA (2011) : (2015/12/12 Access)
http://www.swordmedical.ie/product/cslcomputerized-speech-lab/
FLAC. Josh Coalson, Xiph. Org Foundation. (2001): (2015/12/14 Access)
https://xiph.org/flac/
https://github.com/AlyoshaVasilieva/JavaFlacEncoder
Gartner, Inc. Stamford, Connecticut, USA. (1979) (2015/12/12 Access)
http://www.gartner.com/technology/home.jsp
International Data Corporation, IDC. Massachusetts, USA. (1964) (2015/12/12 Access)
http://www.idc.com/getdoc.jsp?containerId=prUS24108913
iOS. Apple Computer, Inc., Cupertino, CA, USA. (2007) (2015/12/12 Access)
http://www.apple.com/tw/
Praat: Doing Phonetics By Computer. (2014): (2015/12/12 Access)
http://www.fon.hum.uva.nl/praat/
Plotly, Inc. Montreal, Quebec, Canada. (2012): (2015/12/12 Access)
https://plot.ly/feed/
https://github.com/plotly/plotly.js#bugs-and-feature-requests
ProSpec Lite Spectrum Analyzer. (1934): (2015/12/12 Access)
http://www.9apps.com/android-apps/ProSpec-Lite-Spectrum-Analyzer/
Summer Institute of Linguistics, SIL International., Arkansas, USA. (1934) (2015/12/12 Access)
http://www.sil.org/
University of Dallas, Speech Enhancement for Android. (2012)
(2015/12/12 Access)
http://enhancementapp.com/
Windows Phone. Microsoft Corporation, Redmond, WA, USA. (2010) (2015/12/12 Access)
http://www.windowsphone.com/zh-tw
中文參考資料:
葉斐聲、徐通鏘 (2001). “語言學綱要,” 書林出版有限公司, 台灣.
衛生福利部社會及家庭屬身心障礙服務網站: (2015年12月12號 存取)
https://dpws.sfaa.gov.tw/commonch/index.jsp
蘇宗柏、陳思遠、王亭貴、王顏和、連倚南, (2010), “復健醫療服務之疾病分類研究,” 國內某醫學中心近期經驗: (2015年12月12號 存取)
http://www.ntuh.gov.tw/PMR/Lists/List14/Attachments/188/10253009-201012-201101150006-201101150006-229-236.pdf
林寶貴 (1994). “語言障礙與矯治,” 五南圖書出版有限公司, 台灣台北市。
賴湘君 (1990). “構音異常的診斷及矯治,” 語言治療教育專題研討專輯, 台灣台北市政府教育局, 123-133。
曾進興 (2005). “語言病理學基礎 — 第一卷,” 心理出版社股份有限公司, 台灣台北市。
謝秉寰 (2014). “可見式語音診斷與復健系統,” 國立中央大學電機工程研究所, 碩士論文。
王士元、彭剛 (2007). “語言、語音與技術,” 香港城市大學出版社, 香港.
徐筱萍 (2014). “國小低年級構音/音韻障礙兒童 語音清晰度、聲韻覺識與識字量之研究,” 碩士論文, 臺北市立大學語言治療碩士學位學程。
吳威德 (2014). “使用行動裝置輔助中風患者復健,” 碩士論文, 國立臺灣科技大學電機工程學系。
謝國平 (2011). “語言學概論,” 三民出版社股份有限公司, 台灣台北市。
鄭靜宜 (2011). “語音聲學-說話聲音的科學,” 心理出版社股份有限公司, 台灣台北市。
鍾榮富 (2007). “語言學概論,” 五南出版社股份有限公司, 台灣台北市。
王小川 (2009). “語音訊號處理 修定二版,” 全華圖書股份有限公司, 台灣新北市。
鍾玉梅 (2002). “舌根音化異常兒童之音韻處理能力探討,” 碩士論文, 國立台北護理學院, 聽語障礙科學研究所。