跳到主要內容

簡易檢索 / 詳目顯示

研究生: 陳厚君
Hou-Jyun Chen
論文名稱: 經驗模態分解法之語音辨識
An Empirical Mode Decomposition Method To Speech Recognition
指導教授: 莊堯棠
Yau-Tarng Juang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 電機工程學系
Department of Electrical Engineering
畢業學年度: 93
語文別: 中文
論文頁數: 62
中文關鍵詞: 語音辨識經驗模態分解法
外文關鍵詞: Empirical Mode Decomposition Method, Speech Recognition
相關次數: 點閱:11下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 摘要
    本篇論文重點在於語音信號分析處理這部分,根據黃鍔等人發表了一個新的資料處理方法—經驗模態分解法,這個方法利用系統變化的內部時間尺度來作為能量的直接析出,可將資料表達成內建模態函數,而這些函數即是原輸入訊號的基底,其具有完整性、幾乎正交性及可適性。可適性可表達原函數之物理特性,藉以處理非線性及非穩態性時間序列問題。因為這種方法的特性,再加上語音訊號也是非線性時間序列,而且瞭解說話內容文字特性及說話人的特性將有助於語音辨識,所以基底能夠表達原輸入訊號之物理特性將更加幫助我們作語音模型的訓練。是故改善傳統訊號分析方式,使訊號呈現其特性,為本研究之一大課題。
    本論文利用經驗模態分解法找出與文字特性較有關的輸入基底,以訓練一套模型,以在辨識流程上求得較好的辨識率。


    目錄 摘要 Ⅰ 目錄 Ⅱ 附圖目錄 Ⅳ 表格目錄 Ⅵ 第一章 緒論 1 1.1研究動機 1 1.2研究目標 2 1.3章節概要 3 第二章 語音辨識基本技術 4 2.1特徵參數擷取 4 2.2隱藏式馬可夫模型 8 2.3模型的建立與訓練 11 2.3.1 Viterbi Search演算法 11 2.3.2 訓練流程 13 2.4連續語音辨認方法 15 2.5辨識流程 17 第三章 希伯特黃轉換理論 19 3.1即時頻率 19 3.2內建模態函數 21 3.3經驗模態分解法 23 3.4希伯特黃頻譜 30 3.5經驗模態分解法應用於語音辨識 31 第四章 實驗與討論 38 4.1實驗環境 .38 4.2實驗與討論 39 4.2.1實驗一 內建模態函數辨識率的比較 39 4.2.2實驗二 組合內建模態函數辨識率的比較 40 4.2.3實驗三 內建模態函數在數字連續辨識的辨識比較 42 4.2.4實驗四 內建模態函數在雜訊環境下辨識率比較 43 第五章 結論與展望 57 5.1結論 .57 5.2未來展望 57 參考文獻 59

    參考文獻
    [1] E. Bedrosian, “A product theorem for Hilbert transform,” Proc. IEEE 51, pp. 868-869, 1963.
    [2] L. Cohen, Time-frequency analysis, Englewood Cliffs, NJ: Prentice-Hall, 1995.
    [3] J. R. Deller, Jr., J. G. Proakis, J. H. L. Hansen, Discrete-time processing of speech signals, Wiley-IEEE Press, 2000.
    [4] D. Gabor, “Theory of communication,” Proc. IEE 93, pp. 429-457, 1946.
    [5] N. E. Huang, “The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis,” NASA(manuscript), pp. 903-995, 1996.
    [6] B. H. Juang, “The past, present, and future of speech processing,” IEEE Trans. Signal Processing, pp. 24-28, May 1998.
    [7] T. Kawahara, C. H. Lee, and B. H. Juang, “Flexible speech understanding based on combined key-phrase Detection and Verification,” IEEE Trans. Speech and Audio Processing, vol. 6, Nov. 1998.
    [8] M. W. Koo and S. J. Lee, “An utterance verification system based on subword modeling for a vocabulary independent speech,” Eurospeech 1999.
    [9] M. W. Koo, C. H. Lee, and B. H. Juang, “Speech recognition and utterance verification based on a generalized confidence score,” IEEE Trans .on Speech and Audio Processing, vol. 9, Nov. 2001.
    [10] C. M. Liu, C. C. Chiu, and H. Y. Chang “Design of vocabulary -independent mandarin keyword spotters,” IEEE Trans. Speech and Audio Processing, vol. 8, July 2000.
    [11] Q. Li, B. H. Juang, Q. Zhou, and C. Lee, “Automatic verbal information verification for user authentication,” IEEE Trans. Speech and Audio Processing, vol. 8, Sep. 2000.
    [12] S. E. Levinson, L. R. Rabiner, and M. M. Sondhi, “An introduction to the application of the theory of probabilistic function of a markov process to automatic speech recognition,” The Bell System Technical Journal, vol. 62, April 1983.
    [13] C. S. Liu, H. C. Wang and C. H. Lee, “Speaker verification using normalized log-likelihood score,” IEEE Trans. Speech and Audio Processing, pp.57-60, Jan. 1996
    [14] N. Moreau and D. Jouvet “Use of a confidence measure based in frame level likelihood ratios for the rejection of incorrect data,” Eurospeech, 1999.
    [15] H. Ney, “The use of a one stage dynamic programming algorithm for connected word recognition,” IEEE Trans. Acoustic, Speech, Signal Processing, vol.32, April 1984.
    [16] J. Neyman and E. S. Pearson, “On the problem of the most efficient tests of statistical hypotheses,” phil. Trans. R. Soc. A, vol. 231, pp. 289-337, 1933.
    [17] J. Neyman and E. S. Pearson, “On the use and interpretation of certain test criteria for purpose of statistical inference,” Biometrika, pt I, vol. 20A, pp.175-240, 1928.
    [18] M. G. Rahim, C. H. Lee, and B. H. Juang, “Discriminative utterance verification for connected digits recognition,” IEEE Trans. Speech and Audio Processing, vol. 5, no. 3, May 1997.
    [19] L. R. Rabiner, “A tutorial on hidden markov models and selected application in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, Feb. 1989.
    [20] L. R. Rabiner and B. H. Juang, Fundamentals of speech recognition, Prentice Hall, New Jersey, 1993.
    [21] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Recognition Signals, Prentice-Hall Co. Ltd, 1978.
    [22] E. Roseberg, J. Delong, C. H. Lee, B. H. Juang and F. K. Soong, “The use of cohort normalized scores for speaker recognition,” Pro. ICSLP 92. pp.599-602. Oct. 1992.
    [23] J. T. Tou, Pattern recognition principles, Addison-Wesley, 1974.
    [24] L. X. and B. X. Wang “Utterance verification for spontaneous mandarin speech keyword spotting,” IEEE Proceedings ICII 2001, vol.3. pp. 397-401
    [25] Y. Zhzng, D. Zhand and Z. Shu, “A novel text-independent speaker verification method based on the global speaker model,” IEEE Trans. Systems, Man, and Cybernetics, vol. 30, pp. 598-602, 2000.

    QR CODE
    :::