關鍵詞萃取及語者辨識系統之研製｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	蔡炎興 Yan-Hsing Tsai
論文名稱：	關鍵詞萃取及語者辨識系統之研製 A System for Keyword Spotting and Speaker Recognition
指導教授：	莊堯棠 Yau-Tarng Juang
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 電機工程學系 Department of Electrical Engineering
畢業學年度：	91
語文別：	中文
論文頁數：	83
中文關鍵詞：	語者識別、關鍵詞萃取、語音辨識
外文關鍵詞：	Speaker Recognition, Keyword Spotting
相關次數：	點閱：13 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文的研究主題是針對前人的關建詞萃取、確認技術加以改進，並結合語者辨識技術建構一套系統。本論文主體可分為三個部分。在關鍵詞萃取方面，關鍵詞與無關詞模組是用次音節模型來建立的，目的是使最後建立的系統更具有可攜性。另外，除了找出關鍵詞模組與無關詞模組的混合數有最好的搭配方式外，還應用了GCS辨識演算法在關鍵詞的萃取上，使得辨識時就具有部分的拒絕能力，最後再應用類似Beam Search的概念來對系統加速。而關鍵詞的確認上，同樣的我們使用了次音節模型來作假設測試，並且提出了一個不用訓練每個次音節臨界值的方法，使得以後建立確認系統可以更快速。最後，簡略地介紹我們使用語者辨識技術的方法。結合前人的語者識別技術【27】，建構一套系統，並且利用Visual C++的MFC、SDK將我們的語音辨識核心技術包起來，實現視窗化的使用者介面，使得我們的理論能做到即時的線上測試。

摘要 ............................................................I
目錄 ............................................................II
附圖目錄	........................................................V
表格目錄	........................................................VII
第一章 緒論 .....................................................1
1 研究動機 ....................................................1
2 研究目標 ....................................................2
3 論文大綱 ....................................................3
第二章 語音辨識基本技術 .........................................4
1 特徵參數擷取 ................................................4
2 隱藏式馬可夫模型 ............................................8
3 聲學模型 ....................................................11
4 模型訓練與參數預估 ..........................................16
4.1 訓練演算法 ................................................16
4.2 訓練流程圖 ................................................19
第三章 關鍵詞萃取與確認 .........................................21
1 概論	........................................................21
2 關鍵詞萃取架構 ..............................................22
2.1 關鍵詞模組 ................................................22
2.2 無關詞模型 ................................................23
2.3 辨識模組的排列 ............................................24
3 辨識演算法 ..................................................25
4 辨識流程 ....................................................28
5 廣義信任分數 ................................................29
6 關鍵詞確認 ..................................................33
6.1 確認流程 ..................................................33
6.2 次音節的假設測試 ..........................................35
6.3 錯誤率的計算 ..............................................38
7 系統加速 ....................................................40
第四章 語者辨識與確認 ...........................................42
1 語者辨識 ....................................................42
2 語者確認 ....................................................45
2.1 語者模型 ..................................................45
2.2 全域語者模型(Global Speaker Model) ........................46
第五章 實驗與結果	...............................................48
1 實驗環境 ....................................................48
2 關鍵詞萃取實驗 ..............................................50
2.1 混合數對辨識率的影響	 .....................................50
2.2 廣義信任分數（GCS） .......................................52
3 關鍵詞確認實驗 值 ...........................................55
3.2 關鍵詞確認 ................................................58
4 系統加速 ....................................................61
5 系統實現 ....................................................62
第六章 結論與展望	...............................................68
1 結論	........................................................68
2 未來展望 ....................................................69
參考文獻	........................................................71

                                

[1] M.-W. Koo, C.-H. Lee, and B.-H Juang, “ Speech Recognition and Utterance Verification Based on a Generalized Confidence Score, ” IEEE Trans .on Speech and Audio Processing, vol. 9, No. 8, Nov. 2001.
[2] Chi-Min Liu, Chin-Chih Chiu, and Hung-Yuan Chang “ Design of Vocabulary -Independent Mandarin Keyword Spotters, ” IEEE Trans. on Speech and Audio Processing, vol. 8, No. 4, July 2000.
[3] Qi Li, B.-H, Juang, Qiru Zhou, and C.-H. Lee, “ Automatic Verbal Information Verification for User Authentication, ” IEEE Trans. on Speech and Audio Processing, vol. 8, No. 5, Sep. 2000.
[4] T. Kawahara, C.-H. Lee, and B.-H. Juang, “ Flexible Speech Understanding Based on Combined Key-Phrase Detection and Verification, ” IEEE Trans. on Speech and Audio Processing, vol. 6, No. 6, Nov. 1998.
[5] B. H. Juang, “ The past, present, and future of speech processing, ” IEEE Trans. on Signal Processing, pp. 24-28, May 1998.
[6] M. G. Rahim, C.-H. Lee, and B.-H. Juang, “ Discriminative Utterance Verification for Connected Digits Recognition, ” IEEE Trans. on Speech and Audio Processing, vol. 5, No. 3, May 1997.
[7] D. Burshtein, “ Robust parametric modeling of duration in hidden Markov models, ” IEEE Trans. on Speech Audio Processing, vol. 4, pp. 240-242, May 1996.
[8] H. Ney, “ The use of a one stage dynamic programming algorithm for connected word recognition, ” IEEE Trans on. Acoustic, Speech, Signal Processing, vol.32, No.2, April 1984.
[9] S. E. Levinson, L. R. Rabiner, and M. M. Sondhi, “ An Introduction to the Application of the Theory of Probabilistic Function of a Markov Process to Automatic Speech Recognition, ” The Bell System Technical Journal, vol. 62, No. 4, April 1983.
[10] J. Neyman and E. S. Pearson, “ On the problem of the most efficient tests of statistical hypotheses, ” phil. Trans. R. Soc. A, vol. 231, pp. 289-337, 1933.
[11] Lin Xin and Bing-Xi Wang “ Utterance Verification For Spontaneous Mandarin Speech Keyword Spotting, ” IEEE Proceedings ICII 2001, Beijing, pp. 397-401 vol.3
[12] Myoung-Wan Koo and Sun-Jeong Lee, “ An Utterance Verification System Based on Subword Modeling For A Vocabulary Independent Speech, ” Eurospeech 1999.
[13] N. Moreau and D, Jouvet “ Use of A Confidence Measure Based in Frame Level Likelihood Ratios for The Rejection of Incorrect Data, ” Eurospeech, 1999.
[14] Tatsuya Kawahara, C.-H. Lee and B.-H. Juang “ Combining Key-Phrase Detection and Subword-Based Verification For Flexible Speech Understanding, ” in Proc IEEE Int. Conf. Acoustic, Speech, Signal Processing, Munich, Germany, May 1997, pp. 1159-1162
[15] L. R. Rabiner, “ A Tutorial on Hidden Markov Models and Selected Application in Speech Recognition, ” Proceedings of the IEEE, vol. 77, No. 2, Feb. 1989.
[16] L. R. Rabiner and B. H. Juang, “ Fundamentals of Speech Recognition, ” Prentice Hall, New Jersey, 1993.
[17] John R. Deller, Jr., John G. Proakis, John H. L. Hansen, “ Discrete-Time Processing of Speech Signals ”, 1987
[18] L. R. Rabiner and R. W. Schafer, “ Digital Processing of Speech Recognition Signals, ” Prentice-Hall Co. Ltd, 1978
[19] Tou, J. T., “ Pattern Recognition Principles, ” Addison-Wesley, 1974
[20] J. Neyman and E. S. Pearson, “ On the use and interpretation of certain test criteria for purpose of statistical inference, ” Biometrika, pt I, vol. 20A, pp.175-240, 1928.
[21] Y. Zhzng, D. Zhand and Z. Shu, “ A novel text-independent speaker verification method based on the global speaker model, ” IEEE Trans. on Systems, Man, and Cybernetics, 30(5):598-602, 2000.
[22] Chi-Shi Liu, Hsiao-Chuan Wang and Chin-Hui Lee, “ Speaker verification using normalized log-likelihood score, ” IEEE Trans. on Speech and Audio Processing, Jan. 1996, pp.57-60.
[23] E. Roseberg, J. Delong, C. H. Lee, B. H. Juang and F. K. Soong, “ The Use of Cohort Normalized Scores for Speaker Recognition, ” Pro. ICSL 92. Banff, pp.599-602. Oct. 1992.
[24] 蔡永琪，“ 基於次音節單元之關鍵詞辨識 ”，國立中央大學碩士論文，中華民國八十四年六月
[25] 黃國彰，“ 關鍵詞萃取與確認之研究 ”，國立中央大學碩士論文，中華民國八十五年六月
[26] 王維邦，“ 連續國語語音關鍵詞萃取系統之研究與發展 ”，國立中央大學碩士論文，中華民國八十六年六月
[27] 吳金池，“ 語者辨識系統之研究 ”，國立中央大學碩士論文，中華民國九十年五月

簡易檢索 / 詳目顯示

相關論文