| 研究生: |
蔡炎興 Yan-Hsing Tsai |
|---|---|
| 論文名稱: |
關鍵詞萃取及語者辨識系統之研製 A System for Keyword Spotting and Speaker Recognition |
| 指導教授: |
莊堯棠
Yau-Tarng Juang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 畢業學年度: | 91 |
| 語文別: | 中文 |
| 論文頁數: | 83 |
| 中文關鍵詞: | 語者識別 、關鍵詞萃取 、語音辨識 |
| 外文關鍵詞: | Speaker Recognition, Keyword Spotting |
| 相關次數: | 點閱:13 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文的研究主題是針對前人的關建詞萃取、確認技術加以改進,並結合語者辨識技術建構一套系統。本論文主體可分為三個部分。在關鍵詞萃取方面,關鍵詞與無關詞模組是用次音節模型來建立的,目的是使最後建立的系統更具有可攜性。另外,除了找出關鍵詞模組與無關詞模組的混合數有最好的搭配方式外,還應用了GCS辨識演算法在關鍵詞的萃取上,使得辨識時就具有部分的拒絕能力,最後再應用類似Beam Search的概念來對系統加速。而關鍵詞的確認上,同樣的我們使用了次音節模型來作假設測試,並且提出了一個不用訓練每個次音節臨界值的方法,使得以後建立確認系統可以更快速。最後,簡略地介紹我們使用語者辨識技術的方法。結合前人的語者識別技術【27】,建構一套系統,並且利用Visual C++的MFC、SDK將我們的語音辨識核心技術包起來,實現視窗化的使用者介面,使得我們的理論能做到即時的線上測試。
[1] M.-W. Koo, C.-H. Lee, and B.-H Juang, “ Speech Recognition and Utterance Verification Based on a Generalized Confidence Score, ” IEEE Trans .on Speech and Audio Processing, vol. 9, No. 8, Nov. 2001.
[2] Chi-Min Liu, Chin-Chih Chiu, and Hung-Yuan Chang “ Design of Vocabulary -Independent Mandarin Keyword Spotters, ” IEEE Trans. on Speech and Audio Processing, vol. 8, No. 4, July 2000.
[3] Qi Li, B.-H, Juang, Qiru Zhou, and C.-H. Lee, “ Automatic Verbal Information Verification for User Authentication, ” IEEE Trans. on Speech and Audio Processing, vol. 8, No. 5, Sep. 2000.
[4] T. Kawahara, C.-H. Lee, and B.-H. Juang, “ Flexible Speech Understanding Based on Combined Key-Phrase Detection and Verification, ” IEEE Trans. on Speech and Audio Processing, vol. 6, No. 6, Nov. 1998.
[5] B. H. Juang, “ The past, present, and future of speech processing, ” IEEE Trans. on Signal Processing, pp. 24-28, May 1998.
[6] M. G. Rahim, C.-H. Lee, and B.-H. Juang, “ Discriminative Utterance Verification for Connected Digits Recognition, ” IEEE Trans. on Speech and Audio Processing, vol. 5, No. 3, May 1997.
[7] D. Burshtein, “ Robust parametric modeling of duration in hidden Markov models, ” IEEE Trans. on Speech Audio Processing, vol. 4, pp. 240-242, May 1996.
[8] H. Ney, “ The use of a one stage dynamic programming algorithm for connected word recognition, ” IEEE Trans on. Acoustic, Speech, Signal Processing, vol.32, No.2, April 1984.
[9] S. E. Levinson, L. R. Rabiner, and M. M. Sondhi, “ An Introduction to the Application of the Theory of Probabilistic Function of a Markov Process to Automatic Speech Recognition, ” The Bell System Technical Journal, vol. 62, No. 4, April 1983.
[10] J. Neyman and E. S. Pearson, “ On the problem of the most efficient tests of statistical hypotheses, ” phil. Trans. R. Soc. A, vol. 231, pp. 289-337, 1933.
[11] Lin Xin and Bing-Xi Wang “ Utterance Verification For Spontaneous Mandarin Speech Keyword Spotting, ” IEEE Proceedings ICII 2001, Beijing, pp. 397-401 vol.3
[12] Myoung-Wan Koo and Sun-Jeong Lee, “ An Utterance Verification System Based on Subword Modeling For A Vocabulary Independent Speech, ” Eurospeech 1999.
[13] N. Moreau and D, Jouvet “ Use of A Confidence Measure Based in Frame Level Likelihood Ratios for The Rejection of Incorrect Data, ” Eurospeech, 1999.
[14] Tatsuya Kawahara, C.-H. Lee and B.-H. Juang “ Combining Key-Phrase Detection and Subword-Based Verification For Flexible Speech Understanding, ” in Proc IEEE Int. Conf. Acoustic, Speech, Signal Processing, Munich, Germany, May 1997, pp. 1159-1162
[15] L. R. Rabiner, “ A Tutorial on Hidden Markov Models and Selected Application in Speech Recognition, ” Proceedings of the IEEE, vol. 77, No. 2, Feb. 1989.
[16] L. R. Rabiner and B. H. Juang, “ Fundamentals of Speech Recognition, ” Prentice Hall, New Jersey, 1993.
[17] John R. Deller, Jr., John G. Proakis, John H. L. Hansen, “ Discrete-Time Processing of Speech Signals ”, 1987
[18] L. R. Rabiner and R. W. Schafer, “ Digital Processing of Speech Recognition Signals, ” Prentice-Hall Co. Ltd, 1978
[19] Tou, J. T., “ Pattern Recognition Principles, ” Addison-Wesley, 1974
[20] J. Neyman and E. S. Pearson, “ On the use and interpretation of certain test criteria for purpose of statistical inference, ” Biometrika, pt I, vol. 20A, pp.175-240, 1928.
[21] Y. Zhzng, D. Zhand and Z. Shu, “ A novel text-independent speaker verification method based on the global speaker model, ” IEEE Trans. on Systems, Man, and Cybernetics, 30(5):598-602, 2000.
[22] Chi-Shi Liu, Hsiao-Chuan Wang and Chin-Hui Lee, “ Speaker verification using normalized log-likelihood score, ” IEEE Trans. on Speech and Audio Processing, Jan. 1996, pp.57-60.
[23] E. Roseberg, J. Delong, C. H. Lee, B. H. Juang and F. K. Soong, “ The Use of Cohort Normalized Scores for Speaker Recognition, ” Pro. ICSL 92. Banff, pp.599-602. Oct. 1992.
[24] 蔡永琪,“ 基於次音節單元之關鍵詞辨識 ”,國立中央大學碩士論文,中華民國八十四年六月
[25] 黃國彰,“ 關鍵詞萃取與確認之研究 ”,國立中央大學碩士論文,中華民國八十五年六月
[26] 王維邦,“ 連續國語語音關鍵詞萃取系統之研究與發展 ”,國立中央大學碩士論文,中華民國八十六年六月
[27] 吳金池,“ 語者辨識系統之研究 ”,國立中央大學碩士論文,中華民國九十年五月