| 研究生: |
楊景嵐 Ching-Lan Yang |
|---|---|
| 論文名稱: |
電話語音應用整合語者辨識與關鍵詞萃取 A Study on Speaker Recognition and Keyword Spotting in Telephony Integration |
| 指導教授: |
莊堯棠
Yau-Tarng Juang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 畢業學年度: | 92 |
| 語文別: | 中文 |
| 論文頁數: | 76 |
| 中文關鍵詞: | 信任分數 、無關詞模組 、關鍵詞 |
| 外文關鍵詞: | Filler Model, Keyword |
| 相關次數: | 點閱:8 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文主要的研究目標是將關鍵詞萃取與語者辨識結合在一起,由於本實驗室在語者辨識方面的研究已經累積了不少成果,所以主要著重在如何提升關鍵詞萃取的辨識能力以及對於辨識速度如何加速為其研究重點。
由於在關鍵詞模組與無關鍵詞模組的組合中,無關詞模組對辨識率有很大的影響,所以我們嘗試改變無關詞模組的形式,去找出一組較佳的無關詞模組來進行辨識,經由實驗驗證所選取的無關詞模組,由於較省記憶空間所以辨識速度也跟著提升,再者進行關鍵詞萃取時我們利用雙重辨識架構進行關鍵詞的萃取,第一層利用一階動態演算找出Top N最接近的候選者,第二層則將Top N的候選者跟經由訓練得到的Top N的信任分數臨界值作判斷,假如該名次被拒絕掉則名次由下一名遞補,以此類推,之後我們找最佳名次當作萃取出的關鍵詞。我們經由實驗驗證這個方法確實可提高整體的辨識率。在確認上,我們使用了一個不用訓練每個次音節臨界值的方法,使得以後建立確認系統可以更快速。
[1] D. Burshtein, “Robust parametric modeling of duration in hidden Markov models,” IEEE Trans. Speech Audio Processing, vol. 4, no. 8, pp. 240-242, May 1996.
[2] J. R. Deller, Jr., John G. Proakis, John H. L. Hansen, “Discrete-time processing of speech signals”, 1987.
[3] X. Huang, A. Acero, H. Hon, “Spoken Language Processing,” Prentice Hall, 2001.
[4] B. H. Juang, “The past, present, and future of speech processing,” IEEE Trans. Signal Processing, vol. 15, no. 3, pp. 24-28, May 1998.
[5] T. Kawahara, C. H. Lee, and B. H. Juang, “Flexible speech understanding based on combined key-phrase Detection and Verification,” IEEE Trans. Speech and Audio Processing, vol. 6, no. 6, pp.558-568 Nov. 1998.
[6] M. W. Koo and Sun-Jeong Lee, “An utterance verification system based on subword modeling for a vocabulary independent speech,” Eurospeech 1999.
[7] M. W. Koo, C. H. Lee, and B. H. Juang, “Speech recognition and utterance verification based on a generalized confidence score,” IEEE Trans .on Speech and Audio Processing, vol. 9, no. 8, pp.821-832, Nov. 2001.
[8] Chi-Min Liu, Chin-Chih Chiu, and Hung-Yuan Chang “Design of vocabulary -independent mandarin keyword spotters,” IEEE Trans. Speech and Audio Processing, vol. 8, no. 4, pp.483-487, July 2000.
[9] Qi Li, B.-H, Juang, Qiru Zhou, and C.-H. Lee, “Automatic verbal information verification for user authentication,” IEEE Trans. Speech and Audio Processing, vol. 8, no. 5, pp.585-596, Sep. 2000.
[10] S. E. Levinson, L. R. Rabiner, and M. M. Sondhi, “An introduction to the application of the theory of probabilistic function of a markov process to automatic speech recognition,” The Bell System Technical Journal, vol. 62, no. 4, April 1983.
[11] Chi-Shi Liu, Hsiao-Chuan Wang and Chin-Hui Lee, “Speaker verification using normalized log-likelihood score,” IEEE Trans. Speech and Audio Processing, vol. 4, no. 1, pp.57-60, Jan. 1996
[12] N. Moreau and D, Jouvet “Use of a confidence measure based in frame level likelihood ratios for the rejection of incorrect data,” Eurospeech, 1999.
[13] H. Ney, “The use of a one stage dynamic programming algorithm for connected word recognition,” IEEE Trans. Acoustic, Speech, Signal Processing, vol.32, no.2, pp. 263-271, April 1984.
[14] J. Neyman and E. S. Pearson, “On the problem of the most efficient tests of statistical hypotheses,” phil. Trans. R. Soc. A, vol. 231, pp. 289-337, 1933.
[15] J. Neyman and E. S. Pearson, “On the use and interpretation of certain test criteria for purpose of statistical inference,” Biometrika, pt I, vol. 20A, pp.175-240, 1928.
[16] M. G. Rahim, C. H. Lee, and B. H. Juang, “Discriminative utterance verification for connected digits recognition,” IEEE Trans. Speech and Audio Processing, vol. 5, no. 3, pp.266-277, May 1997.
[17] L. R. Rabiner, “A tutorial on hidden markov models and selected application in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
[18] L. R. Rabiner and B. H. Juang, “Fundamentals of speech recognition,” Prentice Hall, New Jersey, 1993.
[19] L. R. Rabiner and R. W. Schafer, “Digital processing of speech recognition signals,” Prentice-Hall Co. Ltd, 1978.
[20] E. Roseberg, J. Delong, C. H. Lee, B. H. Juang and F. K. Soong, “The use of cohort normalized scores for speaker recognition,” Pro. ICSL 92. Banff, pp.599-602, Oct. 1992.
[21] J. T. Tou, “Pattern recognition principles,” Addison-Wesley, 1974.
[22] L. Xin and B. X. Wang “Utterance verification for spontaneous mandarin speech keyword spotting,” IEEE Proceedings Info-tech and Info-net, 2001 Proceedings, ICII 2001 Beijing, vol.3. pp. 397-401, 2001
[23] Y. Zhzng, D. Zhand and Z. Shu, “A novel text-independent speaker verification method based on the global speaker model,” IEEE Trans. Systems, Man, and Cybernetics, vol. 30, no. 5, pp. 598-602, 2000.
[24] 蔡永琪,“基於次音節單元之關鍵詞辨識”,國立中央大學碩士論文,中華民國八十四年六月。
[25] 黃國彰,“ 關鍵詞萃取與確認之研究 ”,國立中央大學碩士論文,中華民國八十五年六月。
[26] 王維邦,“ 連續國語語音關鍵詞萃取系統之研究與發展 ”,國立中央大學碩士論文,中華民國八十六年六月。
[27] 吳金池,“ 語者辨識系統之研究 ”,國立中央大學碩士論文,中華民國九十年五月。
[28] 蔡炎興,“ 關鍵詞萃取及語者辨識系統之研製 ”,國立中央大學碩士論文,中華民國九十二年六月。