快速演算法在大字彙關鍵詞萃取上的應用｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	楊鎮光 Zhen-Guang Yang
論文名稱：	快速演算法在大字彙關鍵詞萃取上的應用
指導教授：	莊堯棠 Yau-Tarng Juang
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 電機工程學系 Department of Electrical Engineering
畢業學年度：	89
語文別：	中文
論文頁數：	43
中文關鍵詞：	CMS 、樹枝狀、關鍵字萃取、快速演算法、Cepstrum Weighting
相關次數：	點閱：6 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在傳統whole word based的關鍵詞萃取辨識系統中,辨識效能常因關鍵詞彙的增加而導致辨識率下降及辨識時間增加,所謂的快速演算法,就是藉由關鍵詞字彙結構的相關性,將關鍵詞予以分類並加以結構化,因而能藉由樹枝狀的搜尋架構,大幅的減少辨識時間,而隨著關鍵詞彙的增加,辨識率仍能維持ㄧ定水準而不墬,這就是將快速演算法應用在大字彙關鍵詞萃取的目的.
在作法上,我們先將關鍵詞分成幾個次部分(subsets),而不同關鍵詞的次部分會包含相同的共同次字彙(common subword),如同樹枝一般,在辨識出前N個最佳的共同的次字彙之後,就能夠減小搜尋範圍,捨去不可能入選的關鍵詞,針對相似度比較高的關鍵詞進行最後的確認.進而達到快速的目的.
除了演算法本身之外,論文中還針對多項能夠提昇辨識率的方案進行實驗,這些方案包含了將無關詞對語音特徵的機率加上一縮小權值,以使關鍵詞的切音區更加準確.使用動態的權值,讓不同的測試語句都有相對應最佳的縮小權值.另外鑒於測試和訓練語料取得環境的不同(分別為電話及麥克風錄音),我們以CMS加上Cepstrum weighting分別對訓練語料及測試語料進行處理,並重新訓練次音節模型,最後,將處理前後(指有無加上CMS及Cepstrum weighting)的機率值混合考慮,並由實驗找出最佳的混合比例.由實驗結果可以發現,動態權值及機率混合考慮這兩種方法如配合使用,可達最佳辨識率Top1為91.32%.而僅使用單一權值的辨識效果最差,Top1達83.67%.
為了使關鍵詞萃取系統更加完整,關鍵詞拒絕的能力是有必要被加入的,在實驗結果方面,加入關鍵詞拒絕後的正確率為81.51%.

第一章序論 1
1 研究動機…………………………………………………1
2 關鍵詞萃取的基本定義………………………………….1
3 快速演算法的概念……………………………………….1
4技術回顧………………………………………………….2
5 論文大綱………………………………………………….4
第二章語音辨識的基本技術 5
1 概論…………………………………………………5
2 特徵參數的求取…………………………………………5
3 隱藏式馬可夫模型………………………………………7
3.1 隱藏式馬可夫模型的描述…………………………8
第三章系統架構 11
1 概論……………………………………………….11
2 模型參數………………………………………………11
3 訓練與辨識的演算法…………………………………12
3.1 訓練演算法…………………………………………..12
3.2 辨識模組與辨識演算法….…………………………15
第四章快速演算法 16
1 概論……………………………………………….16
2 快速演算法……………………………………………16
3 無關詞模組對特徵值機率的縮小權值.……………..21
3.1 靜態的縮小權值…………..…………………………21
3.2 動態調整縮小權值………..…………………………21
4 兩種對特徵值處理的方法─Cepstrum Mean Subtraction和Cepstrum Weighting…………………….……………22
4.1 Cepstrum Mean Subtraction和Cepstrum Weighting…22
4.2將Cepstrum+Delta Cepstrum及Cepstrum+Delta Cepstrum+CMS+Cepstrum Weighting的機率值加權計算…………………………………………………23
5 關鍵詞的拒絕能力.…………………………………23
5.1 關鍵詞拒絕的原理…………………………………..23
5.2 訓練反模型(anti-model)的方法……………………..24
5.3 訓練臨界值τk的方法………………………………25
5.4 關鍵詞拒絕的演算法………………………………..26
5.5 錯誤率的計算………………………………………..26
第五章實驗與結果 28
1 概論……………………………………………….28
2 實驗環境………………………………………………28
3 大字彙的關鍵詞萃取實驗………………….…………..28
第六章結論
1 結論…………………………………………………...…38
2 未來發展………………………………………………...39
參考文獻

                                

[1]Torsten Zeppenfeld et al., “ Improving the MS-TDNN for Word
Spotting ”, ICASSP ’93, pp. II-475~II-478.
[2]S. V. Kosonocky et al., “ A Continuous Density Neural Tree Network Word Spotting System ”, ICASSP ’95, pp. 1870~1878.
[3]Jay G. Wilpon et al., “ Automatic Recognition of keywords in Unconstrained Speech Using Hidden Markov Models ”, IEEE Trans on Assp, Vol. 38, No. 11, Nov 1990, pp. 1870~1878.
[4]R. C. Rose et al., “ A Hidden Markov Model Based Keyword Recognition System ”, ICASSP ’90, pp. 129~132.
[5]Rohilcek, J., Russel, W., Roukos, S., and Gish, H.（1989） “ Continuos Hidden Markov Modells for Speaker Independent Word Spotting, ” Proc. Int. Conf. On Acoust., Speech, and Signal Processing, pp. 627~630.
[6]Rose, B., and Paul, D.（1990） “ A Hidden Maekov Model Based Keyword Recognition System, ” Proc. Int. Conf. On Acoust., Speech, and Signal Processing, I , pp. 129~132.
[7]Rose, R.（1992） “ Discriminant Word Spotting Techniques for Rejecting Non-vocabulary Utterances in Unconstrained Speech ”, Proc. Int. Conf. On Acoust., Speech, and Signal Processing, II, pp. 105~108.
[8]Bahl, L., Brown, P., Souza, P., and Mercer, R.（1986） “Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition, ” Proc. Int. Conf. on Acoust., Speech, and Signal Processing, I , pp. 49~52.
[9]A.L. Higgins and R.E. Wohlford,”Keyword recognition using template concatenation,”in Proc. IEEE Int. Conf. Acust., Speech, Signal Processing, Apr.1985, pp 1233-1236
[10]J .G.Wilpon,L. R. Rabiner,C. H.Lee, and E. R. Goldman,”Automatic recognition of keywords in unconstrained speech using hidden Markov models,”IEEE Trans. Acoust.,Speech,Signal Processing, vol.11,pp 1870-1878 ,Nov. 1990
[11]R.C. Rose and D.B.Paul ,”A hidden Markov model based keyword recognition system,”in Proc. IEEE Int .Conf Acoust.,Speech,Signal Processing ,Apr.1990,pp.129-130
[12]Christiansen, R. W. and Rushforth, C.K., “ Deteding and Locating Key Words in Continuous Speech Using Linear Predictive Coding. ” IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-25, No. 5, pp. 361~367, October 1977.
[13]Higgins, A. L. and Wohford, R. E., “Keyword Recognition Using Template Concatenation ” Proc. IEEE Int Conf. Acous., Speech, and Signal Processing, pp. 1233~1236, Tampa, Florida, March 1985.
[14]H. W. Hon and K. F. Lee, “ CMU robust vocabulary independent speech recognition system, ” Proc. Int. Conf. On Acoust., Speech, and Signal., pp. 889~892, May 1991.
[15]J. R. Bellegarda and D. Nahamoo, “ The mixture continuous parameter modeling for speech recognition , ” IEEE Trans. on Acoust, Speech and Signal. Proc., vol. ASSP-38, no. 12, pp. 2033~2045, 1990.
[16]B. H. Juang and L.R. Rabiner, “ Mixture Autoregressive Hidden Markov Models for Speech Signal ”, IEEE Trans. ASSP, vol. 33, pp. 1404~1412, Dec. 1985.
[17]X. D. Huang and M. A. Jack, “ Semi-continuous Hidden Markov models for speech signals ” Computer, Speechand Language, vol. 3 pp. 239~257, 1989.
[18]L. F. Larnel, and S. Seneff, “ speech database development：design and analysis of the acoustic-phonetic corpus, ” Proc. MIT Speech Recognition Workshop, July 1986.
[19]Richard Schwarz and Yen-Lu Chow, “ The N-Best Algorithm：An Efficient and Exact Procedure for Finding The N Most Likely Sentence Hypothese ”, Proc. Speech&Natural Language Workshop Oct., 1989., pp. 199~202.
[20]E.F. Huang,H.C.Chuan,and F.K. Soong,”A Fsat Algorithm for Large Vocabulary Keyword Spotting Application”,IEEE,Trans Speech and Audio Processing,VOL,2,NO.3,JULY 1994,PP,449-452
[21]Wilpon, J. G., DeMarco, D. M., and Mikkilineni, R. P., “ Isolated Word recognition over the DDD telephone network-result of two Extensive field studues, ” Proc. IEEE Int. Conf. Acous., Speech and Sig. Processing, 1S. 1. 10, pp. 55~57, New York City, NewYork, Apri, 1988.
[22]Chigier, B.（1992） “ Rejection and Keyword Spotting Algorithms for a Directory Assistance City Name Recognition Application, ” Proc. ICASSP, pp. 93~96.
[23]L. R. Rabiner and B. H. Juang, “ Fundamentals of Speech Recognition ”, Prentice-Hall Co. Ltd, 1993.
[24]F. K. Soong and A. F. Rosenberg, “ On the Use of Instantaneous and Transitional Spectral Information in Speaker Recognition ”, Proc. ICASSP, pp. 877~880, 1986.
[25]F. Itakura and T. Umezaki, “ Distance Measure for Speech Recognition Based on the Smoothed Group Delay Spectrum ”, Proc. ICASSP, pp. 1257~1260, 1987.
[26]D. Mansour and B. H. Juang, “ A Familiy of Distortion Measure Based upon Projection for Robust Speech Recognition ”, IEEE Trans. ASSP, Vol. 37, pp. 1659~1671, 1989.
[27]K. K. Paliwai and M. M. Sondhi, “ Recognition of Noisy Speech using Cumulant-Based Linear Prediction Analysis ”, Proc. ICASSP, pp. 429~432, 1990.
[28]D. Mansour and B. H. Juang, “ The Short-Time Modified Coherence Representation and Noisy Speech Recognition ”, IEEE Trans. ASSP, Vol. 37, pp. 795~804, June 1989.
[29]L. R. Rabiner and R. W. Schafer, “ Digital Processing of Speech Recognition Signals ”, Prentice-Hall Co. Ltd, 1978.
[30]Mokbel,C., Monne,J. and Jouvet, D.:”One-Line Adaption of a Speech Recognizer to Variations in Telephone Line Conditions”,European Conference of Speech Communication and Technology (EURPOSPEECH),pp.1247-1250,1993
[31] Mokbel,C.,Paches-Ieal,P., Monne,J. and Jouvet, D.:”Compensation of Telephone Line Effect for Robust Speech Recognition”,Int Conf. Spoken Language Processing,pp.987-990,1994
[32]Becchetti,C. and L.P. Ricotti,Speech Recognition,John Wiley& Sons,1999.
[33]Rabiner,L. and B.H. Juang,”Fundamentals of Speech Recognition”,Prentice-Hall,1993.

簡易檢索 / 詳目顯示

相關論文