| 研究生: |
林志榮 Zhe-Run Lin |
|---|---|
| 論文名稱: |
結合隱藏式馬可夫模型與類神經網路之國語語音辨識 |
| 指導教授: |
莊堯棠
Yau-Tarng Juang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 畢業學年度: | 88 |
| 語文別: | 中文 |
| 論文頁數: | 48 |
| 中文關鍵詞: | 隱藏式馬可夫模型 、類神經網路模型 、語者相關系統 、語者無關系統 |
| 相關次數: | 點閱:8 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在語者相關系統方面,三種系統的辨識率皆在九成以上,其中HMM-NN-Net、NN-NN-Net狀態模型更可達百分之百,並且在經過對收斂條件適當的調整後,HMM-NN-Net狀態模型的辨識率以微幅的差距超越隱藏式馬可夫模型。
在語者無關系統方面,HMM-NN-Net狀態模型以94.25﹪的辨識率領先其他模型,進一步證明了新方法的可行性。同時,利用HMM-NN-Net與NN-NN-Net兩種狀態模型的比較,對類神經網路收斂問題,做完整的分析。
Hidden Markov model (HMM) was widely used for speech recognition and has been proved useful in dealing with the statistical and sequential aspects of the speech signal. However, their discriminative properties are weak if they are trained with the maximum likelihood. On the other hand, neural networks (NN) have powerful classification capability but are not well-suited for dealing with time-varying input patterns. In this study, a hybrid HMM-NN speech recognition system that combines the advantages of both models is presented. Three neural net state models, HMM-NN-Net, HMM-HMM-Net and NN-NN-Net, are developed for the proposed hybrid HMM-NN system. All the experimental results are compared with the one obtained from HMM.
In the speaker-dependent experiment, the recognition rates of all the three models are above the level of 90 percent. Furthermore, in spite of the results of HMM-HMM-Net models, all error rates approach to zero after adjusting the criterion.
In the speaker-independent case, HMM-NN-Net model achieves a recognition rate of 94.25 percent and has the best performance compared with other models. Besides, NN-NN-Net model requires less training time than HMM-NN-Net model although its recognition capability cannot compete with HMM-NN-Net model.
The experimental results indicate that the hybrid HMM-NN recognition system based on HMM-NN-Net model improves the performance of traditional HMM system. It is also found that the criterion of neural net state models was related to the recognition capability.
﹝1﹞ L. E. Baum and T. Tetrie, “Statistical Inference for Probabilistic Functions of Finite State Markov Chains,” Ann. Math. Stat., Vol. 37, pp. 1554-1563, 1966.
﹝2﹞ L. E. Baum and J. A. Egon, “An Inequality with Applications to Statistical Estimation for Probabilistic Functions of A Markov Process and to A Model for Ecology,” Bull. Amer. Meteorol. Soc., Vol. 73, pp. 360-363, 1967.
﹝3﹞ L. E. Baum and G. R. Sell, “Growth Functions for Transformations on Manifolds,” Pac. J. Math., Vol. 27, No.2, pp. 211-227, 1968.
﹝4﹞ L. E. Baum, T. Petrie, G. Soules, and N Weiss, “A Maximization Technique Occurring in The Statistical Analysis of Probabilistic Functions of Markov Chains,” Ann. Math. Stat., Vol. 41, No. 1, pp. 164-171, 1970.
﹝5﹞ L. E. Baum, “An Inequality and Associated Maximization Technique in Statistical Estimation for Probabilistic Functions of Markov Processes,” Inequalities, Vol. 3, pp. 1-8, 1972.
﹝6﹞ J. K. Baker, “The Dragon System-An Overview,” IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. 23, No. 1, pp. 24-29, Feb. 1975.
﹝7﹞ F. Jelinek, “A Fast Sequential Decoding Algorithm Using A Stack,” IBM J. Res. Develop., Vol. 13, pp. 675-685,1969.
﹝8﹞ L. R. Bahl and F. Jelinek, “Decoding for Channels with Insertions, Deletions, and Substitutions with Applications to Speech Recognition,” IEEE Trans. on Information Theory, Vol. 21, pp. 404-411, 1975.
﹝9﹞ F. Jelinek, L. R. Bahl, and R. L. Mercer, “Design of A Linguistic Statistical Decoder for The Recognition of Continuous Speech,” IEEE Trans. on Information Theory, Vol. 21, pp. 250-256, 1975.
﹝10﹞ F. Jelinek, “Continuous Speech Recognition by Statistical Methods,” Proc. IEEE, Vol. 64, pp. 532-536, Apr. 1976.
﹝11﹞ R. Bakis, “Continuous Speech Word Recognition via Centi-second Acoustic States,” in Proc. ASA Meeting (Washington DC), Apr. 1976.
﹝12﹞ F. Jelinek, L. R. Bahl, and R. L. Mercer, “Continuous Speech Recognition: Statistical Methods,” in Handbook of statistics, II, P. R. Krishnaiad, Ed. Amsterdam, The Netherlands: North-Holland, 1982.
﹝13﹞ L. R. Bahl, F. Jelinek, and R. L. Mercer, “A Maximum Likelihood Approach to Continuous Speech Recognition,” IEEE Trans. on Pattern Analysis and Machine Intelligence., Vol. 5, pp. 179-190, 1983.
﹝14﹞ L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. IEEE, Vol. 77, No.2, pp. 257-286, Feb. 1989.
﹝15﹞ K. J. Lang, Alex H. Waibel and G. E. Hinton, “A Time-Delay Neural Network Architecture for Isolated Word Recognition,” Neural Networks, Vol. 3, pp. 23-43, 1990.
﹝16﹞ A. Bendiksen and K. Steiglitz, “Neural Networks for Voiced/Unvoiced Speech Classification,” IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, No. 90, pp. 521-524, 1990.
﹝17﹞ T. Ghiselli-Crippa, A. El-Jaroudi, “A Fast Neural Net Training Algorithm and Its Application to Voiced-Unvoiced-Silence Classification of Speech,” IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, No. 91, pp. 441-444, 1991.
﹝18﹞ Y. Qi and B. R, Hunt, “Voiced-Unvoiced-Silence Classifications of Speech Using Hybrid Features and A Network Classifier,” IEEE Trans. on Speech and Audio Processing, Vol. 1, No. 2, pp. 250-255, Apr. 1993.
﹝19﹞ G. Kuhn, R. L. Watrous and B. Ladendorf, “Connected Recognition with A Recurrent Network,” Speech Communication, Vol. 9, No. 1, pp. 41-48, Feb. 1990.
﹝20﹞ S. J. Lee, K. C. Kim, H. Yoon and J. W. Cho, “Application of Fully Recurrent Neural Networks for Speech Recognition,” Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, pp. 77-80, 1991.
﹝21﹞ A. Hunt, “Recurrent Neural Networks for Syllabification,” Speech Communication, Vol. 13, pp. 323-332, 1993.
﹝22﹞ T. Lee, P. C. Ching and L. W. Chan, “Recurrent Neural Networks for Speech Modeling and Speech Recognition,” Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 5, pp. 3319-3322, 1995.
﹝23﹞ W.-Y. Chen, Y.-F. Liao and S.-H. Chen, “Speech Recognition with Hierarchical Recurrent Neural Networks,” Pattern Recognition, Vol. 28, No. 6, pp. 795-805, 1995.
﹝24﹞ H. Bourlard and C. j. Wellekens, “Links between Markov Models and Multilayer Perceptrons,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 12, No. 12, pp. 1167-1178, Dec. 1990.