| 研究生: |
張文杰 Wen-Chieh Chang |
|---|---|
| 論文名稱: |
模型調適之語者識別系統 Model Adaptation Based Speaker Recognition Systems |
| 指導教授: |
莊堯棠
Yau-Tarng Juang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 畢業學年度: | 93 |
| 語文別: | 中文 |
| 論文頁數: | 63 |
| 中文關鍵詞: | 特徵語音調適法 、回授式調適法 |
| 外文關鍵詞: | Eigenvoice, Feedback Speaker Adaptation |
| 相關次數: | 點閱:10 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
摘 要
在本論文中,我們提出以回授式調適法為基礎的文字不特定語者辨識系統。首先將一個訓練良好的通用背景模型與系統使用者的少量調適語料利用貝式調適法來獲得語者特定模型,然後藉由回授方式再進一步調整模型參數,而由於回授過程中可取得更多的事前參數資訊,因此對於模型的調適能更有效地突顯出該語者的個人特色。針對貝式調適法中的缺點,即在很少量的調適語料下會因為沒有調適到的高斯分佈群而使得系統辨識的效能降低,回授式調適法讓調適語者模型參數有了更完善的描述,因而獲得調適上的改善。
另外,我們使用特徵語音調適法來進行語者調適工作,利用其在少量調適語料下的特殊表現來建置語者辨識系統,藉著眾多語者特定模型與不特定語者模型建構出特徵空間,運用主成分分析將重要的聲學訊息取出,再藉由少數具代表性的空間基底來線性組合出調適語者模型,以達成快速調適之目的,此方法不僅在語音辨識上有不錯的表現,在語者識別上亦有良好的效能。本論文中是以100位語者來做語者調適實驗,而由實驗的結果可發現此兩種調適方法都能夠在少量語料的情況下,即時地調適出語者模型且都有傑出的辨識效果。
參考文獻
[1] X. Huang, A. Acero and H. W. Hon, Spoken Language Processing, Prentice Hall, 2001.
[2] L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition, Prentice Hall, New Jersey, 1993.
[3] L. S. Lee, Y. Lee, “Voice Access of Global Information for Broad-Band Wireless: Technologies of Today and Challenges of Tomorrow,” Proceedings of the IEEE, vol. 89, no. 1, pp. 41-57, January 2001.
[4] G. R. Doddington, “Speaker recognition-identifying people by their voices,” Proceedings of the IEEE, vol. 73, no. 11, pp. 1651-1664, November 1985.
[5] J. L. Gauvain and C. H. Lee, “Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains,” IEEE Trans. Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, April 1994.
[6] C. J. Leggetter and P. C. Woodland, “Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models,” Computer Speech and Language, vol. 9, pp. 171-185, 1995.
[7] R. Kuhn, J. C. Junqua, P. Nguyen and N. Niedzielski, “Rapid Speaker Adaptation in Eigenvoice Space,” IEEE Trans. Speech and Audio Processing, vol. 8, no. 6, pp. 695-707, November 2000.
[8] D. A. Reynolds and R. C. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Models,” IEEE Trans. Speech and Audio Processing, vol. 3, no. 1, pp. 72-83, January 1995.
[9] R. Vergin, D. O’Shaughnessy and A. Farhat, “Generalized Mel Frequency Coefficients for Large-Vocabulary Speaker-Independent Continuous-Speech Recognition,” IEEE Trans. Speech and Audio Processing, vol. 7, no. 5, pp. 525-532, September 1999.
[10] T. K. Moon, “The Expectation-Maximization Algorithm,” IEEE Signal Processing Magazine, vol. 13, no. 6, pp. 47-60, November 1996.
[11] D. A. Reynolds, T. F. Quatieri and R. B. Dunn, “Speaker Verification Using Adapted Gaussian Mixture Models,” Digital Signal Process, vol. 10, pp. 19-41, 2000.
[12] M. Tonomura, T. Kosaka and S. Matsunaga, “Speaker Adaptation Based on Transfer Vector Filed Smoothing Using Maximum a Posteriori Probability Estimation,” ICASSP-95, vol.1, pp. 688-691, 1995.
[13] J. Takahashi and S. Sagayama, “Vector-Filed-Smoothed Bayesian Learning for Incremental Speaker Adaptation,” ICASSP-95, vol.1, pp. 696-699, 1995.
[14] T. Y. Wu, L. Lu, K. Chen and H. J. Zhang, “UBM-based Incremental Speaker Adaptation,” ICME’03, vol.2, pp. 721-724, July 2003.
[15] O. Thyes, R. Kuhn, P. Nguyen and J. C. Junqua, “Speaker Identification and Verification using Eigenvoices,” ICSLP 2000, vol.2, pp. 242-246, October 2000.
[16] H. C. Wang, F. Seide, C. Y. Tseng and L. S. Lee, “MAT2000-Design, collection, and validation of a Mandarin 2000-spealer telephone speech database,” ICSLP, pp. 460-463, 2000.
[17] P. Kenny, G. Boulianne, P. Ouellet and P. Dumouchel, “Speaker Adaptation Using an Eigenphone Basis,” IEEE Trans. Speech and Audio Processing, vol. 12, no. 6, pp. 579-589, November 2004.
[18] D. K. Kim and N. S. Kim, “Maximum a posteriori Adaptation of HMM Parameters Based on Speaker Projection,” Speech Communication, vol. 42, pp. 59-73, January 2004.
[19] P. Nguyen, C. Wellekens and J. C. Junqua, “Maximum Likelihood Eigenspace and MLLR for Speech Recognition in Noisy Environments,” Eurospeech99, vol. 6, pp. 2519-2522, 1999.
[20] 吳金池, “語者辨識系統之研究”,國立中央大學電機工程研究所碩士論文,民國九十一年。
[21] 廖家慶, “語者調適之應用研究”,國立中央大學電機工程研究所碩士論文,民國九十一年。
[22] 賴彥輔, “語者辨識之研究”,國立中央大學電機工程研究所碩士論文,民國九十二年。
[23] 廖文偉, “以向量空間為基礎之語者調適技術”,國立台灣大學電信工程學研究所碩士論文,民國八十九年。
[24] 鍾偉仁, “語者辨認與認證之初步研究”,國立台灣大學電信工程學研究所碩士論文,民國九十年。
[25] 李孝健, “以特徵聲音調整為主之使用者言語資訊確認技術”,國立成功大學資訊工程學系碩士論文,民國九十二年。