跳到主要內容

簡易檢索 / 詳目顯示

研究生: 賴易烽
Yi-fong Lai
論文名稱: 粒子群演算法應用於語者確認系統之研究
PSO Algorithm for Speaker Verification
指導教授: 莊堯棠
Yau-tarng Juang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 電機工程學系
Department of Electrical Engineering
畢業學年度: 100
語文別: 中文
論文頁數: 61
中文關鍵詞: 語者確認粒子群最佳化方法
外文關鍵詞: Particle swarm optimization, Speaker verification
相關次數: 點閱:11下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文將粒子群演算法應用於語者確認系統之向量量化及支撐向量機參數決定兩部分,粒子群演算法是一種模擬鳥群或魚群覓食行為的最佳化方法,在搜尋最佳解的過程中,每顆粒子的位置皆為一組解,依據個體經驗及群體經驗,決定每顆粒子的游動方向,並經由疊代更新位置,搜尋最佳解。粒子群演算法具有容易實現、具記憶性,分散搜尋等優點。
    在傳統的語者確認系統上,向量量化大多使用LBG演算法,在疊代的過程中,常會收斂於區域最佳解,因此本論文透過粒子群演算法全域最佳解的搜尋能力,並從實驗的結果獲得比LBG演算法更小的均方誤差,且其收斂速度也優於LBG演算法,確定所提方法的有效性。
    此外,語者確認系統在做SVM模型訓練時,如何選擇核函數及其參數值,對於訓練的結果影響重大,本論文利用粒子群演算法來獲得最佳的參數值,實驗的結果顯示,本論文所提之方法,比起傳統上利用grid search來尋找最佳參數值的方法,其相等錯誤率及決策成本函數改善了2.26%和0.0275。


    This thesis proposed method uses PSO algorithm to develop the VQ algorithm and determinate the parameter of SVM. Particle swarm optimization (PSO) simulates social behavior such as birds flocking to a promising position to achieve precise objectives in a multi-dimensional space. PSO performs searches using a population (called swarm) of individuals (called particles) that are updated from iteration to iteration.
    The vector quantization (VQ) was a powerful technique in the applications of digital speech compression. The traditionally widely used method such as the Linde-Buzo-Gray (LBG) algorithm always generated local optimal codebook. This thesis proposed method uses PSO algorithm to develop the VQ algorithm. Experimental results showed that the PSO algorithm can provide a better codebook with smaller mean square error (MSE) and less computation time than LBG algorithm.
    In the support vector machines (SVM), the model for classification is generated from the training process with the training data. Later on, classification is executed based on trained model. The largest problems encountered in setting up the SVM model are how to select the kernel function and its parameter values. This thesis proposed a method uses PSO algorithm to determinate the SVM parameter. Experimental results showed that the proposed system obtains a 2.26% EER and 0.0275 DCF improvement over the system with grid search.

    摘要 I Abstract II 目錄 III 圖目錄 V 表目錄 V 第一章 緒論 1 1.1研究動機 1 1.2語者辨識概述 2 1.3研究方向 4 1.4文獻探討 4 1.5章節概要 6 第二章 粒子群最佳化方法 7 2.1 簡介 7 2.2 粒子群演算法基本模式 7 2.3常數慣性權重 12 2.4線性慣量遞減 13 2.5最大速度法 14 第三章 粒子群演算法應用於向量量化 15 3.1 向量量化 15 3.2 以PSO設計碼本 17 第四章 粒子群演算法應用於支撐向量機 20 4.1 支持向量機 20 4.1.1線性SVM分類器 20 4.1.2 資料不可分隔情況 26 4.1.3 核函數 28 4.2 以PSO決定SVM參數 29 4.3支持向量機的語者模型訓練 33 第五章 實驗與討論 35 5.1 語音資料庫 35 5.2語者確認效能評估 36 5.3 PSO-VQ實驗 39 5.3.1實驗一 向量量化使用PSO演算法 39 5.3.2實驗二 收斂速度比較 40 5.4 PSO-SVM 實驗 41 5.4.1實驗三 使用PSO來決定SVM參數 41 第六章 結論與未來展望 43 6.1結論 43 6.2 未來展望 44 參考文獻 45

    [1] L. Rabiner and B. H. Juang, “Fundamentals of Speech Recognition,” Prentice Hall, vol.103, 1993.
    [2] J. P. Campbell and JR., “Speaker recognition: a tutorial,” Proceedings of the IEEE, vol.85, pp. 1437 – 1462, 1997
    [3] C. H. Wu and J. H. Chen, “Speech activated telephony email reader (SATER) based on speaker verification and text-to-speech conversion,” IEEE Transactions on Consumer Electronics, vol.43, pp. 707-716, 1997.
    [4] T. Jacobs and A. Setlur, “A field study of performance improvements in HMM-based speaker verification,” Second IEEE Workshop on Interactive Voice Technology for Telecommunications Applications, pp.121-124, 1994.
    [5] D. Burton, “Text-dependent speaker verification using vector quantization source coding,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol.35, pp.133-143, 1987.
    [6] H. Gish and M. Schmidt, “Text-independent speaker identification,” IEEE Signal Processing Magazine, vol.11, pp. 18-32, 1994.
    [7] R. C. Eberhart and Y. Shi, “Particle swarm optimization: developments, applications and resources,” Proceedings of the 2001 Congress on Evolutionary Computation, vol.1, pp. 81-86, 2001.
    [8] C. Y. Chen and F. Ye, “Particle swarm optimization algorithm and its application to clustering analysis,” IEEE International Conference on Networking, Sensing and Control, vol.2 , pp. 789 – 794, 2004.
    [9] L. Xiao and Z. Shao and G. Liu, “K-means Algorithm Based on Particle Swarm Optimization Algorithm for Anomaly Intrusion Detection,” The Sixth World Congress on Intelligent Control and Automation, vol.2, pp. 5854 – 5858, 2006.
    [10] M. S. Kim and I. H. Yang and H. J. Yu, “Maximizing Distance between GMMs for Speaker Verification Using Particle Swarm Optimization,” Fourth International Conference on Natural Computation, vol.6, pp. 175 – 178, 2008.
    [11] Y. Liu, “A particle swarm optimization algorithm for Mandarin speech recognition,” Asia-Pacific Conference on Computational Intelligence and Industrial Applications, vol.2, pp. 205-208, 2009.
    [12] R. Saeidi and H. R. S. Mohammadi and T. Ganchev and R. D. Rodman, “Particle Swarm Optimization for Sorted Adapted Gaussian Mixture Models,” IEEE Transactions on Audio, Speech, and Language Processing, vol.17, pp.344-353, 2009
    [13] S. W. Lin and K. C. Ying and S. C. Chen and Z. J. Lee, “Particle swarm optimization for parameter determination and feature selection of support vector machines,” Expert Systems with Applications, vol.35, 2008.
    [14] N. M. Nasrabadi and R. A. King, “Image coding using vector quantization: a review,” IEEE Transactions on Communications, vol.36, pp.957-971, 1988.
    [15] J. Makhoul and S. Roucos and H. Gish, “Vector quantization in speech coding,” Proceedings of the IEEE, vol.73, pp. 1551-1588, 1985.
    [16] V. Delport and M. Koschorreck, “Genetic algorithm for codebook design in vector quantisation,” Electronics Letters, vol.31, pp. 84-85, 1995.
    [17] V. Delport and D. Liesch, “Fuzzy-c-mean algorithm for codebook design in vector quantisation,” ectronics Letters, vol. 30, pp. 1025-1026, 1994.
    [18] M. H. Horng and T. W. Jiang, “The codebook design of image vector quantization based on the firefly algorithm,” Proceedings of the Second international conference on Computational collective intelligence, vol.3, 2010.
    [19] M. H. Horng and T. W. Jiang, “Image vector quantization algorithm via honey bee mating optimization,” Expert Systems with Applications, vol. 38, pp. 1382-1392, 2011.
    [20] L. S. Tue and A. Hegde and D. Erdogmus and J. C. Principe, “Vector quantization using information theoretic concepts,” Natural Computing: an international journal, vol. 4, pp.39-51, 2005.
    [21] V. Wan and W. M. Campbell, “Support vector machines for speaker verification and identification,” Proceedings of Neural Networks for Signal Processing X, vol.2, pp. 775-784, 2000.
    [22] W. M. Campbell and J. P. Campbell and D. A. Reynolds and E. Singer and P. A. T. Carrasquillo, “Support vector machines for speaker and language recognition,” Computer Speech & Language, vol.20, pp. 210-229, 2006.
    [23] T. Jaakkola and D. Haussler, “Exploiting generative models in discriminative classifiers,” Proceedings of the 1998 conference on Advances in neural information, pp.487-493, 1999.
    [24] W. M. Campbell and D. E. Sturim and D. A. Reynolds, “Support vector machines using GMM supervectors for speaker verification,” IEEE Signal Processing Letters, vol.13, pp. 308-311, 2006.
    [25] Z. N. Karam and W. M. Campbell, “A multi-class MLLR kernel for SVM speaker recognition,” IEEE International Conference on Acoustics, Speech and Signal Processing, PP. 4117-4120, 2008.
    [26] J. S Park and J. H. Kim and Y. H. Oh, “Feature vector classification based speech emotion recognition for service robots,” IEEE Transactions on Consumer Electronics, vol.55, pp. 1590-1596, 2009.
    [27] J. Kennedy and R. Eberhart, “Particle swarm optimization,” IEEE International Conference on Neural Networks, vol.4, pp. 1942-1948, 1995.
    [28] Y. Shi and R. Eberhart, “A modified particle swarm optimizer,” The 1998 IEEE International Conference on Evolutionary Computation Proceedings, pp. 69-73, 1998.
    [29] Y. Shi and R. Eberhart, “Parameter Selection in Particle Swarm Optimization,” Proceedings of the 7th International Conference on Evolutionary Programming VII, 1998.
    [30] R. Gray, “Vector quantization,” IEEE ASSP Magazine, vol.1, pp.4-29, 1984.
    [31] D. Lee and S. Baek and K. Sung, “Modified K-means algorithm for vector quantizer design,” IEEE Signal Processing Letters, vol.4, pp.2-4, 1997.
    [32] Y. Linde and A. Buzo and R. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Transactions on Communications, vol.28, pp.84-95, 1980.
    [33] M. A. Hearst and S. T Dumais and E. Osman and J. Platt and B. Scholkopf, “Support vector machines,” IEEE Intelligent Systems and their Applications, vol.13, pp.18-28, 1998.
    [34] M. Kaumann, “Transductive Inference for Text Classification using Support Vector Machines,” Proceedings of the Sixteenth International Conference on Machine Learning, pp.200-209, 1999.
    [35] S. Tong and E. Chang, “Support vector machine active learning for image retrieval,” Proceedings of the ninth ACM international conference on Multimedia, 2001.
    [36] D. Zhang and W. S. Lee, “Web taxonomy integration using support vector machines,” Proceedings of the 13th international conference on World Wide Web, 2004.
    [37] E. N. Issam and Y. Yang and M. N. Wernick and N. P. Galatsanos and R. M. Nishikawa, “A support vector machine approach for detection of microcalcifications,” IEEE Transactions on Medical Imaging, vol.21, pp. 1552 – 1563, 2002.
    [38] M. Pardo and G. Sberveglieri, “Classification of electronicnose data with support vector machines,” Sensors and Actuators B: Chemical , vol.107, pp.730-737, 2005.
    [39] C. W. Hsu and C. C. Chang and C. J. Lin, “A Practical Guide to Support Vector Classification,” Department of Computer Science National Taiwan University, vol.1, pp.1-16, 2010.
    [40] C. C. Chang and C. J. Lin, “LIBSVM: a library for support vector machines,” Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm/.
    [41] C. C. Chang and C. J. Lin, “LIBSVM: A library for support vector machines,” ACM Transactions on Intelligent Systems and Technology, vol.2, 2011.
    [42] The NIST Year 2001 Speaker Recognition Evaluation, Available at http://www.itl.nist.gov/iad/mig/tests/sre/2001/index.html.
    [43] A. Martin and G. Doddington and T. Kamm and M. Ordowski and M. Przybocki, “The DET curve in assessment of detection task performance,” Proceeding of European Conference on Speech Communication and Technology, pp. 1895-1898, 1997.

    QR CODE
    :::