| 研究生: |
邱俞閤 Yu-He Chiou |
|---|---|
| 論文名稱: |
前瞻性語音分離與增強系統之硬體設計 Hardware Design of Advanced Voice Separation and Enhancement System |
| 指導教授: | 蔡宗漢 |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 中文 |
| 論文頁數: | 57 |
| 中文關鍵詞: | 盲訊號分離 、硬體架構 、陣列式麥克風 |
| 相關次數: | 點閱:7 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
盲訊號分離是以摺積性混合訊號為假設基礎,去做訊號重建之技術。混合訊號會經過短時傅立葉轉換,轉換到頻域,因為訊號源有稀疏性特性,我們可以根據空間特徵,來聚集這些特徵時頻點。一般來說,可以用各個聲源到兩個麥克風的相位差和強度比作為空間特徵。本系統為一智慧電子系統,我們在未知聲源位置情況下,使用信號時頻分佈稀疏性的二元時頻遇遮罩技術分離訊號,達到抑制噪音的目的。
本系統針對於駕駛時,手持電話通話具有危險性並且違法,而免持通話方法解決了這個問題,如車用電話耳機,然而,行車通訊時常會有背景噪音,像是鳴喇叭汽車與高速公路的雜音,導致通訊品質下降,也造成駕駛人無法專注,延伸出許多困擾。為方便行車通訊,使用車用電話耳機通訊也逐漸普及,因此我們發展一個雙麥克風抗噪技術的系統,系統能將噪音與語音分離出來。將上述演算法以硬體架構實現,以TSMC 90nm的製程去實現我們的設計,我們的設計操作在10MHz在不包含memory的情況下約為119.71K的gate count,消耗功率約為2.92mW,memory使用量為69Kbits。
Blind source separation uses convolutive mixture signals as assumptions to reconstruct different signals. The mixture signals will go through a short time Fourier transform, and then being transferred into frequency domain. Because of the haracteristics of the signal sources are sparse. We can gather time-frequency point by spatial characteristics. Generally speaking, we can apply various sound sources to the different phase between the two microphones and the intensity ratio as the spatial characteristics. Our system is a smart electronic system. We can apply frequency masking techniques in case of binary frequency distribution sparse signal to separate signals without knowing where the source is.
We have a complete system-level solution on algorithm and VLSI implementation. This design is using TSMC 90 nm library with 10 MHz operation frequency. Without calculating memory of gate count about 119.71K. Power consumption about 2.92mW and memory usage is 69Kbits.
[1] O. M. Mitchell; C. A. Ross; G. H. Yates. “Signal processing for a cocktail party effect,” Journal of the Acoustic Society of America, 1971.
[2] H. Saruwatari; K. Sawai; T. Nishikawa; A. Lee; K. Shikano; A. Kaminuma; M. Sakata; D. Saitoh,“Speech Enhancement Based on Blind Source Separation in Car Environments,” Data Engineering Workshops 21st International Conference on, April 2005
[3] Araki, S., Sawada, H., Mukai, R. and Makino, S., Normalized observation vector clustering approach for sparse source separation. In: Proceedings of the EUSIPCO 2006.
[4] A. Hyvärinen ; E. Oja,“Independent Component Analysis:Algorithms and Applications,” Neural Networks, pp.411-430, 2000
[5] A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural Computation, 7:1129-1159, 1995.
[6] M. Gaeta and J. L. Lacoume, “Source separation without prior knowledge: the maximum likelihood solution,” In Proc. EUSIPCO’90, p. 621–624, 1990.
[7] M. Kawamotoa; Ki. Matsuokab; N. Ohnishia, “A method of blind separation for convolved non stationary signals”, Neurocomputing,vol.22, pp. 157–171, Nov.1998
[8] P. Smaragdis,“blind separation of convolved mixture in the frequency domain,” Neurocomputing, Vol. 22, No. 1-3. (20 November 1998), pp. 21-34
[9] A. Hyvärinen,“Fast and Robust Fixed-Point Algorithms for Independent Component Analysis”, IEEE Trans. on Neural Networks, pp.626-634, 1999.
[10] E. Bingham; A. Hyvärinen,“a fast fixed point algorithm for independent component analysis of complex valued signals”,International Journal of Neural Systems,vol. 10, No. 1,Feb. 2000
[11] H. Sawada; R. Mukai ;Se´bastien de la Kethulle de Ryhove; S. Araki; S. Makino,“spectral smoothing for frequency domain blind source separation,” International workshop on acoustic echo and noise control(IWAENC ),Sep.2003
[12] Robledo-Arnuncio; E. ; Biing-Hwang Juang, “Issues in frequency domain blind source separation - a critical revisit”, Acoustics, Speech, and Signal Processing, IEEE International Conference on (ICASSP),vol.5,Mar.2005
[13] T. Nishikawa; H. Saruwatari; and K. Shikano,“Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA,” IEICE Trans. Fundamentals,vol. E86-A, no. 4, pp. 846–858, Sep 2003
[14] P. Bofill; M. Zibulevsky, Blind separation of more sources than mixtures using sparsity of their short-time Fourier transform, in: Proceedings of the ICA2000, 2000, pp. 87–92.
[15] Jourjine, A. ; Rickard, Scott ; Yilmaz, O.” “Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures”,Acoustics, Speech, and Signal Processing (ICASSP), 2000 IEEE International Conference on, vol. 5, 2000.
[16] S. Winter; W. Kellermann; H. Sawada; S. Makino1,“MAP based underdetermined blind source separation of convolutive mixtures by hierarchical clustering and l1-norm minimization”, EURASIP Journal on Advances in Signal Processing, 2007.
[17] S. Winter , H. Sawada , S.Araki , S. Makino,“Overcomplete BSS for convolutive mixtures based on hierarchical clustering”, Independent Component Analysis and Blind Signal Separation Lecture Notes in Computer Science, vol.3195, pp. 652-660,2004.
[18] M. Aoki; M.Okamoto; S. Aoki; H. Matsui; T. Sakurai; Y. Kaneda, “Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones,” Acoustical Science and Technology, pp.149-157,Jan. 2001.
[19] S. Rickard , R. Balan , J. Rosca,“Real-time time–frequency based blind source separation,” in Proc. of International Conference on Independent Component Analysis and Signal Separation ,2001
[20] S. Arakia; H. Sawadaa; R. Mukaia; S. Makinoa ,“Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors”, Signal Processing , pp.1833-1847, Aug.2007
[21] Vincent, E. ; Gribonval, R. ; Fevotte, C. “performance measurement in blind audio source separation”, Audio, Speech, and Language Processing, IEEE Transactions on vol.14 ,pp.1462 – 1469,July 2006
[22] Cardoso, J.F.; Souloumiac, A.,“Blind beamforming for non-Gaussian signals,” Radar and Signal Processing, vol.140,pp.362-370, Dec 1993.
[23] H. Sawada; R. Mukai ; S. Araki ; S. Makino ,“A robust and precise method for solving the permutation problem of frequency-domain blind source separation”, IEEE transactions on speech and audio processing, vol. 12, no. 5, september 2004.
[24] S.-F. Lei; S.-N. Yao ,“A memory-free modified discrete cosine transform architecture for MPEG-2/4 AAC”, IET Circuits Devices Syst., 2010, Vol. 4, Iss. 1, pp. 14–23, January 2010.
[25] T.-H. Tsai; C.-N. Liu, "A Configurable Common Filterbank Processor for Multi-Standard Audio Decoder", IEICE Trans. Fundamentals of Electronics, Communications and Computer Sciences, Vol. E90-A, No. 9, pp. 1913-1923, Sep. 2007.
[26] Bohan Yang ; Dong Wang ; Leibo Liu, " Complex Division and Square-Root Using CORDIC", Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on, 21-23 April 2012.
[27] Tse-Wei Chen; Shao-Yi Chien, " Flexible Hardware Architecture of Hierarchical K-Means Clustering for Large Cluster Number", IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 8, AUGUST 2011.
[28] J-C Wang; C-Y Wang; T-C Tai; M Shih; S-C Huang; Y-C Chen; Y-Y Lin; L-X Lian, “VLSI Design for Convolutive Blind Source Separation, ” IEEE Transactions on Circuits and Systems II: Express Briefs 2015.
[29] Lan-Da Van; Di-You Wu; Chien-Shiun Chen, “Energy-Efficient FastICA Implementation for Biomedical Signal Separation” IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 11, NOVEMBER 2011.
[30] C-M Kim; H-M Park; T Kim; Y-K Choi; S-Y Lee, “FPGA implementation of ICA algorithm for blind signal separation and adaptive noise canceling” IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 14, NO. 5, SEPTEMBER 2003.