| 研究生: |
陳禹興 Yu-Shing Chen |
|---|---|
| 論文名稱: |
基於廣義交互相關函數之聲源方位偵測系統 A Sound Source Localization System Based on Generalized Cross Correlation |
| 指導教授: |
鍾鴻源
Hung-Yuan Chung |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 畢業學年度: | 100 |
| 語文別: | 中文 |
| 論文頁數: | 72 |
| 中文關鍵詞: | 時間延遲估測 、聲源方位偵測 、廣義交互相關函數 |
| 外文關鍵詞: | Time Delay of Arrival, Sound Source Localization, Generalized Cross Correlation |
| 相關次數: | 點閱:19 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文以兩個麥克風組成麥克風對建構出一個聲源方位偵測系統,其運算的部分是由嵌入式系統(TMS320VC5509)來完成,運算的部分包括語音活動偵測(Voice Activity Detection, VAD)、時間延遲(Time Delay of Arrival, TDOA)估測與方位角度估測,其中語音活動偵測結合了對數能量分析(Log Energy)與頻譜亂度分析(Entropy)來增加判定的準確性與降低總計算量。時間延遲估測的部分使用廣義交互相關函數(Generalized Cross Correlation, GCC),並以拋物線內插法來增加時間延遲估測的準確性。在方位角度估測是依據麥克風對與聲源所建構出的雙曲線來求得,在估測中以此雙曲線的漸進線來作為聲源的方位角度。
而本論文也在實驗的部分對於時間延遲估測的幾種方式像是平均振幅差函數(Average Magnitude Difference Function, AMDF)、最小平方法(Least Median of Square, LMS)、交互相關函數(Cross Correlation, CC)與廣義交互相關函數做出討論與比較。
In this thesis, a sound source localization system is studied and implemented. For the hardware part, the microphone array composed of two microphones is used to input the voice signals. The operation of this system is used by an embedded system (TMS320VC5509A). For the software part, VAD (Voice Activity Detection) and TDOA (Time Delay of Arrival) estimation and direction the detection are executed in order. In the processing of VAD, we combine the log-energy and the spectral-entropy to distinguish the speech/non-speech frames. To estimate the TDOA, an approximation algorithm is used to compute a generalized cross correlation function. Then, use the parabolic interpolation based method to increase the accuracy of estimated TDOA values. The TDOA and speed of sound in air can be employed find the direction of a sound source.
[1] D. Johnson and D. Dudgeon, Array Signal Processing : Concepts and Techniques, Prentice Hall, Englewood Cliff, New Jersey, 1993.
[2] J. L. Flanagan, L. Landgraf, D.J. McLean, “Matched-filter processing of hydrophone array”, J. Acoust. Soc. Am. Vol. 42, pp.1165-1165, November 1967.
[3] B. L. Sim, Y. C. Tong, J. S. Chang and C. T. Tan, “A parametric formulation of the generalized spectral subtraction method”, IEEE Trans. Speech and Audio Processing, Vol. 6, pp. 328-337, July 1998.
[4] Y. Ephraim and H. L. Van Trees, “A signal approach for speech enhancement”, IEEE Trans. Speech and Audio Processing, Vol. 3, pp. 251-266, July 1995.
[5] F. Asano, Y. Motomura, H. Asoh, T. Yoshimura, N. Ichimura, S. Nakamura, “Fusion of audio and video information for detecting speech events”, Proceedings of the Sixth International Conference of Information Fusion, pp. 386-393, 2003.
[6] F. Asano, H. Asoh, T. Matsi, “Sound source localization and signal separation for office robot“Jijo-2””, Proceedings. 1999 IEEE/SICE/RSJ International Conference on Multisensor Fusion and Integration for Intelligent Systems, pp. 243-248, August 1999.
[7] K. Nakadai, K. Hidai, H. Mizoguchi, G. Hiroshi, H. Kitano, “Real-time Auditory and Visual Multiple-object Tracking for Humanoids”, International Joint Conferences on Artificial Intelligence, pp. 1425-1436, 2001.
[8] R. Schmidt, “Multiple emitter location and signal parameter estimation”, IEEE Transactions on Antennas Propagation, Vol. AP-34, No.3, pp. 276-280, March 1986.
[9] R. Roy, T. Kailath, “ESPRIT-Estimation of Signal Parameters via Rotational Invariance Techniques”, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol.37, No. 7, pp. 984-995, July 1989.
[10] C. H. Knapp, G. C. Carter, “The generalized correlation method for estimation of time delay”, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol.24, No. 4, pp. 320-327, August 1976.
[11] J-M. Valin, F. Michaud, J. Rouat, D. Letourmeau, “Robot sound source localization using a microphone array on a mobile robot”, IEEE/RSJ International Conference on Intelligent Robots and Systems, Vol. 2, pp. 1228-1233, October 2003.
[12] Y. Sasaki, Y. Tamai, S. Kagami, H. Mizoguchi, “2D sound source localization on a mobile robot with a concentric microphone array”, IEEE International Conference on Systems, Man and Cybernetics, Vol. 4, pp. 3528-3533, October 2005.
[13] National Semiconductor, LM386 low voltage audio power amplifier
http://www.national.com/ds/LM/LM386.pdf
[14] National Semicondutor, LM124/LM224/LM324/LM2902 Low Power Quad Opweational Amplifiers
http://cache.national.com/ds/LM/LM124.pdf
[15] Texas Instruments, TMS320VC5509A Fixed-Point Digital Signal Processor.
http://www.ti.com/product/tms320vc5509a
[16] P. Renevey, A. Drygajlo, “Entropy Based Voice Activity Detection in Very Noisy Conditions”, European Conference on Speech Communication and Technology, Eurospeech’2001, pp. 1887-1890, 2001.
[17] C. E. Shannon, “A Mathematical Theory of Communication”, The Bell System Technical Journal, Vol. 27, pp. 379-423; 623-656, July, October, 1948.
[18] P. R. Roth, “Effective measurements using digital signal analysis”, IEEE Spectrum, Vol.8, pp. 62-70, April 1971.
[19] G. C. Carter, A. H. Nuttall, and P. G. Cable, “The smoothed coherence transform”, Proceedings of the IEEE, Vol. 61, pp. 1497-1498, October 1973.
[20] G. C. Carter, A. H. Nuttall, and P. G. Cable, “The smoothed coherence transform (SCOT)”, Naval Underwater Systems Center, New London Lab., New London, CT, Tech. Memo TC-159-72, August 1972.
[21] M. Omologo, P. Svaizer, “Use of the Crosspower-Spectrum Phase in Acoustic Event Location”, IEEE Transactions on Speech and Audio Processing, Vol. 5, No. 3, pp. 288-292, May 1997.
[22] K. C. Kwak, “Sound Source Localization with the Aid of Excitation Source Information in Home Robot Environments”, IEEE Transactions on Consumer Electronics, Vol. 54, No. 2, pp. 852-856, May 2008.
[23] R. K. Swamy, K. S. R. Murty, and B. Yegnanarayana, “Determining number of speakers from multispeaker speech signals using excitation source information”, IEEE Signal Processing Letters, vol. 14, no. 7, pp. 481-484, 2007.
[24] B. C. Park, K. D. Ban, K. C. Kwak, and H. S. Yoon, “Sound source localization based on audio-visual information for intelligent service robots”, Int. Symposium on Advanced Intelligent Systems, pp.364-367, September 2007.