| 研究生: |
黃麗芳 Li-Fang Huang |
|---|---|
| 論文名稱: |
利用快速碼簿搜尋之AMR至G.729A語音轉碼 AMR to G.729A speech transcoding with fast codebook search |
| 指導教授: |
張寶基
Pao-Chi Chang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 通訊工程學系 Department of Communication Engineering |
| 畢業學年度: | 95 |
| 語文別: | 中文 |
| 論文頁數: | 96 |
| 中文關鍵詞: | G.729A 、AMR 、語音轉碼 |
| 外文關鍵詞: | AMR, G.729A, Speech transcoding |
| 相關次數: | 點閱:12 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著網路的發達,網際網路除可傳送數據資料外,人們也可以使用行動通信系統透過網際網路與IP電話做連結。由於行動通信與VoIP所使用之語音編碼技術不盡相同,因此語音轉碼(speech transcoding)是網路語音系統中不可缺少的機制,此技術尚可應用在網路連線遊戲及語音聊天室等娛樂用途。
傳統上最佳的語音轉碼方法是使用完全解碼(full decoding)的方式,在過程上必需進行語音的壓縮及解壓縮處理,造成運算複雜度過高與時間延遲長的缺點。為此,本論文利用脈衝代換之快速碼簿搜尋法,提出一套部份解碼(partial decoding)方式的語音轉碼方法,利用語音訊號的特性,以碼框(frame)為單位,分析代表各語音所需的語音參數,藉由參數的轉換以達到語音轉碼的效果。該組目標音訊參數亦符合原壓縮方法之壓縮格式。可運用在AMR與G.729A語音壓縮標準上,並可有效地降低運算複雜度,就每一音框所需的時脈刻劃時間(clockticks),約為完全解碼法的7.2%,且可得到與完全解碼法接近之語音品質。
As the development of the internet technique, we not only can transmit the data but also connect 3GPP with VoIP over internet . Because of the coding schemes of 3GPP are not the same as VoIP, speech transcoding scheme is needed in the voice system over internet. Speech transcoding scheme can make the connection between users successful, and furthermore, it can be used in entertainment applications, such as audio chat rooms and online games.
Full decoding technique is an intuitive and traditional speech transcoding method, but it requires high computational complexity and long processing time. In this work, we propose a partial decoding technique with fast codebook search, which utilizes the pulse replacement method, on ACELP coding architecture. There is no need to redo all the decoding and encoding processes. Partial decoding method can be directly applied to ACELP based speech coding, such as AMR and G.729A speech standards. It achieves excellent voice quality as the full decoding method does while it only requires 7.2% computation loading on clockticks per frame.
[1] ETSI, "Digital Cellular Telecommunications System(Phase 2+);Adaptive Muliti-Rate(AMR)speech transcoding," EN 301 704, Apr. 2000.
[2] ITU-T Recommendation G.729A, "Coding of speech at 8 kbit/s using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP)," Mar. 1996.
[3] Y. Ota, M. Suzuki and Y. Tsuchinaga, "Speech coding translation for IP and 3G mobile integrated network," Proc. of ICC, pp. 114-118, Apr. 2002.
[4] S. Lee, S. Seo and D. Jang, "A novel transcoding algorithm for AMR and EVRC speech codecs via direct parameter transformation," Proc. of ICASSP, pp. 177-180, vol. 2, Apr. 2003.
[5] J. Choi, C. Lee, H. Kang, Y. Park and D. Youn, "Improvement issues on transcoding algorithms for the flexible usage to the various pairs of speech codec," Proc. of ICASSP, pp. I-269 ~ I-272, May. 2004.
[6] 余志剛, 胡波, " AMR與G.729A的參數直接轉換算法," 信息與電子工程, 第四期,中華民國九十四年十二月.
[7] ITU-T Recommendation P.862, "Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone network and speech codecs," Feb. 2001.
[8] L. R. Rabiner, R. W. Schafer, Digital Prediction of Speech Signals, Prentice Hall, 1978.
[9] A. S. Spanias, "Speech Coding: A Tutorial Review," Proc. of the IEEE, vol. 82, no. 10, pp. 1541-82, Oct. 1994.
[10] A. M. Kondoz, Digital Speech Coding for Low Bit Rate Communications Systems, Wiley, 1994
[11] D. G. Rowe, "Techniques for Harmonic Sinusoidal Coding," Ph.D. Thesis, University of South Australia, 1997.
[12] M. R. Scheroeder, B. S. Atal, "Code-excited linear prediction (CELP): high quality speech at very low bit rate," Proc. of ICASSP, pp. 937-940, Mar. 1985
[13] B. Gold and C. Rader, "The channel vocoder," IEEE Trans. On Audio, vol. 15, pp. 148-161, Dec. 1967.
[14] I. Gibson, "Vector sum excited linear prediction (VSCELP) speech coding for Japan digital cellular," presented at the Meeting of IEICE, paper RCS90-26, Nov. 1990.
[15] J. P. Campbell, T. E. Tremain and V. C. Welch, "The DOD 4.8 kbps standard (Proposed Federal Standard 1016)," Advances in Speech Coding, Kluwer Academic Publishers, pp. 121-133, 1991.
[16] R. V. Cox, "Three new speech codecs from the ITU cover a range of application," IEEE Comm. Magazine, Sep. 1997.
[17] S. Lin and D. J. Constello, "Error control coding fundamentals and applications," Prentice-Hall, 1983.
[18] A. Gersho, "Advances in speech and audio compression," Proc. of IEEE, vol. 82, no. 6, pp. 900-918, Jun. 1994.
[19] R. V. Cox and P. Kroon, "Low bit-rate speech coders for multimedia communication," IEEE Comm. Magazine, vol. 34, no. 12, pp. 34-41, Dec. 1996.
[20] A. Papoulis, Probability, Random Variables, and Stochastic Processes, third edition, McGraw-Hill, 1991.
[21] T. Fingscheidt, P. Vary and J. A. Andonegui, "Robust speech decoding: can error concealment be better than error correction," Proc. of ISSP, vol. 1, pp. 373-376, May 1998.
[22] 廖瑞祥, 無線傳輸環境下G.723.1語音編碼之位元保護與錯誤隱藏處理, 碩士論文, 中央大學, 1998.
[23] 朱復興, 無線傳輸及網際網路環境下之G.729與G.723.1語音傳輸, 碩士論文, 中央大學, 2000.
[24] S. Atungsiri, R. Soheili, A. M. Kondoz and B. G. Evans, "Effective lost speech frame reconstruction for CELP coders," Proc. of EUROSPEECH Conf., volume 2, Sep. 1991.
[25] C. Hoene, H. Karl, and A. Wolisz, "A perceptual quality model for adaptive VoIP Applications," Proc. of SPECTS, San Jose, CA, July 2004.
[26] H. C. Park, Y. C. Choi and D. Y. Lee, "Efficient codebook search method for ACELP speech codes," Proc. of IEEE Speech Coding Workshop, pp. 17-19, Oct.2002.
[27] M. Ghenania and C. Lamblin, "Low-cost smart transcoding algorithm between ITU-T G.729(8kbit/s) and 3GPPNB-AMR(12.2kbit/s)," Proc. of Eusipco, Vienna, 2004.
[28] A. Lovrich and J. Reimer, "A multi-rate transcoder," IEEE Trans. On Consumer Electronics, vol. 35, pp. 715-722, Jun. 1989.
[29] H. G. Kang, H. K. Kim and R. V. Cox, "Improving the transcoding capability of speech coders," IEEE Trans. On Multimedia, vol. 5, pp. 24-33, Mar. 2003.
[30] 陳慶彰, 運用G.729與G.723.1於多點會議系統之多聲道語音混合方法, 碩士論文, 中央大學, 2001.
[31] 楊東敏, 運用G.729與G.723.1於多點會議系統之多聲道語音混合方法, 碩士論文, 中央大學, 2001.