| 研究生: |
莊詠婷 Yung-ting Chuang |
|---|---|
| 論文名稱: |
利用AAC壓縮域特徵之古典樂翻奏曲檢索系統 Classical Music Cover Song Retrieval System utilizing AAC Domain Features |
| 指導教授: |
張寶基
Pao-chi Chang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 通訊工程學系 Department of Communication Engineering |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 中文 |
| 論文頁數: | 68 |
| 中文關鍵詞: | 古典樂 、AAC 、壓縮域 、音樂檢索 |
| 外文關鍵詞: | classical music cover song, AAC, compression domain, content-based music retrieval |
| 相關次數: | 點閱:14 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於網際網路及多媒體壓縮技術已相當成熟,人們對網路的需求也日益劇增,透過網路下載或分享影音資料已成為人們生活中的一部分,而龐大的音樂資料庫是很常見的,因此如何在資料庫中快速檢索出使用者所需的資料是個重要的課題。常見的搜尋引擎大多藉由文字作為輸入,但卻有標記錯誤或模糊造成檢索結果錯誤的缺點,此情況於檢索古典樂時比流行樂更常發生。
本論文針對古典音樂資料庫,利用AAC壓縮域的特徵,部分解碼出改良式離散餘弦係數,可節省約70%的解碼運算複雜度,且對係數能量作前置處理以提升準確率,將係數重新定義於十二平均律音名,並利用內積計算求得樂曲相似度矩陣,藉由尋找最佳相似度累計路徑求得其相似度分數權重平均值,以得到最後檢索結果。實驗結果顯示,所提出之方法其檢索效能MRR值為0.96,可達97%的準確率,且與傳統基於原始域檢索的方法比較,可省下90%以上的比對時間。
With the rapid development of Internet and multimedia compression techniques, people can easily download or share multimedia data through networks. Therefore, efficient multimedia retrieval from huge multimedia database becomes an important issue. The most common method of search engines is through textual label. However, the label created by people may be ambiguous or even with errors. The situation like this in retrieving classical music occurs more often than pop music.
In our proposed system, we focus on classical music cover song retrieval in AAC compression domain. The modified discrete cosine transform coefficients are directly used to represent 12-dimensional chroma feature without a fully decoding process, and it can save about 70% decoding complexity. We truncate MDCT coefficients with low magnitude, adjust frequency boundary dynamically, and utilize dot-product calculation to get chroma similarity matrix. We calculate the similarity weighted arithmetic mean value between the songs by finding optimal similarity accumulated path and finally get the ranking results.
The experimental results show that the proposed method can reach Precision of 97% and save over 90% matching time compared with traditional approach in the waveform domain.
[1]The Official YouTube Blog. http://youtube-global.blogspot.jp/2013/03/onebillionstrong.html
[2]Music Information Retrieval Evaluation eXchange.
http://www.music-ir.org/mirex/wiki/MIREX_HOME
[3]J. Serra, E. Gomez, and P. Herrera, “Audio cover song identification and similarity: background, approaches, evaluation, and beyond,” Advances in Music Information Retrieval, vol. 274, ch. 14, pp. 307-332, March 2010.
[4]吉松隆文,吳怡文譯,古典音樂簡單到不行,初版,大雁文化,民國96年。
[5]吉松隆作,呂雅昕,游蕾蕾譯,古典音樂一下就聽懂名曲Guide,初版,大雁文化,民國96年。
[6]D. P. W. Ellis and G. E. Poliner, “Identifying ‘Cover Songs’ with Chroma Features and Dynamic Programming Beat Tracking,” in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Honolulu, Hawaii, U.S.A., pp. 1429-1432, April 15-20, 2007.
[7]J. Serra and E. Gomez, “Audio cover song identification based on tonal sequence alignment,” in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Las Vegas, Nevada, U.S.A., pp.61-64, March 30- April 4, 2008.
[8]S. Kim, E. Unal, and S. Narayanan, “Music fingerprint extraction for classical music cover song identification,” in Proc. Int. Conf. on Multimedia and Expo, Hannover, pp.1261-1264, June 23- April 26, 2008.
[9]X. Chuan, “Cover song identification using an enhanced chroma over a binary classifier based similarity measurement framework,” in Proc. Int. Conf. on Systems and Informatics (ICSAI), Las Vegas, Nevada, U.S.A., pp.2170-2176, May 19- 20, 2012.
[10]T. Bertin-Mahieux and D. P. W. Ellis, “Large-scale cover song recognition using hashed chroma landmark,” in Proc. IEEE Int. Conf. on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, pp.117-120, Oct. 19-20, 2011.
[11]Z. C. Cheng, C. S. Lin, and Y. H. Chen, “Fast Music Information Retrieval Using PAT Tree Based Dynamic Time Warping,” in Proc. Int. Conf. on Communications and Signal Processing, Singapore, Dec. 2011, pp. 1 – 5.
[12]T. Bertin-Mahieus and D. Ellis, “Large-Scale Cover Song Recognition using the 2D Fourier Transform Magnitude,” in Proc. Int. Conf. on International Society for Music Information Retrieval Conference (ISMIR), Porto, Oct. 8-12 2012.
[13]E. Ravelli, G. Richard, and L. Daudet, “Audio Signal Representations for Indexing in the Transform Domain,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 3, pp. 434-446, March 2010.
[14]T. H. Tsai and Y. T. Wang, “Content-Based Retrieval of Audio Example on MP3 Compression Domain,” in Proc. IEEE 6th Workshop on Multimedia Signal Processing, pp.123-126, September 2004.
[15]T. H. Tsai and W. C. Chang, “Two-Stage Method for Specific Audio Retrieval based on MP3 Compression Domain,” in Proc. IEEE International Symposium on Circuits and Systems, pp. 713-716, May 2009.
[16]C. C. Liu and C. S. Huang, “A singer identification technique for content-based classification of MP3 music objects,” in Proc. Int. Conf. on Information and Knowledge Management, McLean, Virginia, U.S.A., pp. 438-445, November 4-9, 2002.
[17]C. C. Liu and P. J. Tsai, “Content-based retrieval of Mp3 music objects,” in Proc. Int. Conf. on Information and knowledge management, New York, U.S.A. , pp. 506-511, 2011.
[18]Y. Jiao, B. Yang, M. Li, and X. M. Niu, “MDCT-Based Perceptual Hashing for Compressed Audio Content Identification,” in Proc. IEEE Int. Conf. on Multimedia Signal Processing, Crete , pp. 381-384, Oct. 1-3, 2011.
[19]ISO/IEC JTCI SC29/WG11, “ISO/IEC FDIS 14496-3 Subparts 1 ,2 ,3, Coding of Audio-Visual Objects – Part 3: Audio,” October 1988.
[20]M. Muller, D. P. W. Ellis, A. Klapuri, and G. Richard, “Signal Processing for Music Analysis,” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no.6, pp.1088-1110, October 2011.
[21]T. H. Tsai, and C. Liu, “A Configurable Common Filterbank Processpr for Multi-Standard Audio Decoder,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. 90,no. 9, pp. 1913-1923, September 2007.