跳到主要內容

簡易檢索 / 詳目顯示

研究生: 楊雅婷
Ya-ting Yang
論文名稱: 基於H.264畫面內編碼特徵之影像內容檢索技術
Content Based Image Retrieval utilizing H.264 Intra Coding Features
指導教授: 張寶基
Pao-Chi Chang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 通訊工程學系
Department of Communication Engineering
畢業學年度: 100
語文別: 中文
論文頁數: 79
中文關鍵詞: 壓縮域H.264畫面內預測內容檢索
外文關鍵詞: H.264, Intra Prediction, Compression domain, Content-based Retrieval
相關次數: 點閱:25下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著網際網路與手機通訊的蓬勃發展,所能取得的影音資料日漸擴大,如何在龐大的資料庫中取得使用者所需求的影像逐漸成為一個重要的課題。而以影像內容為基礎的檢索研究,雖被廣泛的研究,但仍然有缺點存在,特別是將影像全部解壓縮後再從影像擷取特徵,會耗費相當多地運算時間與儲存空間。
    本論文針對內容檢索,利用H.264解碼器於壓縮域上的特徵進行抽取,以畫面內預測(intra prediction)所取得之不同預測模式當為特徵,並利用其殘餘值(residual)篩選預測模式是否可靠,最後採用區域搜尋方式加上幾何對應關係進行檢索。實驗結果顯示,所提出之方法可大幅減少檢索時所需使用的大量特徵空間,且檢索效能MAP值為0.3,ANMRR值為0.625,仍能維持應有的檢索效能。


    With the development of internet and mobile communications, efficient multimedia retrieval from huge databases becomes an important issue. Although content-based image retrieval has been extensively studied in recent years, there still exist shortcomings. In particular, extracting features from fully decompressed image is computation and memory resource consuming.
    This thesis focuses on the content-based image retrieval in compression domain. The extracted features are based on the I-frame coding information in H.264 that is a powerful and widely used coding standard. We propose to employ the local mode histogram as the texture feature to match images, and apply the residual coefficients to filter non-confidence modes. Moreover, the geometrical correspondence between two images is also considered in our method.
    The experimental results show that the proposed method can reduce the resource consumption and archive the similar performance, i.e., MAP 0.3, and ANMRR 0.625 in Oxford 5k compared with the method that extracting features in decompressed images.

    摘 要………………………………………………………………………………………………………………………i Abstract……………………………………………………………………………………………………………ii 致謝………………………………………………………………………………………………………………………iii 目錄…………………………………………………………………………………………………………………………iv 圖目錄……………………………………………………………………………………………………………………vi 表目錄……………………………………………………………………………………………………………………ix 第一章 緒論……………………………………………………………………………………………………………1 1.1 研究背景………………………………………………………………………………………………………1 1.2 研究動機與目的…………………………………………………………………………………………3 1.3 論文架構………………………………………………………………………………………………………6 第二章 基於內容檢索之相關研究介紹……………………………………………………………7 2.1 多媒體內容檢索之進展……………………………………………………………………………7 2.2 內容檢索特徵概述……………………………………………………………………………………9 2.2.1 尺度不變特徵轉換…………………………………………………………………………12 2.2.2 梯度壓縮直方圖………………………………………………………………………………13 2.3 以內容為基礎之檢索系統……………………………………………………………………15 2.3.1 VisualSEEk…………………………………………………………………………………15 2.3.2 Visual Information Retrieval…………………………………16 2.3.3 PhotoBook……………………………………………………………………………………17 2.3.4 Blobworld……………………………………………………………………………………17 2.3.5 Query by Image Content…………………………………………………18 2.3.6 台灣蝴蝶外觀檢索系統………………………………………………………………19 2.4 壓縮域特徵擷取之文獻………………………………………………………………………20 2.4.1 顏色(Color)……………………………………………………………………………20 2.4.2 紋理(Texture)………………………………………………………………………21 2.4.3 DCT係數…………………………………………………………………………………………22 2.4.4 移動資訊(Motion Infromation)………………………………23 第三章 視訊編碼介紹……………………………………………………………………………………24 3.1 H.264/AVC視訊壓縮標準介紹………………………………………………………24 3.2 H.264/AVC編碼架構介紹………………………………………………………………27 3.2.1 畫面類型………………………………………………………………………………………30 3.2.2 畫面內預測(Intra Prediction)………………………………32 3.3 H.264/AVC壓縮域內容檢索之文獻探討……………………………………36 第四章 提出之壓縮域內容檢索方法…………………………………………………………39 4.1 畫面內預測模式之區域直方圖分析………………………………………………39 4.2 利用殘餘值構成有特色之區域分析………………………………………………44 4.3 幾何對應………………………………………………………………………………………………47 4.4 所提出之演算法架構…………………………………………………………………………51 第五章 實驗結果與分析討論………………………………………………………………………52 5.1 實驗參數與模擬環境…………………………………………………………………………52 5.2 評分機制………………………………………………………………………………………………54 5.2.1 Mean Average Precision……………………………………………54 5.2.2 Average Normalized Modified Retrieval Rank…………………57 5.3 壓縮域內容檢索結果及分析……………………………………………………………58 5.4 Out-Set 實際拍照後之檢索結果與分析…………………………………70 5.4.1 無建立資料庫……………………………………………………………………………70 5.4.2 建立資料庫………………………………………………………………………………72 5.5 資料量評比…………………………………………………………………………………………74 第六章 結論與未來展望………………………………………………………………………………76 參考文獻…………………………………………………………………………………………………………77

    [1]數位時代雜誌第202期special report, March, 2011.
    [2]B. Girod, V. Chandrasekgar, R. Grzeszczuk, and Y. A. Reznik, “Mobile Visual Search : Architectures, Technologies, and the Emerging MPEG Standard,” IEEE Multimedia, vol. 18, no. 3, pp. 86-94, 2011.
    [3]M. J. Swain, and D. H. Ballard, “Color Indexing,” International Journal of Computer Vision, pp. 11-32, 1991.
    [4]D. G. Lowe, “Distinctive Image Features from Scale-InvariantKeypoints,” International Jouranl fo Computer Vision, pp. 91–110, 2004.
    [5]V. Chandrasekhar, G. Takacs, D. Chen, S. Tsai, Y. Reznik, R.Grzeszczuk, and B. Girod, “Compressed Histogram of Gradients: a Lowbitrate Descriptor,” International Journal of Computer Vision, vol. 94, pp.1-16, 2011.
    [6]J. R. Smith, S. -F. Chang, “Querying by color regions using the VisualSEEk content-based visual query system,” Intelligent Multimedia Information Retrieval, 1996.
    [7]O. Marques, B. Furht, and T. K. Shih, “Content-based Visual Information Retrieval,” Distrivuted Multimedia Databases : Techniques and Applications, pp. 37-57, Idea Group Publishing, 2002.
    [8]R. Pentland, W. Picard, and S. Sclaroff, “Photobook : Content-based Manipulation of Image Databases,” International Journal of Computer Vision, 1996.
    [9]C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik, “Blobworld : A System for Region-based Image Indexing and Retrieval,” International Conference on Visual Information Systems, 1999.
    [10]C. Faloutsos, R. Barber, M. F. J. Hafner, W. Niblack, D. Petkovic, and W. Equitz, “Efficient and Effective Querying by Image Content, ” Journal of Intelligent Information Systems, 1994.
    [11]B- C. Chen, H. Jieh, “Content-based Image Retrieval of Butterflies,” 2000. http://turing.csie.ntu.edu.tw/ncnudlm/
    [12]M. Ferman, A. M. Tekalp, and R. Mehrotra, "Robust Color Histogram Descriptors for Video Segment Retrieval and Identification," IEEE Transactions on Image Processing, vol. 11, no. 5, pp. 497-508, May 2002.
    [13]F. Zargari, M. Mehrabi, and M. Ghanbari, "Compressed Domain Texture Based Visual Information Retrieval Method for I-Frame Coded Pictures," IEEE Transactions on Consumer Electronics, vol. 56, no. 2, pp. 728-736, May 2010.
    [14]J. Feng and Y. W. Chen, "Feature Extraction Algorithm of Block Edge Pattern Based on H.264 I-Frame Coding," Journal of South China University of Technology, vol. 38, no. 2, pp. 126-131, 2010.
    [15]M. Mehrabi, F. Zargari, and M. Ghanbari, "Compressed Domain Content Based Retrieval Using H.264 DC-Pictures," Multimedia Tools and Applications, Sep. 2010.
    [16]X. H. Zhang, G. C. Bian, and W. B. Xu, "A Shape Feature Based Image Retrieval in DCT Compressed-Domain," in proc. Fifth International Conference on Computer and Information Technology, pp. 629-633, Sep. 2005.
    [17]H. Wang, A. Divakaran, A. Vetro, S. Chang, and H. Sun, "Survey of Compressed-Domain Features Used in Audio-Visual Indexing and Analysis," Journal of Visual Communication and Image Representation, vol. 14, no. 2, pp. 150-183, Jun. 2003.
    [18]Z. Droueche, M. Lamard, G. Cazuguel, G. Quellec, C. Roux, and B. Cochener, "Content-Based Medical Video Retrieval Based on Region Motion Trajectories," in proc. 5th European Conference of the International Federation for Medical and Biological Engineering, pp. 622-625, Sep. 2011.
    [19]T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol.13, no.7, pp.560-576, July 2003.
    [20]Y. Wang, “A Improved Image Edge Detection Algorithm Based on H.264 Intra Prediction,” International Conference on Intelligence Science and Information Engineering, 2011.
    [21]Oxford 5k dataset, http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/
    [22]M. Zhu, “Recall, precision, and average precision,” Dept. Statistics Actuarial Sci., Univ. Waterloo, CA, Tech. Rep. 9, 2004.
    [23]P. N- Nya, J. Restat, T. Meiers, J.-R. Ohm, A. Seyferth, and R. Sniehotta, “Subjective Evaluation of the MPEG-7 Retrieval Accuracy Measure (ANMRR),” ISO/WG11 MPEG Meeting, Geneva, Switzerland, Doc. M6029, May 2000.
    [24]Q. Xe, J. Liu, S. Wang, and J. Zhao, “H.264/AVC baseline profile decoder optimization on independent platform,” International Conference on Wireless Communications, Networking and Mobile Computing, vol. 2, pp. 1253 – 1256, Sep. 2005.
    [25]J. Phillbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object Retrieval with Large Vocabularies and Fast Spatial Matching,” IEEE Conference on Computer Vision and Pattern Recognition, 2007.

    QR CODE
    :::