跳到主要內容

簡易檢索 / 詳目顯示

研究生: 羅國軒
Kuo-Hsuan Lo
論文名稱: 利用深度學習以降低HEVC模式決策之運算複雜度的研究
A CNN-Assisted Technique for Computation Reduction of HEVC Intra prediction
指導教授: 林銀議
Yin-yi Lin
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 通訊工程學系
Department of Communication Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 107
中文關鍵詞: HEVC預測單元畫面內預測深度學習RDORMD
外文關鍵詞: HEVC, Prediciton Unit, Intra Prediction, Deep Learning, RDO, RMD
相關次數: 點閱:10下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 視訊編碼標準為高效率視訊編碼(High Efficiency Video Coding, HEVC),比H.264/AVC有更佳的編碼效率。HEVC的畫面內預測中,使用了35個模式來增加預測的精確度,但同時也大幅增加其編碼複雜度。因此本篇論文探討卷積神經網絡和約略模式決策所預測的候選模式來跟全模式搜索的最佳模式比較,準確率方面約略模式決策比卷積神經網絡來的高11.48%,而效能方面卷積神經網絡所運行的時間比約略模式決策高了5.109%,而BDBR卻多上升了0.59%。由我們剛才所討論的結果,我們可以知道卷積神經網絡的預測候選模式沒有約略模式決策來的好,但是兩者之間所選的候選模式從紋理方面可以看出是有相關性的,所以接下來我們會使用卷積神經網絡輔助約略模式決策,此處我們會使用卷積神經網絡的模式機制與機率機制來輔助約略模式決策,在模式機制方面,將會用約略模式決策與卷積神經網絡的候選模式做重疊,如果候選模式沒有重疊則會刪除;而在機率機制方面,將會比較候選模式的機率與閥值的大小,如果過小則會刪除候選模式,藉此減少候選模式個數已達節省時間的效果。在只進行8x8編碼的情況下,實驗結果顯示,當使用模式機制,可以在BDBR上升0.014%下,節省9.76%的時間;使用機率機制,可以在BDBR上升0.008%下,節省10.753%的時間。


    High efficiency video coding (HEVC) is the latest video coding standard. To improve predict more accurately, using 35 prediction modes in intra prediction. This process which is meant to improve the efficiency in HEVC intra prediction however leads to a significantly higher computational complexity. In this paper , we discuss candidate mode predicted by CNN and RMD to compare with the best mode for full mode search in terms of accuracy, performance, and texture. We can know that the prediction mode of the CNN is not as good as the RMD, but the candidate mode selected between the RMD and CNN has a correlation, so we will use CNN to assists RMD. Here we will use the CNN's mode and probability to assist RMD. In terms of the mode, RMD and CNN candidate modes will be overlapped. If the candidate modes do not be overlapped, they will be deleted. In terms of the probability, the probability of the candidate modes will be compared with the threshold value. If value is too small, the candidate modes will be deleted, thereby reducing the number of candidate modes to save time. When only 8x8 encoding is performed, the experimental results show that when the mode is used, BDBR can be increased by 0.014%, saving 9.76% of time; using the probability, BDBR can be increased by 0.008%, which saves 10.753% of time.

    目錄 第一章 緒論 1 1.1. 高效率是頻編碼(High Efficiency Video Coding) 1 1.2. 高效率視訊編碼(HEVC)編碼架構 2 1.2.1. 編碼單元(Coding Unit) 4 1.2.2. 預測單元(Prediction Unit) 5 1.2.3. 轉換單元(Transform Unit) 6 1.2.4. 碼率失真代價函數(RD cost) 6 1.3. 研究動機與目的 8 1.4. 論文架構 8 第二章 畫面內編碼模式預測與卷積神經網絡介紹與論文回顧 9 2.1. HEVC畫面內編碼預測介紹 9 2.2. 深度學習介紹 16 2.2.1類神經網絡 17 2.2.2. 深度學習 23 2.3. 基於深度學習之模式預測與快速模式決策演算法相關文獻 27 2.3.1. 使用閥值減少候選模式數量 27 2.3.2. 使用深度學習來進行模式決策 30 第三章 約略模式決策與卷積神經網絡應用於畫面內預測之模式決策比較 35 3.1. 卷積神經網絡應用於HEVC/H.265畫面內編碼之模式預測 35 3.1.1. 實驗環境設置 36 3.1.2. 卷積神經網絡系統架構 38 3.2. 約略模式決策與卷積神經網絡實驗結果比較 43 3.2.1. 卷積神經網絡結合HEVC決策流程 43 3.2.2. 實驗分析與討論 44 第四章 卷積神經網絡輔助約略模式決策減少候選模式演算法 52 4.1. 畫面內預測快速模式決策文獻回顧 52 4.2. 以卷積神經網絡模式機制輔助約略模式決策演算法 56 4.2.1. 卷積神經網絡與約略模式決策最佳候選模式重疊決策流程 57 4.2.2. 卷積神經網絡與約略模式決策最佳候選模式重疊效能分析 58 4.3. 以卷積神經網絡機率機制輔助約略模式決策演算法 73 4.3.1. 卷積神經網絡之模式機率篩選約略模式決策最佳候選模式決策流程 74 4.3.2. 卷積神經網絡之模式機率篩選約略模式決策最佳候選模式效能分析 75 4.3.3. 模式機制與機率機制比較 89 第五章 結論與未來展望 90 參考文獻 91

    [1] “Generic coding of moving pictures and associated audio information,” ISO/IEC 13818-2: Video (MPEG-2), May 1996.
    [2] “Video coding for low bit rate communication, version 1,” ITU-T recommendation H.263, 1995.
    [3] “Coding of audio-visual objects - Part 2: Visual,” in ISO/IEC 14496-2 (MPEG-4 Visual Version 1), Apr. 1999.
    [4] Advanced Video Coding for Generic Audio-Visual Services, ITU-T Rec.H.264 and ISO/IEC 14496-10 (AVC), ITU-T and ISO/IEC JTC 1, May2003 (and subsequent editions).
    [5] JCT-VC, “High Efficiency Video Coding (HEVC) Test Model 11 (HM 11) Encoder Description”, JCTVC L1002, JCT-VC Meeting, Incheon, Jan. 2013.
    [6] J. Lainema, F. Bossen, W-J Han, J. Min and K. Ugur, “Intra Coding of the HEVC Standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1792-1801, Dec. 2012.
    [7] J. Sullivan, J.-R. Ohm, W.-J. Han and T. Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard,” Pre-publication Draft, To Appear in IEEE Transactions on Circuits and Systems for Video Technology, pp. 1-19, Dec. 2012.
    [8] W. S. Mcculloch and W. Pitts, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” Bulletin of Mathematical Biophysics, vol.5, no.4, pp.115-133, Dec. 1943.
    [9] D. O. Hebb, “Organization of Behavior,” New York: Wiley & Sons.
    [10] K. Alex, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems, pp.1097-1105, 2012.
    [11] Y. Lecun, et al., “Gradient-based learning applied to document recognition”, Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
    [12] I. Mrazova, M. Kukacka, “Hybrid convolutional neural networks”, Industrial Informatics INDIN 2008. 6th IEEE International Conference, 2008.
    [13] S. Lawrence, et al., “Face recognition: A convolutional neural-network approach”, IEEE Transactions on Neural Networks, vol.8, no. 1, pp. 98-113, 1997.
    [14] C. M. Fang, Y. T. Chang and W. H. Chung, “Fast Intra Mode Decision for HEVC Based on Direction Energy Distribution,” in proceeding of IEEE International Symposium on Consumer Electronics, Jun. 2013, pp. 61-62
    [15] Y. Wang, X. Fan, L. Zhao, S. Ma, D. Zhao, W. Gao, “A Fast intra coding Algorithm for HEVC”, IEEE International Conference on Image Processing(ICIP),10.1109/ICIP.2014.7025836, Oct.2014
    [16] H. Hsu, S. Huang and Y. Lin, “Computational Complexity Reduction for HEVC Intra Prediction with SVM”, 2017 IEEE 6th Global Conference on Consumer Electronics (GCCE)
    [17] T. Laude, J. Ostermann, "Deep learning-based intra prediction mode decision for HEVC", Picture Coding Symp. (PCS), pp. 1-5, 2016.
    [18] N. Song, Z. Liu, X. Ji, and D. Wang, “CNN oriented fast PU mode decision for HEVC hardwired intra encoder,” IEEE Global Conference on Signal and Information Processing (GlobalSIP), 14-16 Nov. 2017, pp. 239-243.
    [19] H. Ting, H. Fang, J. Wang,” Complexity Reduction on HEVC Intra Mode Decision with modified LeNet-5”, 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)
    [20] S. Kuanar, K.R. Rao, C. Conly ,”Fast Mode Decision In HEVC Intra Prediction ,Using Region Wise CNN Feature Classification”, 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
    [21] TensorFlow: an open source Python package for machine intelligence, https://www.tensorflow.org, retrieved Dec. 1, 2016.
    [22] J. Dean, et al. “Large-Scale Deep Learning for Building Intelligent Computer Systems,” in Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 1-1, Feb. 2016.
    [23] D.-T. Dang-Nguyen, C. Pasquini, V. Conotter, and G. Boato, “RAISE- A Raw Images Dateset for Digital Image Forensics,” in ACM MM, 2015.

    QR CODE
    :::