| 研究生: |
羅國軒 Kuo-Hsuan Lo |
|---|---|
| 論文名稱: |
利用深度學習以降低HEVC模式決策之運算複雜度的研究 A CNN-Assisted Technique for Computation Reduction of HEVC Intra prediction |
| 指導教授: |
林銀議
Yin-yi Lin |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 通訊工程學系 Department of Communication Engineering |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 107 |
| 中文關鍵詞: | HEVC 、預測單元 、畫面內預測 、深度學習 、RDO 、RMD |
| 外文關鍵詞: | HEVC, Prediciton Unit, Intra Prediction, Deep Learning, RDO, RMD |
| 相關次數: | 點閱:10 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
視訊編碼標準為高效率視訊編碼(High Efficiency Video Coding, HEVC),比H.264/AVC有更佳的編碼效率。HEVC的畫面內預測中,使用了35個模式來增加預測的精確度,但同時也大幅增加其編碼複雜度。因此本篇論文探討卷積神經網絡和約略模式決策所預測的候選模式來跟全模式搜索的最佳模式比較,準確率方面約略模式決策比卷積神經網絡來的高11.48%,而效能方面卷積神經網絡所運行的時間比約略模式決策高了5.109%,而BDBR卻多上升了0.59%。由我們剛才所討論的結果,我們可以知道卷積神經網絡的預測候選模式沒有約略模式決策來的好,但是兩者之間所選的候選模式從紋理方面可以看出是有相關性的,所以接下來我們會使用卷積神經網絡輔助約略模式決策,此處我們會使用卷積神經網絡的模式機制與機率機制來輔助約略模式決策,在模式機制方面,將會用約略模式決策與卷積神經網絡的候選模式做重疊,如果候選模式沒有重疊則會刪除;而在機率機制方面,將會比較候選模式的機率與閥值的大小,如果過小則會刪除候選模式,藉此減少候選模式個數已達節省時間的效果。在只進行8x8編碼的情況下,實驗結果顯示,當使用模式機制,可以在BDBR上升0.014%下,節省9.76%的時間;使用機率機制,可以在BDBR上升0.008%下,節省10.753%的時間。
High efficiency video coding (HEVC) is the latest video coding standard. To improve predict more accurately, using 35 prediction modes in intra prediction. This process which is meant to improve the efficiency in HEVC intra prediction however leads to a significantly higher computational complexity. In this paper , we discuss candidate mode predicted by CNN and RMD to compare with the best mode for full mode search in terms of accuracy, performance, and texture. We can know that the prediction mode of the CNN is not as good as the RMD, but the candidate mode selected between the RMD and CNN has a correlation, so we will use CNN to assists RMD. Here we will use the CNN's mode and probability to assist RMD. In terms of the mode, RMD and CNN candidate modes will be overlapped. If the candidate modes do not be overlapped, they will be deleted. In terms of the probability, the probability of the candidate modes will be compared with the threshold value. If value is too small, the candidate modes will be deleted, thereby reducing the number of candidate modes to save time. When only 8x8 encoding is performed, the experimental results show that when the mode is used, BDBR can be increased by 0.014%, saving 9.76% of time; using the probability, BDBR can be increased by 0.008%, which saves 10.753% of time.
[1] “Generic coding of moving pictures and associated audio information,” ISO/IEC 13818-2: Video (MPEG-2), May 1996.
[2] “Video coding for low bit rate communication, version 1,” ITU-T recommendation H.263, 1995.
[3] “Coding of audio-visual objects - Part 2: Visual,” in ISO/IEC 14496-2 (MPEG-4 Visual Version 1), Apr. 1999.
[4] Advanced Video Coding for Generic Audio-Visual Services, ITU-T Rec.H.264 and ISO/IEC 14496-10 (AVC), ITU-T and ISO/IEC JTC 1, May2003 (and subsequent editions).
[5] JCT-VC, “High Efficiency Video Coding (HEVC) Test Model 11 (HM 11) Encoder Description”, JCTVC L1002, JCT-VC Meeting, Incheon, Jan. 2013.
[6] J. Lainema, F. Bossen, W-J Han, J. Min and K. Ugur, “Intra Coding of the HEVC Standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1792-1801, Dec. 2012.
[7] J. Sullivan, J.-R. Ohm, W.-J. Han and T. Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard,” Pre-publication Draft, To Appear in IEEE Transactions on Circuits and Systems for Video Technology, pp. 1-19, Dec. 2012.
[8] W. S. Mcculloch and W. Pitts, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” Bulletin of Mathematical Biophysics, vol.5, no.4, pp.115-133, Dec. 1943.
[9] D. O. Hebb, “Organization of Behavior,” New York: Wiley & Sons.
[10] K. Alex, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems, pp.1097-1105, 2012.
[11] Y. Lecun, et al., “Gradient-based learning applied to document recognition”, Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[12] I. Mrazova, M. Kukacka, “Hybrid convolutional neural networks”, Industrial Informatics INDIN 2008. 6th IEEE International Conference, 2008.
[13] S. Lawrence, et al., “Face recognition: A convolutional neural-network approach”, IEEE Transactions on Neural Networks, vol.8, no. 1, pp. 98-113, 1997.
[14] C. M. Fang, Y. T. Chang and W. H. Chung, “Fast Intra Mode Decision for HEVC Based on Direction Energy Distribution,” in proceeding of IEEE International Symposium on Consumer Electronics, Jun. 2013, pp. 61-62
[15] Y. Wang, X. Fan, L. Zhao, S. Ma, D. Zhao, W. Gao, “A Fast intra coding Algorithm for HEVC”, IEEE International Conference on Image Processing(ICIP),10.1109/ICIP.2014.7025836, Oct.2014
[16] H. Hsu, S. Huang and Y. Lin, “Computational Complexity Reduction for HEVC Intra Prediction with SVM”, 2017 IEEE 6th Global Conference on Consumer Electronics (GCCE)
[17] T. Laude, J. Ostermann, "Deep learning-based intra prediction mode decision for HEVC", Picture Coding Symp. (PCS), pp. 1-5, 2016.
[18] N. Song, Z. Liu, X. Ji, and D. Wang, “CNN oriented fast PU mode decision for HEVC hardwired intra encoder,” IEEE Global Conference on Signal and Information Processing (GlobalSIP), 14-16 Nov. 2017, pp. 239-243.
[19] H. Ting, H. Fang, J. Wang,” Complexity Reduction on HEVC Intra Mode Decision with modified LeNet-5”, 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)
[20] S. Kuanar, K.R. Rao, C. Conly ,”Fast Mode Decision In HEVC Intra Prediction ,Using Region Wise CNN Feature Classification”, 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
[21] TensorFlow: an open source Python package for machine intelligence, https://www.tensorflow.org, retrieved Dec. 1, 2016.
[22] J. Dean, et al. “Large-Scale Deep Learning for Building Intelligent Computer Systems,” in Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 1-1, Feb. 2016.
[23] D.-T. Dang-Nguyen, C. Pasquini, V. Conotter, and G. Boato, “RAISE- A Raw Images Dateset for Digital Image Forensics,” in ACM MM, 2015.