跳到主要內容

簡易檢索 / 詳目顯示

研究生: 范聖敏
Sheng-Min Fan
論文名稱: 一種應用於HEVC解碼端之深度學習架構的研究
Study of A Deep Learning Architecture For HEVC Decoder
指導教授: 林銀議
Yin-Yi Lin
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 通訊工程學系
Department of Communication Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 104
中文關鍵詞: HEVC改善編碼性能支持向量機卷積神經網路分散式編碼
外文關鍵詞: HEVC, Improved Coding Performance, SVM, CNN, Distributed Coding
相關次數: 點閱:8下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 視訊編碼壓縮標準為高效率視訊編碼(High Efficiency Video Coding, HEVC)此壓縮編碼比起H.264擁有更高的編碼壓縮效率,其應用範圍也可以達到4K、8K的影像。HEVC在切割影像技術上使用四分樹(Quad-Tree)的編碼方式,將編碼樹單元(Coding Tree Unit, CTU)藉由壓縮編碼運算,CTU包含著四種不同的深度,深度最淺的影像失真較少;反之深度最深的影像失真較多,而這四種深度組成比例,也隨著量化參數(Quantization Parameter, QP)的不同有影響。而我們要利用卷積神經網路(Convolutional Neural Network, CNN)的方式來優化其影像品質,採用殘差網路的訓練方式,將訓練到的影像殘差值再補上失真影像,來達到強化影像的效果。而就如同上述所提及的量化參數影響CTU深度分布,其也會間接影像到神經網路對於影像品質優化的改善程度,於是我們利用支持向量機應用在CTU的快速演算法,作為我們CTU的分類器,將CTU分成簡單紋理與複雜紋理,再分別使用卷積神經網路優化其影像性能。我們使用卷機神經網路優化使用與未使用支持向量機作為CTU的分類器的兩種例子,發現有加入了支持向量機分類器對於卷積神經網路的訓練有四大優點,一是可以降低卷積神經網路的訓練資料量;二是可以多降低1%左右的BDBR;三是可以節省16%左右的編碼壓縮時間;四是對於特殊影像也有非常良好的效果。


    The video coding compression standard is High Efficiency Video Coding (HEVC). This compression coding has higher coding compression efficiency than H.264, and its application range can also reach 4K and 8K images. HEVC uses the Quad-Tree coding method in technology of cutting image. The coding tree unit (CTU) is compressed and calculated. The CTU contains four different depths and the lowest depth image distortion. Less; on the other hand, the image with the highest depth has more distortion, and the proportions of these four depths also have an effect with the quantization parameter (QP). We use the Convolutional Neural Network (CNN) method to optimize image quality, and use the training method of the residual network to add the distorted image to the residual value of trained image to achieve enhancement. Just as the above-mentioned quantization parameters affect the CTU depth distribution, it will also have different improvement in image quality optimization by the neural network. Therefore, we use support vector machines (SVM) to apply the fast algorithm of CTU as our CTU. The classifier classifies the CTU into simple textures and complex textures, and then uses a convolutional neural network to optimize its image performance. We use a convolutional neural network to optimize two examples of classifiers with and without support vector machines as CTUs. We find that adding a support vector machine classifier has four major advantages for the training of convolutional neural networks. First, it can reduce the amount of training data for convolutional neural networks; Second, it can reduce BDBR by about 1%; Third, it can save about 16% of encoding compression time; Fourth, it also has very good effects on special images.

    第一章 緒論 1 1.1高效率視頻編碼(High Efficiency Video Coding) 1 1.2 HEVC 編碼架構介紹 2 1.2.1編碼單元(Coding Unit) 3 1.2.2預測單元(Prediction Unit) 4 1.2.3轉換單元(Transform Unit) 5 1.2.4畫面內編碼預測(Intra Predict)介紹 5 1.2.5量化(Quantization) 10 1.3 支持向量機介紹 (Support vector machine) 11 1.4 神經網路介紹 16 1.4.1深度神經網路 (Deep Neural Networks, DNN) 16 1.4.2卷積神經網路 (Convolutional Neural Network, CNN) 17 1.5研究動機與目的 19 1.6論文架構 19 第二章 相關文獻回顧 20 2.1 超分辨率技術(Super-Resolution, SR) 20 2.2 超分辨率技術應用於HEVC 27 2.3 SVM應用於HEVC編碼單元(CU)快速深度決策演算法 30 2.3.1支持向量機特徵選取介紹 30 2.3.2快速深度決策演算法 34 2.3.3模型訓練類型與模型量化參數選擇 36 2.3.4實驗性能 41 2.4 Cnn Based Post-Processing To Improve Hevc 42 2.4.1演算法架構 42 2.4.2卷積神經網路模型建立與訓練 44 2.4.3實驗性能 45 2.5 Enhancing Hevc Compressed Videos With A Partition-Masked Convolutional Neural Network 46 2.5.1演算法架構 46 2.5.2卷積神經網路模型建立與訓練 49 2.5.3實驗性能 49 第三章 結合支持向量機與卷積神經網路以提升HEVC畫面內預測性能表現 51 3.1 演算法架構 51 3.2 系統架構 53 3.2.1 前處理階段 53 3.2.3 測試階段 61 第四章 SVM/CNN進階討論與在HEVC之性能分析 63 4.1編碼端SVM演算法應用於解碼端之準確率 63 4.2 SVM/CNN模型在不同量化之適用性 65 4.3 SVM/CNN架構性能分析與模型比較 67 4.3.1 架構性能分析 67 4.3.2 SVM/CNN模型比較 81 第五章 結論與未來展望 84 參考文獻 85

    [1]“Video coding for low bit rate communication, version 1,” ITU-T recommendation H.263, 1995.
    [2]“Generic coding of moving pictures and associated audio information,” ISO/IEC 13818-2: Video (MPEG-2), May 1996.
    [3]“Coding of audio-visual objects - Part 2: Visual,” in ISO/IEC 14496-2 (MPEG-4 Visual Version 1), Apr. 1999.
    [4] I. E. G. Richardson, H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia. Aberdeen, U.K.: John Wiley & Sons, 2003.
    [5] JCT-VC, “High Efficiency Video Coding (HEVC) Test Model 15(HM15) Encoder Description,” JCTVC-Q1002, JCT-VC Meeting, Valencia, ES, Apr. 2014.
    [6] G.J. Sullivan, J.R. Ohm, W.J. Han, T. Wiegand, “ Overview of the High Efficiency Video Coding (HEVC) Standard,” IEEE Trans. CSVT, vol. 22, no. 12, Dec. 2012.
    [7] C. Dong, C.C. Loy, K. He, X. Tang, “Learning a Deep Convolutional Network for Image Super-Resolution”, European Conference on Computer Vision (ECCV), ECCV 2014: Computer Vision – ECCV 2014 pp. 184-199
    [8] C. Dong, C.C. Loy, X. Tang, “Accelerating the Super-Resolution Convolutional Neural Network”, European Conference on Computer Vision (ECCV), ECCV 2016: Computer Vision – ECCV 2016 pp 391-407
    [9] W. Shi, J. Caballero, F. Huszár, J. Totz, A.P. Aitken, R. Bishop, D. Rueckert, Z. Wang, “Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1874-1883
    [10] J. Kim, J.K. Lee, K.M. Lee, “Accurate Image Super-Resolution Using Very Deep Convolutional Networks”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1646-1654
    [11] J. Kim, J.K. Lee, K.M. Lee, “Deeply-Recursive Convolutional Network for Image Super-Resolution”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1637-1645
    [12] X.J. Mao, C. Shen, Y. Yang, “Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections”, arXiv preprint arXiv:1606.08921, 2016
    [13] Y. Tai, J. Yang, X. Liu, “Image Super-Resolution via Deep Recursive Residual Network”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3147-3155
    [14] W.S. Lai, J.B. Huang, N. Ahuja, M.H. Yang, “Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 624-632
    [15] T. Tong, G. Li, X. Liu, Q. Gao, “Image Super-Resolution Using Dense Skip Connections”, The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4799-4807
    [16] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, W. Shi, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4681-4690
    [17] K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778
    [18] A. Norkin, G. Bjontegaard, A. Fuldseth, M. Narroschke, M. Ikeda, K. Andersson, M. Zhou, G.V. Auwera, “HEVC deblocking filter,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1746–1754, 2012.
    [19] C.M. Fu, E. Alshina, A. Alshin, Y. Huang, C.Y. Chen, C.Y. Tsai, C.W. Hsu, S.M. Lei, J.H. Park,W.J. Han, “Sample adaptive offset in the HEVC standard,” IEEE Transactions on Circuits and Systems for Video technology, vol. 22, no. 12, pp. 1755–1764, 2012.
    [20] W.S. Park, M. Kim, “CNN-based inloop filtering for coding efficiency improvement,” in Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), 2016 IEEE 12th. IEEE, 2016, pp. 1–5.
    [21] S. Kuanar, C. Conly, K.R. Rao,“Deep learning based HEVC in-loop filtering for decoder quality enhancement”, 2018 Picture Coding Symposium (PCS)
    [22] J. Kang, S. Kim, K.M. Lee,“Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec”, 2017 IEEE International Conference on Image Processing (ICIP)
    [23] N. Yan, D. Liu, H. Li, and Feng Wu, “A Convolutional Neural Network Approach for Half-Pel Interpolation in Video Coding”, 2017 IEEE International Symposium on Circuits and Systems (ISCAS)
    [24] N. Yan, D. Liu, B. Li, H. Li, T. Xu, F. Wu, “Convolutional Neural Network-Based Invertible Half-Pixel Interpolation Filter For Video Coding”, 2018 25th IEEE International Conference on Image Processing (ICIP)
    [25] H. Zhang, L. Song, Z. Luo, X. Yang, “Learning a Convolutional Neural Network for Fractional Interpolation in HEVC Inter Coding”, 2017 IEEE Visual Communications and Image Processing (VCIP)
    [26] R. Yang, M. Xu, Z. Wang, “Decoder-side hevc quality enhancement with scalable convolutional neural network,” in Multimedia and Expo (ICME), 2017 IEEE International Conference on. IEEE, 2017, pp. 817–822.
    [27] C.H. Yeh, Z.T. Zhang, M.J. Chen, C.Y. Lin, “HEVC Intra Frame Coding Based on Convolutional Neural Network”, IEEE Access p.p. 50087 – 50095
    [28] F. Li, W. Tan, B. Yan,“Deep Residual Network for Enhancing Quality of the Decoded Intra Frames of Hevc”, 2018 25th IEEE International Conference on Image Processing (ICIP)
    [29] S.J. Cai, “Reduction of computation complexity for HEVC intra prediction with support vector machine,” National Central University, Master Thesis, Jun 2017.
    [30] C. Li, L. Song, R. Xie, W. Zhang, "Cnn Based Post-Processing To Improve Hevc", International Conference on Image Processing(ICIP) 2017, pp.4577-4580
    [31] J. Xu, L. Song, R. Xie,"Shot boundary detection using convolutional neural networks", Visual Communications and Image Processing (VCIP), 2016. IEEE, 2016, pp. 1–4.
    [32] Y. Zhu, L. Song, R. Xie, and W. Zhang, "Sjtu 4k video subjective quality dataset for content adaptive bit rate estimation without encoding", Broadband Multimedia Systems and Broadcasting (BMSB), IEEE, 2016, pp.1–4.
    [33] Grand Challenge ICIP 2017, "Grand challenge on the use of image restoration for video coding efficiency improvement", Chttps://storage.googleapis.com/icip-2017/index.html.
    [34] X. He, Q. Hu, X. Han, X. Zhang, C. Zhang, W. Lin, "Enhancing Hevc Compressed Videos With A Partition-Masked Convolutional Neural Network", International Conference on Image Processing(ICIP) 2018, pp.216-220
    [35] Y.Dai, D. Liu, F.Wu, "A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding", MultiMedia Modeling(MMM) 2017, pp.28-39
    [36] R. Yang, M. Xu, Z. Wang, Z. Guan, "Enhancing quality for HEVC compressed videos", arXiv preprint arXiv:1709.06734, 2017.

    QR CODE
    :::