| 研究生: |
范聖敏 Sheng-Min Fan |
|---|---|
| 論文名稱: |
一種應用於HEVC解碼端之深度學習架構的研究 Study of A Deep Learning Architecture For HEVC Decoder |
| 指導教授: |
林銀議
Yin-Yi Lin |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 通訊工程學系 Department of Communication Engineering |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 104 |
| 中文關鍵詞: | HEVC 、改善編碼性能 、支持向量機 、卷積神經網路 、分散式編碼 |
| 外文關鍵詞: | HEVC, Improved Coding Performance, SVM, CNN, Distributed Coding |
| 相關次數: | 點閱:7 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
視訊編碼壓縮標準為高效率視訊編碼(High Efficiency Video Coding, HEVC)此壓縮編碼比起H.264擁有更高的編碼壓縮效率,其應用範圍也可以達到4K、8K的影像。HEVC在切割影像技術上使用四分樹(Quad-Tree)的編碼方式,將編碼樹單元(Coding Tree Unit, CTU)藉由壓縮編碼運算,CTU包含著四種不同的深度,深度最淺的影像失真較少;反之深度最深的影像失真較多,而這四種深度組成比例,也隨著量化參數(Quantization Parameter, QP)的不同有影響。而我們要利用卷積神經網路(Convolutional Neural Network, CNN)的方式來優化其影像品質,採用殘差網路的訓練方式,將訓練到的影像殘差值再補上失真影像,來達到強化影像的效果。而就如同上述所提及的量化參數影響CTU深度分布,其也會間接影像到神經網路對於影像品質優化的改善程度,於是我們利用支持向量機應用在CTU的快速演算法,作為我們CTU的分類器,將CTU分成簡單紋理與複雜紋理,再分別使用卷積神經網路優化其影像性能。我們使用卷機神經網路優化使用與未使用支持向量機作為CTU的分類器的兩種例子,發現有加入了支持向量機分類器對於卷積神經網路的訓練有四大優點,一是可以降低卷積神經網路的訓練資料量;二是可以多降低1%左右的BDBR;三是可以節省16%左右的編碼壓縮時間;四是對於特殊影像也有非常良好的效果。
The video coding compression standard is High Efficiency Video Coding (HEVC). This compression coding has higher coding compression efficiency than H.264, and its application range can also reach 4K and 8K images. HEVC uses the Quad-Tree coding method in technology of cutting image. The coding tree unit (CTU) is compressed and calculated. The CTU contains four different depths and the lowest depth image distortion. Less; on the other hand, the image with the highest depth has more distortion, and the proportions of these four depths also have an effect with the quantization parameter (QP). We use the Convolutional Neural Network (CNN) method to optimize image quality, and use the training method of the residual network to add the distorted image to the residual value of trained image to achieve enhancement. Just as the above-mentioned quantization parameters affect the CTU depth distribution, it will also have different improvement in image quality optimization by the neural network. Therefore, we use support vector machines (SVM) to apply the fast algorithm of CTU as our CTU. The classifier classifies the CTU into simple textures and complex textures, and then uses a convolutional neural network to optimize its image performance. We use a convolutional neural network to optimize two examples of classifiers with and without support vector machines as CTUs. We find that adding a support vector machine classifier has four major advantages for the training of convolutional neural networks. First, it can reduce the amount of training data for convolutional neural networks; Second, it can reduce BDBR by about 1%; Third, it can save about 16% of encoding compression time; Fourth, it also has very good effects on special images.
[1]“Video coding for low bit rate communication, version 1,” ITU-T recommendation H.263, 1995.
[2]“Generic coding of moving pictures and associated audio information,” ISO/IEC 13818-2: Video (MPEG-2), May 1996.
[3]“Coding of audio-visual objects - Part 2: Visual,” in ISO/IEC 14496-2 (MPEG-4 Visual Version 1), Apr. 1999.
[4] I. E. G. Richardson, H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia. Aberdeen, U.K.: John Wiley & Sons, 2003.
[5] JCT-VC, “High Efficiency Video Coding (HEVC) Test Model 15(HM15) Encoder Description,” JCTVC-Q1002, JCT-VC Meeting, Valencia, ES, Apr. 2014.
[6] G.J. Sullivan, J.R. Ohm, W.J. Han, T. Wiegand, “ Overview of the High Efficiency Video Coding (HEVC) Standard,” IEEE Trans. CSVT, vol. 22, no. 12, Dec. 2012.
[7] C. Dong, C.C. Loy, K. He, X. Tang, “Learning a Deep Convolutional Network for Image Super-Resolution”, European Conference on Computer Vision (ECCV), ECCV 2014: Computer Vision – ECCV 2014 pp. 184-199
[8] C. Dong, C.C. Loy, X. Tang, “Accelerating the Super-Resolution Convolutional Neural Network”, European Conference on Computer Vision (ECCV), ECCV 2016: Computer Vision – ECCV 2016 pp 391-407
[9] W. Shi, J. Caballero, F. Huszár, J. Totz, A.P. Aitken, R. Bishop, D. Rueckert, Z. Wang, “Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1874-1883
[10] J. Kim, J.K. Lee, K.M. Lee, “Accurate Image Super-Resolution Using Very Deep Convolutional Networks”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1646-1654
[11] J. Kim, J.K. Lee, K.M. Lee, “Deeply-Recursive Convolutional Network for Image Super-Resolution”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1637-1645
[12] X.J. Mao, C. Shen, Y. Yang, “Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections”, arXiv preprint arXiv:1606.08921, 2016
[13] Y. Tai, J. Yang, X. Liu, “Image Super-Resolution via Deep Recursive Residual Network”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3147-3155
[14] W.S. Lai, J.B. Huang, N. Ahuja, M.H. Yang, “Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 624-632
[15] T. Tong, G. Li, X. Liu, Q. Gao, “Image Super-Resolution Using Dense Skip Connections”, The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4799-4807
[16] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, W. Shi, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4681-4690
[17] K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778
[18] A. Norkin, G. Bjontegaard, A. Fuldseth, M. Narroschke, M. Ikeda, K. Andersson, M. Zhou, G.V. Auwera, “HEVC deblocking filter,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1746–1754, 2012.
[19] C.M. Fu, E. Alshina, A. Alshin, Y. Huang, C.Y. Chen, C.Y. Tsai, C.W. Hsu, S.M. Lei, J.H. Park,W.J. Han, “Sample adaptive offset in the HEVC standard,” IEEE Transactions on Circuits and Systems for Video technology, vol. 22, no. 12, pp. 1755–1764, 2012.
[20] W.S. Park, M. Kim, “CNN-based inloop filtering for coding efficiency improvement,” in Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), 2016 IEEE 12th. IEEE, 2016, pp. 1–5.
[21] S. Kuanar, C. Conly, K.R. Rao,“Deep learning based HEVC in-loop filtering for decoder quality enhancement”, 2018 Picture Coding Symposium (PCS)
[22] J. Kang, S. Kim, K.M. Lee,“Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec”, 2017 IEEE International Conference on Image Processing (ICIP)
[23] N. Yan, D. Liu, H. Li, and Feng Wu, “A Convolutional Neural Network Approach for Half-Pel Interpolation in Video Coding”, 2017 IEEE International Symposium on Circuits and Systems (ISCAS)
[24] N. Yan, D. Liu, B. Li, H. Li, T. Xu, F. Wu, “Convolutional Neural Network-Based Invertible Half-Pixel Interpolation Filter For Video Coding”, 2018 25th IEEE International Conference on Image Processing (ICIP)
[25] H. Zhang, L. Song, Z. Luo, X. Yang, “Learning a Convolutional Neural Network for Fractional Interpolation in HEVC Inter Coding”, 2017 IEEE Visual Communications and Image Processing (VCIP)
[26] R. Yang, M. Xu, Z. Wang, “Decoder-side hevc quality enhancement with scalable convolutional neural network,” in Multimedia and Expo (ICME), 2017 IEEE International Conference on. IEEE, 2017, pp. 817–822.
[27] C.H. Yeh, Z.T. Zhang, M.J. Chen, C.Y. Lin, “HEVC Intra Frame Coding Based on Convolutional Neural Network”, IEEE Access p.p. 50087 – 50095
[28] F. Li, W. Tan, B. Yan,“Deep Residual Network for Enhancing Quality of the Decoded Intra Frames of Hevc”, 2018 25th IEEE International Conference on Image Processing (ICIP)
[29] S.J. Cai, “Reduction of computation complexity for HEVC intra prediction with support vector machine,” National Central University, Master Thesis, Jun 2017.
[30] C. Li, L. Song, R. Xie, W. Zhang, "Cnn Based Post-Processing To Improve Hevc", International Conference on Image Processing(ICIP) 2017, pp.4577-4580
[31] J. Xu, L. Song, R. Xie,"Shot boundary detection using convolutional neural networks", Visual Communications and Image Processing (VCIP), 2016. IEEE, 2016, pp. 1–4.
[32] Y. Zhu, L. Song, R. Xie, and W. Zhang, "Sjtu 4k video subjective quality dataset for content adaptive bit rate estimation without encoding", Broadband Multimedia Systems and Broadcasting (BMSB), IEEE, 2016, pp.1–4.
[33] Grand Challenge ICIP 2017, "Grand challenge on the use of image restoration for video coding efficiency improvement", Chttps://storage.googleapis.com/icip-2017/index.html.
[34] X. He, Q. Hu, X. Han, X. Zhang, C. Zhang, W. Lin, "Enhancing Hevc Compressed Videos With A Partition-Masked Convolutional Neural Network", International Conference on Image Processing(ICIP) 2018, pp.216-220
[35] Y.Dai, D. Liu, F.Wu, "A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding", MultiMedia Modeling(MMM) 2017, pp.28-39
[36] R. Yang, M. Xu, Z. Wang, Z. Guan, "Enhancing quality for HEVC compressed videos", arXiv preprint arXiv:1709.06734, 2017.