跳到主要內容

簡易檢索 / 詳目顯示

研究生: 李穎
Ying Lee
論文名稱: 基於機器學習之360度視訊的 VVC快速畫面間預測演算法
Machine Learning Based Fast Inter Prediction Algorithm of VVC for 360-degree Videos
指導教授: 唐之瑋
Chih-Wei Tang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 通訊工程學系
Department of Communication Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 108
中文關鍵詞: 360度視訊EAC(equi-angular cubemap)VVC(versatile video coding)畫面間編碼快速演算法LNN(light-weighted neural network)
外文關鍵詞: 360-degree videos, EAC(equi-angular cubemap), VVC(versatile video coding), inter frame coding, fast algorithm, LNN(light-weighted neural network)
相關次數: 點閱:15下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • VVC(Versatile video coding)可降低高畫質視訊傳輸位元率,但VCC之編碼時間複雜度過高使其很難在即時傳輸設備上實現,也因此VVC編碼之快速演算法為視訊編碼中重要研究方向。EAC(equi-angular cubemap)格式為360度視訊格式之一,其相較於ERP(equirectangular projection)格式能減少冗餘資訊,然則現有VVC畫面間編碼的快速模式決策與深度決策演算法尚無針對EAC格式設計,因此本論文提出針對EAC格式設計之畫面間編碼快速劃分深度與模式決策演算法,其考慮EAC格式各面之影像內容相連性,與畫面間面之相關性,協助快速畫面編碼決策之準確性,並且畫面間編碼之深度決策與模式決策皆考量VVC新增之affine merge mode進行設計。又,與現有畫面間快速演算法方案相比,本論文採用LNN(light-weighted neural network)作為分類器,比經驗法則更能適應視訊內容之多樣性,並且相較於深度學習方案,僅使用中央處理器(CPU)便可以進行分類。實驗結果顯示本論文所提方案相較於VTM 7.0,平均可節省21%的編碼時間,並僅有1.03%BDBR之上升,與現有採用經驗法則之方案相比亦節省較多的編碼時間節省。


    VVC (versatile video coding) can reduce the bitrate of the high-resolution videos before transmission. However, the encoding complexity of VVC is extremely high cause it hard to implements in real-time hardware. Therefore, fast algorithm of VVC encoder is important. EAC (equi-angular cubemap) format has lower redundant information than ERP format. There is not a fast inter mode or depth decision algorithm about EAC format within the survey of existing literatures. Accordingly, this paper proposed the fast inter prediction algorithm of VVC for EAC format to facilitate mode decision and depth decision process, which taking the inter prediction information of face and face boundary’s connection in EAC format into consideration. Furthermore, this paper considered method of affine merge mode which added by VVC in fast inter mode decision and depth decision. Compare with widely used classification models in VVC fast inter coding algorithms, LNN (light-weighted neural network) can better adjust oneself to different video and coding conditions than rule of thumb and just depends on CPU execution which is difficult on deep learning. Experimental results show that the proposed method reduce the encoding complexity of VTM7.0 about 21% with 1.03% BDBR (Bjontegaard delta bit rate) increasement in average and better than the rule of thumb.

    摘要 i Abstract ii 致謝 iii 目錄 iv 圖目錄 vi 表目錄 xi 第一章 緒論 1 1.1 前言 1 1.2研究動機 2 1.3 研究方法 2 1.4 論文架構 3 第二章 360度視訊編碼現況介紹 4 2.1 多功能影像編碼(Versatile Video Coding, VVC)介紹 4 2.2 劃分流程與模式介紹 6 2.3 多功能影像編碼之仿射合併模式(Affine Merge Mode)與常規合併模式(Regular Merge Mode) 9 2.4 360度視訊品質量測與EAC格式介紹 16 2.5 總結 21 第三章 Versatile Video Coding之快速畫面內及畫面間編碼演算法現況介紹 22 3.1 快速VVC模式決策及運動估測演算法 23 3.2 快速VVC編碼樹單元深度預測演算法 25 3.3 基於360度視訊之VVC 畫面內與畫面間編碼快速演算法現況介紹 28 3.4總結 30 第四章 本論文所提出之VVC快速畫面間360度視訊編碼演算法 31 4.1 資料集選擇與VVC、360度視訊條件設定及本論文所提方案之整體架構 32 4.2 針對EAC格式之面邊界與畫面間相關性設計 36 4.3 本論文所提之快速模式決策方案 41 4.4本論文所提之快速深度決策演算法 53 4.5 總結 60 第五章 實驗結果與分析 61 5.1 編碼環境與參數設定、測試視訊介紹 61 5.2 LNN準確度、個別方案效能分析 64 5.3 於VTM7.0實驗結果及現有方案比較 69 5.4 總結 81 第六章 結論與未來展望 82 參考文獻 83 著作 87 符號表 88

    [1] JVET 360Lib. Available: https://jvet.hhi.fraunhofer.de/svn/svn_360Lib/tags/360Lib-10.0/
    [2] Y. Lee and C.-W. Tang, “Early skip mode decision of Versatile Video Coding on 8K 360-degree videos,” in Proc. IEEE International Conference on Consumer Electronics, Jan 2021.
    [3] S.-H. Park and J.-W. Kang, “Fast multi-type tree partitioning for Versatile Video Coding using a lightweight neural network,” IEEE Transactions on Multimedia ( Early Access ) , Dec. 2020.
    [4] I. Storch, G. Correa, B. Zatt, L. Agostini and D. Palomino ,“ESA360 - Early skip mode decision algorithm for fast ERP 360 video coding,” in Proc. 2020 28th European Signal Processing Conference (EUSIPCO), pp. 535-539, Jan. 2021.
    [5] J.-L. Lin, Y.-H. Lee, C.-H. Shih, S.-Y. Lin, H.-C. Lin, S.-K. Chang, P. Wang, L. Liu and C.-C. Ju, “Efficient projection and coding tools for 360° video,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, Vol. 9, No. 1, pp. 84-97, March 2019.
    [6] ITU-R M.2370-0, “IMT traffic estimates for the years 2020 to 2030,” July 2015.
    [7] Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, “Algorithm description for Versatile Video Coding and Test Model 7 (VTM 7),” Doc. JVET-P2002-v1, Oct. 2019.
    [8] Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Test Model 7 [Online].
    Available: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tree/VTM-7.0
    [9] Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, “Algorithm descriptions of projection format conversion and video quality metrics in 360Lib Version 11,” Doc. JVET-Q2004, January 2020.
    [10] ISO/IEC JTC1/SC29/WG11, “AHG8: A study on Equi-Angular Cubemap projection (EAC),” Doc. JVET-G0056, Torino, July 2017.
    [11] V. Zakharchenko, E. Alshina, A. Singh and A. Dsouza, “AhG8: Suggested testing procedure for 360-degree video,” Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-D0027, Oct. 2016, Chengdu, China.
    [12] Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, “JVET AHG report: Test model software development (AHG3),” Doc. JVET-Q0003-v1, January 2020.
    [13] J.-N. Filipe, J. Carreira, L.-M.-N. Tavora, S.-M.-M. Faria, A. Navarro and P.-A.-A. Assuncao, “Complexity estimation for load balancing of 360-degree intra Versatile Video Coding,” in Proc. IEEE Workshop on Signal Processing Systems (SiPS), Oct. 2020.
    [14] N. Tang, J. Cao, F. Liang, J. Wang, H. Liu, X. Wang and X. Du, “Fast CTU partition decision algorithm for VVC intra and inter coding,” in Proc. IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), pp. 361-364, Nov. 2019.
    [15] S.-H. Park and Je-Won Kang, “Context-based ternary tree decision method in Versatile Video Coding for fast intra coding,” IEEE Access, Vol. 7, pp. 172597-172605, Nov. 2019.
    [16] Y. Fan, J. Chen, H. Sun, J. Katto and M. Jing, “A fast QTMT partition decision strategy for VVC intra prediction,” IEEE Access, Vol. 8, pp. 107900-107911, 2020.
    [17] N. Zouidi, F. Belghith, A. Kessentini and N. Masmoudi, “Fast intra prediction decision algorithm for the QTBT structure,” in Proc. IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems (DTS), May 2019.
    [18] Y.-H. Huang, J.-J. Chen and Y.-H. Tsai, “Speed up H.266/QTMT intra-coding based on predictions of ResNet and Random Forest classifier,” in Proc. IEEE International Conference on Consumer Electronics, Jan 2021.
    [19] H. Yang, L. Shen, X. Dong, Q. Ding, P. An and G. Jiang, “Low-complexity CTU partition structure decision and fast intra mode decision for Versatile Video Coding,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 30, No. 6, pp. 1668-1682, June 2020.
    [20] S.-H. Park and J.-W. Kang, “Fast affine motion estimation for Versatile Video Coding (VVC) encoding,” IEEE Access, Vol.7, pp. 158075-158084, Oct. 2019.
    [21] R.-L. Liao, R. Yang, Y. Ye, Z. Wang and C. Ma, “Fast partition decision for VVC interpicture coding using convolution neural network,” in Proc. Applications of Digital Image Processing XLIV, pp. 361-364, Aug. 2021.
    [22] S. Jung and D. Jun, “Context-based inter mode decision method for fast affine prediction in Versatile Video Coding,” Electronics, Vol.10, April 2021.
    [23] Q. Zhang, Y. Wang, B. Jiang, X. Wang and R. Su , “Adaptive CU partition and early skip mode detection for H.266/VVC,” Multimedia Tools and Applications, Vol.80, pp. 13957-13973, Jan. 2021.
    [24] Z. Pan, P. Zhang, B. Peng, N. Ling and J. Lei, “A CNN-Based fast inter coding method for VVC,” IEEE Signal Processing Letters, Vol. 28, pp. 1260 – 1264, June 2021.
    [25] M. Zhang, Y. Hou and Z. Liu, “An early CU partition mode decision algorithm in VVC based on variogram for virtual reality 360 degree videos,” EURASIP journal on image and video processing (JIVP), May 2021.
    [26] M. Xu, C. Li, Z. Chen, Z. Wang and Z. Guan, “Assessing visual quality of omnidirectional videos,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 29, pp. 3516-3530, Dec. 2019.
    [27] F. Duanmu, Y. Mao, S. Liu, S. Srinivasan and Y. Wang, "A subjective study of viewer navigation behaviors when watching 360-degree videos on computers," in Proc. IEEE International Conference on Multimedia Expo (ICME), July 2018.
    [28] Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, “JVET common test conditions and evaluation procedures for 360° video,” JVET-L1012-v1, October 2018.
    [29] K. Pearson, “Notes on regression and inheritance in the case of two parents proceedings of the royal society of London”, 1895.
    [30] ISO/IEC JTC 1/SC 29/WG 11, “AHG8: InterDigital test sequences for virtual reality video coding,” Doc. JEVT-D0039, Chengdu, Oct. 2016.
    [31] G. Bjontegaard, “Calculation of average PSNR differences between RD-Curves,” Doc. VCEG-M33, Austin, US, April 2001.
    [32] D. M W, “Evaluation: From precision, recall and F-Measure to ROC, informedness, markedness & correlation,” Journal of Machine Learning Technologies, pp.37-63, Nov. 2011.

    QR CODE
    :::