跳到主要內容

簡易檢索 / 詳目顯示

研究生: 粘郁潔
Yu-Chieh Nien
論文名稱: 基於立方體投影的360度視訊編碼之位元分配
Bit Allocation for Cubemap Projection for 360-degree Video Coding
指導教授: 唐之瑋
Chih-Wei Tang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 通訊工程學系
Department of Communication Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 60
中文關鍵詞: 360度視訊立方體投影位元率控制位元分配高編碼代價區域偵測
外文關鍵詞: 360 degree video, cubemap projection, rate control, bit allocation, detection of high coding cost regions
相關次數: 點閱:13下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 360度視訊可提供使用者身臨其境的三維視覺體驗,但現有的視訊編碼器多以二維長方形影像為輸入,因此須先將360度視訊的三維球體域資料投影至二維平面再進行視訊編碼。等距長方投影(equirectangular projection, ERP)及立方體投影(cubemap projection, CMP)為目前360度視訊最常使用之二維投影格式,而立方體投影因幾何失真度較低因此編碼效率較佳,但目前卻未有針對編碼立方體投影格式的位元率控制(rate control)之位元分配(bit allocation)設計。因此,本論文提出基於立方體投影的360度視訊編碼之位元分配方案,其分成兩部分,第一部分為基於支援向量機(support vector machine, SVM)的立方體投影之高編碼代價區域偵測,參考每一最大編碼單元(largest coding unit, LCU)之紋理複雜、動量、動量密集度、與沿時間軸動量方向變異性,以支援向量機(SVM)偵測高代價最大編碼單元(LCUs)。第二部分則以經由曲面擬合(surface fitting)所得之函式分配每一面(face)的高代價最大編碼單元與非高代價最大編碼單元之位元。實驗結果顯示本論文所提出之演算法優於原始HM16.16之參考軟體所採用之R-λ model 的位元率控制,平均BDBR下降3.24%,平均BDWS-PSNR上升0.13dB。


    360-degree videos provide users immersive visual experiences. Since most video encoders take two-dimensional rectangular images as inputs, the three-dimensional sphere domain data of 360-degree videos must be projected onto the two-dimensional image plane before video coding. The eqirectangular projection (ERP) and cubemap projection (CMP) are the most commonly used 2D projection formats of 360-degree videos, where the cubemap projection enables better coding performance because of it has smaller geometric distortions. Since currently there are not any bit allocation schemes proposed for video coding of the cubemap projection, this thesis proposes a bit allocation scheme that consists of two parts for video coding of the CMP. First, high coding cost largest coding units (LCUs) of six faces of the CMP are detected using support vector machine (SVM), referring to the texture complexity, motion vector magnitude, motion density, and temporal variance of motion. Second, bit allocation between high coding cost LCUs and non-high coding cost LCUs is applied by functions attained by surface fitting of coding statistics based on HEVC/H.265. Experimental results show that our proposed performance is better than HM16.16 with R-λ model. with 2.256% BDBR decrease and 0.13 dB BDPSNR increase.

    摘要 I Abstract II 致謝 III 圖目錄 VI 表目錄 IX 第一章 緒論 1 1.1 前言 1 1.2 研究動機 1 1.3 研究方法 3 1.4 論文架構 3 第二章 360度視訊編碼(360-degree Video Coding)介紹 4 2.1 高效能視訊編碼(High Efficiency Video Coding, HEVC)簡介 4 2.1.1 高效能視訊編碼(HEVC)之架構 4 2.1.2 高效能視訊編碼之編碼基本單元 5 2.2 基於HEVC 之360度視訊編碼支援功能 7 2.2.1 360度視訊投影格式 8 2.2.2 360度視訊客觀品質量測 10 2.3 360度視訊編碼發展現況 12 2.4 總結 12 第三章 基於HEVC之360度視訊編碼位元率分配方案介紹 13 3.1 位元率控制之基本原理 13 3.1.1 位元分配(Bit Allocation) 13 3.1.2 量化參數決策(Determination of Quantization Parameter) 14 3.2 360度視訊編碼之位元分配技術現況 14 3.3 總結 15 第四章 本論文所提出之立方體投影格式編碼的位元分配方案 16 4.1 本論文所提出之位元分配之流程 16 4.2 本論文所提出之高編碼代價區域偵測 17 4.3 本論文所提出基於立方體投影之最大編碼單元的位元分配 27 4.4 總結 31 第五章 實驗結果與分析 32 5.1 實驗環境與參數設定 32 5.2 本論文方案之之實驗結果與分析 35 5.3 總結 45 第六章 結論與未來展望 46 參考資料 47

    [1] N. Greene, “Environment mapping and other applications of world projections,” IEEE Comput. Graph. Appl., vol. CGA-6, no. 11, pp. 21-29, Nov. 1986.
    [2] B. Li, H. Li, L. Li, and J. Zhang, “λ domain rate control algorithm for high efficiency video coding,” IEEE Trans. Image Process., vol. 23, no. 9, pp. 3841-3854, Sep. 2014.
    [3] B. Li, L. Song , R. Xie , and W. Zhang, “Weight-based bit allocation scheme for VR videos in HEVC,” in Proc. IEEE Visual Communications and Image Processing, pp. 1-4, Dec. 2017.
    [4] Y. Liu, M. Xu, C. Li, S. Li, and Z. Wang, “A novel rate control scheme for panoramic video coding ,” in Proc. IEEE International Conference on Multimedia and Expo, pp. 691-696, July 2017.
    [5] C.-C. Wang and C.-W. Tang, “Region-based rate control for 3D-HEVC based texture video coding,” Journal of Visual Communication and Image Representation, vol. 54, pp. 108-122, July 2018.
    [6] B. E. Boser, I. Guyon, and V. Vapnik, “A training algorithm for optimal margin classifiers,” in Proc. the Fifth annual Workshop on Computational Learning Theory, pp. 144-152, July 1992.
    [7] J. A. Hartigan and M. A. Wong, “A K-means clustering algorithm,” Applied Statistics, vol. 28, no. 1, pp.100-108, 1979.
    [8] G. J. Sullivan, J.R. Ohm, W.J. Han, and T. Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard,” IEEE Trans. Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649-1668, Sep. 2012.
    [9] ISO/IEC JTC 1/SC 29/WG 11, “Algorithm descriptions of projection format conversion and video quality metrics in 360 Lib,” Doc JVET-E1003, Geneva, Jan. 2017.
    [10] Equirectangular projection, https://en.wikipedia.org/wiki/Equirectangular_projection
    [11] Cubemap, https://en.wikipedia.org/wiki/Cube_mapping
    [12] ISO/IEC JTC1/SC29/WG11, “AHG8: WS-PSNR for 360 video objective quality evaluation,” Doc JVET-D0040, Oct. 2016.
    [13] ISO/IEC JTC1/SC29/WG11, “AhG8: Suggested testing procedure for 360-degree video,” Doc JVET-D0027, Chengdu, China, Oct. 2016.
    [14] ISO/IEC JTC1/SC29/WG11, “Test conditions for 360 Video,” Doc JVET-D0193, hengdu, Oct. 2016.
    [15] Y. Wang, Y. Li, D. Yang, and Z. Chen, “A fast intra prediction algorithm for 360-degree equirectangular panoramic video,” in Proc. IEEE Visual Communications and Image Processing, Dec. 2017.
    [16] Y. He, Y Ye , P. Hanhart, and X. Xiu, “Motion compensated prediction with geometry padding for 360 video coding,” in Proc. IEEE Visual Communications and Image Processing, Dec. 2017.
    [17] Y. Li, J.Xu, and Z. Chen, “Spherical domain rate-distortion optimization for 360-degree video coding,” in Proc. IEEE International Conference on Multimedia and Expo, July 2017.
    [18] G. Ren, P. Li, and G. Wang, “A novel hybrid coarse-to-fine digital image stabilization algorithm,” Inform. Technol. J., vol. 9, no. 7, pp. 1390-1396, July 2010.
    [19] G. J. Sullivan and T. Wiegand, “Rate-distortion optimization for video compression,” IEEE Signal Processing Magazine, vol. 15, pp. 74-90, Nov. 1998.
    [20] R.C. Gonzalez and R.E. Woods, Digital Image Processing, 2nd edition, Publisher: Prentice Hall, Jan. 2002.
    [21] Y.F. Ma and H.-J. Zhang, “A model of motion attention for video skimming,” in Proc. IEEE International Conference on Image Processing, vol. 1, pp. 129-132, Sep. 2002.
    [22] F. Duanmu, Y. Mao, S. Liu, S. Srinivasan, and Y. Wang, “A subjective study of viewer navigation behaviors when watching 360-degree videos on computers,” in Proc. IEEE International Conference on Multimedia Expo, USA, July 2018.
    [23] ISO/IEC JTC 1/SC 29/WG 11, “JVET common test conditions and software reference configurations,” Doc. JVET-L1010, Macau, Oct. 2018.
    [24] ISO/IEC JTC 1/SC 29/WG 11, “AHG8: InterDigital test sequences for virtual reality video coding,” Doc. JVET-D0021, Chengdu, Oct. 2016.
    [25] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inform. Process. Manage., vol. 45, no. 4, pp. 427-437,July 2009.
    [26] G. Bjontegaard, “Calculation of average PSNR differences between RD-Curves,” Doc. VCEG-M33, Austin, US, April 2001.

    QR CODE
    :::