| 研究生: |
林宏龍 Hung-long Lin |
|---|---|
| 論文名稱: |
H.264 影像解碼之系統設計及硬體軟體整合平台 A System Level Design of H.264 video decoder with Hardware and Software Integration Platform |
| 指導教授: |
蔡宗漢
Tsung-Han Tsai |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系在職專班 Executive Master of Electrical Engineering |
| 畢業學年度: | 97 |
| 語文別: | 中文 |
| 論文頁數: | 92 |
| 中文關鍵詞: | 解碼 、嵌入式系統 |
| 外文關鍵詞: | H.264, decoder, embeed system |
| 相關次數: | 點閱:9 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本碩士論文中,利用SOCLE CDK2007 平台實現了一個H.264/AVC 解碼器,經由複
雜度分析發現,除區塊濾波器為其中較佔據系統資源其中之一的模組,所以我們選擇使用
硬體實現以及搭配軟體處理其他部分,第6 章表格中有列出幾個架構效率比較,經過幾篇
論文中的效率比較,我們選擇了一個效率較佳的除區塊濾波電路來做硬體實現,此架構中
因為垂直濾波跟水平濾波做平行處理,此除區塊濾波電路利用幾組暫存器的搬移實現了平
行處理垂直以及水平濾波,另外本文在實現過程中也發現除區塊電路的平行處理上控制電
路做了修改之後可以比原先參考架構縮短了30%的時間去完成一張圖片的解碼。第6 章會
提到硬體修正前以及修正後的結果,並且將解碼的時間與多篇paper 做一個比較。
關於色差轉換部分由於其計算方式使用到浮點運算在最佳化初期占了整體效能的30%
所以也選擇以硬體實現。
軟體部分使用網路上的原始程式,搭配ARM 的profile 分析,針對其中各個模組個別
去做最佳化以及降低記憶體存取的頻率,使用內部暫存器去做各個區塊的運算。
在兩個硬體模組以硬體實現以後整合到系統整合版上之後必須考慮最佳化程序:
1. 系統周邊最佳化,本文中第6 章有提到我們針對實現H.264 播放影像針對的LCD 以及
外部記憶體SDRAM,加上系統PLL 控制以達成最佳化的目的。
2. 硬體效能量測,本文第6 章節提到硬體執行所需要的時間,根據RTL 模擬可以得到改
善後的結果。
3. 未來必須針對軟體且利用ARM 公司所提供的profile 軟體去預估軟體執行過程的最遲
緩的地方做演算法最佳化。
經過以上處理程序系統上撥放效能可以從QCIF 0.5fps,提升到10fps。
In this master thesis, this uses the platform of SOCLE CDK2007 to implement an
H.264/AVC decoder. Through the complexity of analysis of previous study, the De-block Filter
module should be one of the modules which occupy more system resource, so that we decide to
use hardware to implement this module. Chapter6 sets out a few forms to in the framework of
De-block efficiecy, after several in the efficiency of comparison, we have chosen a better
efficiency in a addition to De-block Filter circuits to do implementation. The Filter circuit in
addition to block the use of several 4x4 registers of the move between register to achieve the
parallel processing of the vertical and horzinortal Filtering. Also in the realization of the
implementation of this article, we found the parallel processing control circuit can be done to
amend the original frame De-block Filter time, which can reduce 30% the processing time of
De-block Filter than original architecture. Chpater 6 will be referred to the amendment and make
a Filtering time comparison to more paper.
For YUVtoRGB which occupy 30% system performance, so we decide to implement it by
Hardware.
In the software part of the use of ARM’s profile analysis, we found the conversion of YUV
to RGB also wasted 30% CPU time, so we will use hardware to implement the YUV to RGB
module.
After implementation of two hardware module, we must to consider optimize process before
hardware and software integrate into the system board:
1. The best system around, and in this article are referred to Chapter 6 players for the
realization of H.264 Video, we focus on LCD, external SDRAM and system PLL
optimization.
III
2. Hardware performance measurement, Chapter 6 of this article to the hardware section
execution time, according to the RTL simulation we can get the improved result.
3. In the future, we should use ARM’s profile software to estimate the software execution
process and find out the CPU execution cycles to improve the wasted CPU cycles by
software algorithm.
After performance enhancement which can improve QCIF 0.5fps to 10fps in H.264 system.
[1] Joint Video Team of ISO/IEC MPEG and ITU-T VCEG, “Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification,” ITU-T Rec. H.264 and ISO/IEC 14496-10 AVC 2003.
[2] Luthra, A.; Sullivan, G. J.; Wiegand, T.“Introduction to the special issue on the H.264/AVC video coding standard,” IEEE Trans. Circuits Syst. Video Technol., Vol. 13, no.7, pp. 557-559, July 2003.
[3] SMPTE, “Proposed SMPTE Standard for Television: VC-9 Compressed Video Bitstream Format and Decoding Process,” SMPTE Technology Committee C24 on Video Compression Technology, American, 2004-03-31.
[4] Gary J. Sullivan, Pankaj Topiwala, and Ajay Luthra, “The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions,” SPIE, Vol.5558, pp. P1.454-474, Aug. 2004.
[5] ARM Corporation, AMBA Specification, 1999, Available:
http://www.gaisler.com/doc/amba.pdf
[6] ARM Corporation, ARM926EJ-S Technical Reference Manual, 2001, Available:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0198d/DDI0198_926_TRM.pdf
[7] ARM Corporation, RealView ARMulator ISS version1.4 User Guide, ARMDUI0207C, 2004, Available:
http://infocenter.arm.com/help/topic/com.arm.doc.dui0207d/DUI0207D_rviss_user_guide.pdf.
[8] ARM Corporation, ARM Develop Suite AXD and armsd Debuggers Guide version1.2, 2004, Avaliable: http://infocenter.arm.com/help/topic/com.arm.doc.dui0066d/DUI0066.pdf.
[9] Socal Technology Corporation, Cheetah Connectivity Specification, 2006.
[10] C. M. Chen and C. H. Chen, “An Efficient Architecture for Deblocking Filter in H.264/AVC
Video Coding,” IASTED International Conference on Computer Graphics and Imaging, pp.177-181, Hawaii, August. 2005.
analysis,” IEEE Transactions on Circuits and System for Video Technology, vol. 13, no.7 pp.704-716, July 2003.
[14] Xue Quan; Liu Jilin; Wang Shijie; Zhao Jiandong, “H.264/AVC baseline profile decoder optimization on independent platform”, WCNM, pp. 1253 – 1256, Sept. 2005.
[15] K. Sühring, Ed., (2007) JVT Reference Software JM 11.0, [Online]. Available: http://bs.hhi.de/~suehring/tml.
[16] S. C. Chang, W. H. Peng, S. H. Wang, and T. Chiang, “A Platform based bus-interleaved architecture for deblocking filter in H.264/MPEG-4 AVC,” IEEE Transactions on Consumer Electronics, Vol. 51, pp.249-255, 2005.
[17] M. N. Bojnordi, O. Fatemi, M. R. Hashemi, “An Efficient Deblocking Filter with Self-Transposing Memory Architecture for H.264/AVC,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp.II925-II928, Toulouse, May 2006.
[18] C. C. Cheng, T. S. Chang, and K.B. Lee, “An In-Place Architecture for the Deblocking
Filter in H.264/AVC,” IEEE Transactions on Circuits and Systems II: Express Briefs, Vol.53, No.7, pp.530-534, July 2006.
[19] Y. X. Zhao, A. P. Jiang, “A Novel Parallel Processing Architecture for Deblocking Filter in
H.264 Using Vertical MB Filter Order,” International Conference on Solid-State and Integrated Circuit Technology, pp.2028-2030, Shanghai, 2006.
[20] S. Y. Shih, C. R. Chang and Y. L. Lin, “A Near Optimal Deblocking Filter for H.264 Advanced Video Coding,” Asia and South Pacific Conference on Design Automation, pp.170-175, Jan. 2006.
[21] Y. C. Chao, J. K. Lin, J. F. Yang, and B. D Liu, “A High Throughput and Data Reuse Architecture for H.264/AVC Deblocking Filter,” IEEE Asia Pacific Conference on Circuits and Systems, pp.1260-1263, 2006.
[22] C. M. Chen and C. H. Chen, “An Efficient VLSI Architecture for Edge Filtering in H.264/AVC,” IASTED International Conference on Circuits, Signals, and Systems, pp.118-122, Los Angels, Oct. 2005.
[23] ARM Corporation, ARM Develop Suite CodeWarrior IDE Guide version1.2, ARMDUI0065D, 2001, Available:
http://infocenter.arm.com/help/topic/com.arm.doc.dui0065d/DUI0065.pdf
[24] D. C. Burger and T. M. Austin, “Then SimpleScalar Tool Set, Version2.0,” University of Wisconsin, Madison Tech. Report. 1997.
[25] 張世騫 , 「建構在 ARM 平台的 H.264/MPEG-4 AVC 解碼器以及去方塊 」, 碩士論文, 民國94 年
[26] 許介遠, 「MPEG4 物件視訊解碼器在PACDSP 平台上軟體實現」, 國立交通大學, 碩士論文, 民國96 年
[27] 陶世軒, 「H.264 基線解調器之數位訊號處理平台實現以及四分域為基礎的抗錯與錯誤修補演算法」,國立交通大學, 碩士論文, 民國94 年
[28] 林立浩, 「基於OpenMAX(TM) DL 之H.264 解碼器在PACDSP 數位訊號處理器上的實作與最佳化 」, 國立清華大學,碩士論文, 民國96 年
[29] 李育瑄, 「雙核心系統晶片平台上H.264 解碼器多個程式模型分析」, 國立清華大學, 碩士論文, 民國96 年
[30] 王皓正, 「實作與最佳化AVS 解碼器於雙核心平台 」, 國立清華大學, 碩士論文, 民國96 年