| 研究生: |
黃泰維 Tai-wei Huang |
|---|---|
| 論文名稱: |
適用於二維至三維影像轉換之基於超像素與邊緣資訊深度萃取方法 A Novel Method for 2D-to-3D Video Conversion Based on Superpixels and Edge Information |
| 指導教授: |
蔡宗漢
Tsung-Han Tsai |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 英文 |
| 論文頁數: | 57 |
| 中文關鍵詞: | 立體影像 、深度圖 |
| 外文關鍵詞: | 2D-to-3D, depth map |
| 相關次數: | 點閱:9 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本篇論文提出一個基於superpixel的2D-to-3D的方法,此為自動深度擷取轉換方法。近來三維立體影像需求的增加,而三維影像內容資源之缺乏。如果想容易的享受到逼真的立體視覺效果,勢必需要開發低成本、高效率的轉換方法,將原本二維的影像快速的轉換成三維立體影像。
首先,我們使用高斯模型作前景偵測,分離出前景與背景,接著,我們使用superpixel演算法來找出邊緣資訊,我們將顏色相近和位置相鄰的pixels作clustering,根據superpixel群聚出來的像素我們給予初始的深度值,我們會初始六種不同的深度圖,利用hough transform來找出消失線的斜率,接著利用斜率可知哪個深度圖是我們要的,給完初始深度值後,我們再用sobel edge detection來作第二次的邊緣偵測,用兩種不同的閥值來得到不同邊緣資訊,一個有較多雜訊但邊緣資訊較完整,另一個雜訊較少但邊緣資訊也較缺乏,然後用thinning演算法來降低邊緣像素的寬度使其變成只有1 pixel,比較這兩個結果後重新賦予深度值,接著再將前景資訊加進來給前景物件相同的深度值,為了使深度圖更加精準,因此,我們利用四種方向掃描整張影像來修正深度值,即可得到最後的深度圖,最後,再用depth image based rendering (DIBR)來合成左右視角的影像,如此,就完成了3D影像。
This paper proposes novel method for 2D-to-3D video conversion. It is based on boundary information to automatically generate the depth map. First, we use Gaussian model to detect foreground objects and then separate the foreground and background. Next, we use the superpixel algorithm to find the edge information. Then according to the pixels which are clustered by superpixel, the initial depth values are acquired. Based on the result for depth value assignment, we detect the edges by Sobel edge detection with two thresholds to strength the edge information. To identify the pixel of boundary, we use thinning algorithm to the results of edge detection. Comparing these results and re-assign the depth value, the depth value of foreground will be refined. In order to make more accurate depth map, we use four kinds of scanning path for the entire image to correct depth values. After that, we will have the final depth map. Finally, use depth image based rendering (DIBR) to synthesize left and right view image. The 2D-to-3D conversion will complete. Combining the depth map and the original 2D video, a vivid 3D video is produced.
[1] G. Lawtom, “3d displays without glasses: Coming to a sceen near you,” IEEE Computer, vol. 44, no. 1, pp. 17-19, Jan. 2011.
[2] R.S. Brar, P. Surman, I.Sexton, R.Bares, W.K. Lee, K. Hopf, F. Neumann, S.E. Day, and E. Willman, “Laser-based head-tracked 3D display research,” IEEE J. Display Technology, vol. 6, no. 10, pp. 531-543, Sept. 2010.
[3] M. Harris, “3-D without four eyes,” IEEE Spectrum, vol. 47, no. 12, pp. 50-56, Dec. 2010.
[4] A. Smolic and P. Kauff, “Interactive 3-D video representation and coding technologies,” Proc. IEEE, vol. 93, no. 1, pp. 98-110, Jan. 2005.
[5] M. Tanimoto, “Free viewpoint television – FTV,” in Proceedings of 2004 Picture Coding Symposium, Dec. 2004.
[6] P. V. Harman, “Home-based 3D entertainment—An overview,” in IEEE Int. Conf. Image Process., Vancouver, 2000, pp. 1–4.
[7] K. S. Han, and K. Y. Hong, "Geometric and texture cue based depth-map estimation for 2D to 3D image conversion," IEEE International Conference on Consumer Electronics (ICCE), vol., no., pp. 651-652, 9-12 Jan. 2011.
[8] Guttmann, M., Wolf, L., and Cohen-Or, D., "Semiautomatic stereo extraction from video footage," 2009 IEEE 12th International Conference on Computer Vision, vol., no., pp. 136-142, Sept. 29 2009-Oct. 2 2009.
[9] C. Tan, T. Hong, T. Chang, et al., “Color model-based real-time learning for road following,” in IEEE Intelligent Transportation Systems Conference, 2006, pp. 939–944.
[10] G. Zhang, N. Zheng, C. Cui, et al., “An efficient road detection method in noisy urban environment,” in IEEE Intelligent Vehicle Symposium, Xi’an, China, 2009, pp. 556–561.
[11] M. Blas, M. Agrawal, A. Sudaresan, et al., “Fast color/texture segmentation for outdoor robots,” in IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems, 2008, pp. 4078–4085
[12] C. Fehn, E. Cooke, O. Schreer, and P. Kauff, “3d analysis and image-based rendering for immersive TV application,” Signal Processing: Image Communication, vol. 17, pp. 705-715, 2002.
[13] B. S. Kim, H. Lee, and W. Y. Kim, “Rapid eye detection method for non-glasses type 3d display on portable devices,” IEEE Trans. Consumer Electron., vol. 56, no. 4, pp. 2498-2505, Nov. 2010.
[14] S. Battiato , S. Curti , M. L. Cascia, M. Tortora and E. Scordato "Depth map generation by image classification", Proc. SPIE 5302, Three-Dimensional Image Capture and Applications VI, 95 (April 16, 2004);
[15] J.R. Smith, C. S. Li, “Decoding Image Semantics Using Composite Region Templates”, In Proceedings of CVPR, Workshop on Content-Based Access of Image and Video Libraries, 1998.
[16] J.R. Smith, C. S. Li, “Image Classification and Querying Using Composite Region Templates”, Journal of Computer Vision and Image Understanding, 1999.
[17] J.R. Smith, S. F. Chang, “Multi-stage Classification of Image from Futures and Related Text”, In Proc. of Fourth DELOS workshop, Pisa, Italy, August, 1997.
[18] Stautter, C., and Grmson, W.E.L. “Adaptive background mixture models for real-time tacking,” IEEE Conference on Computer Vision & Pattern Recognition. Colorado, USA. pp. 246-252. June 1999.
[19] T. H. Tsai, C. Y. Lin, S. Y. Li, "Algorithm and Architecture Design of Human-Machine Interaction in Foreground Object Detection with Dynamic Scene," Circuits and Systems for Video Technology, IEEE Transactions on, vol. PP, no.99, pp.0-1.
[20] D. S. Lee, "Effective Gaussian Mixture Learning for Video Background Subtraction," IEEE Transactions on Pattern Analysis and Maching Intelligence, vol. 27, no. 5, MAY 2005.
[21] D. S. Lee, “Effective Gaussian mixture learning for video background subtraction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.2/, no.5, pp. 827-832, May 2005.
[22] T. H. Tsai, Wen-Tsai Sheu, and Chung-Yuan Lin, “Foreground Object Detection Based on Multi-model Background Maintenance” IEEE International Symposium on Multimedia Workshops, pp. 151-159, 10-12 Dec 2007.
[23] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P. and Süsstrunk, S., “SLIC Superpixels Compared to State-of-the-Art Superpixel Methods,” Pattern Analysis and Machine Intellgence, IEEE Transactions on, vol.34, pp.2274-2282, Nov. 2012.
[24] Z. B. Zhang, Y. H. Wang, T. T. Jiang, and W. Gao, “Visual Pertiment 2D-to-3D Video Conversion by Multi-Cue Fusion,” Image Processing(ICIP), 2011 18th IEEE International Conference on, pp.909-912, Sept. 2011.
[25] Y. C. Fan and T. C. Chi, “The Novel non-hole-filling approach of depth image based rendering”, in 3DTV Conf.: True Vision – Capture, Transmission and Display of 3D Video, May 28-30, 2008, pp.325-328.
[26] Y. C. Fan, Y. C. Chen and S. Y. Chou “Vivid-DIBR based 2D-3D image conversion system for 3D display,” in Display Technology, Journal of Vol.10, Issue.10 pp. 887-898. Oct. 2014.
[27] S. F. Tsai, C. C. Cheng, C. T. Li, and L. G. Chen, “A real-time 1080p 2D-to-3D video conversion system,” in IEEE Int. Conf. on Consumer Electron., May 2011 vol.57, no. 2, pp. 915-922.
[28] C. Fehn, “A 3D-TV system based on video plus depth information (DIBR),” in Proc. Visualiz., Imag., Image Process., Sep. 2003, pp. 482-487.
[29] C. Fehn, “A 3D-TV system based on video plus depth information ,” in IEEE Conf. Rec. 37th Asilomar Conf. on Signals, Syst., Computers, Nov. 2003, pp. 1529-1533.
[30] ITU-R Recommendation BT.500-10, (2000). “Methodology for the subjective assessment of the quality of television pictures.”
[31] T. H. Tsai, C. S. Fan, and C. C. Huang, “Semi-automatic Depth Map Extraction Method for Stereo Video Conversion,”The 6th International Conference on Genetic and Evolutionary Computing (ICGEC), Kitakyushu, Japan, Aug. 2012.
[32] C. S. Fan, T. H. Tsai, and C. C. Huang, “Interactive Depth Generation Method for 2D-TO-3D Video Concersion”, The 25th IPPR Conference on Computer Vision, Graphics, and Image Processing (CVGIP), Nantou, Taiwan, Aug. 2012.
[33] C. C. Cheng, Student Member, IEEE, C. T. Li, and L. G. Chen, Fellow, IEEE, “A Novel 2D-to-3D Conversion System Using Edge Information,” IEEE Consumer Electronics Society, Consumer Electronics, IEEE Transactions on Vol.56, pp.1739-1745, 2010.
[34] B. L. Lin, L.C. Chang, S. S. Huang, D. W. Shen, and Y.C. Fan, “Two Dimensional to Three Dimensional Image Conversion System Design of Digital Archives for Classical Antiques and Document,” Information Security and Intelligence Control(ISIC), International Conference on, pp.218-221, Aug. 2012.
[35] Y. K. Lai, Y. F Lai, and C. Chen, “An Effective Hybrid Depth-Generation Algorithm for 2D-to-3D Conversion in 3D Displays”, IEEE/OSA J. Display Technol., vol.9, no.3 pp.146-161, March 2013.
[36] S. H. Raza, O. Javed, A. Das, H. Cheng, H. Singh and I. Essa, “Depth extraction from videos using geometric context and occlusion boundaries,” In BMCV, 2014.
[37] Nagoya University Multi-view Sequences Download List. http://www.fujii.nuee.nagoya-u.ac.jp/multiview-data/