| 研究生: |
吳宗憲 Tsung-Hsien Wu |
|---|---|
| 論文名稱: |
非共平面文件影像透視矯正 The perspective rectification for non-planar documents |
| 指導教授: |
范國清
Kuo-Chin Fan |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 99 |
| 語文別: | 中文 |
| 論文頁數: | 94 |
| 中文關鍵詞: | 文件矯正 、流水演算法 、相關係數 |
| 外文關鍵詞: | document rectification, water flow, correlation coefficients |
| 相關次數: | 點閱:6 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來數位相機普及,使得人們隨時可以藉相機取得影像;過去文字影像只能以紙本形式透過掃描機取得,現在則可利用相機隨時隨地在幾秒內得到各種實物上的文字影像。然而當透過相機取得越接近生活化的影像,因為取像環境的變異多元,形成了如影像光場不均問題、影像非平面化問題等新的研究課題,尤其是在光學文字辨識的應用上,這些新的問題產生了許多不良的影響。
目前文字辨識技術已能對單一文字非常正確的進行辨識,反而是如何取得單一文字的切割問題,因為以相機取像後所造成的視角差異、及文字影像不再是單純的分佈在同一平面等問題,成為有待努力的課題。本研究的研究目標即專注於文件因取像視角及文字分佈在不同平面等所造成的文件透視失真的文件變形矯正,針對日常生活中最常取得的圓柱上文件、立方體上非共平面文件、及非垂直取像光軸的透視失真文件進行分類與矯正研究。
本研究提出一個有效分割不同表面的文件影像矯正方法,利用影像前處理、連通標記擷取影像資料、文字行擷取流水演算法、及文件表面類別判斷,設計出有效的文件矯正方法。本研究所提出的矯正方法,不需使用文件邊界或排版格式的資訊,即可對常見文件影像因所在表面非單一平面的失真現象有不錯的矯正效果。
Recently, digital cameras become a universal device due to its cost down. People can capture images at will in any time. In the past, text images can only be acquired by scanning documents using scanners. Currently, we can obtain the images of any kinds of objects by simply using cameras within second. Resulting from the influences of environments existing in our daily life while capturing the object images, some new research topics arise, such as the space with uneven light-illumination and scene with more than one plane. Those new problems will definitely affect the performance of OCR (optical character recognition) drastically.
Instead of focusing on OCR study which can already correctly recognize a single character with very high recognition rate currently, we devoting ourselves on slicing and obtaining a character from the images captured under poor conditions. For instance, the difference in view angles and texts do not distribute on the same plane. In this thesis, the research focuses on rectifying the documents with perspective distortions caused by different view angles while capturing images and various planes that texts locate on the image. In our work, we specially focus on classifying and rectifying images resided on cylinders, cubes of non-coplanar, and those captured through non-vertical light axis lens.
This study provides an effective way in splitting an image with different planes and rectifying the split regions. The effective method in rectifying documents is designed mainly by using image processing techniques, such as connected-component labeling for extracting image information, text line extraction water flow algorithm, and image plane analysis. The proposed method can rectify those common document images with perspective distortion caused by non-singular planes without needing the information of document border and typesetting. Experimental results verify the feasibility and validity of our proposed method.
[1] G. Agam and C. Wu, “Structural Rectification of Non-planar Document Images: Application to Graphics Recognition,” Lecture Notes in Computer Science, pp. 2390, 2002.
[2] M. S. Brown , M. Sun , R. Yang , L. Yun and W. B. Seales, “Restoring 2D Content from Distorted Documents,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, n0.11, pp.1904-1916, 2007
[3] R. Hartley and A. Zisserman, “Multiple View Geometry in Computer Vision,” Cambridge university press, 2ed, 2003.
[4] X. Chen, J. Yang, J. Zhang and A. Waibel, “Automatic detection and recognition of signs from natural scenes,” IEEE Trans. Image Process., vol. 13, no. 1, pp.87 - 99, 2004.
[5] N. Stamatopoulos, B. Gatos, I. Pratikakis and S. J. Perantonis, “Goal-Oriented Rectification of Camera-Based Document Images,” IEEE Trans. Image Process., vol. 20, no. 4, pp.910 - 920, 2011.
[6] S. J. Lu, B. M. Chen and C. C. Ko, “A partition approach for the restoration of camera images of planar and curled document,” Image and Vision Computing, vol. 24, no. 8, pp. 837-848, 2006.
[7] L. Likforman-Sulem, A. Zahour and B. Taconet, “Text line segmentation of historical documents: a survey,” International Journal on Document Analysis and Recognition, vol. 9, no. 2, pp. 123–138, 2007.
[8] Y. H. Tseng, H. J. Lee, “Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm,” Pattern Recognit. Lett., vol. 20, no 8, pp. 791–806, 1999.
[9] M. W. Friedrich, Y. W. Kwan and G. C. Richard, “Block segmentation and text extraction in mixed text/image documents,” Computer Graphics and Image Processing, vol. 20, no 4, pp.375-390, 1982.
[10] A. Zramdini and R. Ingold, “Optical Font Recognition Using Typographical Features,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 8, 1998.
[11] S.-C. Pei and M. Tzeng “Uneven Illumination Removal and Image Enhancement Using Empirical Mode Decomposition, ” 20th Conf. in Computer Vision, Graphics and Image Processing(CVGIP), 2007.
[12] 陳建隆, “應用改良式經驗模態分解法於消除文件影像中的不良光照現象,” 國立中央大學資訊工程研究所碩士論文, 中華民國98年.
[13] N. Otsu, “A Threshold Selection Method from gray-level Histograms. ” IEEE Trans. Syst. Man Cybern., vol. 9, pp. 62–66, 1979.
[14] L. He, Y. Chao, K. Suzuki and K. Wu, “Fast connected-component labeling,” Pattern Recognition, vol. 42, no. 9, pp.1977-1987, 2009.
[15] S. Basu, C. Chaudhuri, M. Kundu, M. Nasipuri and D.K. Basu, “Text line extraction from multi-skewed handwritten documents,” Pattern Recognition, vol. 40, no. 6, pp.1825-1839, 2007.
[16] 廖紹鋼(編譯), G. Woods(原著), “數位影像處理,” 普林斯頓國際有限公司,第二版, 2003.
[17] 吳榮彬(編譯), J. S. Milton(原著), “工程統計學:原則與應用,” 麥格羅‧希爾,第四版, 2003.
[18] Wikipedia; Pearson product-moment correlation coefficient. http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient. Accessed 20 June 2011.
[19] http://perso.ens-lyon.fr/patrick.flandrin/emd.html. Accessed 20 July 2011.