| 研究生: |
張翔珳 Hsiang-Wen Chang |
|---|---|
| 論文名稱: |
超解析度方法與系統設計比較研究 Comparative Study for Super-Resolution Methods and System Design |
| 指導教授: |
陳慶瀚
Ching-Han Chen |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2017 |
| 畢業學年度: | 106 |
| 語文別: | 中文 |
| 論文頁數: | 146 |
| 中文關鍵詞: | 超解析度 、SRCNN 、APNN 、Bicubic spline 、比較研究 |
| 外文關鍵詞: | Super resolution, SRCNN, APNN, Bicubic spline, Comparison Study |
| 相關次數: | 點閱:11 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在視訊監控、視覺檢測領域,傳統攝影機因為解析度不足,造成監控品質、檢測率降低,而超解析度方法能夠超越攝影機物理極限,將原低解析度影像插補轉換為高解析度影像,增加系統應用的效能。本論文以三種基於不同原理的超解析度方法Bicubic spline、APNN、SRCNN作為比較的對象,以人臉辨識,條碼辨識來評估和量化監控品質、檢測率的改善程度,以及針對演算法實作後成本進行全面評估,比較每個演算法在人眼視覺品質、記憶體使用量、即時性、適合硬體化、相對耗電量、硬體資源的成本,以提供嵌入式系統開發在選擇超解析度方法時,能夠有量化數據做為參考。從實驗結果得知效果方面,超解析度方法能夠改善系統的辨識率,其中以SRCNN改善幅度最大,超解析度方法也進一步降低資料傳輸頻寬,在成本方面,Bicubic spline有少量資源和計算速度快的特性,適合實作於嵌入式軟硬體,APNN需要相對較多的資源,適合實作於嵌入式硬體,SRCNN需要龐大的硬體資源,故目前僅適合實作於GPU平台,其中,本論文對SRCNN硬體部分做硬體最佳化,在邏輯閘數量和精準度做取捨,達到演算法近似計算,本論文找到兩組解,將SRCNN常數權重乘法器硬體化簡,一組為降低0.79%的邏輯閘,平均特徵差增加0.000120,一組為降低12.2%的邏輯閘,平均特徵差增加0.264911。
In the field of video surveillance and vision detection, traditional cameras have low video quality and detection rates, owing to their low resolution. To increase the effectiveness of system applications, the super-resolution method was developed to convert low-resolution images into high-resolution images. In this paper, three super-resolution methods (Bicubic spline, APNN, and SRCNN) are compared and evaluated through two experiments to assess their performance in areas such as face and barcode recognition. The properties of each algorithm, including human vision quality, memory usage, execution time, hardware complexity, relative power consumption, and hardware resources, are also discussed. According to the experimental results, system analyzers can choose the appropriate super-resolution method for embedded system development. Our results show that SRCNN has the greatest improvement of recognition rate among the super-resolution methods. In terms of cost, the Bicubic spline method is suitable for embedded software and hardware applications due to its low cost and high speed. APNN can be applied in embedded hardware applications because of its low resource usage. Due to the high resource usage of SRCNN, it is only suitable for certain GPUs. Therefore, we use a genetic algorithm to obtain a trade-off between hardware cost and accuracy to compute an approximation for the SRCNN hardware. We found two approximated results for SRCNN: a 0.79% decrease in hardware cost and an increase of 0.000120 in average feature difference or a 12.2% decrease in hardware cost and an increase of 0.264011 in average feature difference.
參考文獻
[1] S. Shirmohammadi and A. Ferrero, "Camera as the instrument: the rising trend of vision based measurement," IEEE Instrumentation & Measurement Magazine, vol. 17, no. 3, pp. 41-47, Jun. 2014.
[2] S. Dabral, S. Kamath, V. Appia, M. Mody, B. Zhang, and U. Batur, "Trends in camera based Automotive Driver Assistance Systems (ADAS)," in 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS), College Station, TX, USA, 2014, pp. 1110-1115.
[3] T. Uiboupin, P. Rasti, G. Anbarjafari, and H. Demirel, "Facial image super resolution using sparse representation for improving face recognition in surveillance monitoring," in 2016 24th Signal Processing and Communication Application Conference (SIU), Zonguldak, Turkey, 2016, pp. 437-440.
[4] D. G. Bailey, "Super-resolution of bar codes," Journal of Electronic Imaging, vol. 10, no. 1, pp. 213-220, Jan. 2001.
[5] G. Yang, N. Liu, and Y. Gao, "Two-Dimensional Barcode Image Super-Resolution Reconstruction Via Sparse Representation," A A, vol. 7, p. 3, Oct. 2013.
[6] Z. Wang, H. Yang, W. Li, and Z. Yin, "Super-Resolving IC Images With an Edge-Preserving Bayesian Framework," IEEE Transactions on Semiconductor Manufacturing, vol. 27, no. 1, pp. 118-130, Dec. 2014.
[7] M.-S. Deng, "Blur License Plate Character Prediction Using Super-Resolution Based Image Reconstruction Technique," NCU.
[8] Y. Yamada and D. Sasagawa, "Super-resolution processing of the partial pictorial image of the single pictorial image which eliminated artificiality," in 2012 IEEE International Carnahan Conference on Security Technology (ICCST), Boston, MA, USA, 2012, pp. 338-344.
[9] X. Yang, S. Zhant, C. Hu, Z. Liang, and D. Xie, "Super-resolution of medical image using representation learning," in 2016 8th International Conference on Wireless Communications & Signal Processing (WCSP), Yangzhou, China, 2016, pp. 1-6.
[10] M.-H. Hung, "Fast Super Resolution Image Reconstruction from Multiple Differently Exposed Images Using Multilevel Interpolation," Computer Science, 中興大學, 2009年, 2009.
[11] C. De Boor, "Bicubic spline interpolation," Studies in Applied Mathematics, vol. 41, no. 1-4, pp. 212-218, Apr. 1962.
[12] C. Dong, C. C. Loy, K. He, and X. Tang, "Image super-resolution using deep convolutional networks," IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 2, pp. 295-307, Jun. 2016.
[13] C.-H. Chen, C.-M. Kuo, T.-K. Yao, and S.-H. Hsieh, "Anisotropic Probabilistic Neural Network for Image Interpolation," Journal of Mathematical Imaging and Vision, journal article vol. 48, no. 3, pp. 488-498, Feb. 2014.
[14] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[15] B. Sahiner et al., "Classification of mass and normal breast tissue: a convolution neural network classifier with spatial domain and texture images," IEEE Transactions on Medical Imaging, vol. 15, no. 5, pp. 598-610, Nov. 1996.
[16] S. Azam, A. Rafique, and M. Jeon, "Vehicle pose detection using region based convolutional neural network," in 2016 International Conference on Control, Automation and Information Sciences (ICCAIS), Ansan, South Korea, 2016, pp. 194-198.
[17] J. Amoh and K. Odame, "Deep Neural Networks for Identifying Cough Sounds," IEEE Transactions on Biomedical Circuits and Systems, vol. 10, no. 5, pp. 1003-1011, Sep. 2016.
[18] D. H. Hubel and T. N. Wiesel, "Receptive fields, binocular interaction and functional architecture in the cat's visual cortex," The Journal of Physiology, vol. 160, no. 1, pp. 106-154.2, Jan. 1962.
[19] S. Hijazi, R. Kumar, and C. Rowen, "Using convolutional neural networks for image recognition," Tech. Rep.2015.
[20] K. Fukushima, "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position," Biological Cybernetics, journal article vol. 36, no. 4, pp. 193-202, Apr. 1980.
[21] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning internal representations by error propagation," California Univ San Diego La Jolla Inst for Cognitive Science1985.
[22] Y. LeCun et al., "Backpropagation applied to handwritten zip code recognition," Neural computation, vol. 1, no. 4, pp. 541-551, Dec. 1989.
[23] A. Canziani, A. Paszke, and E. Culurciello, "An analysis of deep neural network models for practical applications," arXiv preprint arXiv:1605.07678, May. 2016.
[24] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, Jun. 2014.
[25] Z. Zhao, L. Song, R. Xie, and X. Yang, "GPU accelerated high-quality video/image super-resolution," in 2016 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Nara, Japan, 2016, pp. 1-4.
[26] T. Gong, T. Fan, J. Guo, and Z. Cai, "Gpu-based parallel optimization and embedded system application of immune convolutional neural network," in 2015 International Workshop on Artificial Immune Systems (AIS), Taormina, Italy, 2015, pp. 1-8.
[27] L. Cavigelli, D. Gschwend, C. Mayer, S. Willi, B. Muheim, and L. Benini, "Origami: A convolutional network accelerator," in Proceedings of the 25th edition on Great Lakes Symposium on VLSI, PA, USA, 2015, pp. 199-204: ACM.
[28] D. F. Specht, "Probabilistic neural networks for classification, mapping, or associative memory," in IEEE international conference on neural networks, San Diego, CA, USA, 1988, vol. 1, no. 24, pp. 525-532.
[29] P.-C. Kao, "Application of SOM-PNN Hybrid Neural Network in Speaker Identification," 碩士, 2002.
[30] Y.-W. Wu, "Six-Axis Motion Recognition Applying to Smart Toothbrush," National Central University, 2016.
[31] Z. Cui, H. Chang, S. Shan, B. Zhong, and X. Chen, "Deep network cascade for image super-resolution," in European Conference on Computer Vision, Zurich, Switzerland, 2014, pp. 49-64: Springer.
[32] C. Hong, Y. Dit-Yan, and X. Yimin, "Super-resolution through neighbor embedding," in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Washington, DC, USA, 2004, vol. 1, pp. I-I.
[33] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel, "Low-complexity single-image super-resolution based on nonnegative neighbor embedding," Sep. 2012.
[34] S.-H. Hsieh and C.-H. Chen, "Adaptive image interpolation using probabilistic neural network," Expert Systems with Applications, vol. 36, no. 3, pp. 6025-6029, Apr. 2009.
[35] A. R. Rao, A taxonomy for texture description and identification. Springer Science & Business Media, 2012.
[36] 魏裕昌, 唐大崙, 徐明景, and 許維欽, "An Analytical Study of the Evaluation Method of Digital Archive Image Quality ", 數位典藏作業規劃與品質管理研討會, 2004.
[37] C.-Y. Yang, C. Ma, and M.-H. Yang, "Single-image super-resolution: A benchmark," in European Conference on Computer Vision, Zurich, Switzerland, 2014, pp. 372-386: Springer.
[38] L. Zhang and L. Zhang. (2014). Research on Image Quality Assessment. Available: http://sse.tongji.edu.cn/linzhang/IQA/IQA.htm
[39] N. Damera-Venkata, T. D. Kite, W. S. Geisler, B. L. Evans, and A. C. Bovik, "Image quality assessment based on a degradation model," IEEE transactions on image processing, vol. 9, no. 4, pp. 636-650, Apr. 2000.
[40] E. Peli, "Contrast in complex images," JOSA A, vol. 7, no. 10, pp. 2032-2040, Oct. 1990.
[41] A. Bradley and I. Ohzawa, "A comparison of contrast detection and discrimination," Vision research, vol. 26, no. 6, pp. 991-997, Jun. 1986.
[42] B. A. Wandell and S. J. Thomas, "Foundations of Vision," Psyccritiques, vol. 42, no. 7, p. 649, 1997.
[43] H. R. Sheikh, A. C. Bovik, and G. De Veciana, "An information fidelity criterion for image quality assessment using natural scene statistics," IEEE Transactions on image processing, vol. 14, no. 12, pp. 2117-2128, Nov. 2005.
[44] T. M. Cover and J. A. Thomas, Elements of information theory. John Wiley & Sons, 2012.
[45] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE transactions on image processing, vol. 13, no. 4, pp. 600-612, Apr. 2004.
[46] Z. Wang, E. P. Simoncelli, and A. C. Bovik, "Multiscale structural similarity for image quality assessment," in Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference on, Pacific Grove, CA, USA, 2003, vol. 2, pp. 1398-1402: IEEE.
[47] C.-H. Chen, M.-Y. Lin, and X.-C. Guo, "High-level modeling and synthesis of smart sensor networks for Industrial Internet of Things," Computers & Electrical Engineering, vol. 61, no. Supplement C, pp. 48-66, Jul. 2017.
[48] M.-j. Lu, "Comparative Study of Background Subtraction Methods for Moving Object Detection Application," NCU, 2016.
[49] (2011). libdmtx. Available: http://libdmtx.sourceforge.net/
[50] K.-L. Tu, "Data-Matrix 二維條碼辨識的性能改善方法," NCU, 2017.
[51] A. T. Laboratories. (1992). The ORL Database of Faces. Available: http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
[52] C.-H. Chen and C. Te Chu, "Fusion of face and iris features for multimodal biometrics," in International Conference on Biometrics, Berlin, Heidelberg, 2006, pp. 571-580: Springer.
[53] R. Zeyde, M. Elad, and M. Protter, "On single image scale-up using sparse-representations," in International conference on curves and surfaces, Berlin, Heidelberg, 2010, pp. 711-730: Springer.
[54] D. Martin, C. Fowlkes, D. Tal, and J. Malik, "A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics," in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vancouver, BC, Canada, 2001, vol. 2, pp. 416-423 vol.2.
[55] C. Ledig et al., "Photo-realistic single image super-resolution using a generative adversarial network," arXiv preprint arXiv:1609.04802, Sep. 2016.
[56] C.-Y. Yang and M.-H. Yang, "Fast direct super-resolution by simple functions," in Proceedings of the IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 2013, pp. 561-568.
[57] K. Zhang, D. Tao, X. Gao, X. Li, and Z. Xiong, "Learning Multiple Linear Mappings for Efficient Single Image Super-Resolution," IEEE Transactions on Image Processing, vol. 24, no. 3, pp. 846-861, Jan. 2015.
[58] R. Timofte, V. De Smet, and L. Van Gool, "A+: Adjusted anchored neighborhood regression for fast super-resolution," in Asian Conference on Computer Vision, Singapore, 2014, pp. 111-126: Springer.
[59] R. Timofte, V. De Smet, and L. Van Gool, "Anchored neighborhood regression for fast example-based super-resolution," in Proceedings of the IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 2013, pp. 1920-1927.