| 研究生: |
劉軒宏 Hsuan-Hung Liu |
|---|---|
| 論文名稱: |
基於深度學習之相片美學評分系統 A Deep-learning-based Photo Aesthetics Assessment System |
| 指導教授: |
蘇木春
Mu-Chun Su |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 82 |
| 中文關鍵詞: | 深度學習 、影像質量評估 、視覺藝術 、攝影構圖 |
| 外文關鍵詞: | deep learning, image quality assessment, visual art, photographic composition |
| 相關次數: | 點閱:13 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
過去,攝影作品的好壞除了與拍攝技術有關之外,攝影機的硬體功能也是其中的一大關鍵。但隨著近幾年智慧型手機的問世與進步,拍攝相片時,其搭配之硬體功能已不再構成作品好壞之關鍵阻礙。然而相較於硬體功能之長足進步,個人的攝影技術卻有因人而異的大差距。除此之外,過去傳統使用底片來攝影時,我們只能將相片洗了之後再評估好壞,現在有許多後製的工具可以使用以讓相片更為吸引人。
因此,本論文提出一套能夠評分、改善以及分析構圖之系統,期望在協助使用者改進相片之餘,也能透過分析出的相片構圖,讓使用者理解如此改進的理由,以及如何能夠拍出更好的相片。
本論文系統包含(1)利用NIMA模型對輸入之相片給予評分(2)以NIMA模型之給分作為基準,給予相片之裁切調整與顏色等參數調整建議(3)對調整後之相片進行構圖分析。
根據系統實驗結果顯示,NIMA之評分結果符合一般人類之普遍審美;構圖分析之平均Top-5正確率達到92.08%。因此,本系統具備一定程度之可用性。
In the past, in addition to the quality of photography, the camera's hardware function was also a key factor. However, with the advent and advancement of smartphones in recent years, the hardware functions that are used when taking photos are no longer a key obstacle to the quality of the work. However, compared with the great progress of the hardware function, there is a big gap between individual photography techniques. In addition, in the past, when traditionally using negatives for photography, we need to develop a roll of film first before evaluating the quality. Now there are many post-production tools that can be used to make the photos more attractive.
Therefore, this paper proposes a system that can assess, enhance and analyze composition. In addition to helping users improve their photos, we hope that the composition analysis of the photos can make users understand the reasons for the improvement and how they can take better photos.
This system includes (1) using the NIMA model to assess the input photo (2) giving suggestions for cropping and color adjustment of photos using the score of the NIMA model as a benchmark (3) giving composition analysis of the adjusted photos.
According to the results of the experiments, the results of the NIMA model are in line with the general aesthetics, and the average Top-5 accuracy of composition analysis reaches 92.08%. Therefore, this system has a certain degree of usability.
[1] H. R. Sheikh and A. C. Bovik, "Image information and visual quality," IEEE Transactions on Image Processing, vol. 15, no. 2, pp. 430-444, 2006.
[2] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015.
[3] L. Kang, P. Ye, Y. Li, and D. Doermann, "Convolutional Neural Networks for No-Reference Image Quality Assessment," in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[4] S. Bosse, D. Maniry, T. Wiegand, and W. Samek, "A deep neural network for image quality assessment," in 2016 IEEE International Conference on Image Processing (ICIP), 2016.
[5] S. Bianco, L. Celona, P. Napoletano, and R. Schettini, "On the Use of Deep Learning for Blind Image Quality Assessment," in Signal, Image And Video Processing, 2016.
[6] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional," in NIPS, 2012.
[7] X. Lu, Z. Lin, H. Jin, J. Yang, and J. Z. Wang, "Rating Image Aesthetics Using Deep Learning," IEEE Transactions on Multimedia, vol. 17, no. 11, pp. 2021-2034, 2015.
[8] B. Jin, M. V. Ortiz Segovia, and S. Süsstrunk, "Image aesthetic predictors based on weighted CNNs," in 2016 IEEE International Conference on Image Processing (ICIP), 2016.
[9] N. Murray, L. Marchesotti, and F. Perronnin, "AVA: A large-scale database for aesthetic visual analysis," in 2012 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2012.
[10] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," in arXiv:1409.1556, 2014.
[11] H. Talebi and P. Milanfar, "NIMA: Neural Image Assessment," IEEE Transactions on Image Processing, vol. 27, no. 8, pp. 3998-4011, 2018.
[12] N. Ponomarenko, O. Ieremeiev, V. Lukin, K. Egiazarian, L. Jin, J. Astola, B. Vozel, K. Chehdi, M. Carli, F. Battisti, and C.-C. J. Kuo, "Color image database TID2013: Peculiarities and preliminary results," in European Workshop on Visual Information Processing (EUVIP), 2013.
[13] D. Ghadiyaram and A. C. Bovik, "Massive Online Crowdsourced Study of Subjective and Objective Picture Quality," IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 372-387, 2016.
[14] L. Hou, C.-P. Yu, and D. Samaras, "Squared Earth Mover's Distance-based Loss for Training Deep Neural Networks," in arXiv:1611.05916, 2016.
[15] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," in arXiv:1704.04861, 2017.
[16] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," in arXiv:1409.1556, 2014.
[17] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the Inception Architecture for Computer Vision," in arXiv:1512.00567, 2015.
[18] H. Tong, M. Li, H.-J. Zhang, J. He, and C. Zhang, "Classification of Digital Photos Taken by Photographers or Home Users," Advances in Multimedia Information Processing - PCM 2004, pp. 198-205, 2004.
[19] R. Datta, D. Joshi, J. Li, and J. Z. Wang, "Studying Aesthetics in Photographic Images Using a Computational Approach," Computer Vision – ECCV 2006, pp. 288-301, 2006.
[20] L. Liu, R. Chen, L. Wolf, and D. Cohen-Or, "Optimizing Photo Composition," Computer Graphics Forum. Wiley Online Library, vol. 29, pp. 469-478, 2010.
[21] L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka, "Assessing the aesthetic quality of photographs using generic image descriptors," in 2011 International Conference on Computer Vision, 2011.
[22] M.-T. Wu, T.-Y. Pan, W.-L. Tsai, H.-C. Kuo, and M.-C. Hu, "High-level semantic photographic composition analysis and understanding with deep neural networks," in 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2017.
[23] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[24] Y. Chen and T. Pock, "Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration," in IEEE transactions on pattern analysis and machine intelligence, 2017.
[25] L. Xu, J. S. Ren, C. Liu, and J. Jia, "Deep Convolutional Neural Network for Image Deconvolution," Advances in Neural Information Processing Systems, pp. 1790-1798, 2014.
[26] T. Acharya and A. K. Ray, Image Processing: Principles and Applications, 2005.
[27] H. Talebi and P. Milanfar, "Fast Multi-Layer Laplacian Enhancement," in arXiv:1606.07396, 2016.
[28] L. Shen, Z. Yue, F. Feng, Q. Chen, S. Liu, and J. Ma, "MSR-net:Low-light Image Enhancement Using Deep Convolutional Network," in arXiv:1711.02488, 2017.
[29] A. Ignatov, N. Kobyshev, R. Timofte, K. Vanhoey, and L. V. Gool, "DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks," in arXiv:1704.02470, 2017.
[30] H. Talebi and P. Milanfar, "Learned Perceptual Image Enhancement," in arXiv:1712.02864, 2017.
[31] V. Bychkovsky, S. Paris, E. Chan, and F. Durand, "Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs," in The Twenty-Fourth IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[32] S. A. Esmaeili, B. Singh, and L. S. Davis, "Fast-At: Fast Automatic Thumbnail Generation Using Deep Neural Networks," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[33] J. Yan, S. Lin, S. B. Kang, and X. Tang, "Learning the Change for Automatic Image Cropping," in 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
[34] E. Hong, J. Jeon, and S. Lee, "CNN based Repeated Cropping for Photo Composition Enhancement," in CVPR workshop, 2017.
[35] Z. Wei, J. Zhang, X. Shen, Z. Lin, R. Mech, M. Hoai, and D. Samaras, "Good View Hunting: Learning Photo Composition from Dense View Pairs," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[36] X. Tang, W. Luo, and X. Wang, "Content-Based Photo Quality Assessment," in 2011 International Conference on Computer Vision, 2011.
[37] S. Ma, Z. Wei, F. Tian, X. Fan, J. Zhang, X. Shen, Z. Lin, J. Huang, R. Měch, D. Samaras, and H. Wang, "SmartEye: Assisting Instant Photo Taking via Integrating User Preference with Deep View Proposal Network," in 2019 CHI Conference on Human Factors in Computing Systems, 2019.
[38] D. Kim, T. Kwon, B. Yoo, G. Lee, W. Lee, J. Lee, S. Yim, and J. Jeong, "Seamless Capturing of Moments Using Photographic Compositions and Image Aesthetics," in 2020 International Conference on Electronics, Information, and Communication (ICEIC), 2020.
[39] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
[40] J.-T. Lee, H.-U. Kim, C. Lee, and C.-S. Kim, "Photographic composition classification and dominant geometric element detection for outdoor scenes," Journal of Visual Communication and Image Representation, vol. 55, pp. 91-105, 2018.
[41] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "PyTorch: An Imperative Style, High-Performance Deep Learning Library," in arXiv:1912.01703, 2019.
[42] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask R-CNN," in IEEE International Conference on Computer Vision , 2017.
[43] T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár, "Microsoft COCO: Common Objects in Context," in arXiv:1405.0312, 2014.
[44] M. Francisco and G. Ross, "maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch," 2018. [Online]. Available: https://github.com/facebookresearch/maskrcnn-benchmark.
[45] K. Man, K. Tang, and S. Kwong, "Genetic algorithms: concepts and applications [in engineering design]," IEEE Transactions on Industrial Electronics, vol. 43, no. 5, pp. 519-534, 1996.
[46] "OpenCV," [Online]. Available: https://opencv.org/. [Accessed 6 - Jun - 2018].
[47] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in arXiv:1512.03385, 2015.
[48] G. Huang, Z. Liu, L. v. d. Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks," in arXiv:1608.06993, 2016.
[49] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, "Aggregated Residual Transformations for Deep Neural Networks," in arXiv:1611.05431, 2016.
[50] N. Otsu, "A Tlreshold Selection Method," IEEE TRANSACTIONS ON SYSTREMS, MAN, AND CYBERNETICS, Vols. SMC-9, no. 1, pp. 62-66, 1979.
[51] J. Canny, "A Computational Approach to Edge Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vols. PAMI-8, no. 6, pp. 679-698, 1986.