基於深度學習之相片美學評分系統｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	劉軒宏 Hsuan-Hung Liu
論文名稱：	基於深度學習之相片美學評分系統 A Deep-learning-based Photo Aesthetics Assessment System
指導教授：	蘇木春 Mu-Chun Su
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	82
中文關鍵詞：	深度學習、影像質量評估、視覺藝術、攝影構圖
外文關鍵詞：	deep learning, image quality assessment, visual art, photographic composition
相關次數：	點閱：13 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

過去，攝影作品的好壞除了與拍攝技術有關之外，攝影機的硬體功能也是其中的一大關鍵。但隨著近幾年智慧型手機的問世與進步，拍攝相片時，其搭配之硬體功能已不再構成作品好壞之關鍵阻礙。然而相較於硬體功能之長足進步，個人的攝影技術卻有因人而異的大差距。除此之外，過去傳統使用底片來攝影時，我們只能將相片洗了之後再評估好壞，現在有許多後製的工具可以使用以讓相片更為吸引人。
因此，本論文提出一套能夠評分、改善以及分析構圖之系統，期望在協助使用者改進相片之餘，也能透過分析出的相片構圖，讓使用者理解如此改進的理由，以及如何能夠拍出更好的相片。
本論文系統包含（1）利用NIMA模型對輸入之相片給予評分（2）以NIMA模型之給分作為基準，給予相片之裁切調整與顏色等參數調整建議（3）對調整後之相片進行構圖分析。
根據系統實驗結果顯示，NIMA之評分結果符合一般人類之普遍審美；構圖分析之平均Top-5正確率達到92.08%。因此，本系統具備一定程度之可用性。

In the past, in addition to the quality of photography, the camera's hardware function was also a key factor. However, with the advent and advancement of smartphones in recent years, the hardware functions that are used when taking photos are no longer a key obstacle to the quality of the work. However, compared with the great progress of the hardware function, there is a big gap between individual photography techniques. In addition, in the past, when traditionally using negatives for photography, we need to develop a roll of film first before evaluating the quality. Now there are many post-production tools that can be used to make the photos more attractive.
Therefore, this paper proposes a system that can assess, enhance and analyze composition. In addition to helping users improve their photos, we hope that the composition analysis of the photos can make users understand the reasons for the improvement and how they can take better photos.
This system includes (1) using the NIMA model to assess the input photo (2) giving suggestions for cropping and color adjustment of photos using the score of the NIMA model as a benchmark (3) giving composition analysis of the adjusted photos.
According to the results of the experiments, the results of the NIMA model are in line with the general aesthetics, and the average Top-5 accuracy of composition analysis reaches 92.08%. Therefore, this system has a certain degree of usability.

基於深度學習之相片美學評分系統    i
摘要    i
A Deep-learning-based Photo Aesthetics Assessment System    ii
ABSTRACT    ii
致謝    iv
目錄    v
圖目錄    vii
表目錄    ix
第一章、緒論    1
1-1 研究動機    1
1-2 研究目的    2
1-3 論文架構    3
第二章、相關研究    4
2-1 相片美感評分    4
2-2 相片構圖分類    7
2-3 修圖方法    11
2-3-1 相片增強    11
2-3-2 裁切調整    15
2-4 現有系統比較    18
第三章、研究方法    21
3-1 系統流程    21
3-2 相片種類分類模型    22
3-3 相片評分模型    24
3-4 相片最佳化演算法    26
3-4-1 裁切調整演算法    28
3-4-2 參數調整演算法    31
3-5 相片構圖分類    33
第四章、實驗設計與結果    34
4-1 相片分類實驗    34
4-2 相片最佳化方法實驗    36
4-2-1 相片評分方式    36
4-2-2 參數選擇之結果與分析    44
4-3 相片構圖分類實驗    53
4-3-1 資料集調整之結果與分析    53
4-3-2 模型架構之結果與分析    56
4-3-3 影像後處理之結果與分析    57
第五章、結論與未來展望    60
5-1 結論    60
5-2 未來展望    61
參考文獻    62
                                

[1] H. R. Sheikh and A. C. Bovik, "Image information and visual quality," IEEE Transactions on Image Processing, vol. 15, no. 2, pp. 430-444, 2006.
[2] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015.
[3] L. Kang, P. Ye, Y. Li, and D. Doermann, "Convolutional Neural Networks for No-Reference Image Quality Assessment," in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[4] S. Bosse, D. Maniry, T. Wiegand, and W. Samek, "A deep neural network for image quality assessment," in 2016 IEEE International Conference on Image Processing (ICIP), 2016.
[5] S. Bianco, L. Celona, P. Napoletano, and R. Schettini, "On the Use of Deep Learning for Blind Image Quality Assessment," in Signal, Image And Video Processing, 2016.
[6] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional," in NIPS, 2012.
[7] X. Lu, Z. Lin, H. Jin, J. Yang, and J. Z. Wang, "Rating Image Aesthetics Using Deep Learning," IEEE Transactions on Multimedia, vol. 17, no. 11, pp. 2021-2034, 2015.
[8] B. Jin, M. V. Ortiz Segovia, and S. Süsstrunk, "Image aesthetic predictors based on weighted CNNs," in 2016 IEEE International Conference on Image Processing (ICIP), 2016.
[9] N. Murray, L. Marchesotti, and F. Perronnin, "AVA: A large-scale database for aesthetic visual analysis," in 2012 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2012.
[10] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," in arXiv:1409.1556, 2014.
[11] H. Talebi and P. Milanfar, "NIMA: Neural Image Assessment," IEEE Transactions on Image Processing, vol. 27, no. 8, pp. 3998-4011, 2018.
[12] N. Ponomarenko, O. Ieremeiev, V. Lukin, K. Egiazarian, L. Jin, J. Astola, B. Vozel, K. Chehdi, M. Carli, F. Battisti, and C.-C. J. Kuo, "Color image database TID2013: Peculiarities and preliminary results," in European Workshop on Visual Information Processing (EUVIP), 2013.
[13] D. Ghadiyaram and A. C. Bovik, "Massive Online Crowdsourced Study of Subjective and Objective Picture Quality," IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 372-387, 2016.
[14] L. Hou, C.-P. Yu, and D. Samaras, "Squared Earth Mover's Distance-based Loss for Training Deep Neural Networks," in arXiv:1611.05916, 2016.
[15] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," in arXiv:1704.04861, 2017.
[16] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," in arXiv:1409.1556, 2014.
[17] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the Inception Architecture for Computer Vision," in arXiv:1512.00567, 2015.
[18] H. Tong, M. Li, H.-J. Zhang, J. He, and C. Zhang, "Classification of Digital Photos Taken by Photographers or Home Users," Advances in Multimedia Information Processing - PCM 2004, pp. 198-205, 2004.
[19] R. Datta, D. Joshi, J. Li, and J. Z. Wang, "Studying Aesthetics in Photographic Images Using a Computational Approach," Computer Vision – ECCV 2006, pp. 288-301, 2006.
[20] L. Liu, R. Chen, L. Wolf, and D. Cohen-Or, "Optimizing Photo Composition," Computer Graphics Forum. Wiley Online Library, vol. 29, pp. 469-478, 2010.
[21] L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka, "Assessing the aesthetic quality of photographs using generic image descriptors," in 2011 International Conference on Computer Vision, 2011.
[22] M.-T. Wu, T.-Y. Pan, W.-L. Tsai, H.-C. Kuo, and M.-C. Hu, "High-level semantic photographic composition analysis and understanding with deep neural networks," in 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2017.
[23] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[24] Y. Chen and T. Pock, "Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration," in IEEE transactions on pattern analysis and machine intelligence, 2017.
[25] L. Xu, J. S. Ren, C. Liu, and J. Jia, "Deep Convolutional Neural Network for Image Deconvolution," Advances in Neural Information Processing Systems, pp. 1790-1798, 2014.
[26] T. Acharya and A. K. Ray, Image Processing: Principles and Applications, 2005.
[27] H. Talebi and P. Milanfar, "Fast Multi-Layer Laplacian Enhancement," in arXiv:1606.07396, 2016.
[28] L. Shen, Z. Yue, F. Feng, Q. Chen, S. Liu, and J. Ma, "MSR-net:Low-light Image Enhancement Using Deep Convolutional Network," in arXiv:1711.02488, 2017.
[29] A. Ignatov, N. Kobyshev, R. Timofte, K. Vanhoey, and L. V. Gool, "DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks," in arXiv:1704.02470, 2017.
[30] H. Talebi and P. Milanfar, "Learned Perceptual Image Enhancement," in arXiv:1712.02864, 2017.
[31] V. Bychkovsky, S. Paris, E. Chan, and F. Durand, "Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs," in The Twenty-Fourth IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[32] S. A. Esmaeili, B. Singh, and L. S. Davis, "Fast-At: Fast Automatic Thumbnail Generation Using Deep Neural Networks," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[33] J. Yan, S. Lin, S. B. Kang, and X. Tang, "Learning the Change for Automatic Image Cropping," in 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
[34] E. Hong, J. Jeon, and S. Lee, "CNN based Repeated Cropping for Photo Composition Enhancement," in CVPR workshop, 2017.
[35] Z. Wei, J. Zhang, X. Shen, Z. Lin, R. Mech, M. Hoai, and D. Samaras, "Good View Hunting: Learning Photo Composition from Dense View Pairs," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[36] X. Tang, W. Luo, and X. Wang, "Content-Based Photo Quality Assessment," in 2011 International Conference on Computer Vision, 2011.
[37] S. Ma, Z. Wei, F. Tian, X. Fan, J. Zhang, X. Shen, Z. Lin, J. Huang, R. Měch, D. Samaras, and H. Wang, "SmartEye: Assisting Instant Photo Taking via Integrating User Preference with Deep View Proposal Network," in 2019 CHI Conference on Human Factors in Computing Systems, 2019.
[38] D. Kim, T. Kwon, B. Yoo, G. Lee, W. Lee, J. Lee, S. Yim, and J. Jeong, "Seamless Capturing of Moments Using Photographic Compositions and Image Aesthetics," in 2020 International Conference on Electronics, Information, and Communication (ICEIC), 2020.
[39] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
[40] J.-T. Lee, H.-U. Kim, C. Lee, and C.-S. Kim, "Photographic composition classification and dominant geometric element detection for outdoor scenes," Journal of Visual Communication and Image Representation, vol. 55, pp. 91-105, 2018.
[41] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "PyTorch: An Imperative Style, High-Performance Deep Learning Library," in arXiv:1912.01703, 2019.
[42] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask R-CNN," in IEEE International Conference on Computer Vision , 2017.
[43] T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár, "Microsoft COCO: Common Objects in Context," in arXiv:1405.0312, 2014.
[44] M. Francisco and G. Ross, "maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch," 2018. [Online]. Available: https://github.com/facebookresearch/maskrcnn-benchmark.
[45] K. Man, K. Tang, and S. Kwong, "Genetic algorithms: concepts and applications [in engineering design]," IEEE Transactions on Industrial Electronics, vol. 43, no. 5, pp. 519-534, 1996.
[46] "OpenCV," [Online]. Available: https://opencv.org/. [Accessed 6 - Jun - 2018].
[47] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in arXiv:1512.03385, 2015.
[48] G. Huang, Z. Liu, L. v. d. Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks," in arXiv:1608.06993, 2016.
[49] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, "Aggregated Residual Transformations for Deep Neural Networks," in arXiv:1611.05431, 2016.
[50] N. Otsu, "A Tlreshold Selection Method," IEEE TRANSACTIONS ON SYSTREMS, MAN, AND CYBERNETICS, Vols. SMC-9, no. 1, pp. 62-66, 1979.
[51] J. Canny, "A Computational Approach to Edge Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vols. PAMI-8, no. 6, pp. 679-698, 1986.

簡易檢索 / 詳目顯示

相關論文