| 研究生: |
江建衛 Jiang Jian Wei |
|---|---|
| 論文名稱: |
生成對抗網路在影像填補的應用 Application of Generative Adversarial Networks to Image Inpainting |
| 指導教授: |
楊肅煜
Suh-Yuh Yang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
理學院 - 數學系 Department of Mathematics |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 38 |
| 中文關鍵詞: | 神經網路 、類神經網路 、卷積神經網路 、生成對抗網路 、影像填補 、電腦視覺 、深度學習 、人工智慧 |
| 外文關鍵詞: | neural network, artificial neural network, convolutional neural network, generative adversarial network, image inpainting, computer vision, deep learning, artificial intelligence |
| 相關次數: | 點閱:20 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本文主要運用 Pathak et al. [13] 和 Iizuka et al. [6] 的基本思想,重新建構一個影像填補的生成對抗網路。在硬體設備運算能力的侷限下,我們建置出一個層數較少的神經網路模型,用來達成某些較為簡單的影像填補任務,例如填補主題較單一而且遺失區域比例較低的情況,而本文的主要目標是進行影像中心遺失小區域時的填補工作。為了實現這個目標,我們採用 Goodfellow et al. [4] 提出的生成對抗網路的想法,運用生成網路與對抗網路的相互競爭以加強填補的效能。更明確地說,我們使用卷積層來建構網路,其中生成網路的部分使用 Iizuka et al. [6] 所提到的擴張卷積 [17]。同時,我們採用了 Ioffe 和 Szegedy [7] 的想法,除了最後一層外,所有網路的每一層後都添加標準化層以增強網路的訓練效果。最後模擬實驗結果顯示,我們的生成對抗網路模型可以相當有效地達成主要的填補任務。
Based on the works of Pathak et al. [13] and Iizuka et al. [6], in this thesis, we introduce a simple generative adversarial network approach for image inpainting.
Considering the limitation of computational capacity, we build a simplified model which is able to reconstruct lost or deteriorated parts of images with single context and small missing region. In order to generate the image content of missing region, we mainly employ the generative adversarial network approach proposed by Goodfellow et al. [4]. More specifically, the proposed neural network consists of convolutional layers, where the dilated convolution is used in the generative network. In addition, except the output layer, each layer is equipped with a normalization layer [7] to enhance the overall efficiency of the network. Numerical experiments are performed to demonstrate the good performance of the simplified generative adversarial network for image inpainting.
[1] C. Barnes, E. Shechtman, D. B. Goldman, and A. Finkelstein, The generalized patch match correspondence algorithm, European Conference on Computer Vision, (2010), pp. 29-43.
[2] C. M. Bishop, Pattern Recognition and Machine Learning, Springer, New York, 2016.
[3] K. Fukushima, Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, Biological Cybernetics, 36 (1980), pp. 193-202.
[4] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, Advances in Neural Information Processing Systems, 27 (2014), pp. 2672-2680.
[5] D. H. Hubel and T. N. Wiesel, Receptive fields of single neurones in the cat’s striate cortex, The Journal of Physiology, 148 (1959), pp. 574–591.
[6] S. Iizuka, E. Simo-Serra, and H. Ishikawa, Globally and locally consistent image completion, ACM Transactions on Graphics, 36 (2017), pp. 107:1-107:14.
[7] S. Ioffe and C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, International Conference on Machine Learning, 2015. (arXiv:1502.03167v3)
[8] D. P. Kingma and J. L. Ba, Adam: A method for stochastic optimization, International Conference on Learning Representations, 2015. (arXiv:1412.6980v9)
[9] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 25 (2012), pp. 1097-1105.
[10] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, Backpropagation applied to handwritten zip code recognition, Neural Computation, 1 (1989), pp. 541-551.
[11] Z. Liu, P. Luo, X. Wang, and X. Tang, Deep learning face attributes in the wild, International Conference on Computer Vision, (2015). (arXiv:1411.7766v3)
[12] W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, The Bulletin of Mathematical Biophysics, 5 (1943), pp. 115-133.
[13] D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros, Context encoders: feature learning by inpainting, Conference on Computer Vision and Pattern
Recognition, pp. 2536-2544, 2016.
[14] S. Raschka, Python Machine Learning, Packt Publishing Ltd, 2015.
[15] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, Nature, 323 (1986), pp. 533-536.
[16] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. V. D. Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, Mastering the game of go with deep neural networks and tree search, Nature, 529 (2016), pp. 484–489.
[17] F. Yu and V. Koltun, Multi-scale context aggregation by dilated convolutions, International Conference on Learning Representations, 2016. (arXiv:1511.07122v3)
[18] 齋藤康毅著,吳嘉芳譯, Deep Learning:用Python進行深度學習的基礎理論實作, 歐萊禮出版社,台灣,2017。