跳到主要內容

簡易檢索 / 詳目顯示

研究生: 江建衛
Jiang Jian Wei
論文名稱: 生成對抗網路在影像填補的應用
Application of Generative Adversarial Networks to Image Inpainting
指導教授: 楊肅煜
Suh-Yuh Yang
口試委員:
學位類別: 碩士
Master
系所名稱: 理學院 - 數學系
Department of Mathematics
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 38
中文關鍵詞: 神經網路類神經網路卷積神經網路生成對抗網路影像填補電腦視覺深度學習人工智慧
外文關鍵詞: neural network, artificial neural network, convolutional neural network, generative adversarial network, image inpainting, computer vision, deep learning, artificial intelligence
相關次數: 點閱:19下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本文主要運用 Pathak et al. [13] 和 Iizuka et al. [6] 的基本思想,重新建構一個影像填補的生成對抗網路。在硬體設備運算能力的侷限下,我們建置出一個層數較少的神經網路模型,用來達成某些較為簡單的影像填補任務,例如填補主題較單一而且遺失區域比例較低的情況,而本文的主要目標是進行影像中心遺失小區域時的填補工作。為了實現這個目標,我們採用 Goodfellow et al. [4] 提出的生成對抗網路的想法,運用生成網路與對抗網路的相互競爭以加強填補的效能。更明確地說,我們使用卷積層來建構網路,其中生成網路的部分使用 Iizuka et al. [6] 所提到的擴張卷積 [17]。同時,我們採用了 Ioffe 和 Szegedy [7] 的想法,除了最後一層外,所有網路的每一層後都添加標準化層以增強網路的訓練效果。最後模擬實驗結果顯示,我們的生成對抗網路模型可以相當有效地達成主要的填補任務。


    Based on the works of Pathak et al. [13] and Iizuka et al. [6], in this thesis, we introduce a simple generative adversarial network approach for image inpainting.
    Considering the limitation of computational capacity, we build a simplified model which is able to reconstruct lost or deteriorated parts of images with single context and small missing region. In order to generate the image content of missing region, we mainly employ the generative adversarial network approach proposed by Goodfellow et al. [4]. More specifically, the proposed neural network consists of convolutional layers, where the dilated convolution is used in the generative network. In addition, except the output layer, each layer is equipped with a normalization layer [7] to enhance the overall efficiency of the network. Numerical experiments are performed to demonstrate the good performance of the simplified generative adversarial network for image inpainting.

    一 前言 . . . 1 二 多層神經網路 . . . 3 2.1 向前傳遞 . . . 4 2.2 反向傳遞 . . . 7 三 卷積神經網路 . . . 11 3.1 向前傳遞 . . . 12 3.2 反向傳遞 . . . 14 3.3 擴張卷積 . . . 17 四 生成對抗網路 . . . 18 4.1 概念介紹 . . . 18 4.2 訓練流程 . . . 19 4.3 誤差函數 . . . 19 五 影像填補 . . . 20 5.1 網路結構 . . . 20 5.2 誤差函數 . . . 22 5.3 訓練模型 . . . 22 5.4 模型優化 . . . 23 六 模擬實驗 . . . 24 6.1 影像資料的前處理 . . . 25 6.2 實驗結果和討論 . . . 25 七 結論 . . . 29 參考文獻 . . .30

    [1] C. Barnes, E. Shechtman, D. B. Goldman, and A. Finkelstein, The generalized patch match correspondence algorithm, European Conference on Computer Vision, (2010), pp. 29-43.
    [2] C. M. Bishop, Pattern Recognition and Machine Learning, Springer, New York, 2016.
    [3] K. Fukushima, Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, Biological Cybernetics, 36 (1980), pp. 193-202.
    [4] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, Advances in Neural Information Processing Systems, 27 (2014), pp. 2672-2680.
    [5] D. H. Hubel and T. N. Wiesel, Receptive fields of single neurones in the cat’s striate cortex, The Journal of Physiology, 148 (1959), pp. 574–591.
    [6] S. Iizuka, E. Simo-Serra, and H. Ishikawa, Globally and locally consistent image completion, ACM Transactions on Graphics, 36 (2017), pp. 107:1-107:14.
    [7] S. Ioffe and C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, International Conference on Machine Learning, 2015. (arXiv:1502.03167v3)
    [8] D. P. Kingma and J. L. Ba, Adam: A method for stochastic optimization, International Conference on Learning Representations, 2015. (arXiv:1412.6980v9)
    [9] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 25 (2012), pp. 1097-1105.
    [10] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, Backpropagation applied to handwritten zip code recognition, Neural Computation, 1 (1989), pp. 541-551.
    [11] Z. Liu, P. Luo, X. Wang, and X. Tang, Deep learning face attributes in the wild, International Conference on Computer Vision, (2015). (arXiv:1411.7766v3)
    [12] W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, The Bulletin of Mathematical Biophysics, 5 (1943), pp. 115-133.
    [13] D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros, Context encoders: feature learning by inpainting, Conference on Computer Vision and Pattern
    Recognition, pp. 2536-2544, 2016.
    [14] S. Raschka, Python Machine Learning, Packt Publishing Ltd, 2015.
    [15] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, Nature, 323 (1986), pp. 533-536.
    [16] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. V. D. Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, Mastering the game of go with deep neural networks and tree search, Nature, 529 (2016), pp. 484–489.
    [17] F. Yu and V. Koltun, Multi-scale context aggregation by dilated convolutions, International Conference on Learning Representations, 2016. (arXiv:1511.07122v3)
    [18] 齋藤康毅著,吳嘉芳譯, Deep Learning:用Python進行深度學習的基礎理論實作, 歐萊禮出版社,台灣,2017。

    QR CODE
    :::