跳到主要內容

簡易檢索 / 詳目顯示

研究生: 顏寧
Ning Yen
論文名稱: 基於遷移學習的建築影像辨識模型
Transfer Learning Based Model for Image Recognition of Architecture
指導教授: 黃楓南
Feng-Nan Hwang
口試委員:
學位類別: 碩士
Master
系所名稱: 理學院 - 數學系
Department of Mathematics
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 47
中文關鍵詞: 層轉移遷移學習影像辨識MobileNetV2建築
外文關鍵詞: Layer transfer, Transfer learning, Image recognition, MobileNetV2, Architecture
相關次數: 點閱:10下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 人們往往需透過建築的招牌、特殊造型等特徵才能辨別建築。當人們從無法看見建築特徵的立面觀看或建築本身外觀相近時,可能需要花費較長時間才能夠正確辨識建築。本研究旨在訓練一個建築分類模型,並進一步輔助開發應用程式,協助使用者能快速辨識建築物。我們收集國立中央大學校園內的四棟建築外觀影像作為目標資料,以MobileNetV2作為基礎模型,利用遷移學習中的層轉移技術訓練模型,在多元分類的任務上達到95%的準確率。我們比較了不同凍結層數、資料增廣方法與訓練時期數對模型的影響,發現凍結層數的選擇對模型效能影響顯著。另外,我們以人類與Teachable machine進行相同分類任務的結果作為比較基準,人類的準確率為62%,模型之準確率相較之下有明顯提高;Teachable machine的準確率為90%,模型之準確率與其比較則是些微提升。


    Identifying buildings for human beings rely on features like signs and distinctive shapes. However, correctly recognizing structures becomes challenging when these features are not visible from a particular facade or when buildings share a similar appearance. This study focuses on developing a deep learning-based model for building recognition that can be integrated into a mobile application, allowing users to identify buildings quickly. The model employs MobileNetV2 as the base model for the layer transfer technique. We collected exterior images of four buildings at the National Central University in Taiwan for training and testing datasets. To optimize the model's performance, we conducted a parametric study that explored the impact of various factors, including the number of frozen layers, data augmentation techniques, and training epochs. Our model achieved as high as 95% of accuracy for the multi-class classification task. In addition, we conducted the same experiment by humans and Teachable Machine, a web-based machine learning tool developed by Google, as benchmarks for comparison. The accuracy rate of humans was 62%, while Teachable Machine achieved an accuracy rate of 90%. Our model surpassed both of them in terms of accuracy.

    致謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi 表目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 圖目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 緒論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 卷積神經網路 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 卷積層 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 池化層 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 遷移學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.1 遷移學習的類型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 MobileNetV2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2.1 深度分離卷積 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2.2 線性瓶頸 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.3 反向殘差 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 微調 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3.1 層轉移 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3.2 保守訓練 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4 實驗 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.1 資料收集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.2 資料預處理 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2.1 水平翻轉 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2.2 旋轉 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2.3 明暗度調整 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2.4 縮放 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3 實驗設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3.1 參數調整 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3.2 模型評估指標 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5 結果與討論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.1 參數調整比較 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.1.1 凍結層數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.1.2 資料增廣方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.1.3 訓練時期數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.2 人類和 Teachable Machine 在相同任務上的效能. . . . . 25 5.3 資料量不平均時的模型效能 . . . . . . . . . . . . . . . .. . . . . . . 26 6 結論與未來展望 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    [1] O. Henaff. Data-efficient image recognition with contrastive predictive coding. In
    International Conference on Machine Learning, pages 4182–4192. PMLR, 2020.
    [2] A. B. Nassif, I. Shahin, I. Attili, M. Azzeh, and K. Shaalan. Speech recognition using
    deep neural networks: A systematic review. IEEE Access, 7:19143–19165, 2019.
    [3] H. Salehi and R. Burgueño. Emerging artificial intelligence methods in structural
    engineering. Engineering Structures, 171:170–189, 2018.
    [4] E. Tjoa and C. Guan. A survey on explainable artificial intelligence (xai): Toward
    medical xai. IEEE Transactions on Neural Networks and Learning Systems, 32:4793–
    4813, 2020.
    [5] I. As, S. Pal, and P. Basu. Artificial intelligence in architecture: Generating con-
    ceptual design via deep learning. International Journal of Architectural Computing,
    16:306–327, 2018.
    [6] J. Cudzik and K. Radziszewski. Artificial intelligence aided architectural design.
    Computing for a Better Tomorrow, page 77, 2018.
    [7] D. Newton. Generative deep learning in architectural design. Technology| Architec-
    ture+Design, 3:176–189, 2019.
    [8] M.-L.-.C Pena, A. Carballal, N. Rodríguez-Fernández, I. Santos, and J. Romero.
    Artificial intelligence applied to conceptual design. a review of its use in architecture.
    Automation in Construction, 124:103550, 2021.
    [9] Y. Yoshimura, B. Cai, Z. Wang, and C. Ratti. Deep learning architect: classification
    for architectural design through the eye of artificial intelligence. Computational Urban
    Planning and Management for Smart Cities, pages 249–265, 2019.
    [10] F. Lomio, R. Farinha, M. Laasonen, and H. Huttunen. Classification of building infor-
    mation model (BIM) structures with deep learning. In 2018 7th European Workshop
    on Visual Information Processing, pages 1–6. IEEE, 2018.
    [11] T. Le, M.-T. Vo, T. Kieu, E. Hwang, S. Rho, and S.-W. Baik. Multiple electric
    energy consumption forecasting using a cluster-based strategy for transfer learning
    in smart building. Sensors, 20:2668, 2020.
    [12] Y. Ahn and B.-S. Kim. Prediction of building power consumption using transfer
    learning-based reference building and simulation dataset. Energy and Buildings,
    258:111717, 2022.
    [13] N. Somu, A. Sriram, A. Kowli, and K. Ramamritham. A hybrid deep transfer learning
    strategy for thermal comfort prediction in buildings. Building and Environment,
    204:108133, 2021.
    [14] J. Liu, Q. Zhang, X. Li, G. Li, Z. Liu, Y. Xie, K. Li, and B. Liu. Transfer learning-
    based strategies for fault diagnosis in building energy systems. Energy and Buildings,
    250:111256, 2021.
    [15] D. Duarte, F. Nex, N. Kerle, and G. Vosselman. Satellite image classification of
    building damages using airborne and satellite image samples in a deep learning ap-
    proach. ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information
    Sciences, 4, 2018.
    [16] S. Mangalathu and H.-V. Burton. Deep learning-based classification of earthquake-
    impacted buildings using textual damage descriptions. International Journal of Dis-
    aster Risk Reduction, 36:101111, 2019.
    [17] H. Perez, J.-H.-M. Tah, and A. Mosavi. Deep learning for detecting building defects
    using convolutional neural networks. Sensors, 19:3556, 2019.
    [18] C.-S. Cheng, A.-H. Behzadan, and A. Noshadravan. Deep learning for post-hurricane
    aerial damage assessment of buildings. Computer-Aided Civil and Infrastructure
    Engineering, 36:695–710, 2021.
    [19] G. Abdi and S. Jabari. A multi-feature fusion using deep transfer learning for earth-
    quake building damage detection. Canadian Journal of Remote Sensing, 47:337–352,
    2021.
    [20] Q. Yu, C. Wang, F. McKenna, S.-X. Yu, E. Taciroglu, B. Cetiner, and K.-H. Law.
    Rapid visual screening of soft-story buildings from street view images using deep
    learning classification. Earthquake Engineering and Engineering Vibration, 19:827–
    838, 2020.
    [21] D. Gonzalez, R.-P. Diego, A.-B. Acevedo, J.-C. Duque, R. Ramos-Pollan, A. Betan-
    court, and S. Garcia. Automatic detection of building typology using deep learning
    methods on street level images. Building and Environment, 177:106805, 2020.
    [22] F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong, and Q. He. A
    comprehensive survey on transfer learning. Proceedings of the IEEE, 109:43–76,
    2020.
    [23] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. Mobilenetv2:
    Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on
    Computer Vision and Pattern Recognition, pages 4510–4520, 2018.
    [24] J.-D. Farfan-Escobedo, L. Enciso-Rodas, and J.-E. Vargas-Muñoz. Towards accurate
    building recognition using convolutional neural networks. In 2017 IEEE XXIV In-
    ternational Conference on Electronics, Electrical Engineering and Computing, pages
    1–4. IEEE, 2017.
    [25] H.-C. Lee, I.-H. Park, T.-H. Im, and D.-T. Moon. CNN-based building recognition
    method robust to image noises. Journal of the Korea Institute of Information and
    Communication Engineering, 24:341–348, 2020.
    [26] J. Li and N. Allinson. Building recognition using local oriented features. IEEE
    Transactions on Industrial Informatics, 9:1697–1704, 2013.
    [27] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He. Aggregated residual transformations
    for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision
    and Pattern Recognition, pages 1492–1500, 2017.
    [28] G. Huang, Z. Liu, Van D.-M.-L., and K.-Q. Weinberger. Densely connected convo-
    lutional networks. In Proceedings of the IEEE Conference on Computer Vision and
    Pattern Recognition, pages 4700–4708, 2017.
    [29] J Chen, R Stouffs, and F Biljecki. Hierarchical (multi-label) architectural image
    recognition and classification. 2021.
    [30] U. Kulkarni, S.-M. Meena, S.-V. Gurlahosur, and U. Mudengudi. Classification of
    cultural heritage sites using transfer learning. In 2019 IEEE Fifth International
    Conference on Multimedia Big Data, pages 391–397. IEEE, 2019.
    [31] J. Kim and J.-K. Lee. Stochastic detection of interior design styles using a deep-
    learning model for reference images. Applied Sciences, 10:7299, 2020.
    [32] M. Sun, F. Zhang, F. Duarte, and C. Ratti. Understanding architecture age and
    style through deep learning. Cities, 128:103787, 2022.
    [33] J. Leon-Malpartida, J.-D. Farfan-Escobedo, and G.-E. Cutipa-Arapa. A new method
    of classification with rejection applied to building images recognition based on trans-
    fer learning. In 2018 IEEE XXV International Conference on Electronics, Electrical
    Engineering and Computing, pages 1–4. IEEE, 2018.
    [34] Z.-N.-K. Swati, Q. Zhao, M. Kabir, F. Ali, Z. Ali, S. Ahmed, and J. Lu. Brain tumor
    classification for MR images using transfer learning and fine-tuning. Computerized
    Medical Imaging and Graphics, 75:34–46, 2019.
    [35] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marc-
    hand, and V. Lempitsky. Domain-adversarial training of neural networks. The Jour-
    nal of Machine Learning Research, 17:2096–2030, 2016.
    [36] M. Johnson, M. Schuster, Q. V. Le, M. Krikun, Y. Wu, Z. Chen, N. Thorat, F. Viégas,
    M. Wattenberg, G. Corrado, et al. Google’s multilingual neural machine translation
    system: Enabling zero-shot translation. Transactions of the Association for Compu-
    tational Linguistics, 5:339–351, 2017.
    [37] R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng. Self-taught learning: transfer
    learning from unlabeled data. In Proceedings of the 24th International Conference
    on Machine Learning, pages 759–766, 2007.
    [38] W. Dai, Q. Yang, G.-R. Xue, and Y. Yu. Self-taught clustering. In Proceedings of
    the 25th International Conference on Machine Learning, pages 200–207, 2008.
    [39] M.-D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks.
    In Computer Vision–ECCV 2014, pages 818–833. Springer, 2014.
    [40] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. How transferable are features in deep
    neural networks? Advances in Neural Information Processing Systems, 27, 2014.
    [41] M. Carney, B. Webster, I. Alvarado, K. Phillips, N. Howell, J. Griffith, J. Jonge-
    jan, A. Pitaru, and A. Chen. Teachable machine: Approachable web-based tool for
    exploring machine learning classification. In Extended abstracts of the 2020 CHI
    conference on human factors in computing systems, pages 1–8, 20

    QR CODE
    :::