跳到主要內容

簡易檢索 / 詳目顯示

研究生: 曾欽緹
ChinTi Tseng
論文名稱: 以漸進式基因演算法實現神經網路架構搜尋最佳化
A Progressive Genetic-based Optimization for Network Architecture Search
指導教授: 陳以錚
Yi-Chen Chen
周惠文
Huey-Wen Chou
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 50
中文關鍵詞: 機器學習深度學習神經架構搜尋基因演算法機器學習自動化
外文關鍵詞: Machine Learning, Evolutionary Algorithm, Neural Architecture Search, Automated Machine Learning, Deep Learning
相關次數: 點閱:19下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 機器學習是一門從數據中由電腦自行學習得出特徵,再利用特徵對未知數
    據進行預測的技術。機器學習這門技術會因所針對的目標資料集不同,而設計
    出對應的模型架構,也因此所需對應專業知識、所需花費的研究時間與資源甚
    多,在普遍應用的期望下有一定的門檻與瓶頸。為了加速神經網路的建構
    ,我們建構了一套基於演化演算法,結合深度學習技術,漸進式概念的建模演
    算法,搭配經過設計的細胞結構,應用在運算資源稀缺的環境下,並在針對特
    定資料集的背景下,自動搜尋出對應最優的神經網路架構。


    No matter designing a new neural network (NN) architectures or modifying an existed model require both human expertise and intense computational resources. We propose a progressive strategy to develop models on a “meta” level which recently arose interests of experts. This meta-modeling algorithm is based on evolutionary algorithms and deep learning techniques to generate NN architectures for a given task automatically. The work we did also includes encoding a model structure into many “cells” in a continual representation. Therefore, after defining the cell structure and its topology, we find the structures for the given task cell by cell, brick by brick, and find a structure which has the highest accuracy eventually.

    中文摘要 i Abstract ii Table of contents iii 1. Introduction 1 2. Related Work 5 3. Proposed Methodology 10 3.1 Predictor Training 13 3.2 Generator Training 14 3.3 PG-NAS 15 3.4 Advantages of PG-NAS 17 4. Experiment Details 18 5. Experiment Results 20 5.1 Performance Analysis 20 5.2 Parameter Analysis 24 5.3 Predictor Analysis 28 5.4 Generator Analysis 33 6. Discussion & Future Work 36 6.1 Case Study 36 6.2 Search Efficiency 37 6.3 Future Work 38 7. Conclusion 39 References 40

    [1] Baker, B., Gupta, O., Naik, N., and Raskar, R., 2016, “Designing neural network architectures using reinforcement learning,” arXiv preprint arXiv:1611.02167
    [2] Baker, B., Gupta, O., Raskar, R., and Naik, N., 2017, “Accelerating neural architecture search using performance prediction,” arXiv preprint arXiv:1705.10823
    [3] Bergstra, J., and Bengio, Y., 2012, “Random search for hyper-parameter optimization,” Journal of machine learning research, 13(Feb), pp. 281-305
    [4] Brock, A., Lim, T., Ritchie, J. M., and Weston, N., 2017, “Smash: one-shot model architecture search through hypernetworks,” arXiv preprint arXiv:1708.05344
    [5] Cai, H., Chen, T., Zhang, W., Yu, Y., and Wang, J., 2018, “Efficient architecture search by network transformation,” Thirty-Second AAAI conference on artificial intelligence
    [6] Chollet, F., 2017, “Xception: Deep learning with depthwise separable convolutions,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251-1258
    [7] Deng, J., Dong, W., Socher, R., Li, L., Li, K., and L, Fei-Fei., 2009, “Imagenet: A large-scale hierarchical image database,” 2009 IEEE conference on computer vision and pattern recognition, pp. 248-255, IEEE
    [8] Domhan, T., Springenberg, J. T., and Hutter, F., 2015, “Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves,” Twenty-Fourth International Joint Conference on Artificial Intelligence
    [9] Elsken, T., Metzen, J. H., and Hutter, F., 2017, “Simple and efficient architecture search for convolutional neural networks,” arXiv preprint arXiv:1711.04528
    [10] Elsken, T., Metzen, J.H., and Hutter, F., 2018, “Neural architecture search: A survey,” arXiv
    [11] Floreano, D., Dürr, P., and Mattiussi, C., 2008, “Neuroevolution: from architectures to learning,” Evolutionary Intelligence
    [12] Guo, M., Zhong, Z., Wu, W., Lin, D., and Yan, J., 2019, “Irlas: Inverse reinforcement learning for architecture search,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9021-9029
    [13] Hu, J., Shen, L., and Sun, G., 2018, “Squeeze-and-excitation networks,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132-7141
    [14] Hutter, F., Kotthoff, L., and Vanschoren, J., 2019, “Automatic Machine Learning: Methods, Systems, Challenges,” Springer, available at http://automl.org/book
    [15] Krizhevsky, A., Sutskever, I., and Hinton, G. E., 2012, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, pp. 1097-1105
    [16] Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L. J., ... and Murphy, K., 2018, “Progressive neural architecture search,” Proceedings of the European Conference on Computer Vision (ECCV), pp. 19-34
    [17] Liu, H., Simonyan, K., and Yang, Y., 2018, “Darts: Differentiable architecture search,” arXiv preprint arXiv:1806.09055
    [18] Luo, R., Tian, F., Qin, T., Chen, E., & Liu, T. Y., 2018, “Neural architecture optimization,” Advances in neural information processing systems, pp. 7816-7827
    [19] Mendoza, H., Klein, A., Feurer, M., Springenberg, J. T., and Hutter, F., 2016, “Towards automatically-tuned neural networks,” Workshop on Automatic Machine Learning, pp. 58-65
    [20] Pham, H., Guan, M. Y., Zoph, B., Le, Q. V., and Dean, J., 2018, “Efficient neural architecture search via parameter sharing,” arXiv preprint arXiv:1802.03268
    [21] Real, E., Aggarwal, A., Huang, Y., and Le, Q. V., 2019, “Regularized evolution for image classifier architecture search,” Proceedings of the aaai conference on artificial intelligence, Vol. 33, pp. 4780-4789
    [22] Salimans, T., Ho, J., Chen, X., Sidor, S., & Sutskever, I., 2017, “Evolution strategies as a scalable alternative to reinforcement learning,” arXiv preprint arXiv:1703.03864.
    [23] Stanley, K. O., and Miikkulainen, R., 2002, “Evolving neural networks through augmenting topologies,” Evolutionary computation, 10(2), pp. 99-127
    [24] Stanley, K. O., Bryant, B. D., and Miikkulainen, R., 2005, “Real-time neuroevolution in the NERO video game,” IEEE transactions on evolutionary computation, 9(6), pp. 653-668
    [25] Suganuma, M., Shirakawa, S., and Nagao, T., 2017, “A genetic programming approach to designing convolutional neural network architectures,” Proceedings of the Genetic and Evolutionary Computation Conference, pp. 497-504
    [26] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... and Rabinovich, A., 2015, “Going deeper with convolutions,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9
    [27] Vanschoren, J., 2019, “Meta-learning,” In Automated Machine Learning, pp. 35-61, Springer, Cham
    [28] Yao, X., 1999, “Evolving artificial neural networks,” IEEE
    [29] Zhong, Z., Yan, J., and Liu, C. L., 2017, “Practical network blocks design with q-learning,” arXiv preprint arXiv:1708.05552, 6
    [30] Zoph, B., and Le, Q. V., 2016, “Neural architecture search with reinforcement learning,” arXiv preprint arXiv:1611.01578
    [31] Zoph, B., Vasudevan, V., Shlens, J., and Le, Q. V., 2018, “Learning transferable architectures for scalable image recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8697-8710

    QR CODE
    :::