| 研究生: |
謝宗翰 Tsung-Han Hsieh |
|---|---|
| 論文名稱: |
適用於 MobileNet 及 ShuffleNet 的低功耗低 面積可重構人工智慧加速器設計 A Low-Power Low-Area Reconfigurable AI Accelerator Design for MobileNet and ShuffleNet |
| 指導教授: |
周景揚
Jing-Yang Jou |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 英文 |
| 論文頁數: | 37 |
| 中文關鍵詞: | 人工智慧加速器設計 、ShuffleNet 、MobileNet |
| 外文關鍵詞: | AI Accelerator Design, ShuffleNet, MobileNet |
| 相關次數: | 點閱:7 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
卷積神經網路 (Convolution Neural Network) 是目前深度神經網路 (Deep Neural Network) 領域中所發展的重點。因為它在圖片辨識上甚至可以做到比人類還要精準,所以近年來 CNN 有越來越多應用在目標檢測或是圖像分類上。但是如果同時將所有數據一起送到中央處理節點進行計算,龐大的資料流和複雜的計算,會使系統造成瓶頸,降低整體的運算性能。為了有效解決此瓶頸,人們提出了在邊緣裝置上直接進行運算,而不是將資料送到中央處理節點運算,因此有了邊緣運算的技術,但是邊緣裝置上的計算能力是有限的,所以我們無法將龐大的CNN網路放到邊緣裝置上運算。因此為了克服這一挑戰,人們提出了輕量級神經網絡 (Lightweight Neural Networks) 來降低計算以及網路的複雜度,同時能在邊緣裝置上進行運算。
最近,有兩種輕量級神經網絡 MobileNet和ShuffleNet 被廣泛討論,同時這兩種網路透過適當的訓練及設定,可以使這兩種網路的精準度下降是有限的,並且可以被使用者所接受。然而,目前大多數先進的 AI 加速器並不適用於 MobileNet和ShuffleNet。因此,在本論文中,我們提出了一種新穎的 AI 加速器,它可以支持這兩種輕量級神經網絡的運作以及通道洗牌之功能。與過去的加速器不同的地方在於我們的加速器可以有Depthwise、Pointwise Convolution以及Pointwise Group Convolution之運算。所以我們的加速器相當適合兩種網路的特性。
實驗結果表明,我們的設計可以成功計算 MobileNet 和 ShuffleNet。此外,與之前的作品相比,我們在FPGA的驗證下,資料吞吐量可以提升3.7%,並且在45奈米的製程下,保持相同性能最多可以減少56%的面積,功率也能減少58%。
Convolution Neural Network is currently the focus of development in the field of Deep Neural Network. Because it can even be more accurate than humans in image recognition, CNN has been used more and more in object detection or image classification in recent years. However, if all the data are sent to the central processing node for calculation at the same time, the large amount of data and complex calculations will cause a bottleneck in the system and reduce the overall computing performance. To solve this bottleneck effectively, the researchers proposed performing operations directly on edge devices instead of sending data to a central processing node for computing. Therefore, there is edge computing technology. Still, the computing power on the edge device is limited, so we cannot put the huge CNN network on the edge device for computing. Accordingly, to overcome this challenge, Lightweight Neural Networks have been proposed to reduce the computation and network complexity while enabling computation on edge devices.
Recently, two lightweight neural networks have been widely used, namely MobileNet and ShuffleNet. At the same time, these two networks will be properly trained and set so that the accuracy of the two networks is limited and acceptable to users. However, most of the current advanced AI accelerators are unsuitable for MobileNet and ShuffleNet. Thus, in this thesis, we propose a low-power low-area reconfigurable AI accelerator design for MobileNet and ShuffleNet. The difference from previous works is that our accelerator can support depthwise, pointwise, and pointwise group convolution. So our accelerator is quite suitable for the characteristics of both networks.
Experimental results show that our accelerator design can successfully compute MobileNet and ShuffleNet. In addition, we can increase the throughput by 3.7% under the verification of FPGA and reduce the area by 56% with the same performance at 45nm, as well as reduce power by 58% compared to previous works.
[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Communications of the ACM 60.6 (2017): 84-90.
[2] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[3] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[4] Chollet, François. "Xception: Deep learning with depthwise separable convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[5] Lee, Yen-Lin, Pei-Kuei Tsung, and Max Wu. "Techology trend of edge AI." 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT). IEEE, 2018.
[6] Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017).
[7] Sandler, Mark, et al. "Mobilenetv2: Inverted residuals and linear bottlenecks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[8] Zhang, Xiangyu, et al. "Shufflenet: An extremely efficient convolutional neural network for mobile devices." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[9] Ma, Ningning, et al. "Shufflenet v2: Practical guidelines for efficient cnn architecture design." Proceedings of the European conference on computer vision (ECCV). 2018.
[10] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Communications of the ACM 60.6 (2017): 84-90.
[11] Ryu, Sungju, et al. "A 44.1 tops/w precision-scalable accelerator for quantized neural networks in 28nm cmos." 2020 IEEE Custom Integrated Circuits Conference (CICC). IEEE, 2020.
[12] Wu, Di, et al. "A high-performance CNN processor based on FPGA for MobileNets." 2019 29th International Conference on Field Programmable Logic and Applications (FPL). IEEE, 2019.
[13] Chen, Yu-Guang, et al. "A reconfigurable accelerator design for quantized depthwise separable convolutions." 2021 18th International SoC Design Conference (ISOCC). IEEE, 2021.
[14] Allen Tzeng. (2019). 卷積神經網路 (Convolutional Neural Network , CNN). Retrieved from https://hackmd.io/@allen108108/rkn-oVGA4
[15] AI之路. (2017). ShuffleNet算法詳解. Retrieved from https://blog.csdn.net/u014380165/article/details/75137111 (Jul, 14, 2017)
[16] Chen, Yu-Guang, et al. "A novel DNN accelerator for light-weight neural networks: concept and design." 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS). IEEE, 2022.