| 研究生: |
凃建名 Chien-Ming Tu |
|---|---|
| 論文名稱: |
以卷積神經網路為基礎之新型可解釋性深度學習模型 A New CNN-Based Interpretable Deep Learning Model |
| 指導教授: |
蘇木春
Mu-Chun Su |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2024 |
| 畢業學年度: | 112 |
| 語文別: | 中文 |
| 論文頁數: | 111 |
| 中文關鍵詞: | 可解釋人工智慧 、深度學習 、色彩感知 、彩色影像 |
| 外文關鍵詞: | Explainable Artificial Intelligence, Deep Learning, Color Perception, Color Images |
| 相關次數: | 點閱:12 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著深度學習技術在各領域的廣泛應用,可解釋性模型的重要性日益突顯。
可解釋性模型,不僅能增強使用者對模型的信任,還能在出現異常時提供有價值的建議。
本研究提出了基於卷積神經網路的新型可解釋性深度學習模型,
該模型包括色彩感知區塊、輪廓感知區塊和特徵傳遞區塊三大部分。
色彩感知區塊透過計算輸入影像不同部分的平均色彩與30種基礎色彩的相似度來提取輸入影像的顏色特徵,
輪廓感知區塊則透過前處理將彩色影像變成灰階影像並輸入高斯卷積與特徵增強來檢測影像中的輪廓特徵,
特徵傳遞區塊則將輸入特徵進行高斯卷積與特徵增強後並且將
輸入特徵透過時序性合併的方式組成更完整的特徵輸出到下一層直到傳遞至全連接層,
最後將輸出的色彩特徵與輪廓特徵結合後輸入進全連接層進行分類。
本研究一共主要使用了三種資料集分別是MNIST、Colored MNIST和Colored Fashion MNIST資料集,
在MNIST、Colored MNIST、Colored Fashion MNIST資料集的測試準確率
分別為 0.9566、0.954、0.8223。
通過實驗結果表明,
本研究之模型在可解釋性和性能方面均有不錯的表現。
尤其在Colored MNIST和Colored Fashion MNIST資料集上,
模型不僅能夠準確區分不同顏色和形狀的影像,
還能透過視覺化展示模型內部決策邏輯,
從而驗證其可解釋性和實用性。
Interpretable models are becoming more and more important as deep learning technologies are being widely used in a variety of sectors.
Despite their excellent accuracy, "black-box" models frequently obfuscate the decision-making process.
Conversely, interpretable models not only increase users' confidence in the model but also offer insightful information when anomalies occur.
This research proposes a new CNN-based interpretable deep learning model.
The model comprises three kind main components: color perception block, contour perception block, and feature transmission block.
The color perception block extracts color features from the input image by calculating the similarity between the average color of different parts of the input image and 30 basic colors.
The contour perception block detects contour features in the image by converting the color image to grayscale through preprocessing and then applying Gaussian convolution and feature enhancement.
The feature transmission block combines the input features by space merging module after convolution and response filtering modules to create more complete features, which are then passed to the next layer until they reach fully connected layer.
Finally, the output color features and contour features are combined and pass into the fully connected layer for classification.
There are three key datasets used in this study, MNIST, Colored MNIST, and Colored Fashion MNIST.
The accuracy rates of MNIST, Colored MNIST, and Colored Fashion MNIST are 0.9566, 0.954, and 0.8223.
The outcomes of the experiments show how well the suggested model performs in terms of both interpretability and performance.
The model validates its interpretability and practicality by accurately distinguishing images of different colors and forms, especially on the Colored MNIST and Colored Fashion MNIST datasets. Additionally, the model visualizes the internal decision-making logic of the model.
[1] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to
document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[2] D. Gunning. “Explainable artificial intelligence (xai).” (2016), [Online]. Available: https:
//www.darpa.mil/program/explainable-artificial-intelligence (visited on 06/17/2024).
[3] European Parliament and Council of the European Union. “Regulation (eu) 2016/679 of
the european parliament and of the council of 27 april 2016 on the protection of natural
persons with regard to the processing of personal data and on the free movement of
such data, and repealing directive 95/46/ec (general data protection regulation).” (2016),
[Online]. Available: https://data.europa.eu/eli/reg/2016/679/oj (visited on 06/17/2024).
[4] C. J. Hoofnagle, B. Van Der Sloot, and F. Z. Borgesius, “The european union general data
protection regulation: What it is and what it means*,” Information & Communications
Technology Law, vol. 28, no. 1, pp. 65–98, 2019.
[5] C.-F. Yang et al., A cnn-based interpretable deep learning model, Master’s thesis, 2023.
[6] D. Purves, G. J. Augustine, D. Fitzpatrick, et al., Neuroscience, 3rd ed. Sinauer Associates Inc., 2004.
[7] M. F. Bear, B. W. Connors, and M. A. Paradiso, Neuroscience : exploring the brain /
Mark F. Bear, Barry W. Connors, Michael A. Paradiso. eng, Enhanced fourth edition.
Burlington, MA: Jones and Bartlett Learning, 2016.
[8] H. Baier, “Synaptic laminae in the visual system: Molecular mechanisms forming layers
of perception,” Annual Review of Cell and Developmental Biology, vol. 29, pp. 385–416,
2013.
[9] J. Hawkins and S. Blakeslee, On Intelligence. New York, NY: Times Books, 2004.
[10] E. R. Kandel, J. D. Koester, S. H. Mack, and S. A. Siegelbaum, Principles of Neural
Science, 6e. New York, NY: McGraw Hill, 2021.
[11] W. Swartout, C. Paris, and J. Moore, “Explanations in knowledge systems: Design for
explainable expert systems,” IEEE Expert, vol. 6, no. 3, pp. 58–64, 1991.
[12] S. A. and S. R., “A systematic review of explainable artificial intelligence models and
applications: Recent developments and future trends,” Decision Analytics Journal, vol. 7,
pp. 100–230, 2023.
[13] V. Chamola, V. Hassija, A. R. Sulthana, D. Ghosh, D. Dhingra, and B. Sikdar, “A review of trustworthy and explainable artificial intelligence (xai),” IEEE Access, vol. 11,
pp. 78 994–79 015, 2023.
[14] I. E. Nielsen, D. Dera, G. Rasool, R. P. Ramachandran, and N. C. Bouaynaya, “Robust explainability: A tutorial on gradient-based attribution methods for deep neural networks,”
IEEE Signal Processing Magazine, vol. 39, no. 4, pp. 73–84, Jul. 2022.
[15] L. Longo, M. Brcic, F. Cabitza, et al., “Explainable artificial intelligence (xai) 2.0: A
manifesto of open challenges and interdisciplinary research directions,” Information Fusion, vol. 106, pp. 102–301, 2024.
[16] L. Rokach, “Decision forest: Twenty years of research,” Information Fusion, vol. 27,
pp. 111–125, 2016.
[17] L. Grinsztajn, E. Oyallon, and G. Varoquaux, “Why do tree-based models still outperform deep learning on typical tabular data?” Advances in neural information processing
systems, vol. 35, pp. 507–520, 2022.
[18] S. Salzberg, “A nearest hyperrectangle learning method,” Machine learning, vol. 6, pp. 251–
276, 1991.
[19] J.-S. Jang, “Anfis: Adaptive-network-based fuzzy inference system,” IEEE Transactions
on Systems, Man, and Cybernetics, vol. 23, no. 3, pp. 665–685, 1993.
[20] J. Hatwell, M. M. Gaber, and R. M. A. Azad, “Chirps: Explaining random forest classification,” Artificial Intelligence Review, vol. 53, pp. 5747–5788, 2020.
[21] A. Binder, G. Montavon, S. Lapuschkin, K.-R. Müller, and W. Samek, “Layer-wise relevance propagation for neural networks with local renormalization layers,” in Artificial
Neural Networks and Machine Learning – ICANN 2016, A. E. Villa, P. Masulli, and A. J.
Pons Rivero, Eds., Cham: Springer International Publishing, 2016, pp. 63–71.
[22] M. T. Ribeiro, S. Singh, and C. Guestrin, “”why should i trust you?”: Explaining the
predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, ser. KDD ’16, San Francisco,
California, USA: Association for Computing Machinery, 2016, pp. 1135–1144.
[23] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”
in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S.
Bengio, et al., Eds., vol. 30, Curran Associates, Inc., 2017.
[24] S. Ö. Arik and T. Pfister, “Tabnet: Attentive interpretable tabular learning,” Proceedings
of the AAAI Conference on Artificial Intelligence, vol. 35, no. 8, pp. 6679–6687, May
2021.
[25] K. Čyras, A. Rago, E. Albini, P. Baroni, and F. Toni, “Argumentative xai: A survey,”
in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence,
IJCAI-21, Z.-H. Zhou, Ed., International Joint Conferences on Artificial Intelligence Organization, Aug. 2021, pp. 4392–4399.
[26] J.-H. Chu. “Chapter 2 色彩體系.” (2009), [Online]. Available: https://www.charts.kh.
edu.tw/teaching-web/98color/color2-3.htm (visited on 06/17/2024).
[27] T. Riemersma. “Colour metric.” (2019), [Online]. Available: https://www.compuphase.
com/cmetric.htm (visited on 06/17/2024).
[28] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing humanlevel performance on imagenet classification,” in 2015 IEEE International Conference
on Computer Vision (ICCV), 2015, pp. 1026–1034.
[29] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward
neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Y. W. Teh and M. Titterington, Eds., ser. Proceedings of
Machine Learning Research, vol. 9, Chia Laguna Resort, Sardinia, Italy: PMLR, May
2010, pp. 249–256.
[30] H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms,” 2017.
[31] A. Krizhevsky, V. Nair, and G. Hinton, “Cifar-10 (canadian institute for advanced research),”
[32] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, F.
Pereira, C. Burges, L. Bottou, and K. Weinberger, Eds., vol. 25, Curran Associates, Inc.,
2012.
[33] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), Jun. 2016.
[34] C. Szegedy, W. Liu, Y. Jia, et al., “Going deeper with convolutions,” in Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2015.
[35] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), Jul. 2017.
[36] C. A. Poynton, ““gamma"and its disguises: The nonlinear mappings of intensity in
perception, crts, film and video,” in SMPTE Journal, vol. 102, Dec. 1993, pp. 1099–
1108.