應用Vision Transformer結合圖像描述技術之晶圓瑕疵分類

簡易檢索 / 詳目顯示

回結果列表

研究生：	翁仲毅 Chung-Yi Weng
論文名稱：	應用Vision Transformer結合圖像描述技術之晶圓瑕疵分類 Applying Vision Transformer Integrated with Image Captioning Techniques for Wafer Defect Classification
指導教授：	王啓泰 Chi-Tai Wang
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 工業管理研究所 Graduate Institute of Industrial Management
論文出版年：	2025
畢業學年度：	113
語文別：	英文
論文頁數：	78
中文關鍵詞：	晶圓瑕疵檢測、Transformer 模型、圖像描述、深度學習
外文關鍵詞：	Wafer defect detection, Transformer Model, Image Captioning, Deep Learning
相關次數：	點閱：17 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著半導體製程技術快速演進與晶圓尺寸微縮，微小缺陷對產品良率與製程穩定性的影響愈發關鍵。傳統的晶圓缺陷檢測方法主要依賴人工電子顯微鏡檢測與自動光學檢測，雖然近年來已有許多研究利用機器學習與深度學習技術進行缺陷分類，但這些系統往往缺乏對缺陷的語意化描述能力，使得工程師難以深入分析缺陷成因，進而影響製程優化的效率。為因應此挑戰，本研究提出一套結合 Vision Transformer (ViT) 架構之三階段晶圓缺陷辨識系統，整合異常偵測、缺陷分類與語意化圖像描述，期能提升檢測效能與資訊可讀性。
本研究以公開WM-811K晶圓瑕疵資料集作為實驗基礎，完成資料標準化與不平衡資料處理後，分別訓練分類與描述模型。實驗結果顯示，第一階段以ViT模型於測試集上達成 94.97% 準確率與 96.21% 召回率，顯示其在偵測不良品方面具高度敏感度；而第二階段ViT-GPT2模型於瑕疵分類任務中達成 97.7% 的分類準確率與優異之 Precision、Recall、F1-score表現；第三階段則進一步評估模型的語言生成品質，結果顯示模型能生成語法正確且語意一致之缺陷描述，成功實現圖像與語言之跨模態映射。綜合而言，本研究驗證 Transformer 架構於晶圓缺陷辨識任務中之有效性與應用潛力，不僅提升缺陷分類與異常偵測效能，更賦予檢測結果語意化解釋能力，為智慧製造與品質管控提供嶄新解決方案。

With the rapid advancement of semiconductor manufacturing, even minor wafer defects can critically impact yield and process stability. Traditional inspection methods, such as Automated Optical Inspection (AOI) and Scanning Electron Microscopy (SEM), are time-consuming and lack semantic interpretability, making it difficult for engineers to analyze root causes and effectively optimize processes. To address these limitations, this study proposes a three-stage wafer defect recognition framework based on the Vision Transformer (ViT), incorporating anomaly detection, defect classification, and image captioning.
Using the publicly available WM-811K dataset, the models were trained after data standardization and class imbalance handling. In Stage 1, the ViT model achieved 94.97% accuracy and 96.21% recall for anomaly detection. In Stage 2, the ViT-GPT2 model reached 97.7% accuracy, with high precision, recall, and F1-scores across defect categories. In Stage 3, the model generated syntactically correct and semantically consistent captions, successfully completing cross-modal mapping from images to text. This study demonstrates the effectiveness of Transformer-based models in enhancing both defect detection accuracy and result interpretability, contributing to intelligent and explainable semiconductor quality control.

摘要    i
Abstract    ii
Table of Contents    iii
List of Figures    v
List of Tables    vi
Chapter 1 Introduction    1
1 Semiconductor industry    1
2 Artificial Intelligence    5
3 Motivation    9
4 Problem statement    12
Chapter 2 Literature Review    15
1 Wafer defect recognition    15
2 Artificial Intelligence    19
2.1 The development of AI    19
2.2 Convolutional Neural Networks (CNNs)    21
2.3 Recurrent Neural Networks (RNNs)    22
3 Transformer model    25
3.1 Seq2Seq (Sequence-to-Sequence) model    25
3.2 Attention mechanism    27
3.3 Variants of the Transformer    28
Chapter 3 Methodology    34
1 Research framework    34
2 Data processing    36
3 Model design    38
3.1 Anomaly detection model    38
3.2 Image captioning model    40
4 Model evaluation    44
Chapter 4 Experiments    47
1 Experimental environment and equipment    47
2 Dataset description and preprocessing    47
3 Model training    50
3.1 Hyperparameter configuration    50
3.2 Model initialization and training    51
4 Experimental results and analysis    53
4.1 ViT Anomaly detection model results    53
4.2 ViT-GPT2 defect classification and image captioning model results    56
5 Summary    59
Chapter 5 Conclusion    61
1 Research summary    61
2 Future research directions    63
References    65
                                

1. SEMI國際半導體產業協會 (2022)。半導體是什麼？晶片產業一次看懂。網站：https://www.semi.org/zh/technology-trends/what-is-a-semiconductor (上網日期：2024年11月4日)。
2. （日）菊地正典：《半導體》，蔡澄崇譯，新北市：瑞昇文化，2013。
3. 陳會安：《新一代Keras 3.x 重磅回歸：跨TensorFlow與PyTorch建構Transformer、CNN、RNN、LSTM深度學習模型》，臺北市：旗標科技，2024。
4. 黃朝隆：《ChatGPT X Keras X Pytorch全方位應用實踐指南：從零開始的AI程式設計養成之路》，新北市：博碩文化，2023。
5. 經濟部統計處 (2024)。工業產銷存－產品統計。網站：https://service.moea.gov.tw/EE520/investigate/InvestigateDA.aspx (上網日期：2024年11月4日)。
6. 經濟部產業技術司 (2024)。半導體製程中的AOI技術發展。網站：https://www.moea.gov.tw/MNS/doit/industrytech/IndustryTech.aspx?menu_id=13545&it_id=566 (上網日期：2024年12月18日)。
7. Alom, M. Z., Taha, T. M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M. S., Hasan, M., Van Essen, B. C., Awwal, A. A. S., & Asari, V. K. (2019). A state-of-the-art survey on deep learning theory and architectures. Electronics, 8(3), 292.
8. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
9. Chollet, F. (2017). Deep Learning with Python. Manning.
10. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
11. Counterpoint Research. (2024). Global Semiconductor Foundry Market Share: Quarterly. Retrieved from https://www.counterpointresearch.com/insights/global-semiconductor-foundry-market-share/ (Accessed: Dec. 2, 2024)
12. Devlin, J. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
13. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
14. Fukushima, K. (1988). Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks, 1(2), 119-130.
15. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
16. Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8), 2554-2558.
17. Jin, C. H., Na, H. J., Piao, M., Pok, G., & Ryu, K. H. (2019). A novel DBSCAN-based defect pattern detection and classification framework for wafer bin map. IEEE Transactions on Semiconductor Manufacturing, 32(3), 286-292.
18. Katari, M., Shanmugam, L., & Malaiyappan, J. N. A. (2024). Integration of AI and machine learning in semiconductor manufacturing for defect detection and yield improvement. Journal of Artificial Intelligence General Science (JAIGS), 3(1), 418-431.
19. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
20. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin Transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE/CVF International Conference on Computer Vision, 10012-10022.
21. Masum, S., Liu, Y., Chiverton, J. (2018). Multi-step time series forecasting of electric load using machine learning models. Artificial Intelligence and Soft Computing, Springer, 148-159.
22. McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, C. E. (2006). A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955. AI Magazine, 27(4), 12-14.
23. MIR Lab. (2015). MIR corpora. Retrieved from http://mirlab.org/dataSet/public/ (Accessed: Jul. 6, 2024)
24. Nag, S., Makwana, D., Mittal, S., & Mohan, C. K. (2022). WaferSegClassNet-A light-weight network for classification and segmentation of semiconductor wafer defects. Computers in Industry, 142, 103720.
25. Nakazawa, T., & Kulkarni, D. V. (2018). Wafer Map Defect Pattern Classification and Image Retrieval Using Convolutional Neural Network. IEEE Transactions on Semiconductor Manufacturing, 31(2), 309-314.
26. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI Technical Report.
27. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 1-67.
28. Saqlain, M., Jargalsaikhan, B., & Lee, J. Y. (2019). A Voting Ensemble Classifier for Wafer Map Defect Patterns Identification in Semiconductor Manufacturing. IEEE Transactions on Semiconductor Manufacturing, 32(2), 171-182.
29. Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417–457.
30. Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, 27, 3104–3112.
31. Turing, A. M. (1950). Computing Machinery and Intelligence. Mind 59(236), 433-460.
32. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008.
33. Wang, C. T., & Chiu, C. S. (2014). Competitive strategies for Taiwan's semiconductor industry in a new world economy. Technology in Society, 36, 60-73.
34. Wang, J., Yang, Z., Zhang, J., Zhang, Q., & Chien, W.-T. K. (2019). AdaBalGAN: An improved generative adversarial network with imbalanced learning for wafer defective pattern recognition. IEEE Transactions on Semiconductor Manufacturing, 32(3), 310-319.
35. Wu, M.-J., Jang, J.-S. R., & Chen, J.-L. (2015). Wafer map failure pattern recognition and similarity ranking for large-scale data sets. IEEE Transactions on Semiconductor Manufacturing, 28(1), 1–12.
36. Yang, Y. F. (2019). A deep learning model for identification of defect patterns in semiconductor wafer map. 2019 30th Annual SEMI Advanced Semiconductor Manufacturing Conference (ASMC), 1-6.

簡易檢索 / 詳目顯示

相關論文