| 研究生: |
涂珮涓 Pei-Chuan Tu |
|---|---|
| 論文名稱: |
用於3D物體辨識基於視圖的注意力圖卷積監督式對比學習神經網路 |
| 指導教授: |
葉英傑
Yin-Gjie Ye |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業管理研究所 Graduate Institute of Industrial Management |
| 論文出版年: | 2024 |
| 畢業學年度: | 112 |
| 語文別: | 中文 |
| 論文頁數: | 39 |
| 中文關鍵詞: | 工業自動化 、多視圖三維物體辨識 、注意力機制 、對比學習 |
| 外文關鍵詞: | automated industry, multi-view 3D object recognition, attention mechanism, contrastive learning |
| 相關次數: | 點閱:5 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
工業革命自18、19世紀興起,歐美國家透過機器取代手工生產,演進出四次工業革命而目前正處於第四次。本研究針對工業革命的核心自動化,以提高生產效率、降低成本、提升品質為目標,特別關注於製造業中應用的機器視覺系統。傳統三維物體辨識方法多利用二維多視角圖片,但未充分利用多視角圖片間的相關性,以及現實生活中的拍攝環境可能會影響圖片品質增加模型辨識難度。因此,本研究旨在提出一套辨識三維產品的系統,包括基於視圖的圖卷積神經網路、圖片重要特徵提取以及對比學習訓練方法。具體目標為提高辨識效能、提升對圖片重點的捕捉能力以及增強在現實生活中的穩健性。為達成此目的,本研究將採用有效聚合多視角圖片訊息的基於視圖的圖卷積神經網路、注意力機制以提取重要特徵資訊,以及監督式對比學方法來訓練神經網路以提升模型泛化能力。這些方法的詳細內容將在後續章節中詳細探討。
The Industrial Revolution emerged in the 18th and 19th centuries, during which European and American countries replaced manual labor with machines, leading to four distinct industrial revolutions, with the current era being the fourth. This study focuses on the core of the Industrial Revolution, automation, aiming to improve production efficiency, reduce costs, and enhance quality, particularly through the application of machine vision systems in the manufacturing industry. Traditional methods of three-dimensional object recognition often utilize two-dimensional multi-view images but fail to fully exploit the correlation between these images and the potential impact of real-life shooting conditions on image quality, thereby increasing the difficulty of model recognition. Therefore, this study aims to propose a system for recognizing three-dimensional products, comprising a view-based convolutional neural network, feature extraction from images, and contrastive learning training methods. The specific objectives are to improve recognition efficiency, enhance the capture of key features in images, and strengthen robustness in real-life scenarios. To achieve these goals, the study will adopt a view-based convolutional neural network that effectively aggregates information from multiple-view images, an attention mechanism to extract important feature information, and supervised contrastive learning methods to train neural networks and enhance model generalization capabilities. The detailed implementation of these methods will be discussed in subsequent chapters.
參考文獻
[1] Bahdanau, D., K. Cho & Y. Bengio (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
[2] Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4), 193-202.
[3] Golnabi, H. & A. Asadpour (2007). Design and application of industrial machine vision systems. Robotics and Computer-Integrated Manufacturing, 23(6), 630-637.
[4] He, K., X. Zhang, S. Ren & J. Sun (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[5] Khosla, P., P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, ... & D. Krishnan (2020). Supervised contrastive learning. Advances in neural information processing systems, 33, 18661-18673.
[6] Kipf, T. N., & M. Welling (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
[7] Krizhevsky, A., I. Sutskever & G. E. Hinton (2012). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
[8] LeCun, Y., L. Bottou, Y. Bengio & P. Haffner (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[9] Mnih, V., N. Heess, & A. Graves (2014). Recurrent models of visual attention. Advances in neural information processing systems, 27.
[10] Niu, Z., G. Zhong & H. Yu (2021). A review on the attention mechanism of deep learning. Neurocomputing, 452, 48-62.
[11] Qi, C. R., Yi, L., Su, H., & L. J. Guibas (2017). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30.
[12] Simonyan, K., & A. Zisserman (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[13] Su, H., S. Maji, E. Kalogerakis, & E. Learned-Miller (2015). Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision (pp. 945-953).
[14] Szegedy, C., W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, ... & A. Rabinovich (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
[15] Thoben, K. D., S. Wiesner, & T. Wuest (2017). “Industrie 4.0” and smart manufacturing-a review of research issues and application examples. International journal of automation technology, 11(1), 4-16.
[16] Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, ... & I. Polosukhin (2017). Attention is all you need. Advances in neural information processing systems, 30.
[17] Wei, X., R. Yu & J. Sun (2020). View-gcn: View-based graph convolutional network for 3d shape analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1850-1859).
[18] Wu, Z., Y. Xiong, S. X. Yu & D. Lin (2018). Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3733-3742).
[19] Zeiler, M. D., & R. Fergus (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13 (pp. 818-833). Springer International Publishing.