模擬深度學習特徵進行條碼偵測｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	李冠達 Kuan-Ta Lee
論文名稱：	模擬深度學習特徵進行條碼偵測 Simulation of deep learning features used in barcode detection
指導教授：	吳炤民 Chao-Min Wu
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2021
畢業學年度：	109
語文別：	中文
論文頁數：	112
中文關鍵詞：	物件偵測、YOLO 、條碼定位、感興趣區域
外文關鍵詞：	Object detection, YOLO, Barcode localization, Region of Interest
相關次數：	點閱：10 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

條碼在一般大眾的生活中隨處可見，不同的領域為了各自的需求而製作出符合自身產品的條碼，導致條碼的種類變得非常多，所以僅使用單一方法將所有條碼找出並不容易。隨著近年來深度學習在物件辨識領域有大幅度的進步，因此本研究的目標是利用深度學習來找出條碼。
由於所需要的系統只需要將每一個條碼定位為感興趣區域 (Region Of Interest, ROI)而不必加以分類，為了使系統的執行速度夠快，因此所使用的是較小且計算較快的YOLOv3-tiny的網路架構。訓練圖片來源為利用專業掃描器1504P進行影像蒐集，總共蒐集10008張影像，並將影像以8:2分成訓練資料以及驗證資料，再由欣技資訊提供133張測試影像，而從驗證資料以及測試資料的結果顯示Recall皆可以達到95%，而Precision為93%以及75%。
為了要在處理效能有限的個人數位助理中(PDA)執行，因此我們嘗試了剪枝網路，但效果不佳，而後嘗試解析網路並用影像處理模擬網路行為。將網路視覺化的結果只能看出一些大概的處理過程，最後我們嘗試模擬重要特徵圖，並提出三種模擬方法，第一種方法為先找出條碼範圍，再利用5乘5遮罩找出條碼中心，最後利用主動式輪廓法來框選條碼，而框選的條碼即為ROI，第二種方法在找出條碼範圍以及框選條碼為ROI的步驟與第一種方法相同，但使用較小的3乘3遮罩找出條碼中心，第三種為第二種方法執行完成後，再對ROI進一步篩選，以降低誤判區域的數量，將這三種方法利用由欣技資訊所提供的測試集進行測試的結果顯示，Recall分別為83%、92%以及91%，Precision為81%、46%以及79%，而放入個人數位助理的執行速度為118ms、84ms以及156ms，與其他方法相比，我們的方法在準確度上大概落於平均水準，但在執行速度上有著極大的優勢，因此我們的演算法在速度上是相對有競爭力的。

Barcodes are ubiguitous in modern life. Different types of barcodes are designed for different applications. Therefore, it is not easy to detect all types of barcodes using a single approach. In recent years, object detection in deep learning has achieved significant progresses, so this research aims to locate barcodes using deep learning.
In order to efficiently execute the system, which only needs to locate the barcode as a region of interest (ROI) without recognizing the type of each barcode, the simple and fast YOLOv3-tiny network has been chosen. Images used for training were captured by the professional scanner 1504P. The number of images were 10008 and then further divided into training data and verification data in 8:2.
The 133 data for testing were provided by CipherLab. The results of verification data and testing data shown that the recall could reach 95%, and the precision was 93% and 75%.
To implement the network in a resource-limited Personal Digital Assistant (PDA), we tried to prune the network, but the performance was not good. Hense we analyzed the network structure and used image processing techniques to imitate the network behavior. By visualizing the network, only coarse processing procedure could be identified. Finally, we tried to imitate some important feature maps with three methods. The first method searched for barcode candidates, located the centers of the barcodes using the 5 5 mask, and then used the active contour technique to frame up the barcode in an ROI. The second method was similar to the first method in finding barcode candidates and output ROI except using the smaller 3 3 mask to search for the center of barcodes in the middle step. The third method extended the second method with an additional processing stage, which filtered the ROI to reduce the number of erroneously detected areas. The recall and precision of three methods by testing data provided by CipherLab were evaluated. The results for these three methods were 83%, 92%, and 91% in the recall, and 81%、46% and 79% in the precision. The execution time of these three methods took 118ms ,84ms and 156ms in PDA, respectively. These three proposed methods were at similar recall and precision compared to other studies, but with significant improvement in running time. Our algorithms were competitive in execution speed compared to other approaches.

摘要    i
Abstract    ii
致謝    iv
目錄    v
圖目錄    ix
表目錄    xiv
第一章 緒論    1
1    研究動機    1
2    文獻探討    2
2.1    條碼定位    2
2.2    物件偵測    5
2.3    剪枝    7
3    研究目的與貢獻    9
4    論文架構    10
第二章 研究背景與相關理論    11
1    物件偵測Object detection    11
2    兩階段偵測 (Two stage detector)    11
2.1.    RCNN (Region-based Convolutional Neural Networks)    12
2.2.    Fast-RCNN    13
2.3.    Faster-RCNN    13
3    一階段偵測 (One stage detector)    14
3.1.    YOLOv1    14
3.2.    YOLOv2    18
3.3.    YOLOv3    24
4    評估標準    25
5    結論    27
第三章 物件偵測架構及模擬    28
1    物件偵測架構與訓練    28
1.1    YOLOv3-Tiny    28
1.2    資料蒐集使用儀器    32
1.3    資料蒐集    33
1.4    訓練網路設備    36
1.5    訓練方法與結果    36
2    模型壓縮    39
3    解析YOLOv3-tiny    40
3.1    Grad-CAM(Gradient-weighted Class Activation Mapping)    40
4    模擬YOLOv3-tiny    43
4.1    找出重要特徵圖    43
4.2    尋找條碼範圍    46
4.3    尋找條碼中心    52
4.4    使用主動式輪廓框選條碼    57
4.5    篩選區域    59
4.6    演算法統整    63
第四章 實驗結果及討論    66
1    實驗設備    66
1.1.    個人數位助理 (PDA)    66
2    範例圖片    67
3    模擬結果    68
3.1.    UI介面    68
3.2.    方法一 (5乘5遮罩)實驗結果    70
3.3.    方法二 (3乘3遮罩)實驗結果    73
3.4.    方法三 (3乘3遮罩並加入篩選區域)實驗結果    76
4    結果討論    79
4.1.    Precision與Recall差別    79
4.2.    準確度與執行速度    79
4.3.    不同測試集    80
第五章 結論與未來展望    86
1    結論    86
2    未來展望    88
參考文獻    90


                                

Baek, Y., Lee, B., Han, D., Yun, S., & Lee, H. (2019). Character Region Awareness for Text Detection. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9357-9366. Long Beach, CA, USA.
Biao, L. (2007). A DataMatrix-based Mutant Code Design and Recognition Method Research. International Conference on Image and Graphics, pp. 570-574. Sichuan, China.
Byeon, Y.-H., & Kwak, K.-C. (2017). A Performance Comparison of Pedestrian Detection Using Faster RCNN and ACF. International Congress on Advanced Applied Informatics (IIAI-AAI), pp. 858-863. Hamamatsu, Japan.
Chai, D., & Hock, F. (2005). Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras. International Conference on Information Communications & Signal Processing, pp. 1595-1599. Bangkok, Thailand.
Chandan, G., Jain, A., Jain, H., & Mohana. (2018). Real Time Object Detection and Tracking Using Deep Learning and OpenCV. International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 1305-1308. Coimbatore, India.
Chen, L., Zhang, Z., & Peng, L. (2018). Fast single shot multibox detector and its application on vehicle counting system. IET Intelligent Transport Systems, 12(10), pp. 1406-1413.
Chen, S., & Zhao, Q. (2019). Shallowing Deep Networks: Layer-Wise Pruning Based on Feature Representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12), pp. 3048-3056.
Chu, C.-H., Yang, D.-N., & Chen, M.-S. (2007). Extracting Barcodes from a Camera-Shaken Image on Camera Phones. IEEE International Conference on Multimedia and Expo, pp. 2062-2065. Beijing, China.
Creusot, C., & Munawar, A. (2015). Real-Time Barcode Detection in the Wild. IEEE Winter Conference on Applications of Computer Vision, pp. 239-245. Waikoloa, HI, USA.
Creusot, C., & Munawar, A. (2016). Low-computation egocentric barcode detector for the blind. IEEE International Conference on Image Processing (ICIP), pp. 2856-2860. Phoenix, AZ, USA.
Girshick, R. (2015). Fast R-CNN. IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448. Santiago, Chile.
Han, K., Sun, M., Zhou, X., Zhang, G., Dang, H., & Liu, Z. (2017). A new method in wheel hub surface defect detection: Object detection algorithm based on deep learning. International Conference on Advanced Mechatronic Systems (ICAMechS), pp. 335-338. Xiamen, China.
Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both Weights and Connections for Efficient Neural Networks. Advances in Neural Information Processing Systems, pp. 1135–1143. Cambridge , MA , US.
Huang, R., Gu, J., Sun, X., Hou, Y., & Uddin, S. (2019). A Rapid Recognition Method for Electronic Components Based on the Improved YOLO-V3 Network. Electronics, 8(8), p. 825.
Jain, A., Bhattacharjee, S., & Chen, Y. (1992). On texture in document images. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 677-680. Champaign, IL, USA.
Kim, K., Cho, J., Pyo, J., Kang, S., & Kim, J. (2017). Dynamic Object Recognition Using Precise Location Detection and ANN for Robot Manipulator. International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO), pp. 237-241. Prague, Czech Republic.
Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2017). Pruning filters for efficient ConvNets. International Conference on Learning Representations (ICLR), pp. 1-13. Toulon, France.
Liang, Y.-h., & Wang, Z.-y. (2006). A Skew Detection Method for 2D Bar Code Images Based on the Least Square Method. International Conference on Machine Learning and Cybernetics, pp. 3974-3977. Dalian, China.
Liu, Y., Yang, B., & Yang, J. (2008). Bar Code Recognition in Complex Scenes by Camera Phones. International Conference on Natural Computation, pp. 462-466. Jinan, China.
Luo, J.-H., Zhang, H., Zhou, H.-Y., Xie, C.-W., Wu, J., & Lin, W. (2019). ThiNet: Pruning CNN Filters for a Thinner Net. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(10), pp. 2525-2538.
Ma´ rquez-Neila, P., Baumela,, L., & Alvarez, L. (2014). A Morphological Approach to Curvature-Based Evolution of Curves and Surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), pp. 2-17.
Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, Faster, Stronger. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517-6525. Honolulu, HI, USA.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788. Las Vegas, NV, USA.
Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), pp. 1137-1149.
Selvaraju, R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. IEEE International Conference on Computer Vision (ICCV), pp. 618-626. Venice, Italy.
Tropf, A., & Chai, D. (2006). Locating 1-D Bar Codes in Dct-Domain. IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. II-741 - II-744. Toulouse, France.
Xie, L., Ahmad, T., Jin, L., Liu, Y., & Zhang, S. (2018). A New CNN-Based Method for Multi-Directional Car License Plate Detection. IEEE Transactions on Intelligent Transportation Systems, 19(2), pp. 507-517.
Zamberletti, A., Gallo, I., & Albertini, S. (2013). Robust Angle Invariant 1D Barcode Detection. IAPR Asian Conference on Pattern Recognition, pp. 160-164. Naha, Japan.
Zamberletti, A., Gallo, I., Carullo, M., & Binaghi, E. (2010). Neural Image Restoration for Decoding 1-D Barcodes using Common Camera Phones. International Conference on Computer Vision Theory and Applications, pp. 5-11. Angers, France.
Zhang, Chunhui; Wang, Jian; Han, Shi; Yi, Mo; Zhang, Zhengyou. (2006). Automatic Real-Time Barcode Localization in Complex Scenes. International Conference on Image Processing, pp. 497-500. Atlanta, GA, USA.

簡易檢索 / 詳目顯示

相關論文