| 研究生: |
李冠達 Kuan-Ta Lee |
|---|---|
| 論文名稱: |
模擬深度學習特徵進行條碼偵測 Simulation of deep learning features used in barcode detection |
| 指導教授: |
吳炤民
Chao-Min Wu |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2021 |
| 畢業學年度: | 109 |
| 語文別: | 中文 |
| 論文頁數: | 112 |
| 中文關鍵詞: | 物件偵測 、YOLO 、條碼定位 、感興趣區域 |
| 外文關鍵詞: | Object detection, YOLO, Barcode localization, Region of Interest |
| 相關次數: | 點閱:10 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
條碼在一般大眾的生活中隨處可見,不同的領域為了各自的需求而製作出符合自身產品的條碼,導致條碼的種類變得非常多,所以僅使用單一方法將所有條碼找出並不容易。隨著近年來深度學習在物件辨識領域有大幅度的進步,因此本研究的目標是利用深度學習來找出條碼。
由於所需要的系統只需要將每一個條碼定位為感興趣區域 (Region Of Interest, ROI)而不必加以分類,為了使系統的執行速度夠快,因此所使用的是較小且計算較快的YOLOv3-tiny的網路架構。訓練圖片來源為利用專業掃描器1504P進行影像蒐集,總共蒐集10008張影像,並將影像以8:2分成訓練資料以及驗證資料,再由欣技資訊提供133張測試影像,而從驗證資料以及測試資料的結果顯示Recall皆可以達到95%,而Precision為93%以及75%。
為了要在處理效能有限的個人數位助理中(PDA)執行,因此我們嘗試了剪枝網路,但效果不佳,而後嘗試解析網路並用影像處理模擬網路行為。將網路視覺化的結果只能看出一些大概的處理過程,最後我們嘗試模擬重要特徵圖,並提出三種模擬方法,第一種方法為先找出條碼範圍,再利用5乘5遮罩找出條碼中心,最後利用主動式輪廓法來框選條碼,而框選的條碼即為ROI,第二種方法在找出條碼範圍以及框選條碼為ROI的步驟與第一種方法相同,但使用較小的3乘3遮罩找出條碼中心,第三種為第二種方法執行完成後,再對ROI進一步篩選,以降低誤判區域的數量,將這三種方法利用由欣技資訊所提供的測試集進行測試的結果顯示,Recall分別為83%、92%以及91%,Precision為81%、46%以及79%,而放入個人數位助理的執行速度為118ms、84ms以及156ms,與其他方法相比,我們的方法在準確度上大概落於平均水準,但在執行速度上有著極大的優勢,因此我們的演算法在速度上是相對有競爭力的。
Barcodes are ubiguitous in modern life. Different types of barcodes are designed for different applications. Therefore, it is not easy to detect all types of barcodes using a single approach. In recent years, object detection in deep learning has achieved significant progresses, so this research aims to locate barcodes using deep learning.
In order to efficiently execute the system, which only needs to locate the barcode as a region of interest (ROI) without recognizing the type of each barcode, the simple and fast YOLOv3-tiny network has been chosen. Images used for training were captured by the professional scanner 1504P. The number of images were 10008 and then further divided into training data and verification data in 8:2.
The 133 data for testing were provided by CipherLab. The results of verification data and testing data shown that the recall could reach 95%, and the precision was 93% and 75%.
To implement the network in a resource-limited Personal Digital Assistant (PDA), we tried to prune the network, but the performance was not good. Hense we analyzed the network structure and used image processing techniques to imitate the network behavior. By visualizing the network, only coarse processing procedure could be identified. Finally, we tried to imitate some important feature maps with three methods. The first method searched for barcode candidates, located the centers of the barcodes using the 5 5 mask, and then used the active contour technique to frame up the barcode in an ROI. The second method was similar to the first method in finding barcode candidates and output ROI except using the smaller 3 3 mask to search for the center of barcodes in the middle step. The third method extended the second method with an additional processing stage, which filtered the ROI to reduce the number of erroneously detected areas. The recall and precision of three methods by testing data provided by CipherLab were evaluated. The results for these three methods were 83%, 92%, and 91% in the recall, and 81%、46% and 79% in the precision. The execution time of these three methods took 118ms ,84ms and 156ms in PDA, respectively. These three proposed methods were at similar recall and precision compared to other studies, but with significant improvement in running time. Our algorithms were competitive in execution speed compared to other approaches.
Baek, Y., Lee, B., Han, D., Yun, S., & Lee, H. (2019). Character Region Awareness for Text Detection. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9357-9366. Long Beach, CA, USA.
Biao, L. (2007). A DataMatrix-based Mutant Code Design and Recognition Method Research. International Conference on Image and Graphics, pp. 570-574. Sichuan, China.
Byeon, Y.-H., & Kwak, K.-C. (2017). A Performance Comparison of Pedestrian Detection Using Faster RCNN and ACF. International Congress on Advanced Applied Informatics (IIAI-AAI), pp. 858-863. Hamamatsu, Japan.
Chai, D., & Hock, F. (2005). Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras. International Conference on Information Communications & Signal Processing, pp. 1595-1599. Bangkok, Thailand.
Chandan, G., Jain, A., Jain, H., & Mohana. (2018). Real Time Object Detection and Tracking Using Deep Learning and OpenCV. International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 1305-1308. Coimbatore, India.
Chen, L., Zhang, Z., & Peng, L. (2018). Fast single shot multibox detector and its application on vehicle counting system. IET Intelligent Transport Systems, 12(10), pp. 1406-1413.
Chen, S., & Zhao, Q. (2019). Shallowing Deep Networks: Layer-Wise Pruning Based on Feature Representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12), pp. 3048-3056.
Chu, C.-H., Yang, D.-N., & Chen, M.-S. (2007). Extracting Barcodes from a Camera-Shaken Image on Camera Phones. IEEE International Conference on Multimedia and Expo, pp. 2062-2065. Beijing, China.
Creusot, C., & Munawar, A. (2015). Real-Time Barcode Detection in the Wild. IEEE Winter Conference on Applications of Computer Vision, pp. 239-245. Waikoloa, HI, USA.
Creusot, C., & Munawar, A. (2016). Low-computation egocentric barcode detector for the blind. IEEE International Conference on Image Processing (ICIP), pp. 2856-2860. Phoenix, AZ, USA.
Girshick, R. (2015). Fast R-CNN. IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448. Santiago, Chile.
Han, K., Sun, M., Zhou, X., Zhang, G., Dang, H., & Liu, Z. (2017). A new method in wheel hub surface defect detection: Object detection algorithm based on deep learning. International Conference on Advanced Mechatronic Systems (ICAMechS), pp. 335-338. Xiamen, China.
Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both Weights and Connections for Efficient Neural Networks. Advances in Neural Information Processing Systems, pp. 1135–1143. Cambridge , MA , US.
Huang, R., Gu, J., Sun, X., Hou, Y., & Uddin, S. (2019). A Rapid Recognition Method for Electronic Components Based on the Improved YOLO-V3 Network. Electronics, 8(8), p. 825.
Jain, A., Bhattacharjee, S., & Chen, Y. (1992). On texture in document images. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 677-680. Champaign, IL, USA.
Kim, K., Cho, J., Pyo, J., Kang, S., & Kim, J. (2017). Dynamic Object Recognition Using Precise Location Detection and ANN for Robot Manipulator. International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO), pp. 237-241. Prague, Czech Republic.
Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2017). Pruning filters for efficient ConvNets. International Conference on Learning Representations (ICLR), pp. 1-13. Toulon, France.
Liang, Y.-h., & Wang, Z.-y. (2006). A Skew Detection Method for 2D Bar Code Images Based on the Least Square Method. International Conference on Machine Learning and Cybernetics, pp. 3974-3977. Dalian, China.
Liu, Y., Yang, B., & Yang, J. (2008). Bar Code Recognition in Complex Scenes by Camera Phones. International Conference on Natural Computation, pp. 462-466. Jinan, China.
Luo, J.-H., Zhang, H., Zhou, H.-Y., Xie, C.-W., Wu, J., & Lin, W. (2019). ThiNet: Pruning CNN Filters for a Thinner Net. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(10), pp. 2525-2538.
Ma´ rquez-Neila, P., Baumela,, L., & Alvarez, L. (2014). A Morphological Approach to Curvature-Based Evolution of Curves and Surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), pp. 2-17.
Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, Faster, Stronger. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517-6525. Honolulu, HI, USA.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788. Las Vegas, NV, USA.
Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), pp. 1137-1149.
Selvaraju, R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. IEEE International Conference on Computer Vision (ICCV), pp. 618-626. Venice, Italy.
Tropf, A., & Chai, D. (2006). Locating 1-D Bar Codes in Dct-Domain. IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. II-741 - II-744. Toulouse, France.
Xie, L., Ahmad, T., Jin, L., Liu, Y., & Zhang, S. (2018). A New CNN-Based Method for Multi-Directional Car License Plate Detection. IEEE Transactions on Intelligent Transportation Systems, 19(2), pp. 507-517.
Zamberletti, A., Gallo, I., & Albertini, S. (2013). Robust Angle Invariant 1D Barcode Detection. IAPR Asian Conference on Pattern Recognition, pp. 160-164. Naha, Japan.
Zamberletti, A., Gallo, I., Carullo, M., & Binaghi, E. (2010). Neural Image Restoration for Decoding 1-D Barcodes using Common Camera Phones. International Conference on Computer Vision Theory and Applications, pp. 5-11. Angers, France.
Zhang, Chunhui; Wang, Jian; Han, Shi; Yi, Mo; Zhang, Zhengyou. (2006). Automatic Real-Time Barcode Localization in Complex Scenes. International Conference on Image Processing, pp. 497-500. Atlanta, GA, USA.