| 研究生: |
李宛真 Wan-Jhen Lee |
|---|---|
| 論文名稱: |
可跟隨與避障的雙重卷積神經網路之視覺自走車 A visual following and obstacle-free automobile using dual convolutional neural networks |
| 指導教授: | 曾定章 |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2018 |
| 畢業學年度: | 106 |
| 語文別: | 中文 |
| 論文頁數: | 77 |
| 中文關鍵詞: | 視覺自走車 、雙重卷積神經網路 、跟隨 |
| 相關次數: | 點閱:6 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
我們將慢速自走車與電腦視覺相結合,希望能夠讓自走車自動追蹤
特定目標;未來可以應用於觀光展覽或是商場購物等,取代傳統的人力導
覽或是普通的購物車,增加生活上的便利性。為了能夠長時間追蹤特定目
標,我們需要應付目標外觀與背景隨時間的改變;且行人的穿著與外型更
是一般分類器著重的分類依據,但一般行人的穿著非常容易變換,所以我
們需要跳脫出一般分類器的概念,減少訓練分類器的時間與樣本的需要, 並且能夠準確辨識出目標。
本系統分為兩部分,第一部分為行人與障礙物偵測,找出影像中的所
有行人與可能的障礙物;第二部分為前導者確認,利用第一部分找出的行
人與資料庫中的目標影像互相比對,判斷彼此的相似性,藉此來判別出實
際出現在影像中的目標。偵測行人與判別特定目標物的相似度在過去傳
統方法上之效果不佳,所以本研究使用卷積神經網路來實踐這兩個任務,
透過卷積神經網路擷取影像特徵,能夠更適應行人或障礙物的外觀變化,
以保持偵測的準確度。透過事先離線訓練網路計算影像相似度,讓網路學
習判斷兩張影像是否為同一個目標物,不須針對特定目標訓練,藉此能夠
線上直接比對未被訓練過的目標者,也能夠適應短時間內更換前導者的 實際應用需求。
在實驗中,我們自己拍攝影片測試前導者追蹤的例子,利用連續的影
像計算出系統是否有在每一張影像中找到正確的前導者,系統的準確率 達到 95%,且不會被其他行人或障礙物影響,若是有其他行人對前導者 產生遮蔽,也能夠很好的適應不會產生誤判,並在行人離開後再次找回前 導者。
We combine the slow-moving self-propelled car with computer vision, hoping to enable the self-propelled car to track specific targets, and in the future, it can be used in sightseeing exhibitions or shopping malls, etc., instead of traditional manpower guides or ordinary shopping carts, increasing the convenience in life. To track a specific target for a long time, we need to cope with the target appearance and background change with time, and the wearer and appearance of the pedestrian is the classification basis of the general classifier, but the pedestrian is generally easy to change the wearing, so we jump out of the general classifier concept, not only reduce the time and the need of sample to train classifiers, but also identify the target.
The system is divided into two parts. The first part is the detection of pedestrians and obstacles, we will find out all the pedestrians and possible obstacles in the image. The second part is the confirmation of the leader, the pedestrians identified in the first part are compared with the target image in the database to determine the similarity of each other, used to identify the targets appearing in the image. The detection of pedestrians and the discrimination of pedestrians has not been effective in the past. Therefore, this study uses convolutional neural networks to practice these two tasks. The convolutional neural network captures image features and is more adaptive to the appearance of the object changes, it can maintain the accuracy of the detection with time. By training the network to calculate image similarity offline in advance, let the network learn to determine whether the two images are the same target, and do not need to train for specific targets. The network is possible to directly use on the untrained target online, and also adapt to the actual application needs of replacing the target in a short period of time.
In the experiment, we took the film to test leader tracking, using continuous frame to figure out if the system has found the right predecessor in each image, the accuracy of system is reached 95%, and our system will not be affected by other pedestrians or obstacles. If there are other pedestrians covering the predecessor, the proposed system can adapt well, with no misjudgment.
[1] M. Enzweiler and D. M. Gavrila, ''Monocular pedestrian detection: survey and experiments,'' IEEE Trans. Pattern Analysis and Machine Intelligence, vol.31, no.12, pp.2179-2195, 2008.
[2] M. Bertozzi, A. Broggi, R. Chapuis, F. Chausse, A. Fascioli, and A. Tibaldi, ''Shape-based pedestrian detection and localization,'' in Proc. IEEE Int. Conf. Intelligent Transportation Systems, Shanghai, China, Oct.12-15, 2003, pp.328-333.
[3] J. Ge, Y. Luo, and G. Tei, "Real-time pedestrian detection and tracking at nighttime for driver-assistance systems," IEEE Trans. Intelligent Transportation Systems, vol.10, pp.283-298, 2009.
[4] M. Bertozzi, A. Broggi, A. Fascioli, T. Graf, and M. Meinecke, “Pedestrian detection for driver assistance using multiresolution infrared vision,” IEEE Trans. Vehicular Technology, vol.53, no.6, pp.1666-1678, 2004.
[5] C. Papageorgiou and T. Poggio, "A trainable system for object detection," Int. Journal of Computer Vision, vol.38, no.1, pp.15-33, 2000.
[6] N. Dalad and B. Triggs, ''Histograms of oriented gradients for human detection,'' in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, San Diego, CA, June 20-26, 2005, pp.886-893.
[7] Y. Zhao, Y. Zhang, R. Cheng, D. Wei, and G. Li, "An enhanced histogram of oriented gradients for pedestrian detection," IEEE Intelligent Transportation Systems Magazine, vol.7, is.3, pp.29-38, Jul. 2015.
[8] I. P. Alonso, D. F. Llorca, and M. Á. Sotelo, ''Combination of feature extraction methods for SVM pedestrian detection,'' IEEE Trans. Intelligent Transportation System, vol.8, no.2, pp.292-307, 2007.
[9] L. Andreone, F. Bellotti, A. D. Gloria, and R. Lauletta, ''SVM-based pedestrian recognition on near-infrared images,'' in Proc. 4th IEEE Int. Symp. on Image and Signal Processing and Analysis, Torino, Italy, Sep.15-17, 2005, pp.274-278.
[10] Q. Tian, H. Sun, Y. Luo, and D. Hu, “Nighttime pedestrian detection with a normal camera using SVM classifier,” in Proc. 2nd Int. Symp. Neural Networks, vol.3497, pp.189-194, 2005.
[11] M. Bertozzi, A. Broggi, M. Del Rose, M. Felisa, A. Rakotomamonjy, and F. Suard, ''A pedestrian detector using histograms of oriented gradients and a support vector machine classifier,'' in Proc. IEEE Conf. Intelligent Transportation Systems, Seattle, WA, Sep.30-Oct.3, 2007, pp.143-148.
[12] X.-B. Cao, H. Qiao, and J. Keane, ''A low-cost pedestrian-detection system with a single optical camera,'' IEEE Trans. Intelligent Transportation Systems, vol.9, no.1, pp.58-67, 2008.
[13] T.-K. An and M.-H. Kim, ''A new diverse adaboost classifier,'' in Proc. Int. Conf. Artificial Intelligence and Computational Intelligence, Sanya, China, Oct.23-24, 2010, pp.359-363.
[14] P. Luo, X. Wang, X. Tang, "Pedestrian parsing via deep decompositional neural network," in Proc. IEEE Int. Conf. Computer Vision (ICCV), Shanghai, China, Dec.1-8, 2013, pp.2648-2655.
[15] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, Jun.23-28, 2014, pp.580-587.
[16] J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeul-ders, “Selective search for object recognition,” Int. Journal of Computer Vision, vol.104, is.2, pp.154-171, 2013.
[17] R. Girshick, "Fast R-CNN," in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, Dec.11-18, 2015, pp.1440-1448.
[18] K. He, X. Zhang, S. Ren, and J. Sun, ''Spatial pyramid pooling in deep convolutional networks for visual recognition,'' IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.37, is.9, pp.1904-1916, 2015.
[19] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards realtime object detection with region proposal networks," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.39, is.6, pp.1137-1149, 2016.
[20] K. He, G. Gkioxari, P. Dollar, R. Girshick, “Mask R-CNN,” in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, Oct.22-29, 2017, pp. 2980-2988.
[21] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.779-788.
[22] Z. Kalal, K. Mikolajczyk, J. Matas, “Tracking-Learning-Detection,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.34, is.7, pp.1409-1422, 2012.
[23] L. Wang, W. Ouyang, X. Wang, and H. Lu, “Visual tracking with fully convolutional networks,” in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, Dec.11-18, 2015, pp.3119-3127.
[24] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.8-10, 2015, pp.3431-
3440.
[25] H. Nam and B. Han, “Learning multi-domain convolutional neural networks for visual tracking,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.4293-4302.
[26] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” in Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, Jul.21-26, 2017, pp.6517-6525.
[27] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y . Fu, and A. C. Berg, “SSD: Single shot multibox detector,” in European Conf. on Computer Vision (ECCV), Amsterdam, Holland, Oct.8-16, 2016, pp.2137.
[28] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.1-9.
[29] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.770-778.
[30] J. Bromley, J. W. Bentz, L. Bottou, I. Guyon, Y. LeCun, C. Moore, E. Sackinger, and R.Shah, “Signature verification using a Siamese time delay neural network,” Int. Journal of Pattern Recognition and Artificial Intelligence, vol.7, no.4, pp.669-687, Aug. 1993.
[31] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, "Caffe: Convolutional architecture for fast feature embedding," in Proc. of the 22nd ACM Int. Conf. on Multimedia, Orlando, FL, 2014, pp.675-678.