跳到主要內容

簡易檢索 / 詳目顯示

研究生: 李佩瑩
Pei-Ying Lee
論文名稱: 輕量化卷積神經網路的車門開啟防撞警示
Collision warning for car door opening with a light convolutional neural network
指導教授: 曾定章
Ding-Chang Tseng
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 63
中文關鍵詞: 車門開啟防撞警示系統卷積神經網路
外文關鍵詞: DOW, CNN
相關次數: 點閱:10下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   由於台灣汽機車數量逐年攀升,且人口密度高、道路窄小和停車位不足的問題,人、車爭道和兩車併排等現象層出不窮,使得在路邊停靠而下車開門時未注意後方來車造成碰撞的傷亡事故屢屢發生,因此如何防範不當開車門肇禍已成為目前重要的研究議題。在本論文中,我們提出一個基於輕量化卷積神經網路 (convolution neural network, CNN) 的車門開啟防撞警示系統,以相機作為感測器監測後方的行人、自行車、機車及汽車,在可能碰撞前警示駕駛人,保障駕駛、乘客與其他用路人的安全。
      本論文分為兩個部分:第一部分為輕量化的卷積神經網路,使用 MobileNet V2 width1.6 取代原先 YOLOv3 中的 Darknet-53 架構作為特徵提取器,藉此減少網路執行時所需耗費的計算量和參數儲存空間,再透過 YOLOv3 中類似特徵金字塔網路 (feature pyramid networks, FPN) 以三種不同尺寸的特徵圖做後方移動物件的偵測與辨識;第二部分則是利用第一部分輸出的移動物件座標與類別,以俯瞰轉換將原始影像轉成平行於地面的虛擬影像平面,進而計算出縱向 (latitude)、橫向 (longitude) 的距離及估計碰撞時間 (time to collision, TTC) 作為警示的依據。
      在實驗中,我們所使用的YOLOv3-MobileNet V2 width1.6架構相較於YOLOv3降低約2.45倍的參數量及3.24倍的計算量。以960×540解析度的影片測試,平均執行速度為每秒 28 張影像,物件偵測系統的mAP達到88.43%。


    In most cities, the traffic condition is always crowded and chaotic. Sometimes, the drivers are hard to stop their cars away from the moving stream of roads, or less-morality drivers arbitrarily stop their cars to get off. In these situations, the abrupt car door opening may be collided by the following cars or autobikes.
    To avoid the collision, we here propose a “car-door-open warning system” based on a constructed light convolutional neural network combining a location estimator. At first, a light convolutional neural network is constructed to detect and recognized the moving objects in the images; the objects include the coming cars, autobikes, pedestrians, and other moving objects. The modified MobileNet V2 instead of the original Darknet-53 in YOLOv3 was used for shrinking the amount of computation and network parameters. Then, we transform the locations of detected objects from the coordinate system of captured images into the coordinate system of the transformed top-view images to estimate the relative locations of the coming objects.
    To evaluate the performance of the proposed system, several experiments and comparisons were conducted and reported. Based on 3164 images, the mAP of the object detection system can reach 88.43%, and the average execution speed on 960×540 images is 28 frames per second. Comparing to the original YOLOv3, we reduced the parameter and computation amount by 2.45 times and 3.24 times, respectively.

    摘要 i Abstract ii 致謝  ii 目錄  iv 圖目錄 vi 表目錄 viii 第一章 緒論 1   1.1 研究動機 1   1.2 系統架構 2   1.3 論文架構 3 第二章 相關研究 4   2.1 卷積神經網路物件偵測系統相關發展 4   2.2 卷積神經網路的輕量化 8 第三章 移動物件的偵測與辨識 12   3.1 YOLOv3架構 12   3.2 基於YOLOv3與MobileNet V2 width1.6架構 17 第四章 移動物件之距離與碰撞時間估算 24   4.1 相機參數校正 24   4.2 俯瞰轉換 31   4.3 距離與碰撞時間估計 34 第五章 實驗與結果 36   5.1 實驗設備介紹 36   5.2 卷積神經網路之訓練 36   5.3 卷積神經網路架構的比較和評估 38   5.4 移動物件之碰撞時間估計 42   5.5 車門開啟防撞警示系統結果展示 44 第六章  結論及未來展望 47 參考文獻 48

    [1] J. Redmon and A. Farhadi, ''Yolov3: an incremental improvement,'' arXiv:1804.02767, 2018.
    [2] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen, ''MobileNet V2: Inverted residuals and linear bottlenecks,'' arXiv:1801.04381, 2019.
    [3] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, Jun.23-28, 2014, pp.580-587.
    [4] J. Uijlings, K. Sande, T. Gevers, and A. Smeulders, “Selective search for object recognition,” Int. Journal of Computer Vision (IJCV), vol.104, is.2, pp.154-171, 2013.
    [5] R. Girshick, "Fast R-CNN," in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, Dec.11-18, 2015, pp.1440-1448.
    [6] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.39, is.6, pp.1137-1149, 2016.
    [7] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask R-CNN," in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, Oct.22-29, 2017, pp. 2980-2988.
    [8] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “ SSD: Single shot multibox detector,” in European Conf. on Computer Vision (ECCV), Amsterdam, Holland, Oct.8-16, 2016, pp.21-37.
    [9] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: unified, real-time object detection," in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp.779-788.
    [10] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, Jul.21-26, 2017, pp.6517-6525.
    [11] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proc. 5th Berkeley Symp. on Mathematical Statistics and Probability, Berkeley, CA, Jun.21-Jul.18, vol.1, 1967, pp.281-297.
    [12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada, Dec.3-8, 2012, pp.1097-1105.
    [13] M. Lin, Q. Chen, and S. Yan, “Netwok in network,” in Proc. Int. Conf. Learn. Represent (ICLR), Banff, Canada, Apr.14-16, 2014, pp.274-278.
    [14] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.1-9.
    [15] N. Iandola, S. Han, W. Moskewicz, K. Ashraf, W. Dally and K. Keutzer, ''Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 1mb model size,'' arXiv: 1602.07360, 2016.
    [16] A. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, '' Mobilenets: efficient convolutional neural networks for mobile vision applications,'' arXiv:1704.04861, 2017.
    [17] F. Chollet, ''Xception: deep learning with depthwise deparable convolutions,'' in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, Jul.22-25, 2017, pp.1800-1807.
    [18] X. Zhang, X. Zhou, M. Lin, and J. Sun, ''ShuffleNet: an extremely efficient convolutional neural network for mobile devices,'' arXiv:1707.01083, 2017.
    [19] G. Huang, Z. Liu, L. V. D. Maaten and K. Q. Weinberger, ''Densely Connected Convolutional Networks,'' in Proc. IEEE Conf. on Pattern Recognition and Computer Vision (CVPR), Honolulu, Hawaii, Jul.22-25, 2017, pp.4700-4708.
    [20] T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, '' Feature pyramid networks for object detection,'' arXiv:1612.03144, 2017.
    [21] K. He, X. Zhang, S. Ren, and J. Sun, ''Deep residual learning for image Recognition,'' arXiv: 1512.03385, 2015.
    [22] D. C. Brown, ''Close-range camera calibration," Photogrammetric Engineering, vol.37, no.8, pp.855-866, 1971.
    [23] W. Faig, "Calibration of close-range photogrammetry systems: Mathematical formulation," Photogrammetric Engineering and Remote Sensing, vol.41, no.12, pp.1479-1486, 1975.
    [24] O. Faugeras, T. Luong, and S. Maybank, "Camera self-calibration: Theory and experiments," in Proc. of 2nd European Conf. on Computer Vision, Santa Margherita Ligure, Italy, May 19-22, 1992, pp.321-334.
    [25] D. Gennery, "Stereo-camera calibration," in Proc. of 10th Image Understanding Workshop, Los Angeles, CA, Nov.7-8, 1979, pp.101-108.
    [26] G. Wei and S. Ma, "A complete two-plane camera calibration method and experimental comparisons," in Proc. of 4th Int. Conf. on Computer Vision, Berlin, Germany, May.11-14, 1993, pp.439-446.
    [27] J. Weng, P. Cohen, and M. Herniou, "Camera calibration with distortion models and accuracy evaluation," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.14, no.10, pp.965-980, 1992.
    [28] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd Edition, Cambridge University Press, 2004.
    [29] Z. Zhang, "A flexible new technique for camera calibration," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.22, no.11, pp.1330-1334, 2000.
    [30] D. Marquardt, "An algorithm for least-squares estimation of nonlinear parameters," SIAM Journal on Applied Mathematics, vol.11, pp.431-441, 1963.

    QR CODE
    :::