跳到主要內容

簡易檢索 / 詳目顯示

研究生: 林威任
Wei-Jen Lin
論文名稱: 應用深度學習於視障人士生活輔助系統
A Deep Learning Approach to Living Assistance Systems for the Visually Impaired
指導教授: 蘇木春
Mu-Chun Su
口試委員:
學位類別: 博士
Doctor
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 106
中文關鍵詞: 導盲系統室內引導偵測系統
外文關鍵詞: Blind guidance, indoor guidance, detection
相關次數: 點閱:15下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 根據世界衛生組織2019 年的統計全球約有22 億人有視覺障礙的問
    題,而在台灣視覺障礙者約有五萬六千餘人,對於視覺障礙者而言,要
    自主在陌生環境中移動是相當困難的,而傳統的輔助裝置像是白手杖、
    導盲犬等都有其不方便或是普及的困難之處,所以本論文想使用立體視
    覺的攝影機與深度學習的演算法協助視障者避開障礙物、偵測路牌,輔
    助他們在陌生的環境中行走。
    本論文包含:(1) 開發離線版的室內導盲輔助裝置、(2) 使用MobileNet
    偵測路面情況、(3) 使用YOLO、CRAFT、CRNN 三個模型解析路牌資
    訊,協助障者在室內的公共場所移動。
    路面偵測的實驗雖然DenseNet(94.58%) 效果優於MobileNet(93.53%),
    但是考量硬體裝置,使用參數量較少的MobileNet 更加適合。而使用
    YOLO 偵測路牌的實驗,當IOU>0.5 的mAP 為90.07%,已經能透過路
    牌偵測協助障者移動。


    According to the World Health Organization's statistics in 2019, there are approximately 2.2 billion people worldwide with visual impairment issues, and there are about 56,000 visually impaired people in Taiwan. For visually impaired people, it is quite difficult to move independently in unfamiliar environments, and traditional aids such as white canes and guide dogs have their own difficulties or difficulties in popularization. Therefore, this paper proposes to use a stereoscopic camera and deep learning algorithm to assist visually impaired people in avoiding obstacles, detecting road signs, and assisting them in walking in unfamiliar environments.
    This paper includes: (1) developing an offline indoor navigation aid, (2) using MobileNet to detect road conditions, and (3) using three models, YOLO, CRAFT, and CRNN, to analyze road sign information and assist the visually impaired in moving around public indoor spaces.
    Although DenseNet (94.58%) performed better than MobileNet (93.53%) in road detection experiments, MobileNet with fewer parameters is more suitable considering hardware devices. In the experiment of using YOLO to detect road signs, when IOU>0.5, the mAP is 90.7%, which can already detect road signs to assist the visually impaired in moving around.

    摘要iv Abstract v 誌謝vii Contents viii 1 Introduction 1 1.1 Motivation ................................................................. 1 1.2 Objectives.................................................................. 3 1.3 Thesis Structure........................................................... 4 2 Related Work 5 2.1 Background Knowledge ................................................. 6 2.1.1 Hearing for The Visually Impaired............................ 6 2.1.2 Ability of Spatial Perception in Visually Impaired Individuals...................................................................... 7 2.1.3 Traditional assistive Devices for The Visually Impaired... 8 2.2 Review ..................................................................... 10 2.2.1 Obstacle Avoidance Assistance Device Using Sensors..... 11 2.2.2 Computer Vision-Assisted Devices ........................... 16 2.2.3 Positioning Assistance Device ................................. 20 3 Methods 24 3.1 Hardware Architecture................................................... 24 3.2 System Architecture...................................................... 28 3.3 Road Detection............................................................ 31 3.3.1 Preprocessing..................................................... 32 3.3.2 Road Surface Recognition...................................... 34 3.3.3 Distance Calculation for Obstacles and Stairs ............... 36 3.4 Using Metro Station Signs for Wayfinding by the Visually Impaired.............................................................................. 38 3.4.1 Key Object Detection in Metro Stations...................... 38 3.4.2 Ticket Gate Detection ........................................... 44 3.4.3 Analysis of Road Sign .......................................... 46 3.4.4 Distance Calculation ............................................ 52 4 Experiments and Analysis 53 4.1 Road Detection............................................................ 53 4.1.1 Data Set ........................................................... 53 4.1.2 Assessment Methods ............................................ 54 4.1.3 Road Surface Detection Results ............................... 56 4.2 Road Sign Detection ..................................................... 58 4.2.1 Data Set ........................................................... 58 4.2.2 Assessment Methods ............................................ 60 4.2.3 Analysis and Interpretation of Results ........................ 61 4.3 Arrow Symbol Experiment.............................................. 64 4.3.1 Assessment Methods ............................................ 65 4.3.2 Two-Stage Experimental Method ............................. 67 4.3.3 Analysis and Interpretation of Results ........................ 69 4.4 Region Matching Experiment........................................... 80 4.4.1 Dataset and Evaluation Methods .............................. 80 4.4.2 Experimental Results and Analysis ........................... 81 5 Conclusion and Future Work 85 5.1 Conclusion................................................................. 85 5.2 Future Work ............................................................... 86 References 87

    [1] Ministry of Health and Welfare. “身心障礙統計專區.” (Jul. 23, 2021), [Online]. Available:
    https://dep.mohw.gov.tw/dos/cp-5224-62359-113.html (visited on 05/20/2024).
    [2] Taiwan Guide Dog Association. “認識導盲犬,” [Online]. Available: https : / / www .
    guidedog.org.tw/aboutguidedog/about-1.html (visited on 05/20/2024).
    [3] The Hong Kong Society for the Blind. “「視障訓練員」- 第6 招: 驚人聽力,” [Online].
    Available: https://www.hksb.org.hk/tc/triviaDetail/281974/%E8%A6%96%E9%9A%
    9C%E8%A8%93%E7%B7%B4%E5%93%A1%E7%AC%AC6%E6%8B%9B:
    %E9%A9%9A%E4%BA%BA%E8%81%BD%E5%8A%9B/ (visited on 05/20/2024).
    [4] M.-S. Chen, J.-S. Liu, and W.-R. Chen, “Differences in auditory discrimination ability
    between visually impaired and normally sighted adults,” Journal of Industrial and Production
    Engineering, vol. 32, pp. 255–262, 2015.
    [5] The league For Persons With Disabilities, R.O.C(TAIWAN). “「設置導盲磚不是錯,
    設計錯誤才是錯」聯合聲明稿.” (Apr. 13, 2021), [Online]. Available: https://www.
    enable.org.tw/issue/item_detail/909 (visited on 05/20/2024).
    [6] P. Patil and A. Sonawane, “Environment sniffing smart portable assistive device for visually
    impaired individuals,” in 2017 International Conference on Trends in Electronics
    and Informatics (ICEI), 2017, pp. 317–321.
    [7] P. Patil and A. Sonawane, “Environment sniffing smart portable assistive device for visually
    impaired individuals,” in 2017 International Conference on Trends in Electronics
    and Informatics (ICEI), 2017, pp. 317–321.
    [8] T. M. N. Vamsi, G. K. Chakravarthi, and T. Pratibha, “An embedded system design for
    guiding visually impaired personnel,” in 2019 International Conference on Recent Advances
    in Energy-efficient Computing and Communication (ICRAECC), 2019, pp. 1–4.
    [9] S. Tejaswini, M. S. Sahana, K. Bhargavi, S. S. Jayamma, H. S. Bhanu, and K. S. Praveena,
    “Obstacle sensing assistance for visually impaired person using arduino,” in 2021 5th
    International Conference on Electrical, Electronics, Communication, Computer Technologies
    and Optimization Techniques (ICEECCOT), 2021, pp. 767–771.
    [10] F. Shaikh, M. A. Meghani, V. Kuvar, and S. Pappu, “Wearable navigation and assistive
    system for visually impaired,” in 2018 2nd International Conference on Trends in
    Electronics and Informatics (ICOEI), 2018, pp. 747–751.
    [11] D. P. Khairnar, R. B. Karad, A. Kapse, G. Kale, and P. Jadhav, “Partha: A visually impaired
    assistance system,” in 2020 3rd International Conference on Communication System,
    Computing and IT Applications (CSCITA), 2020, pp. 32–37.
    [12] S. T. H. Rizvi, M. J. Asif, and H. Ashfaq, “Visual impairment aid using haptic and sound
    feedback,” in 2017 International Conference on Communication, Computing and Digital
    Systems (C-CODE), 2017, pp. 175–178.
    [13] Intel RealSense. “Intel realsense l515 datasheet,” [Online]. Available: https : / / www .
    intelrealsense.com/lidar-camera-l515/ (visited on 05/20/2024).
    [14] B. Hofflich, I. Lee, A. Lunardhi, et al., “Audio mapping using lidar to assist the visually
    impaired,” in 2022 IEEE Biomedical Circuits and Systems Conference (BioCAS), 2022,
    pp. 374–378.
    [15] S. Tian, M. Zheng, W. Zou, X. Li, and L. Zhang, “Dynamic crosswalk scene understanding
    for the visually impaired,” IEEE Transactions on Neural Systems and Rehabilitation
    Engineering, vol. 29, pp. 1478–1486, 2021.
    [16] S. Chinchole and S. Patel, “Artificial intelligence and sensors based assistive system for
    the visually impaired people,” in 2017 International Conference on Intelligent Sustainable
    Systems (ICISS), 2017, pp. 16–19.
    [17] T. Yoshikawa and C. Premachandra, “Pedestrian crossing detection by vgg16 for visuallyimpaired
    walking assistance system,” in 2022 2nd International Conference on Robotics,
    Automation and Artificial Intelligence (RAAI), 2022, pp. 284–288.
    [18] H. M. Saber, N. K. Al-Salihi, and R. M. D. Omer, “Visually impaired people navigation
    system using sensors and neural network,” in 2022 IEEE 3rd International Conference
    on Human-Machine Systems (ICHMS), 2022, pp. 1–7.
    [19] D. Chaudhary, A. Mathur, A. Chauhan, and A. Gupta, “Assistive object recognition
    and obstacle detection system for the visually impaired using yolo,” in 2023 13th International
    Conference on Cloud Computing, Data Science & Engineering (Confluence),
    2023, pp. 353–358.
    [20] S. Shahani and N. Gupta, “The methods of visually impaired navigating and obstacle
    avoidance,” in 2023 International Conference on Applied Intelligence and Sustainable
    Computing (ICAISC), 2023, pp. 1–6.
    [21] R. A. Minhas and A. Javed, “X-eye: A bio-smart secure navigation framework for visually
    impaired people,” in 2018 International Conference on Signal Processing and
    Information Security (ICSPIS), 2018, pp. 1–4.
    [22] A. Devi, M. J. Therese, and R. S. Ganesh, “Smart navigation guidance system for visually
    challenged people,” in 2020 International Conference on Smart Electronics and
    Communication (ICOSEC), 2020, pp. 615–619.
    [23] P. Sankalpani, I. Wijesinghe, I. Jeewani, R. Anooj, M. Mahadikaara, and J. A. D. C. Anuradha
    Jayakody, ““smart assistant": A solution to facilitate vision impaired individuals,”
    in 2018 National Information Technology Conference (NITC), 2018, pp. 1–6.
    [24] C. Yang and H.-r. Shao, “Wifi-based indoor positioning,” IEEE Communications Magazine,
    vol. 53, no. 3, pp. 150–157, 2015.
    [25] D. Ahmetovic, C. Gleason, C. Ruan, K. Kitani, H. Takagi, and C. Asakawa, “Navcog: A
    navigational cognitive assistant for the blind,” in Proceedings of the 18th International
    Conference on Human-Computer Interaction with Mobile Devices and Services, ser. MobileHCI
    ’16, Florence, Italy: Association for Computing Machinery, 2016, pp. 90–99.
    [26] K.-H. Huang, “Beacon application for museum indoor positioning system: A case study
    of national museum of taiwan history,” 繁體中文, 博物館與文化, no. 15, pp. 5–29, Jun.
    2018.
    [27] C.-C. Lan, “The study of user’s experience on the utility of the beacon system applied in
    indoor and outdoor orientation and navigation for individuals with visual impairments,”
    繁體中文, 身心障礙研究季刊, vol. 16, no. 3&4, pp. 234–250, Dec. 2018.
    [28] S. Willis and S. Helal, “Rfid information grid for blind navigation and wayfinding,” in
    Ninth IEEE International Symposium on Wearable Computers (ISWC’05), 2005, pp. 34–
    37.
    [29] B. Li, J. P. Muñoz, X. Rong, et al., “Vision-based mobile indoor assistive navigation aid
    for blind people,” IEEE Transactions on Mobile Computing, vol. 18, no. 3, pp. 702–714,
    2019.
    [30] A. M. Ali and M. J. Nordin, “Sift based monocular slam with multi-clouds features for indoor
    navigation,” in TENCON 2010 - 2010 IEEE Region 10 Conference, 2010, pp. 2326–
    2331.
    [31] G. R. Shoovo, B. Dey, M. K. Akash, T. Motahara, and M. H. Imam, “Design of a line
    following wheelchair for visually impaired paralyzed patient,” in 2021 2nd International
    Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), 2021,
    pp. 398–402.
    [32] ZED 2. “Zed 2 technical specifications.” (Aug. 12, 2024), [Online]. Available: %7Bhttps:
    //www.stereolabs.com/en-tw/products/zed-2%7D (visited on 08/12/2024).
    [33] Azure Kinect DK. “Azure kinect dk technical specifications.” (Aug. 12, 2024), [Online].
    Available: %7Bhttps://learn.microsoft.com/zh- tw/azure/kinect- dk/hardwarespecification%
    7D (visited on 08/12/2024).
    [34] Intel® RealSense™. “Intel® realsense™ depth camera d457 technical specifications.”
    (Aug. 12, 2024), [Online]. Available: %7Bhttps : / / www . intelrealsense . com / depth -
    camera-d457/%7D (visited on 08/12/2024).
    [35] G. Kurillo, E. Hemingway, M.-L. Cheng, and L. Cheng, “Evaluating the accuracy of the
    azure kinect and kinect v2,” Sensors, vol. 22, no. 7, p. 2469, 2022.
    [36] NVIDIA. “View jetson xavier technical specifications,” [Online]. Available: https : / /
    www . nvidia . com / en - us / autonomous - machines / embedded - systems / jetson - xavier -
    series/ (visited on 08/12/2024).
    [37] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, realtime
    object detection,” in Proceedings of the IEEE Conference on Computer Vision and
    Pattern Recognition (CVPR), Jun. 2016.
    [38] C. Szegedy, W. Liu, Y. Jia, et al., “Going deeper with convolutions,” in Proceedings of
    the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2015.
    [39] J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in Proceedings of the
    IEEE conference on computer vision and pattern recognition, 2017, pp. 7263–7271.
    [40] K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image
    recognition, 2015.
    [41] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection
    with region proposal networks,” IEEE Transactions on Pattern Analysis and Machine
    Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017.
    [42] J. Redmon and A. Farhadi, Yolov3: An incremental improvement, 2018.
    [43] jiangdabai. “Yolo v3 architecture,” [Online]. Available: https://pic2.zhimg.com/80/v2-
    af7f12ef17655870f1c65b17878525f1_1440w.jpg (visited on 06/11/2024).
    [44] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, Yolov4: Optimal speed and accuracy
    of object detection, 2020.
    [45] C.-Y. Wang, H.-Y. M. Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, and I.-H. Yeh, “Cspnet: A
    new backbone that can enhance learning capability of cnn,” in Proceedings of the IEEE/
    CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jun.
    2020.
    [46] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional
    networks for visual recognition,” IEEE Transactions on Pattern Analysis and Machine
    Intelligence, vol. 37, no. 9, pp. 1904–1916, 2015.
    [47] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for instance segmentation,”
    in Proceedings of the IEEE Conference on Computer Vision and Pattern
    Recognition (CVPR), Jun. 2018.
    [48] Range King. “Yolo v8 architecture,” [Online]. Available: https://github.com/RangeKing
    (visited on 06/11/2024).
    90
    [49] Y. Baek, B. Lee, D. Han, S. Yun, and H. Lee, “Character region awareness for text detection,”
    in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
    Recognition (CVPR), Jun. 2019.
    [50] B. Shi, X. Bai, and C. Yao, “An end-to-end trainable neural network for image-based
    sequence recognition and its application to scene text recognition,” IEEE Transactions
    on Pattern Analysis and Machine Intelligence, vol. 39, no. 11, pp. 2298–2304, 2017.
    [51] Tesseract Open Source OCR Engine. “Tesseract open source ocr engine,” [Online]. Available:
    https://github.com/tesseract-ocr/tesseract (visited on 06/02/2024).
    [52] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional
    networks,” in Proceedings of the IEEE Conference on Computer Vision and
    Pattern Recognition (CVPR), Jul. 2017.
    [53] A. G. Howard, M. Zhu, B. Chen, et al., “Mobilenets: Efficient convolutional neural networks
    for mobile vision applications,” 2017.
    [54] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in
    Proceedings of the IEEE conference on computer vision and pattern recognition, 2016,
    pp. 770–778.

    QR CODE
    :::