| 研究生: |
黃彥憬 Yan-Jing Huang |
|---|---|
| 論文名稱: |
利用強化學習於無人機自主巡檢之路徑規劃 Path Planning for Autonomous UAV Inspection Using Reinforcement Learning |
| 指導教授: |
王啟泰
Chi-Tai Wang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業管理研究所 Graduate Institute of Industrial Management |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 英文 |
| 論文頁數: | 83 |
| 中文關鍵詞: | 覆蓋路徑規劃 、強化學習 、無人機巡檢 、近端策略最佳化 、工業監測 |
| 外文關鍵詞: | Coverage Path Planning, Reinforcement Learning, Reinforcement Learning, UAV Inspection, Proximal Policy Optimization, Industrial Monitoring |
| 相關次數: | 點閱:98 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著智慧製造與工業4.0的快速發展,廠房巡檢的自動化需求日益提升。本研究提出一套結合人工智慧與強化學習技術的無人機自動巡檢系統,旨在提升巡檢效率,並降低人力成本與人力巡檢時的潛在風險。系統設計以自訂的 GridWorld 環境模擬廠房空間,並透過 Proximal Policy Optimization(PPO)演算法訓練無人機學習最佳巡檢策略。為強化探索能力與實現策略泛化,本研究引入多種隨機生成的地圖場景、障礙物配置與視野限制,首次訪問獎勵與回原點的獎勵設計。此外,輔以指引資訊以提升學習效率。實驗結果顯示,訓練後的模型具備良好的環境適應能力,不僅能有效完成全區覆蓋式巡檢,亦能成功應用於未曾訓練過的新地圖上,展現出良好的泛化能力。本研究展示了無人機結合深度強化學習於智慧工廠巡檢的應用潛力,並為後續拓展至實體部署奠定基礎。
With the rapid advancement of smart manufacturing and Industry 4.0, the demand for automated inspection in factory environments has been steadily increasing. This study proposes an autonomous UAV inspection system that integrates artificial intelligence and reinforcement learning techniques, aiming to enhance inspection efficiency while reducing labor costs and the potential risks associated with manual inspections. The system is designed using a customized GridWorld environment to simulate the factory layout, and employs the Proximal Policy Optimization (PPO) algorithm to train the UAV to learn an optimal inspection strategy. To improve exploration capabilities and achieve policy generalization, the study incorporates various randomly generated map scenarios, obstacle configurations, and visibility constraints, along with reward mechanisms such as first-visit bonuses and return-to-origin incentives. Additionally, guiding information is provided to further accelerate the learning process. Experimental results demonstrate that the trained model exhibits strong adaptability to diverse environments, effectively completing full-area coverage inspections and successfully generalizing to previously unseen maps. This research highlights the potential of integrating UAVs with deep reinforcement learning for smart factory inspections, laying the groundwork for future deployment in real-world applications.
1. 中華經濟研究院(2023)。無人機發展與應用介紹:巡檢無人機。檢自https://reurl.cc/O5MEW7 (上網日期:2024年12月17日)。
2. 國際勞工組織 (2024)。呼籲建構更安全的工作環境。檢自https://reurl.cc/p9rjEd (上網日期:2024年11月3日)。
3. 勞動部勞動及職業安全衛生研究所 (2014)。從人因工程觀點分析重大職業災害研究。檢自 https://criteria.ilosh.gov.tw/iLosh/wSite/ct?xItem=36955&ctNode=324&mp=3 (上網日期:2024年11月12日)。
4. 勞動部職業安全衛生署 (2023a)。勞動檢查統計年報。檢自 https://www.osha.gov.tw/48110/48331/48333/48339/173234/post (上網日期:2024年12月15日)。
5. 勞動部職業安全衛生署(2020)。職安署應用無人機科技 巡檢營造工地作業安全。 檢自https://www.osha.gov.tw/48110/48417/48419/87046/ (上網日期:2024年12月15日)。
6. 勞動部職業安全衛生署(2023b)。112年職業災害重點統計數據。檢自 https://www.osha.gov.tw/48110/48331/48333/48339/173234/post (上網日期:2024年12月15日)。
7. 遠東集團 (2024a)。嘉惠電力率先導入全自動無人機,巡檢工時直接砍半。檢自https://www.feg.com.tw/tw/news/news_detail.aspx?id=11657 (上網日期:2024年12月17日)。
8. 遠東集團 (2024b)。亞泥嘉惠電廠導入無人機自動巡檢方案 450億三期擴建計畫。
檢自https://www.feg.com.tw/tw/news/news_detail.aspx?id=11656 (上網日期:2024年12月17日)。
9. Alam, T., Qamar, S., Dixit, A., & Benaida, M. (2020). Genetic algorithm: Reviews, implementations, and applications. International Journal of Engineering Pedagogy, 10(6), 57–77.
10. Alam, M. M., Rahman, M. H., Ahmed, M. F., Chowdhury, M. Z., & Jang, Y. M. (2022). Deep learning based optimal energy management for photovoltaic and battery energy storage integrated home micro‑grid system. Scientific Reports, 12, Article 15133.
11. Bayerlein, H., Theile, M., Caccamo, M., & Gesbert, D. (2020). UAV path planning for wireless data harvesting: A deep reinforcement learning approach. arXiv preprint arXiv:2007.00544.
12. De Kuyffer, E., Joseph, W., Martens, L., & De Pessemier, T. (2024). Travel route and formation optimization for flocks of drones in package delivery by using an ACO‑based V‑shape algorithm. Results in Engineering, 24, 103627.
13. Galceran, E., & Carreras, M. (2013). A survey on coverage path planning for robotics. Robotics and Autonomous Systems, 61(12), 1258–1276.
14. Gao, C., Kou, Y., Li, Z., Xu, A., Li, Y., & Chang, Y. (2018). Optimal multirobot coverage path planning: Ideal‑shaped spanning tree. Mathematical Problems in Engineering, 2018, Article 3436429.
15. Ghasemi, M., & Ebrahimi, D. (2024). Introduction to reinforcement learning. arXiv preprint arXiv:2408.07712.
16. Heinrich, H. W. (1941). Industrial accident prevention: A scientific approach (2nd ed.). McGraw‑Hill.
17. International Labour Organization (2023). A call for safer and healthier working environments. Retrieved from https://www.ilo.org/publications/call-safer-and-healthier-working-environments (Accessed: Nov. 7, 2024).
18. Javaid, A. (2013). Understanding Dijkstra’s algorithm. SSRN Electronic Journal.
19. Jayaweera, H. M. P. C., & Hanoun, S. (2021). UAV path planning for reconnaissance and look‑ahead coverage support for mobile ground vehicles. Sensors, 21(13), Article 4595.
20. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of ICNN’95: International Conference on Neural Networks (Vol. 4, pp. 1942–1948). IEEE.
21. Li, K., Zhang, T., Wang, R., Wang, Y., Han, Y., & Wang, L. (2021). Deep reinforcement learning for combinatorial optimization: Covering salesman problems. IEEE Transactions on Cybernetics, 52(12), 13142–13155.
22. Li, S., Chen, X., Zhang, M., Jin, Q., Guo, Y., & Xing, S. (2022). A UAV coverage path planning algorithm based on double deep Q‑network. Journal of Physics: Conference Series, 2216(1), Article 012017.
23. Loughborough University. (2019). Solar experts and aerial inspection company awarded £345k to develop drone technologies that detect panel defects. Retrieved from https://www.lboro.ac.uk/news-events/news/2019/june/crest-and-above-drone-tech-project/ (Accessed: Jan. 17, 2025).
24. Mahmoudinazlou, S., Sobhanan, A., Charkhgard, H., Eshragh, A., & Dunn, G. (2025). Deep reinforcement learning for dynamic order picking in warehouse operations. Computers & Operations Research, 182, Article 107112.
25. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
26. Mu, X., Gao, W., Li, X., & Li, G. (2023). Coverage path planning for UAV based on improved back-and-forth mode. IEEE Access, 11, 114840–114854.
27. Oroojeni, M. J., Agboola, S. O., Jethwani, K., Zeid, A., & Kamarthi, S. (2019). A reinforcement learning–based method for management of type 1 diabetes: Exploratory study. JMIR Diabetes, 4(3), Article e12905.
28. Reuters. (2024). Britain to allow drones to inspect power lines, wind turbines. Retrieved from https://www.reuters.com/world/uk/britain-allow-drones-inspect-power-lines-wind-turbines-2024-10-14/ (Accessed: Dec. 31, 2024).
29. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.
30. Sanjay, K. S., Kanniga Devi, R., Arjun, P., & Geethan, D. A. (2024). Automated safety compliance monitoring in industrial environments with autonomous rover. In Proceedings of the 2024 International Conference on Electrical, Electronics and Computing Technologies (ICEECT) (pp. 1580–1586). IEEE.
31. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
32. Silva, J. E., da Silva, W. H. B., Ferraz, M. A. J., Menezes, E. A. S., da Costa, O. P., Inácio, F. D., Barboza, T. O. C., Melo, C. A. D., Carvalho, G. R., & dos Santos, A. F. (2024). Impact of spray volume and flight speed on the efficiency of drone applications in coffee plants of different ages. Smart Agricultural Technology, 9, Article 100694.
33. Theile, M., Bayerlein, H., Caccamo, M., & Sangiovanni‑Vincentelli, A. L. (2023). Learning to recharge: UAV coverage path planning through deep reinforcement learning. arXiv preprint arXiv:2309.03157.
34. Watkins, C. J. C. H., & Dayan, P. (1992). Q‑learning. Machine Learning, 8(3–4), 279–292.
35. World Health Organization. (2024). Occupational injury indicator (metadata).
Retrieved from https://www.who.int/data/gho/indicator-metadata-registry/imr-details/158(Accessed: Nov. 17, 2024).
36. Xu, P.-F., Ding, Y.-X., & Luo, J.-C. (2021). Complete coverage path planning of an unmanned surface vehicle based on a complete coverage neural network algorithm. Journal of Marine Science and Engineering, 9(11), Article 1163.