跳到主要內容

簡易檢索 / 詳目顯示

研究生: 林士詒
Shih-Yi Lin
論文名稱: 融合LLM規劃路徑與視覺定位的自主物流搬運系統
指導教授: 王文俊
Wen-June Wang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 71
中文關鍵詞: 無人搬運車大型語言模型雲端決策架構低成本設計任務規劃與導 航
外文關鍵詞: Autonomous Mobile Robot, Large Language Model, Cloud-Based Decision Architecture, Low-Cost Design, Task Planning and Navigation
相關次數: 點閱:30下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著網路購物快速發展與社會消費型態改變,越來越多空間被用來作為臨時存放貨物的小型倉儲。然而,碎片化的空間如果使用人力進行管理,將會大幅增加成本;若改採完全自助的方式,又可能面臨安全性與信任不足的問題。為了解決這些挑戰,本論文設計了一套低成本的無人搬運車(Autonomous Mobile Robot, AMR)系統,讓機器人能夠自動管理倉儲狀態,並協助顧客取貨,達到無人化的操作流程。
    本系統的設計目標在於降低成本與計算資源需求,因此在硬體方面,捨棄了高價的光學雷達與深度相機,改以成本較低的里程計(Odometry)、慣性測量單元(Inertial Measurement Unit , IMU)以及網路攝影機(Webcam)來進行定位。而在系統架構上,搬運車不需具備完整的本地智慧運算功能,而是將任務決策與路徑規劃交由雲端的大型語言模型(Large Language Model, LLM)處理。透過這樣的設計,搬運車只需回傳自身位置、障礙物資訊與目標位置,就能由 LLM 推論出最佳的任務順序與中繼導航點,進而產生極佳的運動軌跡。
    由於減少感測器數量,本系統大量依賴如 AprilTag 等視覺標籤來協助定位與進行貨架對位,使搬運車能準確掌握自身在室內空間中的位置。導航過程中,透過 LLM 所選擇的任務順序與目標位置,結合LLM-A* 與 DWB 演算法來完成地圖中與搬運車周邊的路徑規劃。實際執行搬運任務的過程中,本系統設計了具備升降與夾取功能的夾爪用以執行取件動作,並建立重置機制,避免在連續操作時因結構累積誤差而造成偏移。同時,為修正里程計在長時間使用下的誤差累積問題,本系統也導入擴展卡爾曼濾波器(Extended Kalman Filter, EKF),結合 IMU 與里程計資料進行定位融合,提升整體移動精度。
    為了方便使用者操作與管理搬運車,本系統也設計了一套網頁式的操作介面,提供貨架、貨品、障礙物與任務等功能的管理。該介面具備跨平台支援,並與機器人作業系統(Robot Operating System, ROS)整合,可在電腦或手機等不同平台上操作此系統。實驗在模擬場域中進行驗證,結果顯示本系統能順利完成搬運任務,並具備穩定性與實用性


    With the rapid growth of e-commerce and the evolving demands of last-mile logistics, an increasing number of small indoor spaces are being utilized as temporary storage facilities. Manual management of such fragmented environments incurs high labor costs, while fully self-service models often suffer from limitations in safety and reliability. To address these challenges, this paper presents a cost-effective Autonomous Mobile Robot (AMR) system designed for automated object retrieval and storage management in small-scale warehouse scenarios.
    The proposed system emphasizes low hardware and computational overhead. Instead of relying on high-cost sensors such as LiDAR or depth cameras, it utilizes wheel odometry, an Inertial Measurement Unit (IMU), and a monocular webcam for localization. Task scheduling and global path planning are offloaded to a cloud-based Large Language Model (LLM), which computes optimal task sequences and navigation targets based on the robot’s current state and environmental information.
    For precise indoor localization and alignment, the system employs AprilTag visual fiducials, enabling accurate positioning relative to storage racks. Navigation is achieved using a hybrid approach that combines LLM-A* for global planning and the Dynamic Window Approach (DWB) for local obstacle avoidance. A lift-enabled gripper with an automatic reset mechanism is used for object handling, while an Extended Kalman Filter (EKF) fuses odometry and IMU data to reduce long-term drift and improve localization accuracy.
    A platform-independent web-based interface is developed for task assignment, inventory tracking, and real-time robot monitoring. Experimental results in a simulated warehouse field demonstrate the system’s ability to perform pick-and-place tasks reliably with high stability and low operational cost, making it suitable for flexible deployment in autonomous storage applications.

    摘要 i Abstract ii 致謝 iii 目錄 iv 圖目錄 vi 表目錄 ix 第一章 緒論 1 1.1 研究背景與動機 1 1.2 文獻回顧 2 1.3 論文目標 4 1.4 論文架構 4 第二章 系統架構與軟硬體介紹 5 2.1 系統架構 5 2.2 硬體介紹 6 2.2.1 無人搬運車硬體平台 6 2.2.2 電腦硬體平台 9 2.3 軟體介紹 9 2.3.1 ROS系統 9 2.3.2 管理系統 13 第三章 系統設計 15 3.1 移動底盤 15 3.1.1 麥克納姆輪 15 3.1.2 控制方式 16 3.1.3 擴展卡爾曼濾波器 17 3.2 夾爪與升降滑台 20 3.2.1 螺桿升降滑台(Z軸) 21 3.2.2 伺服舵機雲台(Pitch軸) 21 3.2.3 伺服舵機夾爪 22 3.2.4 控制器 22 3.2.5 控制方法 23 第四章 大型語言模型、定位與導航 25 4.1 大型語言模型 25 4.1.1 Azure平台 25 4.1.2 OpenAI GPT-4o mini 25 4.1.3 OpenAI o3-mini 26 4.2 AprilTag與定位校準 28 4.2.1 鏡頭校正與畫面擷取 28 4.2.2 AprilTag偵測與座標系轉換 29 4.2.3 AprilTag對正 31 4.2.4 定位校準 32 4.3 Nav2與導航 33 4.3.1 地圖與TF座標轉換 34 4.3.2 全域路徑規劃 35 4.3.3 區域路徑規劃 37 4.3.4 避障與執行 39 第五章 管理系統 42 5.1 貨架與貨件管理 42 5.2 障礙物管理 43 5.3 取件任務管理 44 5.4 顧客介面 44 第六章 實驗結果與討論 46 6.1 LLM決策 46 6.2 實際執行結果 51 6.3 實驗結果討論 52 第七章 結論與未來展望 53 7.1 結論 53 7.2 未來展望 53 參考文獻 55

    [1] "當前經濟情勢概況," 經濟部統計處, 2024. [Online]. Available: https://www.moea.gov.tw/Mns/populace/news/News.aspx?kind=1&menu_id=40&news_id=118857
    [2] 陳其華, 張贊育, 張世龍, and 劉銘韻, "電子商務發展對物流業影響分析," 交通部運輸研究所, 2022.
    [3] (2019). 低溫引起之職業病認定參考指引.
    [4] Ü. Bilge and J. M. A. Tanchoco, "AGV systems with multi-load carriers: Basic issues and potential benefits," Journal of Manufacturing Systems, vol. 16, no. 3, pp. 159–174, 1997/01/01/ 1997, doi: https://doi.org/10.1016/S0278-6125(97)88885-1.
    [5] X. Huo, X. He, Z. Xiong, and X. Wu, "Multi-objective optimization for scheduling multi-load automated guided vehicles with consideration of energy consumption," Transportation Research Part C: Emerging Technologies, vol. 161, p. 104548, 2024/04/01/ 2024, doi: https://doi.org/10.1016/j.trc.2024.104548.
    [6] D. Li-zhen, K. Shanfu, W. Zhen, T. Jing, Y. Lianqing, and L. Hongjun, "Research on multi-load AGV path planning of weaving workshop based on time priority," Mathematical Biosciences and Engineering, vol. 16, no. 4, pp. 2277–2292, 2019, doi: 10.3934/mbe.2019113.
    [7] E. W. Dijkstra, "A note on two problems in connexion with graphs," Numerische Mathematik, vol. 1, no. 1, pp. 269–271, 1959/12/01 1959, doi: 10.1007/BF01386390.
    [8] P. E. Hart, N. J. Nilsson, and B. Raphael, "A Formal Basis for the Heuristic Determination of Minimum Cost Paths," IEEE Transactions on Systems Science and Cybernetics, vol. 4, no. 2, pp. 100–107, 1968, doi: 10.1109/TSSC.1968.300136.
    [9] S. Karaman and E. Frazzoli, "Sampling-based algorithms for optimal motion planning," The International Journal of Robotics Research, vol. 30, no. 7, pp. 846–894, 2011, doi: 10.1177/0278364911406761.
    [10] E. Heiden, L. Palmieri, K. O. Arras, G. S. Sukhatme, and S. Koenig, "Experimental comparison of global motion planning algorithms for wheeled mobile robots," arXiv preprint arXiv:2003.03543, 2020.
    [11] R. Yonetani, T. Taniai, M. Barekatain, M. Nishimura, and A. Kanezaki, "Path planning using neural a* search," in International conference on machine learning, 2021: PMLR, pp. 12029–12039.
    [12] D. Kirilenko, A. Andreychuk, A. Panov, and K. Yakovlev, "TransPath: Learning Heuristics for Grid-Based Pathfinding via Transformers," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 10, pp. 12436–12443, 06/26 2023, doi: 10.1609/aaai.v37i10.26465.
    [13] W. X. Zhao et al., "A survey of large language models," arXiv preprint arXiv:2303.18223, vol. 1, no. 2, 2023.
    [14] H. Naveed et al., "A comprehensive overview of large language models," arXiv preprint arXiv:2307.06435, 2023.
    [15] E. Latif, "3p-llm: Probabilistic path planning using large language model for autonomous robot navigation," arXiv preprint arXiv:2403.18778, 2024.
    [16] L. Xiao and T. Yamasaki, "LLM-Advisor: An LLM Benchmark for Cost-efficient Path Planning across Multiple Terrains," arXiv preprint arXiv:2503.01236, 2025.
    [17] S. Meng, "Llm-a*: Large language model enhanced incremental heuristic search on path planning," University of California, Los Angeles, 2025.
    [18] R. Mautz, "Indoor positioning technologies," 2012.
    [19] F. Liu et al., "Survey on WiFi-based indoor positioning techniques," IET Communications, vol. 14, no. 9, pp. 1372–1383, 2020/06/01 2020, doi: https://doi.org/10.1049/iet-com.2019.1059.
    [20] F. Dellaert, D. Fox, W. Burgard, and S. Thrun, "Monte carlo localization for mobile robots," in Proceedings 1999 IEEE international conference on robotics and automation (Cat. No. 99CH36288C), 1999, vol. 2: IEEE, pp. 1322–1328.
    [21] Q. Li, J. P. Queralta, T. N. Gia, Z. Zou, and T. Westerlund, "Multi-sensor fusion for navigation and mapping in autonomous vehicles: Accurate localization in urban environments," Unmanned Systems, vol. 8, no. 03, pp. 229–237, 2020.
    [22] S. Garrido-Jurado, R. Muñoz-Salinas, F. J. Madrid-Cuevas, and M. J. Marín-Jiménez, "Automatic generation and detection of highly reliable fiducial markers under occlusion," Pattern Recognition, vol. 47, no. 6, pp. 2280–2292, 2014/06/01/ 2014, doi: https://doi.org/10.1016/j.patcog.2014.01.005.
    [23] E. Olson, "AprilTag: A robust and flexible visual fiducial system," in 2011 IEEE international conference on robotics and automation, 2011: IEEE, pp. 3400–3407.
    [24] J. Wang and E. Olson, "AprilTag 2: Efficient and robust fiducial detection," in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016: IEEE, pp. 4193–4198.
    [25] M. Krogius, A. Haggenmiller, and E. Olson, "Flexible layouts for fiducial tags," in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019: IEEE, pp. 1898–1903.
    [26] J. Kallwies, B. Forkel, and H. J. Wuensche, "Determining and Improving the Localization Accuracy of AprilTag Detection," in 2020 IEEE International Conference on Robotics and Automation (ICRA), 31 May–31 Aug. 2020 2020, pp. 8288–8294, doi: 10.1109/ICRA40945.2020.9197427.
    [27] tiangolo. "FastAPI framework, high performance, easy to learn, fast to code, ready for production." https://fastapi.tiangolo.com (accessed May, 2025).
    [28] PonyORM. "Pony ORM." https://ponyorm.org/ (accessed May, 2025).
    [29] Bootstrap. "Bootstrap · The most popular HTML, CSS, and JS library in the world." https://getbootstrap.com/ (accessed May, 2025).
    [30] Ilon and B. Erland, "Wheels for a course stable selfpropelling vehicle movable in any desired direction on the ground or some other base," United States Patent 3876255, 1975.
    [31] "Mecanum wheel --- Wikipedia, The Free Encyclopedia." https://en.wikipedia.org/w/index.php?title=Mecanum_wheel (accessed May, 2025).
    [32] "冰達機器人/base_control." https://gitee.com/bingda-robot/base_control (accessed May, 2025).
    [33] "robot_localization." https://index.ros.org/p/robot_localization/ (accessed May, 2025).
    [34] T. Moore and D. Stouch, "A generalized extended kalman filter implementation for the robot operating system," in Intelligent Autonomous Systems 13: Proceedings of the 13th International Conference IAS-13, 2015: Springer, pp. 335–348.
    [35] Arduino. "Nano." https://docs.arduino.cc/hardware/nano (accessed May, 2025).
    [36] "Rotary encoder --- Wikipedia, The Free Encyclopedia." https://en.wikipedia.org/w/index.php?title=Rotary_encoder (accessed May, 2025).
    [37] "Interrupt --- Wikipedia, The Free Encyclopedia." https://en.wikipedia.org/w/index.php?title=Interrupt&oldid=1291856335 (accessed Jun, 2025).
    [38] Microsoft. "Microsoft Azure – Cloud Computing Services." https://azure.microsoft.com/ (accessed May, 2025).
    [39] "Regular expression --- Wikipedia, The Free Encyclopedia." https://en.wikipedia.org/w/index.php?title=Regular_expression (accessed May, 2025).
    [40] OpenAI, "GPT-4o mini: advancing cost-efficient intelligence," 2024. [Online]. Available: https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/
    [41] OpenAI, "OpenAI o3-mini," 2025. [Online]. Available: https://openai.com/index/openai-o3-mini/
    [42] "v4l2_camera." https://index.ros.org/p/v4l2_camera (accessed May, 2025).
    [43] "image_proc." https://index.ros.org/p/image_proc (accessed May, 2025).
    [44] "ROS2与Navigation2入门教程-使用ROS2进行相机标定." https://www.ncnynl.com/archives/202110/4707.html (accessed Jun, 2025).
    [45] C. Brommer, D. Malyuta, D. Hentzen, and R. Brockers, "Long-Duration Autonomy for Small Rotorcraft UAS Including Recharging," in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 1–5 Oct. 2018 2018, pp. 7252–7258, doi: 10.1109/IROS.2018.8594111.
    [46] D. Malyuta, C. Brommer, D. Hentzen, T. Stastny, R. Siegwart, and R. Brockers, "Long-duration fully autonomous operation of rotorcraft unmanned aerial systems for remote-sensing data acquisition," Journal of Field Robotics, vol. 37, no. 1, pp. 137–157, 2020, doi: https://doi.org/10.1002/rob.21898.
    [47] AprilRobotics. "apriltag_ros." https://github.com/AprilRobotics/apriltag_ros (accessed May, 2025).
    [48] Adlink-ROS. "apriltag_ros." https://github.com/Adlink-ROS/apriltag_ros (accessed May, 2025).
    [49] S. Macenski, F. Martín, R. White, and J. G. Clavero, "The marathon 2: A navigation system," in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020: IEEE, pp. 2718–2725.
    [50] S. Macenski, T. Moore, D. V. Lu, A. Merzlyakov, and M. Ferguson, "From the desks of ROS maintainers: A survey of modern & capable mobile robotics algorithms in the robot operating system 2," Robotics and Autonomous Systems, vol. 168, p. 104493, 2023.
    [51] D. Fox, W. Burgard, and S. Thrun, "The dynamic window approach to collision avoidance," IEEE Robotics & Automation Magazine, vol. 4, no. 1, pp. 23–33, 1997, doi: 10.1109/100.580977.
    [52] G. Williams, P. Drews, B. Goldfain, J. M. Rehg, and E. A. Theodorou, "Aggressive driving with model predictive path integral control," in 2016 IEEE International Conference on Robotics and Automation (ICRA), 16–21 May 2016 2016, pp. 1433–1440, doi: 10.1109/ICRA.2016.7487277.
    [53] "Model Predictive Path Integral Controller." https://docs.nav2.org/configuration/packages/configuring-mppic.html (accessed Jun, 2025).

    QR CODE
    :::