深度強化學習於適應性號誌控制之研究｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	王亦凡 I-Fan Wang
論文名稱：	深度強化學習於適應性號誌控制之研究 Research on Deep Reinforcement Learning for Adaptive Traffic Signal Control
指導教授：	陳惠國 Huey-Kuo Chen
口試委員:
學位類別：	碩士 Master
系所名稱：	工學院 - 土木工程學系 Department of Civil Engineering
論文出版年：	2024
畢業學年度：	112
語文別：	中文
論文頁數：	56
中文關鍵詞：	適應性號誌控制、深度強化學習、Rainbow DQN 、交通模擬
外文關鍵詞：	adaptive signal control, deep reinforcement learning, Rainbow DQN, traffic simulation
相關次數：	點閱：9 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本研究旨在探討深度強化學習在適應性號誌控制中的應用，將透過微觀交通模擬軟體Vissim來模擬尖峰時段臺北市路口的車流情境，在考量不同車種當量影響和機車兩段式左轉設計下，建構基於深度強化學習演算法的適應性號誌控制系統，以改善目前市區路口尖峰時段的交通狀態。
架構上將透過深度強化學習網路Rainbow DQN作為號誌控制系統的判斷模型，考量流向基礎之車流狀態和時相狀態，動作選擇以時制順序切換與延長綠燈時間作為號誌控制方式，獎勵目標以最小化路口總壓力，並將結果與定時號誌為基準比較兩者間的路口績效表現。
實驗設計將晨峰和昏峰拆分成各三個不同時段場景訓練，結果顯示透過深度強化學習於適應性號誌控制確實能降低路口之停等長度，在各實驗場景皆可快速收斂於100回合內，並於晨峰尖峰時段改善50%的績效，且模型設計能適應研究設計中市區內不同尖峰時段的車流量，彈性的狀態、動作和獎勵設計能將模型一般化應用於其他場景應用。

This study aims to explore the application of deep reinforcement learning in adaptive traffic signal control. Using the microscopic traffic simulation software Vissim, we simulate the traffic conditions at intersections in Taipei City during peak hours. Considering the effects of different vehicle types and the two-stage left-turn design for motorcycles, we construct an adaptive traffic signal control system based on a deep reinforcement learning algorithm to improve the current traffic conditions at urban intersections during peak hours.
The framework employs the deep reinforcement learning network Rainbow DQN as the decision model for the signal control system. The model considers traffic flow conditions and phase states, with action choices focusing on phase sequence switching and green light extension as control methods. The reward objective is to minimize the total intersection pressure. The system's performance is compared with fixed-time signals as a baseline.
The experimental design splits morning and evening peaks into three different time periods for training. Results show that deep reinforcement learning in adaptive traffic signal control effectively reduces waiting times at intersections. The model converges quickly within 100 episodes across all experimental scenarios and improves performance by 50% during peak morning hours. Furthermore, the model design can adapt to varying traffic volumes during different peak periods in urban areas, with flexible state, action, and reward designs enabling generalization to other scenarios.

摘要    i
Abstract    ii
致謝    iii
目錄    iv
圖目錄    vi
表目錄    vii
第一章 緒論    1
第二章 文獻回顧    4
2.1強化學習於號誌控制的應用    4
2.2類神經網路架構設計    5
2.3強化學習機制    5
第三章 研究方法    8
3.1深度強化學習算法    8
3.2價值基礎之Rainbow DQN    8
3.2.1 Double DQN    9
3.2.2 Prioritized Experience Replay    10
3.2.3 Dueling Network    11
3.2.4 Distributional DQN    12
3.2.5 Noisy Net    13
3.2.6 n步學習    14
第四章 模型與實驗設計    16
4.1強化學習之模型設計    16
4.1.1代理人設計    16
4.1.2類神經網路架構    19
4.1.3訓練流程    24
4.2研究範圍    27
4.2.1使用資料    28
4.2.2模擬軟體    29
4.3實驗設計    29
4.4號誌控制於模擬場景    30
第五章 實驗訓練結果    31
5.1訓練績效    31
5.1.1等候長度和停等延滯    31
5.1.2車輛數分析    33
5.1.3損失分析    35
5.2車種當量設定比較    36
第六章 結論與建議    37
6.1結論    37
6.2建議    38
第七章 參考文獻    39
附錄    43


                                

[1] 李秉原，2023，應用價值基礎之元強化學習方法於交通號誌控制之研究，國立中央大學土木工程系碩士論文。
[2] 胡守任、葉志韋、林定憲、劉瀚聰，2020，都市適應性號誌控制原理與發展，土木水利，第四十七卷，第四期，第28-39頁。
[3] 陳惠國，2022，強化學習應用於交通號誌控制之展望，中華道路季刊，第六十一卷，第四期，第43-54頁。
[4] Abdoos, M., Mozayani, N. and Bazzan, A. L., 2013, Holonic multi-agent system for traffic signals control. Engineering Applications of Artificial Intelligence, Vol.26, No.5, pp.1575–1587.
[5] Abdulhai, B., Pringle, R. and Karakoulas G. J., 2003, Reinforcement learning for true adaptive traffic signal control, Journal of Transportation Engineering, Vol.129, No.3, pp.278–285.
[6] Arel, I., Liu, C., Urbanik, T. and Kohls AG., 2010, Reinforcement learning based multi-agent system for network traffic signal control, IET Intelligent Transport Systems, Vol.4, No.2, pp.128–135.
[7] Bakker, B., Whiteson, S., Kester L. and Groen F. C., 2010, Traffic light control by multiagent reinforcement learning systems, Interactive Collaborative Information Systems, pp.475–510.
[8] Chen, C., Wei, H., Xu, N., Zheng, G., Yang, M., Xiong, Y., Xu, K. and Li, Z., 2020, Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control, Proceedings of the AAAI Conference on Artificial Intelligence, Vol.34, No.4, pp.3414–3421.
[9] Dabney, W., Rowland, M., Bellemare, M. and Munos, R., 2018, Distributional reinforcement learning with quantile regression, Proceedings of the AAAI conference on artificial intelligence, Vol.32, No.1.
[10] El-Tantawy, S., Abdulhai, B. and Abdelgawad, H., 2013, Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): methodology and large-scale application on downtown Toronto, IEEE Transactions on Intelligent Transportation Systems, Vol.14, No.3, pp.1140–1150.
[11] Fortunato, M., Azar, M. G., Piot, B., Menick, J., Osband, I., Graves,A., Mnih, V., Munos, R., Hassabis, D., Pietquin, O., Blundell, C. and Legg, S., 2018, Noisy Networks for Exploration., The Twelfth International Conference on Learning Representations(ICLR).
[12] Gao, J., Shen, Y., Liu, J., Ito, M. and Shiratori, N., 2017, Adaptive traffic signal control: Deep reinforcement learning algorithm with experience replay and target network, arXiv:1705.02755.
[13] Genders, W. and Razavi, S., 2016, Using a deep reinforcement learning agent for traffic signal control, arXiv:1611.01142.
[14] Hasselt, V., Hado, Guez, A. and Silver, D., 2016, Deep reinforcement learning with double q-learning, Proceedings of the AAAI conference on artificial intelligence, Vol. 30, No.1.
[15] Hessel, M., Modayil, J., Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., Horgan, D., Piot, B., Azar, M. and Silver, D., 2017, Rainbow: combining improvements in deep reinforcement learning, arXiv:1710.02298.
[16] Li, L., Lv, Y. and Wang, F.Y., 2016, Traffic signal timing via deep reinforcement learning, IEEE/CAA Journal of Automatica Sinica , Vol.3, No.3, pp.247–254.
[17] Liu, M., Deng, J., Xu, M., Zhang, X. and Wang, W., 2017, Cooperative deep reinforcement learning for traffic signal control. In 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Halifax.
[18] Mannion, P., Duggan, J. and Howley, E., 2016, An experimental review of reinforcement learning algorithms for adaptive traffic signal control, Autonomic Road Transport Support Systems, pp.47–66.
[19] Schaul, T., Quan, J., Antonoglou, I. and Silver, D., 2015, Prioritized experience replay, arXiv:1511.05952.
[20] Sutton, R. S. and Barto, A. G., 2018, Reinforcement Learning: An Introduction (2nd ed.). MIT Press.
[21] van der Pol E. and Oliehoek F. A., 2016, Coordinated deep reinforcement learners for traffic light control, Proceedings of learning, inference and control of multi-agent systems (at NIPS), Vol.8, pp.21-38.
[22] Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M. and Freitas, N., 2016, Dueling network architectures for deep reinforcement learning. Proceedings of Machine Learning Research (PMLR), pp.1995-2003.
[23] Wei, H., Xu, N., Zhang, H., Zheng, G., Zang, X., Chen, C., Zhang, W., Zhu, Y., Xu, K. and Li, Z., 2019, PressLight: Learning max pressure control to coordinate traffic signals in arterial network, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), New York, USA, pp.1290–1298.
[24] Wei, H., Zheng, G., Yao, H. and Li, Z., 2018, IntelliLight: A reinforcement learning approach for intelligent traffic light control, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), London, UK, pp.2496-2505.
[25] Wiering, M., 2000, Multi-agent reinforcement learning for traffic light control, In Machine Learning: Proceedings of the Seventeenth International Conference (ICML), pp.1151–1158.
[26] Zang, X., Yao, H., Zheng, G., Xu, N., Xu, K. and Li, Z., 2020, MetaLight: Value-based meta-reinforcement learning for traffic signal control, Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No.1, pp.1153-1160.
[27] Zhang, H., Liu, C., Zhang, W., Zheng, G. and Yu, Y., 2020, Generalight: Improving environment generalization of traffic signal control via meta reinforcement learning, Proceedings of the 29th ACM international conference on information & knowledge management, pp.1783-1792.
[28] Zhao, W., Ye, Y., Ding, J., Wang, T., Wei, T. and Chen, M., 2022, IPDALight: Intensity and phase duration-aware traffic signal control based on reinforcement learning, Journal of Systems Architecture, Vol. 123, pp.102374-102385.
[29] Zheng, G., Xiong, Y., Zang, X., Feng, J., Wei, H., Zhang, H., Li, Y., Xu, K. and Li, Z., 2019, Learning phase competition for traffic signal control, Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp.1963-1972.
[30] Zheng, G., Zang, X., Xu, N., Wei, H., Yu, Z., Gayah, V., Xu, K. and Li, Z., 2019, Diagnosing reinforcement learning for traffic signal control, arXiv:1905.04716.

簡易檢索 / 詳目顯示

相關論文