跳到主要內容

簡易檢索 / 詳目顯示

研究生: 徐昊
Hao Hsu
論文名稱: 深度 Q 網絡學習用於加護病房敗血症治療
Deep Q Network Learning for Sepsis Treatment in Intensive Care Unit
指導教授: 王孫崇
Sun-Chong Wang
口試委員:
學位類別: 碩士
Master
系所名稱: 生醫理工學院 - 系統生物與生物資訊研究所
Graduate Institute of Systems Biology and Bioinformatics
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 84
中文關鍵詞: 機器學習強化學習深度 Q 網絡敗血症加護病房
外文關鍵詞: Machine Learning, Reinforcement Learning, Deep Q Network, Sepsis, Intensive Care Unit
相關次數: 點閱:19下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 敗血症是患者感染所引起全身性發炎的嚴重疾病,在重症加護病房是常見的死亡原因,隨著病程的發展,病患生理功能會逐漸受到損害而無法維持正常機能,最後演變成死亡,而不同敗血病患者的治療對於醫療措施會有不同的反應。目前敗血症在臨床上沒有普遍認可的治療方針與指引,治療敗血症患者是富有挑戰性的課題,所以了解患者在特定時間的生理狀態可能是製定有效治療政策的關鍵。現今深度強化學習的應用廣大,可以藉由電腦來執行人類智慧的判斷過程,用來輔助人類執行困難的工作。在我們的研究當中,提出了一種能夠推斷最佳敗血症治療的策略,係利用深度強化學習的方法,為敗血症患者的治療制定具有參考價值的醫療政策,學習到的治療政策可用於幫助重症加護病房的臨床醫生做出醫療決策並提高患者生存的可能性。我們發現與臨床醫師的決策相比,模型政策略優於臨床醫師的決策,且符合臨床醫師實際執行的政策特性分布,可用於為臨床醫生提供敗血症治療決策的輔助支持,協助醫師執行醫療策略。


    Sepsis is a fatal condition of systemic inflammation caused by infection of patients. Sepsis is prevalent a prevalent bring of death in the intensive care units (ICU), costing a hospital jillion. With the development of the sepsis disease, patients’ physiological functions will be gradually damaged, and they can not maintain normal functions well. Treatment of sepsis patients will respond diversely to clinical standards. Currently, there are no generally accepted treatment guidelines for sepsis patients, and treating patients with sepsis can be very challenging. Understanding a sepsis patient’s conditions and physiological state at a specific time may explain developing a worthwhile treatment policy. In our study, we proposed a strategy capable of inferring optimal treatment for sepsis patients, using a deep reinforcement learning method to create a reference medical policy for sepsis patients, and the learned treatment policy can be used to help clinicians in intensive care units make medical decisions and improve the likelihood of patient survival. Deep reinforcement learning is widely used in the medical field, and the algorithm can perform the judgment process of human intelligence to assist humans in performing complex tasks. Our policy is slightly better than the clinician's policy compared to the clinician's approach and our study. Finally, our policy conforms to the policy characteristic distribution implemented by the clinician, which can be used to provide the clinician with additional support for sepsis treatment and assist physicians in medical strategies.

    Chinese Abstrac ii English Abstract iii Table of Contents iv List of Figures v List of Tables vi Explanation of Symbols vii Chapter I Introduction 1 1-1 Machine Learning 1 1-2 Reinforcement Learning 2 1-3 Markov Decision Process 3 1-4 Deep Q Learning 3 1-5 Septicemia 6 Chapter 2 Materials and Methods 8 2-1 MIMIC-III Clinical Database 8 2-2 Process Flow Chart 9 2-3 Feature State Preprocessing 10 2-4 Feature Action Preprocessing 13 2-5 Feature Reward Preprocessing 16 Chapter 3 Results 17 3-1 Deep Q Network Models 17 3-2 Adjustments Appropriately to Increase Model Performance 18 3-3 Policies Evaluation of Different Models 19 3-4 Actions Learned by the Different Models 20 Chapter 4 Discussion and Conclusion 22 4-1 Discussion and Conclusion 22 4-2 Future Works 23 References 24 Appendix A 26 Appendix B 33

    1. J. Cohen, et al. Sepsis: a roadmap for future research. Lancet Infectious Diseases, 15(5): 581614, 2006
    2. Dellinger, R. Phillip, et al. Surviving Sepsis Campaign Guidelines Committee including the Pediatric Subgroup Surviving Sepsis Campaign, Critical Care Medicine, 41 (2): 580–637, 2013
    3. Jason Waechter, et al. Cooperative Antimicrobial Therapy of Septic Shock Database Research Group, et al. Interaction between fluids and vasoactive agents on mortality in septic shock: a multicenter, observational study. Critical care medicine, 42 (10): 2158–2168, 2014.
    4. Andrew Rhodes, et al. Surviving sepsis campaign: International guidelines for management of sepsis and septic shock: 2016. Intensive care medicine, 43 (3): 304–377, 2017.
    5. D. Silver, et al. Mastering the game of Go with deep neural networks and tree search. Nature, 529: 484–489, 2016.
    6. David Polle, et al. Computational Intelligence: A Logical Approach. New York: Oxford University Press. 1998.
    7. J. Hu, et al. Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments via Deep Reinforcement Learning. IEEE Transactions on Vehicular Technology. 69 (12): 14413-14423, 2020.
    8. Bellman, R. A Markovian Decision Process. Journal of Mathematics and Mechanics. 6 (5): 679–684, 1957.
    9. Watkins, Christopher JCH, and Peter Dayan. Q-learning. Machine learning 8.3 279-292, 1992.
    10. Raghu, Aniruddh, et al. Continuous state-space models for optimal sepsis treatment: a deep reinforcement learning approach. Machine Learning for Healthcare Conference. PMLR 147-163, 2017.
    11. Martín Abadi, et al. TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org. 2015.
    12. Chollet, Fran, et al. Keras. GitHub. 2015
    13. Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press. 2018.
    14. Hausknecht, Matthew, and Peter Stone. Deep recurrent q-learning for partially observable mdps. 2015 AAAI fall symposium series. 2015.
    15. Aniruddh Raghu, et al. Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters. International Conference on Machine Learning (ICML) Workshop on CausalML. 2018
    16. Raghu, Aniruddh. Reinforcement learning for sepsis treatment: Baselines and analysis. 2019.
    17. Angus, DC; van der Poll, T. Severe sepsis and septic shock. New England Journal of Medicine. 369 (9): 840–51, 2013.
    18. Johnson, Alistair, et al. MIMIC-III Clinical Database (version 1.4). PhysioNet. 2016
    19. Goldberger, A., et al. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23): e215–e220, 2000
    20. Van Buuren, S., Groothuis-Oudshoorn, K. mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software. 45(3), 1-67, 2011.
    21. VanValkinburgh, D., Kerndt, C. C., and Hashmi, M. F. Inotropes and vasopressors. StatPearls. 2018.

    QR CODE
    :::