應用自然語言處理與機器學習於疾病分類編碼之探討

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳美君 Mei-Chun Chen
論文名稱：	應用自然語言處理與機器學習於疾病分類編碼之探討 Natural Language Processing and Machine Learning Techniques for Disease Classification of Medical Records
指導教授：	蔡志豐 Chih-Fong Tsai
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理學系 Department of Information Management
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	129
中文關鍵詞：	疾病分類、自然語言處理、機器學習、LightGBM
外文關鍵詞：	Disease Classification, Natural Language Processing, Machine Learning, LightGBM
相關次數：	點閱：12 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

臺灣醫療品質享譽國際，全民納保的健康保險制度在1995年開辦後，龐大的就醫資料為臺灣醫療發展在世界上建立影響力的基石。然而，大量的就醫需求造成醫療費用不斷上漲，使全民健保制度長期運作受到重大挑戰。為使醫療資源更加妥善利用，衛生福利部積極修訂支付制度，因此，以疾病分類碼為基礎的申報制度和費用給付密切相關，編碼的適當性、正確性及完整性成為醫療給付的重要關鍵。

為了解決醫療提供者於臨床醫學疾病名詞多樣性與複雜度，及病歷文字非結構化資料必須運用人力閱讀及理解方能正確分類診斷的困境，本研究應用自然語言處理N-Gram和TF-IDF技術從去識別化的真實病歷資料提取文字特徵向量，搭配機器學習建構四個預測分類模型：SVM、MLP、GBDT與LightGBM，使用交叉驗證減少模型偏差的狀況，評估模型的方式使用Confusion Matrix的Accuracy、Precision、Recall、F1 Score和AUC來檢驗模型的分類效果並比較與分析。並透過三個實驗設計探討臨床醫學上常見的類別不平衡、醫學名詞還原和醫師個人撰寫風格差異問題。

最後結果顯示LightGBM的預測結果優於其他模型，尤其在訓練時間有出色的表現。類別平衡處理有助於提高分類器效果。醫學名詞縮寫具有獨特性，有助於分類判斷。疾病為專有的醫學名詞，雖然醫師表達方式不同，但並不影響對同一疾病的描述方式，不同科別的醫師撰寫病歷風格不影響分類模型結果。

Taiwan’s qualities of medicine and health cares are on the top of the world. Millions of electronic medicine recorders (EMR) from citizens can be collected from the National Health Insurance (NHI), which was founded in 1995. Moreover, these EMRs have become the basis of the medical technologies evolutions in Taiwan. Although NHI is good, it needs lots of money to perform social operations, and the rapidly increasing costs from all perspective of medical needs make its situation even worse. To overcome the problem and improve the resource efficiency, the NHI Administration defines lots of systems to ensure all resources are used in the correct way, and one of these systems is ICD-10-CM/PCS. The correct code in ICD-10-CM/PCS is the key of NHI benefits.

To address the complexity of medical terminologies, the N-gram and TF-IDF technologies of NLP were applied on real EMRs with De-identification in this research. In addition, SVM, MLP, GBDT, and LightGBM models with Cross-validation are constructed. All of these four models are compared and analyzed in terms of Accuracy, Precision, Recall, F1 Score and AUC in Confusion Matrix. On the other hand, three experiments are designed for the impacts of the personal writing style, the screw of terminologies in different subjects, and the needs of abbreviation restoration.

The result reveals that LightGBM provides better performance and, especially, its training time is superior to others, as well as the classification model has better performances if the original imbalanced training set is balanced after some preprocess stage. The abbreviation of medical terminologies, not like general ones used by normal people, it could contribute to the model because of uniqueness. Diseases are all proper nouns, thus the same disease might be described differently by different doctors due to personal writing styles, but the features selected in the training model would remain the same; the writing styles has no influences to the model and its result.

摘要    I
Abstract    II
目錄    III
圖目錄    V
表目錄    VII
第一章、 緒論    1
1-1 研究背景    1
1-2 研究動機    3
1-3 研究目的    4
1-4 論文架構    5
第二章、 文獻探討    7
2-1 疾病分類意義    7
2-1-1 醫學詞庫    8
2-2 機器學習運用在輔助臨床決策    9
2-2-1 醫療AI先行者－專家系統    9
2-2-2 AI輔助決策系統－精準醫療    10
2-3 疾病分類相關的研究及文獻    11
第三章、 研究方法    13
3-1 研究架構    13
3-2 資料來源    15
3-3 資料前處理    15
3-3-1 文字前處理 (Pre-Processing)    15
3-3-2 詞幹提取 (Stemming)    17
3-4 文字向量處理    19
3-4-1 詞袋模型 (Bag of Words Model, BoW Model)    19
3-4-2 TF-IDF (Term Frequency-Inverse Document Frequency)    21
3-4-3 N元語法 (N-Gram)    25
3-5 建構機器學習分類模型    26
3-5-1 支援向量機 (Support Vector Machine, SVM)    26
3-5-2 多層感知器 (Multilayer perceptron, MLP)    27
3-5-3 梯度提升決策樹 (Gradient Boosting Decision Tree, GBDT)    28
3-5-4 LightGBM (Light Gradient Boosting Machine)    29
3-5-5 One-Versus-All (OVA)    29
3-5-6 交叉驗證 (Cross Validation)    30
3-5-7 處理過度擬合 (Overfitting)    31
3-6 評估方法    33
3-6-1 混亂矩陣 (Confusion Matrix)    33
3-7 實驗設計    40
3-7-1 實驗一設計    40
3-7-2 實驗二設計    41
3-7-3 實驗三設計    42
第四章、 實驗結果與分析    43
4-1 實驗資料    43
4-2 系統環境    47
4-3 實驗結果    48
4-3-1 實驗一分析    49
4-3-2 實驗二分析    59
4-3-3 實驗三分析    65
4-4 實驗分析總結    71
第五章、 結論    73
5-1 研究結論    73
5-2 研究限制    74
5-3 未來研究方向與建議    75
參考文獻    77
附錄一    82
附錄二    83
附錄三    84
附錄四    101
附錄五    113


                                

2018 Kaggle ML & DS Survey. Retrieved from: https://www.kaggle.com/kaggle/kaggle-survey-2018
Aidan Jones (2015, August 12). IBM's Watson could learn how to identify cancer and heart disease from X-rays and MRI scans. Retrieved from: https://health.economictimes.indiatimes.com/news/diagnostics/ibms-watson-could-learn-how-to-identify-cancer-and-heart-disease-from-x-rays-and-mri-scans/48445600
Aly, M. (2005). Survey on multiclass classification methods. Neural Network, 19, 1-9.
Barnett, G. O., Cimino, J. J., Hupp, J. A., & Hoffer, E. P. (1987). DXplain: an evolving diagnostic decision-support system. Jama, 258(1), 67-74.
Benediktsson, J., Swain, P., & Ersoy, O. (1990). Neural Network Approaches Versus Statistical Methods In Classification Of Multisource Remote Sensing Data. IEEE Transactions on Geoscience and Remote Sensing, 28(4), 540-552.
Bennett, V., Braddock, D., Lee, P., & Smith, L. (2010). The coding workforce shortfall.
Brown, P. F., Desouza, P. V., Mercer, R. L., Pietra, V. J. D., & Lai, J. C. (1992). Class-based n-gram models of natural language. Computational linguistics, 18(4), 467-479.
Calle, E. E., Rodriguez, C., Walker-Thurmond, K., & Thun, M. J. (2003). Overweight, obesity, and mortality from cancer in a prospectively studied cohort of US adults. New England Journal of Medicine, 348(17), 1625-1638.
Charbonneau, A., Rosen, A. K., Ash, A. S., Owen, R. R., Kader, B., Spiro III, A., Kazis, L. (2003). Measuring the quality of depression care in a large integrated health system. Medical care, 669-680.
Cheng, P., Gilchrist, A., Robinson, K. M., & Paul, L. (2009). The risk and consequences of clinical miscoding due to inadequate medical documentation: a case study of the impact on health services funding. Health Information Management Journal, 38(1), 35-46.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
Donoghue, M. (1992). The prevalence and cost of documentation and coding errors. Australian Medical Record Journal, 22(3), 91-97.
Dr. Taha A. Kass-Hout, Dr. Matt Wood (2018, November 27). Introducing medical language processing with Amazon Comprehend Medical. Retrieved from: https://aws.amazon.com/blogs/machine-learning/introducing-medical-language-processing-with-amazon-comprehend-medical/
Elkin, P. L., Froehling, D., Wahner-Roedler, D., Trusko, B., Welsh, G., Ma, H., Brown, S. H. (2008). NLP-based identification of pneumonia cases from free-text radiological reports. Paper presented at the AMIA annual symposium proceedings, 2008, 172-176.
Friedlin, J., Overhage, M., Al-Haddad, M. A., Waters, J. A., Aguilar-Saavedra, J. J. R., Kesterson, J., & Schmidt, M. (2010). Comparing methods for identifying pancreatic cancer patients using electronic data sources. Paper presented at the AMIA Annual Symposium Proceedings, 2010, 237-241.
Friedman, C., Shagina, L., Lussier, Y., & Hripcsak, G. (2004). Automated encoding of clinical documents based on natural language processing. Journal of the American Medical Informatics Association, 11(5), 392-402.
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 29(5), 1189-1232.
Gardner, M. W., & Dorling, S. (1998). Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment, 32(14-15), 2627-2636.
GitHub: microsoft/LightGBM. Retrieved from: https://github.com/Microsoft/LightGBM
GitHub: nemec/porter2-stemme. Retrieved from: https://github.com/nemec/porter2-stemmer
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks, 2(5), 359-366.
Houck, P. M., Bratzler, D. W., Nsa, W., Ma, A., & Bartlett, J. G. (2004). Timing of antibiotic administration and outcomes for Medicare patients hospitalized with community-acquired pneumonia. Archives of internal medicine, 164(6), 637-644.
Hsia, D. C., Krushat, W. M., Fagan, A. B., Tebbutt, J. A., & Kusserow, R. P. (1988). Accuracy of diagnostic coding for Medicare patients under the prospective-payment system. New England Journal of Medicine, 318(6), 352-355.
Hsu, C.-W., & Lin, C.-J. (2002). A comparison of methods for multiclass support vector machines. IEEE transactions on Neural Networks, 13(2), 415-425.
Interlandi, M., Matusevych, S., Amizadeh, S., Zahirazami, S., Weimer, M. (2018). Machine Learning at Microsoft with ML.NET. Paper presented at the NIPS SysML Workshop.
Izzo, A. A., Di Carlo, G., Borrelli, F., & Ernst, E. (2005). Cardiovascular pharmacotherapy and herbal medicines: the risk of drug interaction. International journal of cardiology, 98(1), 1-14.
Jackson, L. A., Neuzil, K. M., Yu, O., Benson, P., Barlow, W. E., Adams, A. L., Thompson, W. W. (2003). Effectiveness of pneumococcal polysaccharide vaccine in older adults. New England Journal of Medicine, 348(18), 1747-1755.
Joe Best (2013, September 9). IBM Watson: The inside story of how the Jeopardy-winning supercomputer was born, and what it wants to do next – TechRepublic. Retrieved from: http://www.techrepublic.com/article/ibm-watson-the-inside-story-of-how-the-jeopardy-winning-supercomputer-was-born-and-what-it-wants-to-do-next/
Kaggle: Machine Learning and Data Science Community. Retrieved from: https://www.kaggle.com/
Kavuluru, R., Rios, A., & Lu, Y. (2015). An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records. Artificial intelligence in medicine, 65(2), 155-166.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Liu, T.-Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Paper presented at the Advances in neural information processing systems.
Koopman, B., Zuccon, G., Nguyen, A., Bergheim, A., & Grayson, N. (2015). Automatic ICD-10 classification of cancers from free-text death certificates. International journal of medical informatics, 84(11), 956-965.
Kuh, D., Ben-Shlomo, Y., Lynch, J., Hallqvist, J., & Power, C. (2003). Life course epidemiology. Journal of epidemiology and community health, 57(10), 778.
Liang, H., Tsui, B. Y., Ni, H., Valentim, C. C., Baxter, S. L., Liu, G., Chen, J. (2019). Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nature medicine, 25(3), 433-438.
LightGBM’s documentation. Retrieved from: https://lightgbm.readthedocs.io/
Lin, C., Hsu, C.-J., Lou, Y.-S., Yeh, S.-J., Lee, C.-C., Su, S.-L., & Chen, H.-C. (2017). Artificial intelligence learning semantics via external resources for classifying diagnosis codes in discharge notes. Journal of medical Internet research, 19(11), e380.
Liu, Y., & Zheng, Y. F. (2005). One-against-all multi-class SVM classification using reliability measures. Paper presented at the Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005.
Mike Miliard (2020, June 10). Nuance, Wolters Kluwer partner for voice enabled clinical decision support from: https://www.healthcareitnews.com/news/nuance-wolters-kluwer-partner-voice-enabled-clinical-decision-support
Moreno, C. A., Rosenthal, V. D., Olarte, N., Gomez, W. V., Sussmann, O., Agudelo, J. G., Valderrama, A. (2006). Device-associated infection rate and mortality in intensive care units of 9 Colombian hospitals: findings of the International Nosocomial Infection Control Consortium. Infection Control & Hospital Epidemiology, 27(4), 349-356.
Nadkarni, P., Chen, R., & Brandt, C. (2001). UMLS concept indexing for production databases: a feasibility study. Journal of the American Medical Informatics Association, 8(1), 80-91.
NCU News. Retrieved from: https://www.ncu.edu.tw/en/campus/news
Nguyen, A. N., Truran, D., Kemp, M., Koopman, B., Conlan, D., O’Dwyer, J., Lawley, M. J. (2018). Computer-Assisted Diagnostic Coding: Effectiveness of an NLP-based approach using SNOMED CT to ICD-10 mappings. Paper presented at the AMIA Annual Symposium Proceedings, 2018, 807-816.
O'malley, K. J., Cook, K. F., Price, M. D., Wildes, K. R., Hurdle, J. F., & Ashton, C. M. (2005). Measuring diagnoses: ICD code accuracy. Health services research, 40(2), 1620-1639.
OpenCV Dev Team. Introduction to Support Vector Machines—OpenCV 2.4.13.7 documentation. Retrieved from: https://docs.opencv.org/2.4/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.html
Passchier-Vermeer, W., & Passchier, W. F. (2000). Noise exposure and public health. Environmental health perspectives, 108(1), 123-131.
Pomposelli, J. J., Baxter III, J. K., Babineau, T. J., Pomfret, E. A., Driscoll, D. F., Forse, R. A., & Bistrian, B. R. (1998). Early postoperative glucose control predicts nosocomial infection rate in diabetic patients. Journal of Parenteral and Enteral Nutrition, 22(2), 77-81.
Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130-137.
Porter, M. F. (2001). Snowball: A language for stemming algorithms. Retrieved from: http://snowball.tartarus.org/texts/introduction.html
Rahmathulla, G., Deen, H. G., Dokken, J. A., Pirris, S. M., Pichelmann, M. A., Nottmeier, E. W., Wharen Jr, R. E. (2014). Migration to the ICD-10 coding system: A primer for spine surgeons (Part 1). Surgical neurology international, 5(3), S185-S191
Rough, K., Dai, A. M., Zhang, K., Xue, Y., Vardoulakis, L. M., Cui, C., Rajkomar, A. (2020). Predicting inpatient medication orders from electronic health record data. Clinical Pharmacology & Therapeutics.
Shortliffe, E. H., Davis, R., Axline, S. G., Buchanan, B. G., Green, C. C., & Cohen, S. N. (1975). Computer-based consultations in clinical therapeutics: explanation and rule acquisition capabilities of the MYCIN system. Computers and biomedical research, 8(4), 303-320.
SNOMED CT. Retrieved from: http://www.ihtsdo.org/snomed-ct/
Steinman, M. A., Landefeld, C. S., & Gonzales, R. (2003). Predictors of broad-spectrum antibiotic prescribing for acute respiratory tract infections in adult primary care. Jama, 289(6), 719-725.
Tatham, A. (2008). The increasing importance of clinical coding. In: MA Healthcare London.
UMLS. Retrieved from: https://www.nlm.nih.gov/research/umls/
Warner, J. H., & Bouhaddou, O. (1994). Innovation review: Iliad--a medical diagnostic support program. Topics in health information management, 14(4), 51-58.
WHO: International Classification of Diseases. Retrieved from: https://www.who.int/classifications/icd/en/
World Health Assembly Update (2019, May 25): International Statistical Classification of Diseases and Related Health Problems (ICD-11). Retrieved from: https://www.who.int/news-room/detail/25-05-2019-world-health-assembly-update
中央健康保險署 (2015, January 7)。ICD-10-CM/PCS疾病分類編碼指引。取自：https://www.nhi.gov.tw/Content_List.aspx?n=0E60DB354566D50E&topn=23C660CAACAA159D
中央健康保險署 (2016, January 29)。為提升醫療服務效率，建立醫院品質比較基礎，全民健保自105年3月1日起，醫院住院全面實施Tw-DRGs支付制度。取自：https://www.mohw.gov.tw/fp-2623-19648-1.html
中央健康保險署 (2017, July 19)。2014年版_中文版ICD-10-CM/PCS。取自：https://www.nhi.gov.tw/Content_List.aspx?n=20443564F26622DC&topn=23C660CAACAA159D
中央健康保險署 (2017, October 11)。全民健康保險門診醫療費用申報診斷碼編碼指引。取自：https://www.nhi.gov.tw/Content_List.aspx?n=B9D0D352E3D58AD2&topn=23C660CAACAA159D
中央健康保險署 (2019) 國際疾病分類使用指引。取自：https://dep.mohw.gov.tw/DOS/lp-2490-113.html
王家麒、賴俊甫、陳韋良、高東煒、賴錦皇、高森永 (2016)。ICD-10-CM/PCS 疾病分類編碼原則介紹及實施後之因應策略。家庭醫學與基層醫療，31(10)，343-351。
李友專 (2018)。AI醫療大未來。臺灣：好人出版社。
翁書婷 (2017, July 14)。癌症病患新選擇：臺北醫學大學導入IBM華生人工智慧治療輔助系統。取自：https://www.bnext.com.tw/article/45382/watson-for-oncology-ibm-cancer
許明暉 (2015)。全民健康保險與資訊服務。國土及公共治理季刊，3(4)，72-76。
慈善新聞網 (2020, May 12)。醫師問診好幫手 ICD-10編碼AI推薦引擎。取自：https://times.hinet.net/news/22898128
衛生福利部醫事司 (2015, January 9)。強化醫病溝通，循序漸進推動病歷及病人資訊中文化。取自：https://www.mohw.gov.tw/fp-2636-21135-1.html

簡易檢索 / 詳目顯示

相關論文