跳到主要內容

簡易檢索 / 詳目顯示

研究生: 蕭琮寶
Chung-Pao Hsiao
論文名稱: 自建風控模型在降低成本和提高收益方面的應用研究
Application Study of Self-built Risk Control Models in Cost Reduction and Revenue Enhancement
指導教授: 梁德容
Deron Liang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系在職專班
Executive Master of Computer Science & Information Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 48
中文關鍵詞: 風控評分卡機器學習模型解釋性成本控制收益率
外文關鍵詞: Risk Scoring System, Machine Learning, Model Interpretability, Cost Control, Profitability
相關次數: 點閱:16下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究旨在探討自建風控模型在降低成本和提高收益方面的應用。當前許多
    公司依賴外部風控商進行風險評估,這導致了高成本和模型不透明等問題。本研究
    提出了一種基於堆疊技術的自建風控模型,旨在利用內部數據建立準確且高效的
    風控評分卡模型,以取代外部供應商並提高整體收益。
    本論文的目標是提出一個風險控制模型,使用 Stacking 技術結合多種基底模
    型(如邏輯迴歸、決策樹、XGBoost、LightGBM)達成目標並引入 LIME(Local
    Interpretable Model-agnostic Explanations)方法來提高模型解釋性。首先,收集公司
    內部的貸款資料,並從中提取出用戶提交的相關信息,再利用模型輸出用戶違約機
    率映射評分卡分數來調整貸款額度。
    實驗結果顯示,自建風控模型在降低違約率和提升收益率方面表現優異,並且
    相比外部風控模型有效降低了風控成本,提升了模型透明度和評估結果的精確性。
    基於內部數據進行的風控模型在應對多變的市場需求和保障數據安全方面具有顯
    著優勢。


    This study aims to explore the application of self-built risk control models to reduce costs
    and increase revenue. Currently, many companies rely on external providers for risk
    assessment, leading to high costs and opaque models. This study proposes a self-built risk
    control model based on stacking technology, aiming to use internal data to establish an
    accurate and efficient risk scoring model to replace external providers and improve
    overall revenue.
    The goal of this thesis is to propose a risk control model that uses stacking technology
    combined with multiple base models (such as logistic regression, decision trees, XGBoost,
    and LightGBM) to achieve this goal. First, the company's internal loan data is collected,
    and user-submitted loan information is extracted. Then, the model output probability is
    mapped to a scoring card, and the method is gradually adjusted and optimized.
    Experimental results show that the self-built risk control model performs excellently in
    reducing default rates and improving return rates. Compared to external risk control
    models, it effectively reduces risk control costs, improves model transparency, and
    enhances the accuracy of evaluation results. Risk control models based on internal data
    have significant advantages in responding to changing market demands and ensuring data
    security.

    中文摘要........................................................................................................................... i ABSTRACT .................................................................................................................... iii 目錄................................................................................................................................. iv 圖目錄............................................................................................................................ vii 表目錄........................................................................................................................... viii 第一章 緒論............................................................................................................1 1.1 研究動機與目的 ............................................................................................2 1.2 研究目標 ........................................................................................................3 1.3 論文架構 ........................................................................................................4 第二章 文獻探討....................................................................................................5 2.1 風險控制模型 ................................................................................................5 2.2 風險評估技術的現狀 ....................................................................................7 2.3 機器學習模型 ................................................................................................8 2.3.1 邏輯迴歸模型 (Logistic Regression) ..................................................9 2.3.2 隨機森林.............................................................................................10 2.3.3 XGBOOST..........................................................................................12 2.3.4 LIGHTBGM........................................................................................13 2.3.5 LIME...................................................................................................14 第三章 解決方案..................................................................................................16 3.1 引言 ..............................................................................................................16 3.2 系統架構設計 ..............................................................................................17 v 3.3 數據收集與預處理 ......................................................................................18 3.3.1 數據來源.............................................................................................18 3.3.2 數據預處理.........................................................................................20 3.4 模型選擇與訓練 ..........................................................................................21 3.4.1 邏輯回歸 (Logistic Regression)模型訓練 ........................................21 3.5 風控評分卡設計 ..........................................................................................24 3.5.1 FICO 評分轉換...................................................................................24 3.5.2 評分卡生成.........................................................................................24 第四章 實驗設計與結果 .....................................................................................25 4.1 衡量指標 ......................................................................................................25 4.2 實驗一:模型性能評估 ..............................................................................26 4.2.1 實驗流程.............................................................................................26 4.2.2 實驗結果 (Results) ............................................................................26 4.3 實驗二:模型可解釋性評估 ......................................................................28 4.3.1 實驗流程.............................................................................................28 4.3.2 實驗結果.............................................................................................28 4.4 實驗三:內部與外部風控違約率比較 ......................................................30 4.4.1 實驗流程.............................................................................................30 4.4.2 實驗結果.............................................................................................30 4.5 實驗四:內部與外部風控報酬率比較 ......................................................32 4.5.1 實驗流程 (Experimental Procedure) .................................................32 4.5.2 實驗結果.............................................................................................32 vi 第五章 結論與未來展望 .....................................................................................35 5.1 結論 ..............................................................................................................35 5.2 未來展望 ......................................................................................................36 參考文獻.........................................................................................................................37

    [1] X. Zhu, et al., "Explainable prediction of loan default based on machine learning
    models," Data Science and Management, vol. 6, no. 3, pp. 123-133, 2023.
    [2] C.-Y. J. Peng, K. L. Lee, and G. M. Ingersoll, "An introduction to logistic regression
    analysis and reporting," The Journal of Educational Research, vol. 96, no. 1, pp. 3-
    14, 2002.
    [3] J. R. Quinlan, "Induction of decision trees," Machine Learning, vol. 1, no. 1, pp. 81-
    106, 1986.
    [4] T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in
    Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
    Discovery and Data Mining, 2016, pp. 785-794.
    [5] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu,
    "LightGBM: A highly efficient gradient boosting decision tree," in Advances in
    Neural Information Processing Systems 30 (NIPS 2017), 2017, pp. 3146-3154.
    [6] M. T. Ribeiro, S. Singh, and C. Guestrin, "Why Should I Trust You?": Explaining
    the Predictions of Any Classifier," in Proceedings of the 22nd ACM SIGKDD
    International Conference on Knowledge Discovery and Data Mining, 2016, pp.
    1135-1144.
    [7] T. Fawcett, "An introduction to ROC analysis," Pattern Recognition Letters, vol. 27,
    no. 8, pp. 861-874, June 2006.
    [8] A. Alagic, N. Zivic, E. Kadusic, D. Hamzic, N. Hadzajlic, M. Dizdarevic, and E.
    Selmanovic, "Machine Learning for an Enhanced Credit Risk Analysis: A
    Comparative Study of Loan Approval Prediction Models Integrating Mental Health
    Data," Machine Learning and Knowledge Extraction, vol. 6, no. 1, pp. 53-77, 2024.
    [9] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, Oct.
    2001.
    [10] "Paytm Credit Score," Paytm, 2024. [Online]. Available: https://creditscore.lending.paytm.com/. [Accessed: July 22, 2024].
    [11] J. Kittler, "Statistical Pattern Recognition: The State of the Art," IEEE Transactions
    on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 38-62, Jan. 2000.
    [12] A. L. Samuel, "Some Studies in Machine Learning Using the Game of Checkers,"
    IBM Journal of Research and Development, vol. 3, no. 3, pp. 210-229, July 1959.
    38
    [13] V. Verdhan, "Introduction to Supervised Learning," in Supervised Learning with
    Python, Berkeley, CA: Apress, 2020, pp. 1-28.
    [14] H. Li, "Introduction to Unsupervised Learning," in Machine Learning Methods,
    Singapore: Springer, 2024, pp. 345-367.
    [15] D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. A. Raffel,
    "MixMatch: A Holistic Approach to Semi-Supervised Learning," in Advances in
    Neural Information Processing Systems 32 (NeurIPS 2019), pp. 5049-5059.
    [16] T. Szandała, "Review and Comparison of Commonly Used Activation Functions for
    Deep Neural Networks," arXiv preprint arXiv:2010.09458, 2020.
    [17] L. Breiman, "Bagging predictors," Machine Learning, vol. 24, no. 2, pp. 123-140,
    Aug. 1996.
    [18] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, Oct.
    2001.
    [19] J. H. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine,"
    Annals of Statistics, vol. 29, no. 5, pp. 1189-1232, Oct. 2001.
    [20] L. Li, H. Xiong, H. Wang, Y. Rao, L. Liu, Z. Chen, and J. Huan, "DELTA: DEep
    Learning Transfer using Feature Map with Attention for Convolutional Networks,"
    arXiv preprint arXiv:1901.09229, 2019.
    [21] D. Ge, J. Gu, S. Chang, and J. Cai, "Credit Card Fraud Detection using LightGBM
    Model," in Proceedings of the 2020 International Conference on E-commerce and
    Internet Technology (ECIT), 2020, pp. 215-220.
    [22] V. Taghian, S. H. Hassan, and M. K. Akbari, "H3O-LGBM: Hybrid Harris Hawk
    Optimization-Based Light Gradient Boosting Machine Model for Real-Time
    Trading," Artificial Intelligence Review, vol. 54, no. 4, pp. 2563-2582, 2022.
    [23] P. Pokhrel, E. Ioup, M. Hoque, M. Abdelguerfi, and J. Simeonov, "A LightGBM
    based Forecasting of Dominant Wave Periods in Oceanic Waters," arXiv preprint
    arXiv:2105.08721, 2021.
    [24] J. Bergstra and Y. Bengio, "Random Search for Hyper-Parameter Optimization,"
    Journal of Machine Learning Research, vol. 13, pp. 281-305, 2012.
    [25] C. Cortes, M. Mohri, and A. Rostamizadeh, "L2 Regularization for Learning
    Kernels," in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial
    Intelligence (UAI 2009), 2009, pp. 109-116.
    [26] S.-A. N. Alexandropoulos, C. K. Aridas, S. B. Kotsiantis, and M. N. Vrahatis,
    39
    "Stacking strong ensembles of classifiers," in Artificial Intelligence Applications and
    Innovations, J. MacIntyre, I. Maglogiannis, L. Iliadis, and E. Pimenidis, Eds. Cham:
    Springer International Publishing, 2019, pp. 545-556.

    QR CODE
    :::