混合式機器學習技術於破產預測之研究｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	王暉元 Hui-Yuan Wang
論文名稱：	混合式機器學習技術於破產預測之研究
指導教授：	蔡志豐 Chih-Fong Tsai
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理學系在職專班 Executive Master of Information Management
論文出版年：	2018
畢業學年度：	106
語文別：	中文
論文頁數：	79
中文關鍵詞：	資料探勘、混合架構、Support vector machine 、Affinity propagation 、Logistics regression 、K-means
外文關鍵詞：	Data mining, Support vector machine, Affinity propagation, Logistics regression, K-means, Hybrid machine learning model
相關次數：	點閱：15 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

企業對於財務困境的評估需求越來越高，因為全球企業的財務困境的案例越來越多，因此對於更有效的財務困境預測模型有更大的需求。機器學習技術在預測問題的應用，主要著重在不同的技術中尋找出一個準確率最高的模型來做為預測。僅使用監督式學習的單一模型，在預測的準確率上已經不容易有突破性的發展，導致出現一個新的趨勢，就是整合多種演算法來增加資料探勘的表現。混合式資料探勘乃是使用兩種以上的學習法的優點藉以提昇單一學習法的效能或效率。隨著混合分類器的漸漸成為研究的趨勢，使用混合式架構，整合多種技術的效果確實比單一分類技術還要來得好。過去的文獻鮮少相關研究探討不同類型的混合模式搭配的預測表現效果如何。
鑑於做出更準確預測的重要性，本研究將要比較機器學習與統計型的分類器搭，配機器學習或是統計型的分群法演算法，比較何種混合模式能提供在財務預測資料集有最準確的預測結果。本研究以四個組合的混合模式在多個破產預測資料集進行實驗，組合為Affinity propagation搭配Support vector machine；K-means搭配Logistics regression；Affinity propagation搭配Logistics regression；K-means搭配Support vector machine。經過實驗以Affinity propagation搭配Support vector machine在較多資料集有最好的預測表現，平均AUC亦是最好；Affinity propagation能幫助Logistics regression提升預測的準確率在要求較低建模時間的情況成為一種選擇。期望研究結果能幫助實際進行建立實務預測模型的參考。

This study investigates the efficacy of applying variant hybrid machine learning models to bankruptcy prediction problem. Although it is a well-known fact that the hybrid models perform well in prediction tasks, the method has some limitations in that it is an art to find an appropriate hybrid model structure. Fewer studies explore how the predictive performance of different types of mixed hybrid model collocations. This study will compare machine learning and statistical classifiers with machine learning or statistical clustering algorithms. Four combinations of mixed models were used to perform experiments on multiple bankruptcy prediction datasets. The combination was Affinity propagation with Support vector machine; K-means with Logis-tics regression; Affinity propagation with Logistics regression; K-means with Support vector machine.
The results demonstrate that the accuracy performance of Affinity propagation with Support vector machine has the best predictive performance in many datasets.

摘要    I
ABSTRACT    II
誌謝    III
第 一 章 緒論    1
1.    研究背景    1
2.    研究動機    2
3.    研究目的    3
4.    研究架構    4
第 二 章 文獻探討    5
1.    破產預測    5
2.    資料探勘之定義    6
3.    混合模式資料探勘    8
4.    Support Vector Machine    9
5.    Logistics Regression    13
6.    K-means    15
7.    Affinity propagation    18
第 三 章 研究方法    21
1.    研究設計及架構    21
1.1.    單一分類器之建立與測試    23
1.2.    混合模式之建立與測試    25
2.    資料來源    28
3.    模型評估    29
第 四 章 實驗    30
1.    Support vector machine參數設定    30
2.    各預測模式之結果與比較    30
2.1.    混合模式分類器預測數據    30
2.2.    LR分類器最佳混合模式    36
2.3.    SVM分類器最佳混合模式    42
3.    各預測模式之花費時間比較    49
第 五 章 研究結論與建議    60
1.    研究結論    60
2.    研究貢獻與建議    60
參考文獻    62
附錄 TEJ資料集樣本屬性列表    65

                                

Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4), 589-609.
Altman, E. I., Haldeman, R. G., & Narayanan, P. (1977). ZETATM analysis A new model to identify bankruptcy risk of corporations. Journal of banking & finance, 1(1), 29-54.
Arora, P., & Varshney, S. (2016). Analysis of K-Means and K-Medoids algorithm for big data. Procedia Computer Science, 78, 507-512.
Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of accounting research, 71-111.
Beaver, W. H. (1968). Market prices, financial ratios, and the prediction of failure. Journal of accounting research, 179-192.
Berry, M. J., & Linoff, G. (1997). Data mining techniques: for marketing, sales, and customer support: John Wiley & Sons, Inc.
Chen, K. H., & Shimerda, T. A. (1981). An empirical analysis of useful financial ratios. Financial Management, 51-60.
Chen, W.-S., & Du, Y.-K. (2009). Using neural networks and data mining techniques for the financial distress prediction model. Expert Systems with Applications, 36(2), 4075-4086.
Chou, C.-H., Hsieh, S.-C., & Qiu, C.-J. (2017). Hybrid genetic algorithm and fuzzy clustering for bankruptcy prediction. Applied soft computing, 56, 298-316.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297. doi:10.1007/bf00994018
Dueck, D., & Frey, B. J. (2007). Non-metric affinity propagation for unsupervised image categorization. Paper presented at the Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on.
Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. science, 315(5814), 972-976.
Furey, T. S., Cristianini, N., Duffy, N., Bednarski, D. W., Schummer, M., & Haussler, D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16(10), 906-914.
Ghodousi, M., Alesheikh, A. A., & Saeidian, B. (2016). Analyzing public participant data to evaluate citizen satisfaction and to prioritize their needs via K-means, FCM and ICA. Cities, 55, 70-81.
Ghodselahi, A. (2011). A hybrid support vector machine ensemble model for credit scoring. International Journal of Computer Applications, 17(5), 1-5.
Grupe, F. H., & Mehdi Owrang, M. (1995). Data base mining discovering new knowledge and competitive advantage. Information System Management, 12(4), 26-31.
Hsieh, N.-C. (2005). Hybrid mining approach in the design of credit scoring models. Expert Systems with Applications, 28(4), 655-665.
Hsu, C.-W., Chang, C.-C., & Lin, C.-J. (2003). A practical guide to support vector classification.
Lenard, M. J., Madey, G. R., & Alam, P. (1998). The design and validation of a hybrid information system for the auditor’s going concern decision. Journal of Management Information Systems, 14(4), 219-237.
Lensberg, T., Eilifsen, A., & McKee, T. E. (2006). Bankruptcy theory development and classification via genetic programming. European journal of operational research, 169(2), 677-697.
Lin, W.-Y., Hu, Y.-H., & Tsai, C.-F. (2012). Machine learning in financial crisis prediction: a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4), 421-436.
Liu, X.-Z., & Feng, G.-C. (2008). Kernel bisecting k-means clustering for SVM training sample reduction. Paper presented at the Pattern Recognition, 2008. ICPR 2008. 19th International Conference on.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Paper presented at the Proceedings of the fifth Berkeley symposium on mathematical statistics and probability.
Mantovani, R. G., Rossi, A. L., Vanschoren, J., Bischl, B., & Carvalho, A. C. (2015). To tune or not to tune: recommending when to adjust SVM hyper-parameters via meta-learning. Paper presented at the Neural Networks (IJCNN), 2015 International Joint Conference on.
McCarty, J. A., & Hastak, M. (2007). Segmentation approaches in data-mining: A comparison of RFM, CHAID, and logistic regression. Journal of business research, 60(6), 656-662.
Min, J. H., & Lee, Y.-C. (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4), 603-614. doi:https://doi.org/10.1016/j.eswa.2004.12.008
Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of accounting research, 109-131.
Shin, K.-S., Lee, T. S., & Kim, H.-j. (2005). An application of support vector machines in bankruptcy prediction model. Expert Systems with Applications, 28(1), 127-135.
Telmoudi, F., El Ghourabi, M., & Limam, M. (2011). RST–GCBR‐Clustering‐Based RGA–SVM Model for Corporate Failure Prediction. Intelligent Systems in Accounting, Finance and Management, 18(2-3), 105-120.
Tsai, C.-F. (2014). Combining cluster analysis with classifier ensembles to predict financial distress. Information Fusion, 16, 46-58.
Tsai, C.-F., & Chen, M.-L. (2010). Credit rating by hybrid machine learning techniques. Applied soft computing, 10(2), 374-380.
Tsai, C.-F., Hu, Y.-H., Hung, C.-S., & Hsu, Y.-F. (2013). A comparative study of hybrid machine learning techniques for customer lifetime value prediction. Kybernetes, 42(3), 357-370.
Tsai, C.-F., & Hung, C. (2014). Modeling credit scoring using neural network ensembles. Kybernetes, 43(7), 1114-1123.
West, D., Dellana, S., & Qian, J. (2005). Neural network ensemble strategies for financial decision applications. Computers & operations research, 32(10), 2543-2559.
Xu, X., & Wang, Y. (2009). Financial failure prediction using efficiency as a predictor. Expert Systems with Applications, 36(1), 366-373.
Yeh, C.-C., Chi, D.-J., & Hsu, M.-F. (2010). A hybrid approach of DEA, rough set and support vector machines for business failure prediction. Expert Systems with Applications, 37(2), 1535-1541.
Žalik, K. R. (2008). An efficient k′-means clustering algorithm. Pattern recognition letters, 29(9), 1385-1391.
Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis. European journal of operational research, 116(1), 16-32.
Zou, G. (2004). A modified poisson regression approach to prospective studies with binary data. American journal of epidemiology, 159(7), 702-706.

簡易檢索 / 詳目顯示

相關論文