| 研究生: |
蘇俊儒 Jun-Ru Su |
|---|---|
| 論文名稱: |
動態多模型融合分析研究 Dynamic Ensemble Learning Research |
| 指導教授: |
陳弘軒
Hung-Hsuan Chen |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 49 |
| 中文關鍵詞: | 多模型融合 、動態多模型融合 、監督式學習 |
| 外文關鍵詞: | ensemble learning, dynamic ensemble learning, supervised learning |
| 相關次數: | 點閱:8 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今多模型的整合大多採用固定策略,在訓練過後,多個基礎模型將以「靜態」的方式作融合,即:不會因為待測樣本的特徵不同而改變基礎模型的融合方式。但在現實的訓練情境中,單一模型可能只擅長於預測特定特徵分佈的樣本。由於各個樣本的特徵分佈不盡相同,只採用「靜態」融合的策略可能是過於天真的。
主流多模型融合大多假設單一基礎模型對不同數據的預測的能力大致相同,本論文想嘗試設計「動態」的融合學習以彌補這個假設可能造成的缺陷。我們已經嘗試了五種不一樣的方法,分別根據(1) 基礎模型判斷類別的機率;(2) 將基礎模型判斷轉換成損失;(3) 根據樣本空間的判斷能力;(4) 根據樣本空間的答對個數;及(5) 加入分類器判斷正確屬
性,以這五種不同的方法來實做「動態」融合。
本文將說明我們設計的五種方法,並在人工生成資料集、車險資料集、Fahsion-MNIST 以及Kuzushiji-MNIST 上的實驗結果。我們設計的融合方法的預測準確度均優於基礎模型,這說明動態的多模型融合是可行的。然而,與理想Model 相比,結果相差甚遠,在訓練額外屬性學習器上還有加強的空間。
Nowadays most of the ensemble learning methods apply a static strategy to integrate the base learners. After training, base learners are merged in a “static”manner, that is, the basic models will not adapt the fusion
strategy to the different feature distribution of the samples to be tested. However, in a realistic training scenario, a single model may only be good at predicting samples of a particular feature distribution. Since the features of each sample are distributed differently, the strategy of using only “static”fusion may be over-naïve.
The mainstream ensemble models mostly assume that the ability of a single base model to predict different data is roughly the same. This paper attempts to design a“dynamic”ensemble model to compensate for the shortcomings of this hypothesis. We have tried five different methods, based on (1) the category probability predicted by the base learners; (2)the loss of the base learners; (3) the percentage of correctness of the nearby
samples predicted by the base learners ; (4) the numbers of correctness of the nearby samples predicted by the base learners ; and (5) adding extra features about which base learner correctly predict the right label. These
five methods realize the “dynamic”ensemble.
This article will explain the five methods we designed and the experimental results based on a simulated dataset and three real datasets, including the Allstate dataset, the Fashion-MNIST dataset, and the Kuzushiji-MNIST dataset. We found that all five ensemble methods perform better than each of the single base learners. However, if we compare our method with an ideal model, the result is not good enough. Therefore, it may still be possible to improve our methods by training the leaner with extra
features.
[1] T. Chen, T. He, M. Benesty, V. Khotilovich, and Y. Tang, “Xgboost: extreme
gradient boosting,” R package version 0.4-2, pp. 1–4, 2015.
[2] R. M. Cruz, R. Sabourin, G. D. Cavalcanti, and T. I. Ren, “Meta-des: A dynamic ensemble
selection framework using meta-learning,” Pattern recognition, vol. 48, no. 5,
pp. 1925–1935, 2015.
[3] C.-C. Chang and C.-J. Lin, “Libsvm: A library for support vector machines,” ACM
transactions on intelligent systems and technology (TIST), vol. 2, no. 3, p. 27, 2011.
[4] S. R. Safavian and D. Landgrebe, “A survey of decision tree classifier methodology,”
IEEE transactions on systems, man, and cybernetics, vol. 21, no. 3, pp. 660–674,
1991.
[5] Y. Freund, R. E. Schapire, et al., “Experiments with a new boosting algorithm,” in
icml, vol. 96, pp. 148–156, Citeseer, 1996.
[6] T. G. Dietterich et al., “Ensemble learning,” The handbook of brain theory and neural
networks, vol. 2, pp. 110–125, 2002.
[7] H.-F. Yu, H.-Y. Lo, H.-P. Hsieh, J.-K. Lou, T. G. McKenzie, J.-W. Chou, P.-H.
Chung, C.-H. Ho, C.-F. Chang, Y.-H. Wei, et al., “Feature engineering and classifier
ensemble for kdd cup 2010,” in KDD Cup, 2010.
[8] A. Niculescu-Mizil, C. Perlich, G. Swirszcz, V. Sindhwani, Y. Liu, P. Melville,
D. Wang, J. Xiao, J. Hu, M. Singh, et al., “Winning the kdd cup orange challenge
with ensemble selection,” in KDD-Cup 2009 Competition, pp. 23–34, 2009.
[9] A. Liaw, M. Wiener, et al., “Classification and regression by randomforest,” R news,
vol. 2, no. 3, pp. 18–22, 2002.
[10] L. Breiman, “Bagging predictors,” Machine learning, vol. 24, no. 2, pp. 123–140,
1996.
[11] T. J. Hastie, “Generalized additive models,” in Statistical models in S, pp. 249–307,
Routledge, 2017.
[12] Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning
and an application to boosting,” Journal of computer and system sciences, vol. 55,
no. 1, pp. 119–139, 1997.
[13] L. Breiman et al., “Arcing classifier (with discussion and a rejoinder by the author),”
The annals of statistics, vol. 26, no. 3, pp. 801–849, 1998.
[14] L. Breiman, “Prediction games and arcing algorithms,” Neural computation, vol. 11,
no. 7, pp. 1493–1517, 1999.
[15] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings
of the 22nd acm sigkdd international conference on knowledge discovery and data
mining, pp. 785–794, ACM, 2016.
[16] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu,
“Lightgbm: A highly efficient gradient boosting decision tree,” in Advances in Neural
Information Processing Systems, pp. 3146–3154, 2017.
[17] A. V. Dorogush, V. Ershov, and A. Gulin, “Catboost: gradient boosting with categorical
features support,” arXiv preprint arXiv:1810.11363, 2018.
[18] H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking
machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017.
[19] T. Clanuwat, M. Bober-Irizar, A. Kitamoto, A. Lamb, K. Yamamoto, and D. Ha,
“Deep learning for classical japanese literature,” arXiv preprint arXiv:1812.01718,
2018.