| 研究生: |
謝得人 Ter-Jen Hsieh |
|---|---|
| 論文名稱: |
自編碼器於推薦系統之應用分析 Application and Analysis of Autoencoder in Recommender Systems |
| 指導教授: | 洪盟凱 |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
理學院 - 數學系 Department of Mathematics |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 57 |
| 中文關鍵詞: | 自編碼器 、推薦系統 、特徵抽取 、過擬合 、混合模型 |
| 外文關鍵詞: | autoencoder, recommender system, feature extraction, overfitting, hybrid model |
| 相關次數: | 點閱:18 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究探討神經網路家族中的自編碼器於推薦系統的應用,主要分為兩部分:第一部分觀察在超參數如隱藏層維數、層數以及正則化與dropout程度不同時模型的表現;第二部分嘗試混合模型,將自編碼器抽取出來的特徵視為內容過濾算法的預處理,觀察並分析模型的表現。推薦場景使用MovieLens 1M資料集,共有6040位使用者對3706部電影的共1000209筆評分資料,以訓練模型預測使用者對電影的評分,最終以RMSE作為模型評估指標。實驗結果發現,隱藏層維數增加容易造成過擬合,隱藏層層數增加則可加速收斂並提升模型表現,而正則化與dropout防止過擬合的效果顯著;混合模型使用自編碼器降維、抽取使用者的特徵,與傳統的協同過濾相比表現略有提升。
This research explores the application of the autoencoder in the neural network family for the recommender system. The thesis is divided into two parts: The first part is to observe the performance of the model when the hyperparameters, such as the hidden layer dimension, the number of layers, the degree of regularization and dropout, are different. The second part is to mix the model so that the feature extracted from the autoencoder is regarded as the preprocessing of the content filtering algorithm. The performance of the model is observed and analyzed. The recommended scene is used from the MovieLens 1M dataset. A total of 6,040 users have scored 1,000,209 ratings on 3,706 movies. We use this dataset to predict the user's ratings on the movie, and finally use RMSE as the index of evaluation. The experimental results show that the increasing of the hidden layer dimension is likely to cause over-fitting. The increasing of the number of hidden layers can accelerate the convergence and improve the performance of the model, while the regularization and dropout prevent the overfitting effect. The hybrid model uses the autoencoder to reduce the dimension and extracted the feature of the user. The performance is slightly improved compared with the traditional collaborative filtering.
〔1〕 Bennett, James, and Stan Lanning. "The netflix prize", Proceedings of KDD cup and workshop, Vol.2007, pp.35, August 2007.
〔2〕 R. Bell, Y. Koren and C. Volinsky, "Matrix Factorization Techniques for Recommender Systems", Computer, Vol.42, no.08, pp.30-37, August 2009.
〔3〕 Netflix Prize, Leaderboard, 2009年7月26日,取自https://www.netflixprize.com/leaderboard.html。
〔4〕 LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning", Nature, Vol.521.7553, pp.436-444, May 2015
〔5〕 Harper, F. Maxwell, and Joseph A. Konstan. "The movielens datasets: History and context." ACM Transactions on Interactive Intelligent Systems (TIIS), Vol.5, Issue 4, no.19, January 2016.
〔6〕 Lü, L., Medo, M., Yeung, C. H., Zhang, Y. C., Zhang, Z. K., Zhou, T., et al, "Recommender systems", Physics reports, Vol.519, Issue 1, pp.1-49, October 2012.
〔7〕 Alisa,訓練集(train)、驗證集(validation)和測試集(test)的意義,2017年10月27日,取自https://hk.saowen.com/a/ccf871080e6f444ceba7924230f2cada409baa8a395900f8c677b365a33c1e4d。
〔8〕 Herlocker, Jonathan L., et al. "Evaluating collaborative filtering recommender systems", ACM Transactions on Information Systems (TOIS), Vol.22, Issue 1, pp.5-53, January 2004.
〔9〕 Powers, David Martin, "Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation", Journal of Machine Learning Technologies, Vol.2, Issue 1, pp.37–63, February 2011.
〔10〕 Roger,如何評估推薦系統(一),2008年1月25日,取自https://blurkerlab.blogspot.com/2008/01/blog-post_25.html。
〔11〕 Josephine Liu,Precision Recall and ROC Curves for Pregnancy Tests,2017年5月12日,取自 https://www.periscopedata.com/blog/precision-recall-and-roc-curves-for-pregnancy-tests。
〔12〕 王建興,憑藉推薦系統來活化長尾的部分,2012年8月13日,取自http://online.ithome.com.tw/itadm/article.php?c=75493&s=1。
〔13〕 Su, Xiaoyuan, and Taghi M. Khoshgoftaar, "A survey of collaborative filtering techniques", Advances in artificial intelligence, Vol.2009, ArticleID 421425, pp.1-19, August 2009.
〔14〕 陳上進,Recommender System: Collaborative Filtering 協同過濾推薦演算法,2017年1月31日,取自https://vinta.ws/code/recommender-system-memory-based-collaborative-filtering.html。
〔15〕 Ron Zacharski, A Programmer’s Guide to Data Mining: The Ancient Art of the Numerati,2015年,取自http://guidetodatamining.com/。
〔16〕 Koren, Yehuda, "Collaborative filtering with temporal dynamics", Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp.447-456, July 2009.
〔17〕 郭耀華,深度學習——優化器演算法Optimizer詳解(BGD、SGD、MBGD、Momentum、NAG、Adagrad、Adadelta、RMSprop、Adam),2018年3月10日,取自https://tw.saowen.com/a/1145a32d8e5672f205fcb15275f029c756c6c13c25c1bbc30a7bac27d168b781。
〔18〕 Adomavicius, Gediminas, and Alexander Tuzhilin, "Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions", IEEE Transactions on Knowledge & Data Engineering ,Vol.17, Issue 6, pp.734-749, June 2005.
〔19〕 Pazzani, Michael J., and Daniel Billsus, "Content-based recommendation systems", The adaptive web, Springer, Berlin, Heidelberg, 2007
〔20〕 Tommy Huang,機器學習- 神經網路(多層感知機 Multilayer perceptron, MLP)運作方式,2018年4月1日,取自https://medium.com/@chih.sheng.huang821/f0e108e8b9af。
〔21〕 Alison yang,[魔法陣系列]Artificial Neural Network (ANN) 之術式解析,2018年10月18日,取自https://ithelp.ithome.com.tw/articles/10201931。
〔22〕 Hecht-Nielsen, Robert, "Theory of the backpropagation neural network", Neural networks for perception, pp.65-93, 1992.
〔23〕 不會停的蝸牛,常用激活函數比較,2017年3月14日,取自https://www.jianshu.com/p/22d9720dbf1a。
〔24〕 Venelin Valkov,What to do when data is missing? - Part II
,2017年2月2日,取自http://curiousily.com/data-science/2017/02/02/what-to-do-when-data-is-missing-part-2.html。
〔25〕 Kuchaiev, Oleksii, and Boris Ginsburg, "Training deep autoencoders for recommender systems", ICLR 2018 Conference Blind Submission, February 2018.
〔26〕 Resnick, Paul, et al, "GroupLens: an open architecture for collaborative filtering of netnews", Proceedings of the 1994 ACM conference on Computer supported cooperative work, ACM, pp.175-186, October 1994.
〔27〕 Batmaz, Zeynep, et al, "A review on deep learning for recommender systems: challenges and remedies", Artificial Intelligence Review, pp.1-37, August 2018.
〔28〕 Sedhain, Suvash, et al, "Autorec: Autoencoders meet collaborative filtering", Proceedings of the 24th International Conference on World Wide Web, pp.111-112, Florence, Italy, May 2015.
〔29〕 Alexander Kun,正則化為什麼能防止過擬合?,2017年5月31日,取自https://www.cnblogs.com/alexanderkun/p/6922428.html。
〔30〕 Srivastava, Nitish, et al, "Dropout: a simple way to prevent neural networks from overfitting", The Journal of Machine Learning Research, Vol.15, pp.1929-1958, January 2014.