| 研究生: |
連丞宥 Cheng-You Lien |
|---|---|
| 論文名稱: |
透過網頁瀏覽紀錄預測使用者之個人資訊與性格特質 Predicting Users' Demographic Information and Personality Through Browsing History |
| 指導教授: |
陳弘軒
Hung-Hsuan Chen |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2018 |
| 畢業學年度: | 106 |
| 語文別: | 中文 |
| 論文頁數: | 44 |
| 中文關鍵詞: | 監督式學習 、分群 、大六性格特質分數 |
| 外文關鍵詞: | Supervised learning, Clustering, Big-six personality |
| 相關次數: | 點閱:12 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
瀏覽網頁所留下的歷史紀錄能夠描述出使用者瀏覽偏好,因此網頁瀏覽紀錄已經成為了解使用者相關資訊的最佳方式之一。近年來藉由分析使用者瀏覽紀錄並進行個人化商品、廣告推薦的應用逐漸增加,其中影響推薦結果準確度之關鍵在於對使用者相關資訊之掌握度,如果能夠藉由分析網頁瀏覽紀錄來獲得使用者的個人資訊與人格特質將能夠提升推薦系統之效能。
本篇論文將 600 位使用者之網頁瀏覽紀錄進行分析並找出較具有代表性的使用者特徵,藉由此使用者特徵搭配分群結合監督式學習方法預測出使用者之性別、年齡、感情狀態與大六性格特質分數,並在準確度上皆有良好的表現。同時也拓展了使用者行為分析的視野,當藉由網頁瀏覽紀錄預測使用者相關資訊時,將不再侷限於個人資訊的預測,而是能夠更加深入了解使用者的個性
Analyzing an individual’s Internet browsing history is one method of revealing the information about that person; for example, it reveals his/her preference for browsing websites. Analyzing browsing histories has become an increasingly common method for recommending advertisements that may serve individuals’ needs. The accuracy of advertisement recommendations depends on the understanding of a user’s information; thus, a recommender system will be more effective if it can analyze browsing histories to identify users’ demographic information and personalities.
This study examined the website browsing histories of 600 users to identify representative user features, which were subsequently analyzed through supervised learning with clustering to make predictions about the users in terms of gender, age, relationship statuses, and big six personality scores. The proposed method enhances the accuracy of the supervised prediction model and broadens the scope of user behavior analyses; particularly, in predicting users’ demographic information, this proposed method clarifies users’ personalities in further depths.
[1] L. M. Greaves, L. J. Cowie, G. Fraser, E. Muriwai, Y. Huang, P. Milojev, D. Osborne, C. G. Sibley, M. Zdrenka, J. Bulbulia et al., “Regional differences and similarities in the personality of new zealanders.” New Zealand Journal of Psychology, vol. 44, no. 1, 2015.
[2] P. Luo, S. Yan, Z. Liu, Z. Shen, S. Yang, and Q. He, “From online behaviors to offline retailing,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016, pp. 175–184.
[3] D. A. Merriman and K. J. O’connor, “Method of delivery, targeting, and measuring advertising over networks,” Sep. 7 1999, uS Patent 5,948,061.
[4] A. Freno, M. Saveski, R. Jenatton, and C. Archambeau, “One-pass ranking models for low-latency product recommendations,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015, pp. 1789–1798.
[5] M. Kosinski, D. Stillwell, and T. Graepel, “Private traits and attributes are predictable from digital records of human behavior,” Proceedings of the National Academy of Sciences, vol. 110, no. 15, pp. 5802–5805, 2013.
[6] S. Matz, M. Kosinski, G. Nave, and D. Stillwell, “Psychological targeting as an effective approach to digital mass persuasion,” Proceedings of the National Academy of Sciences, p. 201710966, 2017.
[7] C. Cadwalladr and E. Graham-Harrison, “Revealed: 50 million facebook profiles harvested for cambridge analytica in major data breach,” The Guardian, vol. 17, 2018.
[8] C. Y. Lien, G. J. Bai, T. R. Chen, and H. H. Chen, “Predicting user’s online shopping tendency during shopping holidays,” in Technologies and Applications of Artificial Intelligence, 2017.
[9] B. P. O’Connor, “A quantitative review of the comprehensiveness of the five-factor model in relation to popular personality inventories,” Assessment, vol. 9, no. 2, pp. 188–203, 2002.
[10] N. S. Altman, “An introduction to kernel and nearest-neighbor nonparametric regression,” The American Statistician, vol. 46, no. 3, pp. 175–185, 1992.
[11] T. K. Ho, “Random decision forests,” in Document analysis and recognition, 1995., proceedings of the third international conference on. IEEE, 1995, pp. 278–282.
[12] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 267–288, 1996.
[13] A. E. Hoerl and R.W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970.
[14] R. C. de Amorim and C. Hennig, “Recovering the number of clusters in data sets with noise features using feature rescaling factors,” Information Sciences, vol. 324, pp. 126–145, 2015.
[15] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.