| 研究生: |
傅勻垣 Yun-yuan Fu |
|---|---|
| 論文名稱: |
基於相似度群集之社群維護 Social Community Maintenance based on Similarity Clustering |
| 指導教授: |
蔡孟峰
Meng-feng Tsai |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 100 |
| 語文別: | 中文 |
| 論文頁數: | 42 |
| 中文關鍵詞: | 社群偵測 、使用者行為分析 、K-means群集演算法 、社群網路分析 |
| 外文關鍵詞: | user behavior analysis, community detection, K-Means Clustering Algorithm, Social network analysis |
| 相關次數: | 點閱:14 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
社群網路分析利用使用者之間的社交訊息及行為,分析出社群的行為表現與關係。我們找出隱藏的社群以輔佐推薦系統,可依據分析結果來支援搜尋引擎,提高特定主題搜尋的準確率。然而隨著社群網路結構的日新月異,過去同樣的分析結果無法持續地提供有效且正確的資訊,未來使用者所發布的新文章勢必會影響到各個社群的結構,甚至自己所應該歸屬的社群可能會因而改變。
在本篇論文中,我們根據使用者所發布的文章主題建立出由數個主題組成特殊化複合社群。藉由主題和主題之間的相似度(Fuzzy RT relation)將相似的主題利用K-means演算法群集,並將使用者歸屬至他有所興趣之該群集。隨著新進的文章所帶來對社群結構演變之影響,我們也透過K-means演算法之特性對該社群網路做結構變化之更新。
Social network analysis utilizes the social messages and behaviors between users to analyze the relationships and characteristics of communities. We try to support recommending search engine system by discovering the hidden information to help increasing the precision when searching specific subject related contents. Nevertheless the result analyzed in the past may not always provide a proper or correct information, new documents posted in the future would definitely influence the appearance and structure of communities, users themselves may even have to be assigned to another different community.
In our research, we construct a special hybrid community structure which is assembled by several subject categories. With the documents shared by the users at the social network, we cluster similar categories with K-Means Clustering Algorithm according to the similarity (in our research we refer it as Fuzzy RT relation) between categories. With this clustering technique, we assign the users to the cluster which contains the subject category that they’re interested in. Considering the influence brought by the new documents in the future, we also employ an update scheme that is also based on K-Means clustering to adjust the structure if the communities.
[1] J. Han and M. Kamber, Data Mining: Concepts and Techniques, MORGAN KAUFMANN PUBLISHERS, 2000.
[2] http://digg.com
[3] http://delicious.com
[4] Anna Huang, Similarity measures for Text Document Clustering, NZCSRSC 2008, April 2008, Christchurch, New Zealand.
[5] Wai-chiu Wang, Ada Wai-chee Fu, Incremental Document Clustering for Web Page Classification, Chinese University of Hong Kong,CiteSeer,2000
[6] Wikipedia – K-means clustering, http://en.wikipedia.org/wiki/K-means_clustering
[7] Open Directory Project http://www.dmoz.org
[8] Digg API, http://developers.digg.com/documentation
[9] Yahoo!搜尋「斷章取義」API, http://tw.developer.yahoo.com/cas/
[10] Baeza-Yates, R.A. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA.
[11] 郭依羚,「基於社群行為分析之階層化角色分類法」,國立中央大學,碩士論文,民國99年。
[12] M.E.J.Newman and M.Girvan, Finding and evaluating community in networks, University of Michigan, Cornell University, 2004, The American Physical Society 2004.
[13] Shihua Zhang, Rius-Sgheng Wang, Xiang-Sun Zhang, Identification of overlapping community structure in complex networks using fuzzy c-means clustering, Renmin University of China, Beijing, China, ScienceDirect 2006.