跳到主要內容

簡易檢索 / 詳目顯示

研究生: 范嘉仁
Jia-Zen Fan
論文名稱: 利用大眾分類法改善部落格排名效能
Using Folksonomy to Improve the Performance of Blog Ranking
指導教授: 楊鎮華
Stephen J.H. Yang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
畢業學年度: 96
語文別: 英文
論文頁數: 84
中文關鍵詞: 大眾分類部落格網頁排名
外文關鍵詞: Blog, PageRank, Folksonomy
相關次數: 點閱:10下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 使用PageRank作為排序網路搜尋結果已被採信為一種可靠的方法,然而,其結果並不是讓人那麼滿意。許多研究發現部落格文章之間的相互連結之少,使得PageRank無法將新奇又相關性高但較少被連結的部落格文章推薦給搜尋者。再者,PageRank缺乏主題性的探索,這使得排名結果雖然有利於最有價值的部落格文章,但不一定利於最相關主題的部落格文章。
    本篇文章嘗試就這些缺點提出更好的排序方法。更進一步,我們比較現有的主題性網頁排名方法和大眾分類法(Folksonomy)做為主題分類的依據時的可靠程度,我們發現Folksonomy的結果較能符合搜尋者的期待。這篇論文將描述這個發現。


    Using PageRank to ranking search results on the web has been adopted as a reliable method; however, the results are not so satisfying. Many researches found that there are too few interlinks between blogposts that PageRank will be unable to recommended novel and high-related blogposts weak-connected to the users. Moreover, PageRank is lack of Topic discovery, which makes the rank advantages the valuable blogposts but does nothing to the relative blogposts.
    We attempted to present a better ranking method on solving these problem. Moreover, we tried to compare the degree of reliably between the latest topic-discovery page ranking method and Folksonomy as they are both used to generate the common topic relation. This paper will describe this discovery.

    Contents 摘要 0 Abstract I Acknowledgements II Contents III List of Figures VI List of Tables VIII Chapter 1 Introduction 1 1.1 What is the motivation of this research? 1 1.2 What kinds of problems to be solved? 3 1.3 Why are the problems significant? 4 1.4 Solutions 9 1.5 Contributions 10 Chapter 2 Related Works 11 2.1 General description of PageRank 11 2.1.1 Current research status and challenges 15 2.1.2 Various approaches of PageRank 16 2.1.3 Industry Product of Blog Search 18 2.2 Comparison of various approaches with our approach 19 2.2.1 Strength, Weakness 19 2.2.2 Opportunity, Threat 22 Chapter 3 Method and Solutions 24 3.1 Definition, axiom, theorem 24 3.1.1 Folksonomy 24 3.1.2 Topic Importance and Blogpost Importance 26 3.2 Problem Model 33 3.2.1 Web Surfing Model 33 3.2.2 Topic Surfing Model on Folksonomy 36 3.3 Algorithm 40 3.3.1 Procedure of Blog Search 40 3.3.2 Folkonomy BlogRank Calculating 43 Chapter 4 System Implementation 46 4.1 Implementation environment 46 4.1.1 Hardware and software platforms 46 4.1.2 Implementation languages and tools 47 4.2 System architecture 48 4.2.1 High-level system design and analysis 48 4.2.2 Low-level system design and analysis 50 4.2.2.1 Web Application 50 4.2.2.2 Backend Application 54 4.2.2.3 Database 56 4.3 System demo 57 Chapter 5 Experiment and Discussion 60 5.1 Experiment design and setup 60 5.1.1 Experiment scenario 60 5.1.2 Roles, hardware, software, and network requirements setup 61 5.2 Quantitative evaluation 62 5.2.1 Effectiveness 62 5.2.2 Precision 64 5.2.3 Results and lesson learned 67 Chapter 6 Conclusion and Future Work 68 References 69

    References
    [1] Bateman, S., Brooks, C., & McCalla, G. (2006). Collaborative Tagging Approaches for Ontological Metadata in Adaptive E-Learning Systems. The Proceedings of the Fourth International Workshop on Applications of Semantic Web Technologies for E-Learning, 3-12.
    [2] Bayes, T. (1763/1958). Studies in the History of Probability and Statistics: IX. Thomas Bayes'' Essay Towards Solving a Problem in the Doctrine of Chances, Biometrika ,45. 296–315.
    [3] Bayes, T. (1763).An essay towards solving a Problem in the Doctrine of Chances. Philosophical Transactions of the Royal Society, 53. 370–418.
    [4] Berendt, B., & Hanser, C. (2007). Tags are not Metadata, but “Just More Content”–to Some People. ICWSM.
    [5] Berman, A., & Plemmons, R. J. (1979). Nonnegative Matrices in the Mathematical Sciences. Academic Press, 2
    [6] Bharat, K., & Mihaila, G. A. (2000). Hilltop: A search engine based on expert documents. In the WWW9 Conference, Ansterdam, 15-19.
    [7] Breiman, L. (1968). Probability. Addison-Wesley.
    [8] Brooks, C. H., & Montanez, N. (2005). An analysis of the effectiveness of tagging in blogs. Proceedings of the 2005 AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs.
    [9] Booth, T. L. (1967). Sequential Machines and Automata Theory. John Wiley and Sons.
    [10] Border, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., & Wiener, J. (2000). Graph Structure in the Web. In the 9th International World Wide Web Conference.
    [11] Cattuto, C., Loreto, V., & Pietronero, L. (2006). Collaborative Tagging and Semiotic Dynamics. Proceedings of the 2nd Workshop on Scripting for the Semantic Web.
    [12] Cattuto, C., Loreto, V., & Pietronero, L. (2006). Semiotic dynamics in online social communities. The European Physical Journal C, 33-37.
    [13] Chakrabarti, S. (2002). Mining the Web: Discovering Knowledge from Hypertext Data. Morgan-Kaufmann Publishers.
    [14] Chi, Y. L. (2006). Applying Knowledge Acquisition and Knowledge Representation Synergy to Construct Ontology Conceptual Structures. Journal of Information Management 13(2), 193-215.
    [15] Damianos, L. E., Cuomo, D., Griffith, J., Hirst, D. M., & Smallwood, J. (2007). Exploring the Adoption, Utility, and Social Influences of Social Bookmarking in a Corporate Environment. Proceedings of the 40th Hawaii International Conference on System Sciences.
    [16] Doob, J. L. (1953). Stochastic Processes. John Wiley and Sons.
    [17] Dubinko, M., Kumar, R., Magnani, J., Novak, J., Raghavan, P., & Tomkins. A. (2006). Visualizing tags over time. 15th International World Wide Web Conference, 193–202.
    [18] Fujimura, K., Inoue, T., & Sugisaki, M. (2005). The eigenrumor algorithm for ranking blogs. In 2nd Workshop on the Weblogging Ecosystem, at WWW 2005.
    [19] Garrett, J. (2005). Ajax: A New Approach to Web Applications. http://www.adaptivepath.com/publications/essays/archives/000385.php
    [20] Godsil, C., & Royle, G. (2001). Algebraic Graph Theory. Springer, 8.
    [21] Goldberg, D., Nichols, D., Oki, B. M., & Terry, D. (1992). Using collaborative filtering to weave an information tapestry. Communications of the ACM, 35(12), 61-70.
    [22] Golder, S., & Huberman, B. A. (2006). Usage patterns of collaborative tagging systems. Journal of Information Science 32, 198-208.
    [23] Gradshteyn, I. S., & Ryzhik, I. M. (2007). Tables of Integrals, Series, and Products, Academic Press. 1103-2000.
    [24] Graham, A. (1987). Nonnegative Matrices and Applicable Topics in Linear Algebra. John Wiley&Sons.
    [25] Hassan-Montero, Y., & Herrero-Solana, V. (2006). Improving tag-clouds as visual information retrieval interfaces. International Conference on Multidisciplinary Information Sciences and Technologies, 25-28.
    [26] Haveliwala, T. H. (2002). Topic-sensitive PageRank. Proceedings of the 11th international conference on World Wide Web. 517-526.
    [27] Haveliwala, T. H. (2003). Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. IEEE Transactions on Knowledge and Data Engineering, 15(4), 784-796.
    [28] Hayes, C., Avesani, P., & Veeramachaneni, S. (2006). An analysis of bloggers and topics for a blog recommender system. Workshop on Web Mining (7).
    [29] Hayes, C., & Avesani, P. (2007). Using Tags and Clustering to Identify Topic-Relevant Blogs. http://www.icwsm.org/papers/2--Hayes-Avesani.pdf.
    [30] Hayes, C., Avesani, P., & Veeramachaneni, S. (2007). An analysis of the use of tags in a blog recommender system. The International Joint Conference on Artificial Intelligence, 2772-2777.
    [31] Horn, R. A., & Johnson, C.R. (1990). Matrix Analysis. Cambridge University Press
    [32] Hotho, A., Jschke, R., Schmitz, C., & Stumme, G. (2006). BibSonomy: A Social Bookmark and Publication Sharing System. In A. de Moor, S. Polovina, and H. Delugach, editors, Proceedings of the Conceptual Structures Tool Interoperability Workshop at the 14th International Conference on Conceptual Structures.
    [33] Huberman, B. A., Pirolli, P. L. T., Pitkow, J. E., & Lukose R. M. (1998). Strong regularities in world wide web surfing. Science, 280.
    [34] Johnson, S. (2005). Everything Bad is Good for You: How Today’s Popular Culture Is Actually Making Us Smarter. Journal of Popular Culture, 39(6). 1104-1106.
    [35] Jones, K. S. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1), 11–21.
    [36] Kemeny, J. G., Mirkil, H., Snell, J. L., & Thompson, G. L. (1959). Finite Mathematical Structures. Prentice-Hall.
    [37] Kipp, M., & Campbell, G. (2006). Patterns and inconsistencies in collaborative tagging systems: An examination of tagging practices. Proceedings of the ASIST Annual Meeting.
    [38] Konstan, J. A., Miller, B. N., Maltz, D., Herlocker, J. L., Gordon, L. R., & Riedl, J. (1997).GroupLens: Applying Collaborative Filtering to Usenet News. Communications of the ACM, 40(3), 77-87.
    [39] Kritikopoulos, A., Sideri, M., & Varlamis, I. (2006). BlogRank: ranking weblogs based on connectivity and similarity features. In Proceedings of the 2nd international Workshop on Advanced Architectures and Algorithms For internet Delivery and Applications (Pisa, Italy, October 10 - 10, 2006). AAA-IDEA ''06, vol. 198. ACM, New York, NY, 8.
    [40] Kritikopoulos A., Sideri M., Varlamis I. (2007). Success Index: Measuring the efficiency of search engines using implicit user feedback. In the 11th Pan-Hellenic Conference on Informatics, Special Session on Web Search and Mining.
    [41] Kurland, O. & Lee, L. (2005). PageRank without hyperlinks: structural re-ranking using links induced by language models. In Proceedings of the 28th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Salvador, Brazil, August 15 - 19, 2005). SIGIR ''05. ACM, New York, NY, 306-313.
    [42] Lambiotte, R., & Ausloos M. (2005). Collaborative tagging as a tripartite network. http://arxiv.org/abs/cs.DS/0512090.
    [43] Macgregor, G., & McCulloch, E. (2006). Collaborative tagging as a knowledge organisation and resource discovery tool. Library Review (55), 291-300.
    [44] Markov, A. A. (1971). Extension of the limit theorems of probability theory to a sum of variables connected in a chain. R. Howard. Dynamic Probabilistic Systems, 1. John Wiley and Sons.
    [45] Markov, A. A. (1906). Rasprostranenie zakona bol''shih chisel na velichiny, zavisyaschie drug ot druga. Izvestiya Fiziko-matematicheskogo obschestva pri Kazanskom universitete, 2-ya seriya, 15. 135-156.
    [46] Meyn, S. P., & Tweedie, R. L. (1993). Markov Chains and Stochastic Stability. Cambridge University Press.
    [47] Meyn, S. P. (2007). Control Techniques for Complex Networks. Cambridge University Press.
    [48] Millen, D. R., Feinberg, J., Kerr, B. (2006), Dogear: Social Bookmarking in the Enterprise. Proceedings of the SIGCHI conference on Human Factors in computing systems, 111-120.
    [49] Minc, H. (1988). Nonnegative matrices, John Wiley&Sons
    [50] Newfield, D., Sethi, B. S., & Ryall, K. (1998). Scratchpad: mechanisms for better navigation in directed Web searching. Proceedings of the 11th annual ACM symposium on User interface software and technology, 1-8.
    [51] Orlowski, A. (2003). Anti-war slogan coined, repurposed and Googlewashed ... in 42 days. http://www.theregister.co.uk/2003/04/03/antiwar_slogan_coined_repurposed/.
    [52] Page, L., Brin, S., Motwani R., & Winograd, T. (1998). The PageRank Citation Ranking: Bringing Order to the Web. Stanford Digital Libraries Working Paper.
    [53] Paolillo, J. C., & Penumarthy S. (2007). The Social Structure of Tagging Internet Video on del.icio.us. Proceedings of the 40th Hawaii International Conference on System Sciences.
    [54] Price, G. (2005). Google and Google Bombing Now Included New Oxford American Dictionary. http://blog.searchenginewatch.com/blog/050516-184202.
    [55] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). GroupLens: an open architecture for collaborative filtering of netnews. Proceedings of the 1994 ACM conference on Computer supported cooperative work, 175-186.
    [56] Richardson, M. & Domingos, P. (2002). The intelligent surfer: Probabilistic combination of link and content information in pagerank. In Advances in Neural Information Processing Systems 14.
    [57] Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513-523.
    [58] Salton, G., Fox, E. A., & Wu, H. (1983). Extended Boolean information retrieval. Communications of the ACM, 26(11), 1022-1036.
    [59] Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. McGraw-Hill.
    [60] Sarwar, B., Karypis, G., Konstan, J., & Reidl, J. (2001). Item-based collaborative filtering recommendation algorithms, Proceedings of the 10th international conference on World Wide Web, 285-295.
    [61] Scoble, R., & Israel, S. (2006). Naked Conversations: How Blogs are Changing the Way Businesses Talk with Customers. Wiley & Sons.
    [62] Sifry, D. (2006). State of the blogosphere: Part 1 – on blogosphere growth, from http://technorati.com/weblog/2006/04/96.html.
    [63] Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application to retrieval. Journal of. Documentaion, 28(1), 11-20.
    [64] Surowiecki, J. (2005). The Wisdom of Crowds. American Journal of Physics, 75(2). 190-192.
    [65] Tapscott, D., & Williams, A. D. (2006). Wikinomics, How Mass Collaboration Changes Everything. Portfolio Hardcover.
    [66] Trant, J. (2006). Exploring the potential for social tagging and folksonomy in art museums: Proof of concept. New Review of Hypermedia and Multimedia 12(1), 83-105.
    [67] Tseng, B., Tatemura, J., & Wu., Y. (2005). Tomographic clustering to visualize blog communities as mountain views. In WWW 2005 Workshop on the Weblogging Ecosystem.
    [68] van Alstyne, M. (1996). Could the Internet Balkanize. Science, 274(5292) , 1479-1480.
    [69] Varga, R. S. (1962). Matrix Iterative Analysis. Englewood Cliffs, NJ: Prentice-Hall.
    [70] Zeller, T. Jr. (2006). A New Campaign Tactic: Manipulating Google Data. The New York Times, 26 October 2006. 20.
    Web Pages:
    [71] Delicious Website in Delicious from http://del.icio.us
    [72] Funp Blog Search website in Funp from http://funp.com/blog/search.php
    [73] Google Blog Search website in Google from http://blogsearch.google.com.tw

    QR CODE
    :::