跳到主要內容

簡易檢索 / 詳目顯示

研究生: 鄭豐叡
CHENG FUNG-YU
論文名稱: 在多核心電腦上實作偵測k-clique community之平行演算法
指導教授: 何錦文
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 中文
論文頁數: 77
中文關鍵詞: 平行演算法分群問題連通圖問題
外文關鍵詞: parallel algorithms, k-clique communities, connected components
相關次數: 點閱:20下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 中文摘要
    偵測真實網路中的community結構是重要的研究主題,在許多方面皆有所應用,譬如:生物學、社會學、全球資訊網以及行動通信網路等。有許多不同的community偵測方法被提出。其中,我們發現Palla, G., et al提出的k-clique community相當重要。不過,他們的程式CFinder執行時間比起其它的偵測方法都要來得長。我們提出偵測k-clique community的平行演算法Parallel-Community,並實作在多核心電腦上,藉以縮短偵測k-clique community的時間。不過,在我們實作的時候,發現我們的作法和Gregori, E., et al的COS演算法,在某些部分作法相似。我們用五張真實網路圖,來測試比較Parallel-Community、COS和CFinder。使用單處理器來執行時,Parallel-Community與CFinder相比,最快可以到130倍,最慢至少可以達到54倍。使用多處理器來執行時,Parallel-Community與COS相比,最快可以到61倍,最慢至少可以達到24倍。


    Abstract.
    Detecting communities is an important research topic. It can be applied to biology, sociology, World Wide Web, mobile communication networks, etc. Many different detection methods have been published. Among them, k-clique community by Palla, G., et al is an important one. However, their program Cfinder’s execution time compared with other detection methods is time consuming. In order to reduce the running time of detecting communities, we propose a parallel algorithm of k-clique community "Parallel-Community" and implement it on a multi-core parallel computer. When we implement the algorithm, we find out some parts of Parallel-Community are similar to Gregori, E., et al’s COS. We experiment on five real networks to test Parallel-Community, COS, and CFinder. When executing with single processor, Parallel-Community achieve 54 to 130 times faster than CFinder. When executing with multiple processors, Parallel-Community achieve 24 to 61 times faster than COS.

    摘要 i Abstract ii 誌謝 iii 目錄 iv 圖目錄 vi 表目錄 viii 符號與專有名詞 ix 第一章 序論 1 1-1 community背景介紹 1 1-2 研究動機 3 1-3 研究方法簡介 5 1-4 研究成果簡介 7 1-5 論文架構 8 第二章 相關研究 9 2-1 CPM介紹 10 2-2 MCE演算法相關研究 15 2-3 平行connected component相關研究 25 第三章 Parallel-community演算法 28 3-1 Parallel-community作法介紹 28 3-2 計算Gc的端點 32 3-3 計算Gc中端點的邊 34 3-4 在disjoint set中處理Gc的邊以及合併p組disjoint set 38 第四章 實驗結果與討論 42 4-1 基本資料介紹 43 4-2 計算maximal clique的執行結果比較 44 4-3 單處理器下CPM.step B及CPM.step C的執行結果 47 4-4 多處理器下CPM.step B及CPM.step C的執行結果 47 第五章 結論與未來工作 57 參考文獻 59 附錄A 63

    [1] Ahn, Y. Y., Bagrow, J. P., Lehmann, S. (2010). "Link communities reveal multiscale complexity in networks." Nature 466(7307): 761-764.

    [2] Bron, C., Kerbosch, J. (1973). "Algorithm 457: finding all cliques of an undirected graph." Communications of the ACM 16(9): 575-577.

    [3] Chen, J., Yuan, B. (2006). "Detecting functional modules in the yeast protein-protein interaction network." Bioinformatics 22(18): 2283-2290.

    [4] Chiba, N., Nishizeki, T. (1985). "Arboricity and subgraph listing algorithms." SIAM Journal on Computing 14(1): 210-223.

    [5] Chin, C. H., Chen, S. H., Ho, C. W., Ko, M. T., Lin, C. Y. (2010). "A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles." BMC bioinformatics 11(suppl 1): S25.

    [6] Chin, C. H., Chen, S. H., Chen, C. Y., Hsiung, C. A., Ho, C. W., Ko, M. T., Lin, C. Y. (2013). "Spotlight: Assembly of protein complexes by integrating graph clustering methods." Gene 518(1): 42-51.

    [7] Chin, F. Y., Lam, J., Chen, I. N. (1982). "Efficient parallel algorithms for some graph problems." Communications of the ACM 25(9): 659-665.

    [8] Choudhary, A., Thakur, R. (1994). "Connected component labeling on coarse grain parallel computers: an experimental study." Journal of Parallel and Distributed Computing 20(1): 78-83.

    [9] Clauset, A., Newman, M. E., Moore, C. (2004). "Finding community structure in very large networks." Physical review E 70(6): 066111-1 - 066111-6.

    [10] Clauset, A., Moore, C., Newman, M. E. (2008). "Hierarchical structure and the prediction of missing links in networks." Nature 453(7191): 98-101.

    [11] Copty, N., Ranka, S., Fox, G. C., Shankar, R. V. (1994). "A data parallel algorithm for solving the region growing problem on the connection machine." Journal of Parallel and Distributed Computing 21(1): 160-168.

    [12] Cormen, T. H., Leiserson, C. E., Rivest, R. L., Stein, C. (2009). "Introduction to algorithms." MIT press, Cambridge, Massachusetts London, England.

    [13] Eppstein, D., Löffler, M., Strash, D. (2010). "Listing all maximal cliques in sparse graphs in near-optimal time." Algorithms and Computation 6506: 403-414.

    [14] Fortunato, S. (2010). "Community detection in graphs." Physics Reports 486(3): 75-174.

    [15] Gregori, E., Lenzini, L., Mainardi, S. (2013). "Parallel k-clique community detection on large-scale networks." IEEE Transactions on Parallel and Distributed Systems 24(8): 1651-1660.

    [16] Han, Y., Wagner, R. A. (1990). "An efficient and fast parallel-connected component algorithm." Journal of the ACM 37(3): 626-642.

    [17] Hirschberg, D. S., Chandra, A. K., Sarwate, D. V. (1979). "Computing connected components on parallel computers." Communications of the ACM 22(8): 461-464.

    [18] Johnson, D. S., Yannakakis, M., Papadimitriou, C. H. (1988). "On generating all maximal independent sets." Information Processing Letters 27(3): 119-123.

    [19] Karp, R. M. (1972). "Reducibility among combinatorial problems." Complexity of Computer Computations, Miller, R. E., Thatcher, J. W., Bohlinger, J. D., Ed., 85-103, Springer-Verlag, US.

    [20] Kim, W., Kohavi, R., Gehrke, J., DuMouchel, W. (2004). "Mining Scale-free Networks using Geodesic Clustering.", Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 719-724, ACM Press, New York, USA.

    [21] Kose, F., Weckwerth, W., Linke, T., Fiehn, O. (2001). "Visualizing plant metabolomic correlation networks using cliquemetabolite matrices." Bioinformatics 17(12): 1198-1208.

    [22] Lawler, E. L., Lenstra, J. K., Rinnooy Kan, A. H. G. (1980). "Generating all maximal independent sets: NP-hardness and polynomial-time algorithms." SIAM Journal on Computing 9(3): 558-565.

    [23] Leskovec, J. "Stanford large network dataset collection." from http://snap.stanford.edu/data/.

    [24] Lick, D. R., White, A. T. (1970). "k-degenerate graphs." Canadian Journal of Mathematics 22: 1082-1096.

    [25] Makino, K., Uno, T. (2004). "New algorithms for enumerating all maximal cliques." 260-272, Proceedings of the 9th Scandinavian Workshop on Algorithm Theory, Springer-Verlag, Heidelberg, Berlin.

    [26] Moon, J. W., Moser, L. (1965). "On cliques in graphs." Israel journal of Mathematics 3(1): 23-28.

    [27] Nepusz, T., Yu, H., Paccanaro, A. (2012). "Detecting overlapping protein complexes in protein-protein interaction networks." Nature methods 9(5): 471-472.

    [28] Palla, G., Derényi, I., Farkas, I., Vicsek, T. (2005). "Uncovering the overlapping community structure of complex networks in nature and society." Nature 435(7043): 814-818.

    [29] Palla, G., Barabási, A. L., Vicsek, T. (2007). "Quantifying social group evolution." Nature 446(7136): 664-667.

    [30] Perkins, C. E. (2008). "Ad hoc networking.", Addison-Wesley, Boston, US.

    [31] Pons, P., Latapy, M. (2005). "Computing communities in large networks using random walks." Computer and Information Sciences-ISCIS 2005: 284-293, Springer Berlin Heidelberg.

    [32] Reddy, K. P., Kitsuregawa, M., Sreekanth, P., Rao, S.S. (2002). "A Graph Based Approach to Extract a Neighborhood Customer Community for Collaborative Filtering.", Proceedings of the Second International Workshop on Databases in Networked Information Systems, 188-200, Springer-Verlag, London, UK.

    [33] Savage, C., Ja'Ja', J. (1981). "Fast, efficient parallel algorithms for some graph problems." SIAM Journal on Computing 10(4): 682-691.

    [34] Schmidt, M. C., Samatova, N. F., Thomas, K., Park, B. H. (2009). "A scalable, parallel algorithm for maximal clique enumeration." Journal of Parallel and Distributed Computing 69(4): 417-428.

    [35] Tomita, E., Tanaka, A., Takahashi, H. (2006). "The worst-case time complexity for generating all maximal cliques and computational experiments." Theoretical Computer Science 363(1): 28-42.

    [36] Tsukiyama, S., Ide, M., Ariyoshi, H., Shirakawa, I. (1977). "A new algorithm for generating all the maximal independent sets." SIAM Journal on Computing 6(3): 505-517.

    [37] Van Dongen, S. M. (2000). "Graph clustering by flow simulation", Ph.D. Thesis, University of Utrecht, The Netherlands.

    QR CODE
    :::