| 研究生: |
洪菁憶 Ching-Yi Hung |
|---|---|
| 論文名稱: |
循序探勘在軟體版本控制上的應用 Using Mining Sequential Pattern to Version Control System for Software Maintenance |
| 指導教授: |
林熙禎
Shi-Jen Lin |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理學系 Department of Information Management |
| 畢業學年度: | 96 |
| 語文別: | 中文 |
| 論文頁數: | 60 |
| 中文關鍵詞: | 資料探勘 、循序探勘 、開放原始碼 、版本控制系統 |
| 外文關鍵詞: | Data mining, Sequencial pattern mining, Open source, Version control system |
| 相關次數: | 點閱:18 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於網路盛行的趨勢,造就程式開發人員更方便遠距離溝通及虛擬團隊之組成,加上世界各國對於推動開放原始碼專案不遺餘力,因此造就開放原始碼專案開始大量因應而生。
目前有許多網站提供網路空間幫助管理開放原始碼專案,由於大部份開放原始碼專案都可以提供任何人進行程式的撰寫及修改或建議,再加上團隊的溝通都是透過網路,如mail,因此虛擬團隊之間的溝通及程式歷史資料存放和使用就更顯重要。近年來在軟體工程領域中,透過資料探勘幫助找尋歷史資料潛在價值的研究逐漸推出。但是目前經常見到的是採用關聯規則等方式。故本研究提出採用循序探勘的方式,希望透過不同技術面向的思考,提供開放原始碼專案之管理及程式撰寫建議。
本研究設計之系統架構流程,是透過程式將版本控制系統網頁介面之資料抓取進資料庫中,再透過本研究認為應該合併、整理或刪除的資料前處理動作,進行資料清理。將循序探勘演算法PrefixSpan所應該輸入的資料和版本控制系統對應之定義於第三章闡述明白,接下來第四章就進行本研究所規劃之系統流程實驗過程及結論說明。希望針對軟工領域中版本控制系統之歷史資料提供另一種探勘方式的思維及建議。
The amount of open source projects becomes more and more. Version control system plays the important role in the open source projects. In the near year, using data mining to find some valuable information from history data are researchable in software maintainace engineering domain. But the association rule are usually been used in such maintainace reearch. In this paper, we propose a model using sequential pattern mining to try to find some different information from version history data. Such information could help management and suggestion with a open source project.
In this papter, we design a model to find some sequential pattern rule. At first, we try to collect data from version system in Web design and then preprocessing history data. Next, we use sequencial pattern algorithm─PrefixSpan, and we define some variable in the PrefixSpan in Chapter 3. In chapter 4 are experiment and some result and analyst. Finally is some researchable aspect and conclution.
中文參考文獻
1. 余忠慶、陳彥良 (民91),「多維度序列樣式挖掘之研究」,碩士論文─國立中央大學資訊管理學系研究所。
2. 江美靜、陳彥良 (民91),「有時間區間的循序挖掘」,碩士論文─國立中央大學資訊管理學系研究所。
3. 顏博文、李維平 (民92),「應用資料探勘技術分析學生選課特性與學業表現」,碩士論文─中原大學資訊管理學系研究所。
4. 許耀文、黃仁鵬 (民93),「序列型樣快速探勘演算法」,碩士論文─南台科技大學資訊管理研究所。
5. 張翊晉、黃世昆 (民93),「開放原始碼軟體貢獻度分析」,碩士論文─國立交通大學資訊工程學系研究所。
6. 林宇健、楊子青 (民97),「資料探勘技術應用於慢性疾病健康照護管理系統」,碩士論文─靜宜大學資訊管理學系研究所。
7. 資策會資訊市場情報中心 (2006),「2006年我國自由軟體產業產銷現況調查報告軟體與服務」。
8. 資策會資訊市場情報中心 (2006),「台灣自由軟體之硬體應用現況與趨勢」。
9. 王盈勛 (民94),「微軟生存之戰」,商周出版
10. Peter Wayner著,蔡憶懷譯 (民91),「開放原始碼」,商周出版。
英文參考文獻
11. Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. Proceedings of the Eleventh International Conference on Data Engineering, , 3-14.
12. Chen, A., Chou, E., Wong, J., Yao, A. Y., Zhang, Q., Zhang, S., et al. CVSSearch: Searching through source code using CVS comments. IEEE International Conference Software Maintenance (ICSM 2001), , 364–374.
13. Chen, Y. L., & Hu, Y. H. (2006). Constraint-based sequential pattern mining: The consideration of recency and compactness. Decision Support Systems, 42(2), 1203-1215.
14. Fayyad, U. M., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery: An overview. Advances in Knowledge Discovery and Data Mining Table of Contents, , 1-34.
15. Frawley, W. J., Piatetsky-Shapiro, G., & Matheus, C. J. (1992). Knowledge discovery in databases: An overview. AI Magazine, 13(3), 57-70.
16. Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., & Hsu, M. C. (2000). FreeSpan: Frequent pattern-projected sequential pattern mining. Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, , 355-359.
17. Hui, S., & Jha, G. (2000). Data mining for customer service support. Information & Management, 38(1), 1-13.
18. Kagdi, H., Collard, M. L., & Maletic, J. I. (2007). A survey and taxonomy of approaches for mining software repositories in the context of software evolution. Journal of Software Maintenance and Evolution: Research and Practice, 19(2), 77-131.
19. Kawaguchi, S., Garg, P., Matsushita, M., & Inoue, K. (2003). Automatic categorization algorithm for evolvable software archive. Software Evolution, 2003.Proceedings.Sixth International Workshop on Principles of, , 195-200.
20. Livshits, B., & Zimmermann, T. (2005). DynaMine: Finding common error patterns by mining software revision histories. ACM SIGSOFT Software Engineering Notes, 30(5), 296-305.
21. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., et al. (2001). PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern. IEEE Int.Conference on Data Engineering,
22. Raymond, E. (1999). The cathedral and the bazaar. Knowledge, Technology, and Policy, 12(3), 23-49.
23. Williams, C. C., & Hollingsworth, J. K. Bug driven bug finders. Proceedings of International Workshop on Mining Software Repositories,
24. Williams, C. C., & Hollingsworth, J. K. (2005). Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering, , 466-480.
25. Williams, C. C., & Hollingsworth, J. K. (2005). Recovering system specific rules from software repositories. ACM SIGSOFT Software Engineering Notes, 30(4), 1-5.
26. Xie, T., & Pei, J. (2006). MAPO: Mining API usages from open source repositories. Proceedings of the 2006 International Workshop on Mining Software Repositories, , 54-57.
27. Ye, Y., & Kishida, K. (2003). Toward an understanding of the motivation open source software developers. International Conference on Software Engineering: Proceedings of the 25 Th International Conference on Software Engineering: Portland, Oregon, 3(10), 419-429.
28. Ying, A. T. T., Murphy, G. C., Ng, R., & Chu-Carroll, M. C. (2004). Predicting source code changes by mining change history. IEEE Transactions on Software Engineering, , 574-586.
29. Zimmermann, T., & Weißgerber, P. (2004). Preprocessing CVS data for fine-grained analysis. Proc.International Workshop on Mining Software Repositories (MSR 2004), , 2–6.
30. Zimmermann, T., Weißgerber, P., Diehl, S., & Zeller, A. (2005). Mining version histories to guide software changes. IEEE Transactions on Software Engineering, , 429-445.
網頁資料
31. 台灣中央研究院資訊科學研究所,自由軟體鑄造場(Open Source Software Foundry),http://www.openfoundry.org/
32. SourceForge.net,http://sourceforge.net/
33. 版本控制系統的基礎觀念,http://huanlin.dyndns.org/techshare/articles/2004061302/svn_concept.htm