FP-Tree不同實作方式之效能比較｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	趙書榮 Shu-Jun Chao
論文名稱：	FP-Tree不同實作方式之效能比較 FP-Tree in different implement methods to compare the performances
指導教授：	陳彥良 Y.L.Chen
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理學系 Department of Information Management
畢業學年度：	91
語文別：	中文
論文頁數：	31
中文關鍵詞：	資料挖掘、關聯規則、演算法、FP-tree
外文關鍵詞：	data mining, association rule, algorithm, FP-tre
相關次數：	點閱：12 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

摘要
目前挖掘關聯規則的演算法可依需不需產生candidate itemset的作法分為兩類，例如Frequent-Pattern tree與Apriroi-like approach。此兩者最主要的差異在於，FP-tree並不產生candidate itemsets，它將資料庫壓縮在Frequent-Pattern tree的結構中，避免多次的高成本的資料庫掃瞄；後者是需要產生candidate itemset的方法。
而本文的目的是以應用Frequent-Pattern tree之理論，在實作方面以不同資料結構技術作效能比較測試，得到以那一種資料結構應用在Frequent-Pattern tree上執行時間之效能較佳。
在本文中共建立了(一)FP-tree_tail演算法，tail為在head table中增加一個tail欄位，(二)FP-tree_hash演算法，hash為以hash function計算出每個node所在位置方式建立FP-tree，(三)FP-tree_hash+tail演算法，為結合(一)、(二)之優點，所完成之演算法.，並將以上三個演算法與傳統FP-tree演算法一起比較，以找出各演算法之優缺點。經由本文實驗測試資料數據中，發現在各種實驗參數下，傳統FP-tree演算法所需花費之時間，為三個改良FP-tree演算法的數十倍。

now the algorithm in the association rules can be seperated two kinds.first is Apriori-like approach.second is Frequent-Pattern tree.main different between the above is the Frequent-Pattern tree did not to generate the candidate itemsets.its avoid a huge cost to scan database many times.
this paper apply three different data structure(FP-tree_tail,FP-tree_hash,FP-tree_hash+tail) to improve Frequent-Pattern tree algorithm .then to compare the performance about them ,
accroding to the test data we found the performance of the FP-tree alogrithm are worst then the other algorithms many times.

目錄
第一章	序論	1
第二章	文獻探討	3
第三章	研究動機	7
3．1	新增Head table 串列item問題	7
3．2	以樹狀結構建立Frequent Pattern Tree問題	10
第四章	演算法	13
4．1	FP-tree_tail演算法	13
4．1．1	FP-tree_tail之node資料結構	13
4．1．2	Head node 結構	14
4．1．3	建立Head table Link程式	14
4．2	FP-tree_hash演算法	15
4．2．1	Hash table 容量大小	15
4．2．2	Hash function	15
4．2．3	FP-tree_hash 之node資料結構	17
4．2．4	Head table node資料結構	18
4．2．5	Hash 演算法	18
4．3	FP-tree_hash+tail演算法	21
4．3．1	Head table node資料結構	21
4．3．2	FP-tree_hash＋tail演算法	21
第五章	實驗模擬	23
5．1	實驗設計	23
5．2	結果分析	24
5．3	實驗結論	27
第六章	結論與建議	29
參考文獻　　　　	30

                                

1. 陳彥良、凌俊青、許秉瑜，2001『在包裹式資料庫中挖掘數量關連規則』，資訊管理學報，第七卷‧第二期：215~229頁。
2. 陳彥良等，民90，『資料間隱含關係的挖掘與展望』，二十一世紀台灣湧現中的資訊管理議題專家研討會，大溪，鴻禧山莊。
3. Agrawal, R. and Srikant, R. "Fast Algorithms for Mining Association Rules," Proc. of the 20th Int''l Conference on Very Large Databases, Santiago, Chile, Sep. 1994.
4. Chen, M.S., Han, J. and Yu, P.S. "Data Mining: An Overview from a Database Perspective,'' IEEE Transactions on Knowledge and Data Engineering, (8:6) 1996, pp: 866-883.
5. Han, J. and Kamber, M. Data mining: Concepts and Techniques, Academic Press, 2001.
6. Han, J., Pei, J. and Yin, Y. "Mining Frequent Patterns without Candidate Generation," Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD''00), Dallas, TX, May 2000, pp: 1-12.
7. Li, S., Shen, H. and Cheng, L. "New Algorithms For Efficient Mining Of Association Rules," Information Sciences (118:1-4) 1999, pp:251-268.
8. Park, J-S., Chen, M-S. and Yu, P. S. "Using a Hash-Based Method with Transaction Trimming for Mining Association Rules," IEEE Trans. on Knowledge and Data Engineering (19:5) 1997, pp:813-825.
9. Pasquier, N., Bastide, Y., Taouil, R. and Lakhal, L. "Efficient Mining Of Association Rules Using Closed Itemset Lattices," Information Systems (24:1) 1999, pp:25-46.
10. Savasere, A., Omiecinski, E. and Navathe, S. "An Efficient Algorithm for Mining Association Rules in Large Databases," Proc. Int''l Conf. Very Large Data Bases, Zurich, Switzerland, Sep. 1995, pp:432-444.
11. Toivonen, H. "Sampling Large Databases For Association Rules," The 22th International Conference on Very Large Databases (VLDB''96), Mumbay, India, Sep. 1996, pp:134-145.

簡易檢索 / 詳目顯示

相關論文