應用最小生成樹與效用分析於關聯規則之研究｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	張文鴻 Wen-Hong Zhang
論文名稱：	應用最小生成樹與效用分析於關聯規則之研究
指導教授：	陳炫碩 Ken Chen
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 企業管理學系 Department of Business Administration
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	48
中文關鍵詞：	關聯法則、最小生成樹
外文關鍵詞：	Association Rule, 最小生成樹
相關次數：	點閱：12 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本研究探討如何找出消費者購買行為之間的關聯，透過最小生成樹演算法而不是以往的關聯法則演算法等，以往的關聯法則演算法需透過決策者自行定義門檻值去挖掘出關聯法則，門檻值沒有準確的定值容易找出過量的關聯法則使得決策者無從觀察，且關聯法則忽略了消費者購物的特性，忽略了產品購買數量與金額的事實，因此本研究使用最小生成樹在交易資料中找出兩兩最具相關的產品種類，透過相互資訊進行產品與產品之間的顯著性檢測證明其在統計上是顯著相關的，並利用中心程度來去找出顧客的購物籃中最常被購買出現在各個購物清單的關鍵產品，最後使用 utility scores 去算出最有價值的關聯法則供給零售業經理人進行促銷活動等。

This study explores how to find the correlation between consumers' purchasing behaviors, through the minimum spanning tree algorithm instead of the previous association rule mining. The previous association rule algorithm needs to define the threshold by the decision maker to mine the association rules , however how to define the threshold is a difficult question, it is easy to find excessive useless association rules that make it impossible for decision makers to observe, and the association rule ignores the characteristics of consumer shopping eg. the fact that the number and amount of product purchase. Therefore, this study uses the minimum spanning tree in identify the most relevant product category in the transaction database, and use the mutual information to conduct product-to-product significant test to prove that they are statistically significantly related, and use the degree centrality to find the key product in the customer's shopping basket. They are often purchased for key products that appear on various shopping lists, and finally use utility scores to calculate the most valuable association rules for retail managers to carry out promotional activities.

國立中央大學圖書館學位論文授權書 ............................................................................i 
國立中央大學碩士班研究生論文指導教授推薦書 ...................................................... ii 
國立中央大學碩士班研究生論文口試委員審定書 ..................................................... iii 
中文摘要 ..........................................................................................................................iv 
ABSTRACT ...................................................................................................................... v 
CONTENTS .....................................................................................................................vi 
LIST OF FIGURES ....................................................................................................... viii 
LIST OF TABLES ............................................................................................................ix 
Chapter1    Introduction ................................................................................................. 1 
Chapter2    Literature review ......................................................................................... 4 
2.1    Association Rule Mining ...................................................................................... 4 
    2.1.1  Apriori ........................................................................................................... 4 
    2.1.2  High Utility Itemset Mining .......................................................................... 6 
2.2    Minimum Spanning Tree .................................................................................... 11 
    2.2.1  Dijkstra 最小生成樹 .................................................................................. 11 
    2.2.2  Kruskal 最小生成樹 .................................................................................. 14 
2.2.3  Prim 最小生成樹 ....................................................................................... 18 
Chapter3    研究方法與實作 ...................................................................................... 22 
3.1    資料處理 ............................................................................................................ 22 
3.2     Prim’s 最小生成樹(Prim’s MST) ................................................................... 23 
    3.2.1  Relevance between two products ................................................................ 24 
 vii 
    3.2.2  Distance between two products ................................................................... 25 
    3.2.3  Identify dependence between two products ............................................... 28 
    3.2.4  Key Products ............................................................................................... 31 
3.3     Utility*Lift Score .............................................................................................. 33 
Chapter 4  結論與未來研究 ......................................................................................... 35 
Chapter 5  REFERENCE ............................................................................................... 36 
                                

[1] Agrawal R., Imieliński T., Swami A.,"Mining association rules between sets of
items in large databases", Proceedings of the 1993 ACM SIGMOD international
conference on Management of data - SIGMOD '93. p. 207, 1993
[2] (Cai, Fu, Cheng, and Kwong (1998), Tao, Murtagh, and Farid (2003) Khan,
Muyeba, and Coenen (2008)
[3] H.Yao, H.J.Hamilton, C.J.Butz, “A foundation approach to mining itemset
utilities from databases”, in: Proceedings of the Third SIAM International
Conference on Data Mining, Orlando, Florida , pp.482-486, 2004.
[4] Liu Y., Liao W., and Choudhary A., “A Fast High Utility Itemsets Mining
Algorithm,” Proc. Utility-Based Data Mining Workshop, 2005.
[5] Yao H.,Hamilton H. and Geng L., “A Unified Framework for Utilty-Based
Measures for Mining Itemsets”, In Proc. of the ACM Intel. Conf. on Utility-
Based Data Mining Workshop (UBDM), pp. 28-37, 2006.
[6] H.F.Li, H.Y. Huang , Y.Cheng Chen, y. Liu, S.Lee, “Fast and memory efficient
mining of high utility itemsets in data streams”, in :Eigth International Conference
of Data Mining 2008.
[7] Liu M. and Qu J., “Mining High Utility Itemsets without Candidate Generation
,CIKM’12,Maui,HI,USA, ACM, October29-November 2,2012.
[8] Vincent S.Tseng,Bai-En shie,Cheng-Wei Wu and Pjillip S.Yu, “Efficient
Algorithms for Mining High Utility Itemset from Transactional Databases”,8
August 2013,IEEE Transactions on Knowledge and Data Engineering ,Vol 25 pp
1172-1786,2013.
37
[9] Philippe Fournier Viger, Jerry Chun Wei Lin,Bay Vo,Tin Truong Chi,Ji
Zhang,Hoai Bac Le, “A Survey of itemset mining” , WIREs Data Mining Knowl
Discov 2017.
[10] Shankar S.,Purusothoman T.P, Jayanthi S.,.Babu N, “A fast algorithm for mining
high utility itemsets” , in :Proceedings of IEEE International Advance Computing
Conference (IACC 2009), Patiala, India, pp.1459-1464, 2009.
[11] Lin JCW,Gan W,Fournier-Viger P,Hong TP,Tseng VS.,“ Weighted frequent
itemset mining over uncertain databases”, Appl Intell 2015, 44:232–250,2015.
[12] Ju Wang, Fuxian Liu, and Chunjie Jin, “PHUIMUS: A Potential High Utility
Itemsets Mining Algorithm Based on Stream Data with Uncertainty”,Hindawi
Mathematical Problems in Engineering Volume, Article ID 8576829, 13 pages
2017.
[13] Lin JCW,Gan W,Fournier-Viger P, Hong TP, Tseng VS, “Efficiently mining
uncertain high-utility itemsets”. Springer International Publishing Switzerland
2016,WAIM 2016, Part I, LNCS 9658, pp. 17–30, 2016.
[14] Birch, J. P. A. A., & Soramaki, K. (2016). Analysis of correlation based networks
rep- resenting dax 30 stock price returns. Personnel Psychology, 47(4), 501–525.
[15] Steuer, R., Kurths, J., Daub, C. O., Weise, J., & Selbig, J. (2002). The mutual
informa- tion: Detecting and evaluating dependencies between variables.
Bioinformatics, 18, S231–S240. (suppl 2)
[16] Good, P. (2005). Permutation, parametric, and bootstrap tests of hypotheses. New
York: Springer-Verlag.
[17] Pethel, S. D., & Hahs, D. W. (2014). Exact test of independence using mutual
infor- mation. Entropy, 16(5), 2839–2849.
[18] Freeman, L. C. (1978). Centrality in social networks conceptual clarification.
38
Social
Networks, 1(3), 215–239.

簡易檢索 / 詳目顯示

相關論文