跳到主要內容

簡易檢索 / 詳目顯示

研究生: 黃振萬
Chen-Wan Huang
論文名稱: Methodologies for Discovering Sets of Critical Products
指導教授: 許秉瑜
Ping-Yu Hsu
口試委員:
學位類別: 博士
Doctor
系所名稱: 管理學院 - 企業管理學系
Department of Business Administration
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 62
中文關鍵詞: 關鍵商品數據挖掘垂直數據庫頻繁項目集挖掘RFM
外文關鍵詞: critical products, data mining, vertical database, frequent itemsets mining, RFM
相關次數: 點閱:12下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在數位轉型時代,零售業者能夠識別出重要客戶與其相關之關鍵商品來提升公司營運績效至關重要。關鍵商品是重要客戶之關鍵購買項目,但在一般消費客戶群中並非具有影響性。儘管關鍵商品之銷售量可能偏低,但仍建議應持續保留在零售店內貨架上,藉以留住店家之重要客戶。過去文獻中少有研究考慮關鍵商品並找出識別模型。本研究提出一種利用垂直數據庫架構來識別出關鍵商品之改良演算法,並將該演算法應用於實際零售超市之交易數據庫,進行實驗以驗證其有效性。經過三種創新過濾條件設計,Precision rate分別達到80.55%、82.15% 與82.35%。本研究是第一個能夠結合多種過濾方法來發掘零售關鍵商品之研究。


    Identifying critical products and important customers to strengthen company performance is vitally important in the digital transformation era. Critical products are itemsets that are preferred by important customers and yet not popular among ordinary customers. As a result, critical products should be kept on the shelf even if their sales volume is lower than that of other popular products. However, few studies have considered identifying critical products or their potentially valuable customer patterns. Therefore, an innovative algorithm that takes advantage of vertical databases to identify critical products was designed in this study. The proposed algorithm is applied to a transactions database of a midsize supermarket to verify the performance. The result showed that the precision of identifying critical products reached 80.55%, 82.15%, and 82.35% for three different filtering criteria. To the best of our knowledge, this dissertation is the first to use multiple filtering criteria to identify critical products.

    Table of Contents Chinese Abstract i English Abstract ii Acknowledgements iii Table of Contents iv List of Figures vi List of Tables vii Chapter I Introduction 1 Chapter II Literature review 4 2-1 Retail transaction logs analysis 4 2-2 Infrequent pattern mining 5 2-3 Vertical database mining algorithm 6 2-4 High utility itemsets mining 7 Chapter III Methodology development 9 3-1 Definitions of data structures and criteria 10 3-2 Algorithm of Improved Equivalence Class Transformation 16 Chapter IV Experiments with frequency consideration 21 4-1 Data collection and cleansing 21 4-2 Customers evaluation and segmentation 21 4-3 Experiment result of bipartite segmentation 24 4-4 Experiment result of multiple segmentation 25 4-5 Sensitivity analysis of value N 28 4-6 Sensitivity analysis of interval 28 4-7 Summary 30 Chapter V Enriching methodologies with utility 32 5-1 Definitions of utility and criteria 32 5-2 Proposed IECTu algorithm 35 Chapter VI Experiments with utility consideration 40 6-1 Data collection and utility information 40 6-2 Experiment results 40 Chapter VII Conclusion 44 References 45

    [1] T. Vafeiadis, K.I. Diamantaras, G. Sarigiannidis G and K.C. Chatzisavvas, A comparison of machine learning techniques for customer churn prediction, Simulation Modelling Practice and Theory, Vol 55, 2015, pp. 1-9. http://doi.org/10.1016/j.simpat.2015.03.003.
    [2] R. Koch, The 80/20 Principle: The Secret to Achieving More with Less, London: Nicholas Brealey, 1998.
    [3] A.J. Badgaiyan and A. Verma, Does urge to buy impulsively differ from impulsive buying behaviour? Assessing the impact of situational factors, Journal of Retailing and Consumer Services, Vol 22, 2015, pp. 145-157. http://doi.org/10.1016/j.jretconser.2014.10.002.
    [4] S. Atulkar and B. Kesari, Role of consumer traits and situational factors on impulse buying: does gender matter?, International Journal of Retail and Distribution Management, Vol 46 (4), 2018, pp. 386-405. http://doi.org/10.1108/IJRDM-12-2016-0239.
    [5] M. Kantardzic, DATA MINING: Concepts, Models, Methods and Algorithms, second ed., New Jersey: John Wiley and Sons, 2011. https://doi.org/10.1002/9781118029145.
    [6] G, Liu, Y. Fu, G. Chen, H. Xiong and C. Chen, “Modeling buying motives for personalized product bundle recommendation”, ACM Transactions on Knowledge Discovery from Data, Vol 11(3), 2017, http://doi.org/10.1145/3022185.
    [7] J. Ding and S. S.T. Yau, “TCOM, an innovative data structure for mining association rules among infrequent items”, Computers and Mathematics with Applications, Vol 57 (2), 2009, pp. 290-301. http://doi.org/10.1016/j.camwa.2008.09.044.
    [8] U. Yun and D. Kim, “Mining of high average-utility itemsets using novel list structure and pruning strategy”, Future Generation Computer Systems, Vol 68, 2017, pp. 346-360. http://doi.org/10.1016/j.future.2016.10.027.
    [9] J. Lismont, S. Ram, J. Vanthienen, W. Lemahieu and B. Baesens, “Predicting interpurchase time in a retail environment using customer-product networks: An empirical study and evaluation”, Expert Systems with Applications, Vol 104, 2018, pp. 22-32. http://doi.org/10.1016/j.eswa.2018.03.016.
    [10] U. Yun and D. Kim, Mining of high average-utility itemsets using novel list structure and pruning strategy, Future Generation Computer Systems, Vol 68, 2017, pp. 346-360. http://doi.org/10.1016/j.future.2016.10.027.
    [11] R. Gunawan, E. Winarko and R. Pulungan, A BPSO-based method for high-utility itemset mining without minimum utility threshold, Knowledge-Based Systems, Vol 190, 2020. http://doi.org/10.1016/j.knosys.2019.105164.
    [12] S. Krishnamoorthy, Efficiently mining high utility itemsets with negative unit profits, Knowledge-Based Systems, Vol 145, 2018, pp. 1-14. http://doi.org/10.1016/j.knosys.2017.12.035.
    [13] L. Zhang, G. Fu, F. Cheng, J. Qiu and Y. Su, A multi-objective evolutionary approach for mining frequent and high utility itemsets, Applied Soft Computing Journal, Vol 62, 2018, pp. 974-986. http://doi.org/10.1016/j.asoc.2017.09.033.
    [14] L. Zhou and S. Yau, “Efficient association rule mining among both frequent and infrequent items”, Computers and Mathematics with Applications, Vol 54 (6), 2007, pp. 737-749. http://doi.org/10.1016/j.camwa.2007.02.010.
    [15] X. Dong and C. Liu, “Mining interesting infrequent and frequent itemsets based on multiple level minimum supports and minimum correlation strength”, International Journal of Services, Technology and Management, Vol 21, 2015, pp. 301-317. http://doi.org/10.1504/IJSTM.2015.073941.
    [16] M. Man, W.A.W. Abu Bakar, M.M. Abd. Jalil and J.A. Jusoh, “Postdiffset algorithm in rare pattern: An implementation via benchmark case study”, International Journal of Electrical and Computer Engineering, Vol 8 (6), 2018, pp. 4477-4485. http://doi.org/10.11591/ijece.v8i6.pp.4477-4485.
    [17] B. Bakariya and G. Thakur, “An efficient algorithm for extracting infrequent itemsets from weblog”, International Arab Journal of Information Technology, Vol 16 (2), 2019, pp. 275-280. ISSN: 16833198.
    [18] P.Y. Hsu, C.W. Huang, S.H. Huang, P.C. Chen and M.S. Cheng, A novel model for finding critical products with transaction logs, Lecture Notes in Computer Science, Vol 10942, 2018, pp. 432-439. http://doi.org/10.1007/978-3-319-93818-9_41.
    [19] P.-Y. Hsu and C.-W. Huang, IECT: A methodology for identifying critical products using purchase transactions, Applied Soft Computing Journal, Vol 94, 2020. http://doi.org/10.1016/j.asoc.2020.106420.
    [20] R. Agrawal, T. Imieliński and A. Swami, “Mining Association Rules Between Sets of Items in Large Databases”, ACM SIGMOD Record, 22 (2), 1993, pp. 207-216. http://doi.org/10.1145/170036.170072.
    [21] P. Fournier-Viger, J.C.-W. Lin, B. Vo, T.T. Chi, J. Zhang and H.B. Le, “A survey of itemset mining”, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol 7 (4), 2017. http://doi.org/10.1002/widm.1207.
    [22] Z.-H. Deng, “DiffNodesets: An efficient structure for fast mining frequent itemsets”, Applied Soft Computing Journal, Vol 41, 2016, pp. 214-223. http://doi.org/10.1016/j.asoc.2016.01.010.
    [23] L. Zhang, G. Fu, F. Cheng, J. Qiu and Y. Su, “A multi-objective evolutionary approach for mining frequent and high utility itemsets”, Applied Soft Computing Journal, Vol 62, 2018, pp. 974-986. http://doi.org/10.1016/j.asoc.2017.09.033.
    [24] M.J. Zaki and W. Meira. Jr, Data mining and analysis: fundamental concepts and algorithms, New York: Cambridge University Press, 2014.
    [25] C. Zhang, P. Tian, X. Zhang, Z.L. Jiang, L. Yao and X. Wang, “Fast eclat algorithms based on minwise hashing for large scale transactions”, IEEE Internet of Things Journal, Vol 6 (2), 2019, pp. 3948-3961. http://doi.org/10.1109/JIOT.2018.2885851.
    [26] M.J. Zaki, S. Parthasarathy, M. Ogihara and W. Li, “Parallel algorithms for discovery of association rules”, Data Mining and Knowledge Discovery, Vol 1 (4), 1997, pp. 343-373. http://doi.org/10.1023/A:1009773317876.
    [27] M.J. Zaki, “Scalable algorithms for association mining”, IEEE Transactions on Knowledge and Data Engineering, Vol 12 (3), 2000, pp. 372-390. http://doi.org/10.1109/69.846291.
    [28] Q. Gao, F.-L. Zhang and R.-J. Wang, “Mining frequent sets using fuzzy multiple-level association rules”, Journal of Electronic Science and Technology, Vol 16 (2), 2018, pp. 145-152. http://doi.org/10.11989/JEST.1674-862X.60408013.
    [29] J.C.-W. Lin, L. Yang, P. Fournier-Viger and T.-P. Hong, Mining of skyline patterns by considering both frequent and utility constraints, Engineering Applications of Artificial Intelligence, Vol 77, 2019, pp. 229-238. http://doi.org/10.1016/j.engappai.2018.10.010.
    [30] S. Krishnamoorthy, Efficient mining of high utility itemsets with multiple minimum utility thresholds, Engineering Applications of Artificial Intelligence, Vol 69, 2018, pp. 112-126. http://doi.org/10.1016/j.engappai.2017.12.012.
    [31] A. Bai, P.S. Deshpande and M. Dhabu, Selective Database Projections Based Approach for Mining High-Utility Itemsets, IEEE Access, Vol 6, 2018, pp. 14389-14409. http://doi.org/10.1109/ACCESS.2017.2788083.
    [32] J.M.-T. Wu, J.C.-W. Lin, M. Pirouz and P. Fournier-Viger, TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns, IEEE Access, Vol 6, 2018, pp. 18655-18669. http://doi.org/10.1109/ACCESS.2018.2820740.
    [33] C.-H. Weng, Discovering highly expected utility itemsets for revenue prediction, Knowledge-Based Systems, Vol 104, 2016, pp. 39-51. http://doi.org/10.1016/j.knosys.2016.04.009.
    [34] V.S. Tseng, C.-W. Wu, P. Fournier-Viger and P.S. Yu, Efficient algorithms for mining the concise and lossless representation of high utility itemsets, IEEE Transactions on Knowledge and Data Engineering, Vol 27 (3), 2015, pp. 726-739. http://doi.org/10.1109/TKDE.2014.2345377.
    [35] K. Black, Business Statistics for Contemporary Decision Making, sixth ed., US: John Wiley and Sons, 2010.
    [36] M. Song, X. Zhao, H. E and Z. Ou, “Statistics-based CRM approach via time series segmenting RFM on large scale data”, Knowledge-Based Systems, Vol 132, 2017, pp. 21-29. http://doi.org/10.1016/j.knosys.2017.05.027.
    [37] P.A. Sarvari, A. Ustundag and H. Takci, “Performance evaluation of different customer segmentation approaches based on RFM and demographics analysis”, Kybernetes, Vol 45 (7), 2016, pp. 1129-1157. http://doi.org/10.1108/K-07-2015-0180.
    [38] A. Dursun and M. Caber, “Using data mining techniques for profiling profitable hotel customers: An application of RFM analysis”, Tourism Management Perspectives, Vol 18, 2016, pp. 153-160. http://doi.org/10.1016/j.tmp.2016.03.001.
    [39] E. Nikumanesh and A. Albadvi, “Customer's life-time value using the RFM model in the banking industry: A case study”, International Journal of Electronic Customer Relationship Management, Vol 8, 2014, pp. 15-30. http://doi.org/10.1504/IJECRM.2014.066876.
    [40] S. Khandelwal and A. Mathias, “Using a 360° view of customers for segmentation”, Journal of Medical Marketing, Vol 11 (3), 2011, pp. 215-220. http://doi.org/10.1177/1745790411408853
    [41] T. Tanaka, T. Hamaguchi, T. Saigo and K. Tsuda, “Classifying and Understanding Prospective Customers via Heterogeneity of Supermarket Stores”, Procedia Computer Science, Vol 112, 2017, pp. 956-964. http://doi.org/10.1016/j.procs.2017.08.133.
    [42] S. Peker, A. Kocyigit and P.E. Eren, “LRFMP model for customer segmentation in the grocery retail industry: a case study”, Marketing Intelligence and Planning, Vol 35 (4), 2017, pp. 544-559. http://doi.org/10.1108/MIP-11-2016-0210.
    [43] M. Namvar, S. Khakabimamaghani and M.R. Gholamian, “An approach to optimised customer segmentation and profiling using RFM, LTV, and demographic features”, International Journal of Electronic Customer Relationship Management, Vol 5, 2011, pp. 220-235. http://doi.org/10.1504/IJECRM.2011.044688.
    [44] A. Hiziroglu, “Soft computing applications in customer segmentation: State-of-art review and critique”, Expert Systems with Applications, Vol 40 (16), 2013, pp. 6491-6507. http://doi.org/10.1016/j.eswa.2013.05.052.
    [45] P. Govender and V. Sivakumar, “Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019)”, Atmospheric Pollution Research, Vol 11 (1), 2020, pp. 40-56. http://doi.org/10.1016/j.apr.2019.09.009.
    [46] A.K. Jain, “Data clustering: 50 years beyond K-means”, Pattern Recognition Letters, Vol 31 (8), 2010, pp. 651-666. http://doi.org/10.1016/j.patrec.2009.09.011.
    [47] D.C.S. Beddows, M. Dall'Osto and R.M. Harrison, “Cluster analysis of rural, urban, and curbside atmospheric particle size data”, Environmental Science and Technology, Vol 43 (13), 2009, pp. 4694-4700. http://doi.org/10.1021/es803121t.
    [48] P.-N. Tan, M. Steinbach, A. Karpatne and V. Kumar, Introduction to Data Mining, second ed., New York: Pearson Education, 2019.
    [49] Y. Su, J. Reedy and R.J. Carroll, “Clustering in general measurement error models”, Statistica Sinica, Vol 28 (4), 2018, pp. 2337-2351. http://doi.org/10.5705/ss.202017.0093.
    [50] L. de la Fuente-Tomas, B. Arranz, G. Safont, P. Sierra, M. Sanchez-Autet, A. Garcia-Blanco and M.P. Garcia-Portilla, “Classification of patients with bipolar disorder using k-means clustering”, PLoS ONE, Vol 14 (1), 2019. http://doi.org/10.1371/journal.pone.0210314.

    QR CODE
    :::