跳到主要內容

簡易檢索 / 詳目顯示

研究生: 江美靜
Mei-Ching Chiang
論文名稱: 有時間區間的循序挖掘
指導教授: 陳彥良
Yen-Liang Chen
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
畢業學年度: 90
語文別: 中文
論文頁數: 34
中文關鍵詞: 資料挖掘循序樣式
外文關鍵詞: data mining, sequential pattern
相關次數: 點閱:11下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 以往一般的循序挖掘研究中所探討的循序樣式,只能得知樣式中項目之間的前後順序關係,但無法讓我們知道項目之間的發生是間隔多久的時間,例如”有70%的機會,顧客於某商店購買印表機後,他會再來購買掃描器,之後則會再來購買燒錄器”的循序樣式,我們能從中得到印表機、掃描器、燒錄器的購買順序之資訊,但是我們無法得知購買印表機、掃描器、燒錄器之間的間隔時間為多久,因此本文提出包含時間區間的循序挖掘之研究,以探勘出擁有更多資訊的時間區間循序樣式,例如”有70%的機會,顧客於某商店購買印表機後,在經過6個月後,他會再來購買掃描器,之後,再經過3個月,則會再來購買燒錄器”的時間區間循序樣式。以零售業為例,業主可以利用所挖掘出的時間區間循序樣式,以瞭解顧客的習慣、喜好和需求,並且預測出顧客在未來某段時間內的期望,達到在適當的時機內,向適當的顧客,提供適當的產品和服務,滿足顧客的所需與所求之目的。故時間區間循序樣式可為企業帶來競爭優勢或替個人帶來利益。
    本研究的目的是要挖掘出在序列資料庫中的時間區間循序樣式,一方面我們對此時間區間循序挖掘問題作出相關的定義,另一方面,我們發展出兩種演算法 — I-Apriori和I-PrefixSpan以進行挖掘。最後的實驗分析中,我們將演算法實作成系統,以驗證方法的可行性,並測試此兩演算法的效能與scale-up的特性。從實驗結果中得知,I-PrefixSpan的效能和scale-up的能力皆勝過I-Apriori,為一個較佳的時間區間循序挖掘的演算法。


    1. 緒論 1 2. 相關研究與相關應用 3 2.1. 相關研究 3 2.2. 相關應用 5 3. 問題定義與說明 8 4. 演算法 11 4.1. I-APRIORI演算法 11 4.1.1. 產生候選的時間區間序列 12 4.1.2. 計算候選的時間區間序列 17 4.2. I-PREFIXSPAN演算法 20 5. 效能 25 5.1. 模擬資料的產生 25 5.2. 效能 26 5.3. SCALE-UP 29 6. 結論 31 參考文獻 32

    [AAP00] R.C. Agarwall, C. Aggarwal and V.V.V. Prasad. A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing, 2000.
    [AS94] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB’94), pages 487-499, Santiago, Chile, Sept. 1994.
    [AS95] R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering (ICDE’95), pages 3-14, Taipei, Taiwan, Mar. 1995.
    [Car97] P. Carbone. Data Mining or Knowledge Discovery in Databases: An Overview. Data Management Handbook, New York: Auerbach Publications, 1997.
    [CHY96] M.S. Chen, J. Han, and P.S. Yu. Data Mining: An Overview from a Database Perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6): 866-883, 1996.
    [CMS99] R. Cooley, B. Mobasher, and J. Srivastava. Data Preparation for Mining World Wide Web Browsing Patterns. Journal of Knowledge and Information Systems, 1(1):5-32, 1999.
    [FPM91] W. J. Frawley, G. Piatetsky-Shapiro and C. J. Matheus. Knowledge Discovery in Databases:An Overview, AAAI/MIT press, 1991.
    [GGK01] V. Guralnik, N. Garg and G. Karypis. Parallel Tree Projection Algorithm for Sequence Mining, 7th International European Conference on Parallel Processing (Euro-Par 2001), Pages 310-320, Manchester, UK, Aug. 2001.
    [Han01] Jiawei Han. Data Mining:Concepts and Techniques. Academic Press, 2001.
    [HDY99] J. Han, G. Dong, and Y. Yin. . In Proc. 1999 Int. Conf. on Data Engineering (ICDE''99), Pages 106-115, Sydney, Australia, March 1999.
    [HGY98] J. Han, W. Gong, and Y. Yin. Mining Segment-Wise Periodic Patterns in Time-Related Databases. In Proc. of 1998 Int. Conf. on Knowledge Discovery and Data Mining (KDD''98), Pages 214-218, New York City, NY, Aug. 1998.
    [HPMA+00] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M.-C. Hsu. FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining. In Proc. 2000 Int. Conf. on Knowledge Discovery and Data Mining (KDD’00), Pages 355-359, Boston, MA, Aug 2000.
    [MH01] Sheng Ma and J. L. Hellerstein. Mining partially periodic event patterns with unknown periods. In Proc. 17th Int. Conf. Data Engineering (ICDE''01), Pages 205-214, Heidelberg, Germany, April 2001.
    [MTIV95] H. Mannila, H. Toivonen, and A. Inkeri Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3): 259 -289, November 1997.
    [PHMAZ00] J. Pei, J. Han, B. Mortazavi-Asl, and H. Zhu. Mining Access Patterns Efficiently from Web Logs. In Proc. 2000 Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD''00), Pages 396-407, Kyoto, Japan, April 2000.
    [PHPC+01] J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. Prefix Span: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In Proc. 2001 Int. Conf. on Data Engineering (ICDE''01), Heidelberg, Germany, Apr. 2001.
    [SA96] R. Srikant, R. Agrawal. Mining Sequential Patterns: Generalizations and Performance Improvements. In Proc. of the Fifth Int''l Conference on Extending Database Technology (EDBT’96), Avignon, France, Mar. 1996.
    [WPC01] P.-H. Wu, W.-C. Peng and M.-S. Chen. Mining Sequential Alarm Patterns in a Telecommunication Database. Workshop on Databases in Telecommunications (VLDB 2001), Sept. 2001.
    [Zak01] M. J. Zaki. SPADE:An Efficient Algorithm for Mining Frequent Sequences. In Proc. of Machine Learning Journal, special issue on Unsupervised Learning (Doug Fisher, ed.), 42(1/2):31-60, Jan/Feb 2001.

    QR CODE
    :::