| 研究生: |
廖原豐 Yuan-Fong Liao |
|---|---|
| 論文名稱: |
因果關聯規則挖掘 Causal Association Rule Mining |
| 指導教授: |
陳稼興
Jiah-Shing Chen |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理學系 Department of Information Management |
| 畢業學年度: | 94 |
| 語文別: | 中文 |
| 論文頁數: | 83 |
| 中文關鍵詞: | 因果關聯規則 、資料挖掘 、技術指標 |
| 外文關鍵詞: | Level Crossing, FP-growth, Technological Indicator, Data Mining, Causal Association Rule |
| 相關次數: | 點閱:27 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本篇論文主要探討股市投資問題中的因果關係做為本研究的實驗對象,著重於探討如何提昇投資績效,而欲提昇投資績效,則須瞭解影響績效之因素及績效觀察值間的因果關係,我們將利用資料挖掘技術之關聯規則方法,有利於尋找影響績效之技術指標及績效觀察值(ex.股價反轉點)間的因果關係之規則,本研究稱之為因果關聯規則(Causal Association Rule),我們可將這些規則,組合成證券交易策略。
過去有諸多學者提出了許多關聯規則方法,然而這些傳統資料挖掘之關聯規則方法,均會產生大量的高頻項目集(Large Itemset),以致產生的規則太多、不易評估有趣性且較沒有效率,因此本研究提出一個CFP演算法架構,其中主要是改良FP-Growth演算法,以減少產生不必要之高頻項目集,使能更有效率地產生有趣的因果關聯規則。
現今常見資料離散化的處理方法,分為等距劃分法及等量劃分法兩種。然而一般投資者在進行股票進出場買賣操作時,所參考的股市技術指標數值都是累計值。本研究提出等距累計劃分法及等量累計劃分法之資料離散化概念,所離散化的技術指標適合投資者股票進出場買賣操作,同時採用累計之概念,可挖掘跨階層(level-crossing)之因果關聯規則,以挖掘更多可能有趣的規則。經過實驗t檢定結果,本研究之演算法能有效率挖掘因果關聯規則,在效率上的確有不錯的表現,顯著優於傳統之FP-growth方法。本研究在實驗中亦發現本研究之演算法隨著資料量的增加,效率更加顯著,因此適合挖掘較大型資料庫。
將挖掘之影響投資績效之因果關聯規則,依影響績效之不同構面排序進行分析,藉以提供投資者進行投資策略之安排上的協助,並藉由發掘技術指標與特定投市投資問題之關聯規則,提供投資者避險之參考。
This thesis mainly probes into the causality among the investment problems of the stock market to do for the experimental subject of this research. We focus on discussing how about to promote the performance of investment. If we want to promote the performance of investment, we must understand the causality among the factor which influences the performance and performance observing value. we will utilize the method of association rule of data mining to help to look for association rules about causality among the technological indicators which influences the performance and performance observing value (ex. the reversal point of the stock price). We call these rules as Causal Association Rules. We can make these rules up into the tactics of securities trading.
In the past, many scholars proposed a lot of methods of association rules, but these methods will produce a large number of large itemsets. So that there are too many rules and it is difficult to assess the interesting of rules and relatively inefficient. So we propose a CFP algorithm structure which mainly improve FP-Growth algorithm to reduce mining the unnecessary large itemsets and enable only producing the interesting causal association rules efficiently.
The common data dispersed methods now have equal width interval and equal frequency interval. But when investors pass in and out stock market to buy or sell stocks, they usually reference the aggregate value of technological indicators. So we propose equal width aggregate interval and equal frequency aggregate interval. These two data dispersed methods can also support mining causal association rules with level crossing so that we can mine more interesting rules. As the result of t test, the performance of our algorithm is better than FP-growth algorithm apparently. We also find the CFP algorithm is suitable for mining large-scalar database.
We arrange causal association rules in an order by different point of view to analysis so as to offer investors assistance in arrangements of investment tactics and the reference of to avoid the loss.
[1] Uinminer Data Mining資料採礦介紹<http://www.uniminer.com/center01.htm>
[2] 李秀梅,信用卡持卡者資料探勘之研究,碩士論文,輔仁大學應用統計學研究所,2000,台北。
[3] 李姿儀,醫院門診資料探勘—以虎尾若瑟醫院為例,碩士論文,南華大學資訊管理學系碩士班,2000,嘉義。
[4] 杜金龍,「技術指標在台灣股市應用的訣竅」,金錢文化,1998。
[5] 林思宇,整合集群分析與螞蟻理論於關聯法則之探勘,碩士論文,國立台北科技大學工業工程與管理系碩士班,2005,台北。
[6] 侯佳利,組合編碼遺傳演算法於投資組合及資金分配之應用,碩士論文,國立中央大學,2001,桃園。
[7] 徐家馴,在教學網站的環境中發掘熱門學習路線,碩士論文,輔仁大學資訊工程學系,2000,台北。
[8] 陳仕昇,以可重複序列挖掘網路瀏覽規則之研究,碩士論文,國立中央大學資訊管理學系,1998,桃園。
[9] 陳伯仁,證券交易策略發掘,碩士論文,國立中央大學,2002,桃園。
[10] G.H. Grupe and M.M. Owrang, “Database Mining Discovering New Knowledge and Cooperative Advantage”, Information System Management, Vol. 12, No. 4, 1995, ppg.26-31.
[11] H. Lu, L. Feng and J. Han , “Beyond Intra-Transaction Association Analysis: Mining Multi-Dimensional Inter-Transaction Association Rules”, ACM Transactions on Information Systems, Vol. 18, No. 4, 2000, pp. 423-454.
[12] J. Han, and Y. Fu, “Discovery of Multiple-Level Association Rules from Large Databases”, Proceedings of the 21 st VLDB Conference Zurich, Swizerland, 1995, pp. 420-431.
[13] J. Han, J. Pei and Y. Yin , “Mining Frequent Patterns without Candidate Generation”, in Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data(SIGMOD’00), 2000, pp. 1-12.
[14] M. J. Shaw, C. Subramaniam, G.W. Tan and M.E. Welge, “Knowledge management and data mining for marketing”, Decision Support Systems, Vol. 31, 2001, pp.127-137.
[15] M.J.A. Berry and G.S. Linoff, Data Mining Technique: For Marketing, Sales, and Customer Relationship Management, New York: Wiley Computer Publishing, 1997.
[16] NCR-Transforming Transactions into Relationships:<http://www.ncr.com/ repository/case_studies/store_automation/sa_walmart7875scanner.htm>
[17] P. Cabena, P. Hadjnian, R. Stadler, J. Verhees and A. Zanasi, Discovering Data Mining from Concept to Implementation, New Jersey:Pretice Hall, 1997.
[18] R. Agrawal and R. Srikant , “Fast algorithm for mining association rules”, in Proc. 1994 Int. Conf. Very Large Data Bases (VLDB’94), 1994, pp. 487-499.
[19] R. Agrawal, R. Srikant, “Mining Generalized Association Rules”, Proceedings of the 21 st VLDB Conference, Zurich, Swizerland, 1995, pp. 409-419.
[20] R. Srikant, Q. Vu, and R. Agrawal. “Mining Association Rules with Item Constraints”, In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD’97), August 1997, pp. 67-73.
[21] R.S. Thakur, R.C. Jain and K.R. Pardasani, “Mining Level-Crossing Association Rules from Large Databases”, Journal of Computer Science, 2006, pp. 76-81
[22] Sanjay Sehgal and Anurag Garhyan, “Abnormal returns using technical analysis: The Indian experience”, Finance India, Mar 2002, Vol. 16, No. 1, pp. 181-203
[23] Show-Jane Yen, Yue-Shi Lee and Bai-En Shie, “Mining Association Rules with Item Constraints in Transaction Databases”, Proceedings of 3rd Conference on Evolutionary Computation Applications (ECA''2005), 2005.
[24] U. Fayyad, G. Piatetsky-shapiro and P. Smyth, “From data mining to knowledge discovery in databases”, AI Magazine, 1996, pp.37-54.
[25] U. Fayyad, G. Piatetsky-Shapiro and P. Symth , “Overview of Data Mining and Knowledge Discovery”, Knowledge Discovery and Data Mining, AAAI press, 1996, pp. 1-36.
[26] Y. Lu and Q. Yuan, “Research on weather forecast based on neural network”, Proceedings of the 3rd world Congress on Intelligent Control and Automation, Vol. 2, 2000, pp.1069-1072.