| 研究生: |
吳昀錚 Yun-Cheng Wu |
|---|---|
| 論文名稱: |
利用文字探勘技術預測台股加權指數之漲跌趨勢 Predicting the Trend of Taiwan Weighted Stock Index with Text Mining Techniques |
| 指導教授: |
周世傑
Shih-Chieh Chou |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理學系 Department of Information Management |
| 畢業學年度: | 96 |
| 語文別: | 英文 |
| 論文頁數: | 27 |
| 中文關鍵詞: | 股票 、台灣股票加權指數 、分類 、文字探勘 、短期 |
| 外文關鍵詞: | Text Mining, Classification, Taiwan Weighted Stock Index, Stock, Short-Term |
| 相關次數: | 點閱:20 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
一直以來,股票價格的趨勢預測都是個令人感興趣的議題。如果投資人能夠事先得知股票價格的漲跌趨勢,那麼他們將能夠順利的從股票市場當中獲利。然而,人類的行為相當難以掌握,因此,想要準確的預測其趨勢是非常困難的。過去,在此議題的研究上,大多採用技術分析以及基本分析這兩種分析方法。但是,這兩種方法都只提供長期的股票投資策略,而忽略了由財經新聞所引起的短期股市變動。
本研究將藉由文字探勘技術去預測台灣股票市場的移動趨勢。我們發展了一個系統去針對線上的財經新聞進行分類,分類的結果將會決定我們的投資策略。最後,我們透過投資台灣股票加權指數去評估該系統的績效。
實驗結果顯示,該系統將能夠在每個月獲得大約百分之五點四的投資報酬率。此外,經過統計檢定驗證後發現,在顯著水準為0.05之下,該投資報酬率勝過銀行定存利率。由此可知,該系統所提供之策略對於短期股票投資人而言,有其參考之價值。
Stock price trend forecasting is an interesting topic. If investors can master stock price trend in advance, they will gain profit efficiently. However, no method can predict the trend accurately because human behavior is quite difficult to understand. In the past, many studies work on the topic by adopting fundamental and technical analysis. Nevertheless, both of the two trading analyses ignore the influence of short-term stock market movement caused by financial news, but only research into long-term forecasting.
In this paper, we aim to predict the movement of whole Taiwan stock market by utilizing text mining. We develop a system to classify on-line financial news articles. The classification results can decide our trading strategies, and then the performance of our system is evaluated by investing Taiwan Weighted Stock Index (TWSI).
The results reveal that our system can earn an average return of 5.4% per month, and additionally, the system has statistically the higher average return than the certificate of deposit (CD) rate (α = 0.05). Therefore, we argue that the trading strategies provide by our system are valuable for the short-term investors.
References
[1]B. Wuthrich, V. Cho, S. Leung, D. Permunetilleke, K. Sankaran, J. Zhang and W. Lam, "Daily Stock Market Forecast from Textual Web Data," IEEE International Conference on Systems, Man, and Cybernetics, vol. 3, pp. 2720-2725, San Diego, CA, USA, 1998.
[2]C.-S. Lee, Y.-J. Chen and Z.-W. Jian, "Ontology-Based Fuzzy Event Extraction Agent for Chinese E-News Summarization," Expert Systems with Applications, vol. 25, no. 3, pp. 431-447, 2003.
[3]G. L. Gastineau, The Exchange-Traded Funds Manual. John Wiley & Sons, New York, NY, USA, 2002.
[4]G. Gidófalvi, "Using News Articles to Predict Stock Price Movements," Project Report, Department of Computer Science and Engineering, University of California, San Diego, 2001.
[5]H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, Norwell, MA, USA, 1998.
[6]I. Rish, "An Empirical Study of the Naive Bayes Classifier," Proceedings of IJCAI-01 Workshop on Empirical Methods in Artificial Intelligence, vol. 335, pp. 41-46, Seattle, WA, USA, 2001.
[7]J. Han and M. Kamber, Data Mining: Concepts and Technique. Morgan Kaufmann, San Francisco, CA, USA, 2006.
[8]J.-L. Tsai, G. Hsieh and W.-L. Hsu, "Auto-Generation of NVEF Knowledge in Chinese," Computational Linguistics and Chinese Language Processing, vol. 9, no. 1, pp. 41-64, 2004.
[9]K. Aas and L. Eikvil, "Text Categorisation: A Survey," Technical Report, Norwegian Computing Center, 1999.
[10]M.-A. Mittermayer, "Forecasting Intraday Stock Price Trends with Text Mining Techniques," Proceedings of the 37th Annual Hawaii International Conference on System Sciences, vol. 3, pp. 30064b, Big Island, HI, USA, 2004.
[11]M. Beechey, D. Gruen and J. Vickery, "The Efficient Market Hypothesis: A Survey," Economic Research Department, Reserve Bank of Australia Working Paper, 2000.
[12]P. A. Adler and P. Adler, The Social Dynamics of Financial Markets. JAI Press, Greenwich, CT, USA, 1984.
[13]P. Cunningham and S. J. Delany, "k-Nearest Neighbour Classifiers," Technical Report, University College Dublin, School of Computer Science and Informatics, 2007.
[14]R. P. Schumaker and H. Chen, "Textual Analysis of Stock Market Prediction Using Financial News Articles," Proceedings of the 12th Americas Conference on Information Systems, paper 185, Acapulco, Guerrero, Mexico, 2006.
[15]S.-B. Cho and H.-H. Won, "Machine Learning in DNA Microarray Analysis for Cancer Classification," Proceedings of the First Asia-Pacific Bioinformatics Conference on Bioinformatics, vol. 19, pp. 189-198, Adelaide, SA, Australia, 2003.
[16]T.-C. Hsieh, K.-H. Tsai, C.-L. Chen, M.-C. Lee, T.-K. Chiu and T.-I. Wang, "Query-Based Ontology Approach for Semantic Search," Proceedings of the Sixth International Conference on Machine Learning and Cybernetics, vol. 5, pp. 2970-2975, Hong Kong, 2007.
[17]W.-Y. Ma and K.-J. Chen, "Introduction to CKIP Chinese Word Segmentation System for the First International Chinese Word Segmentation Bakeoff," Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, vol. 17, pp. 168-171, Sapporo, Hokkaido, Japan, 2003.
[18]Y. Yang and J. O. Pedersen, "A Comparative Study on Feature Selection in Text Categorization," Proceedings of the Fourteenth International Conference on Machine Learning, pp. 412-420, Nashville, TN, USA, 1997.
[19]陳俊達,王台平,劉昭麟,「以文件分類技術預測股價趨勢」,第十九屆自然語言與語音處理研討會論文集,347-361頁,國立台灣大學,台北市,台灣,2007年。
[20]陳振南,吳毓傑,「特徵選取與權重分配於中文新聞分類之比較」,第十三屆國際資訊管理學術研討會,721-728頁,淡江大學,台北縣,台灣,2002年。
[21]鍾任明,李維平,吳澤民,「運用文字探勘於日內股價漲跌趨勢預測之研究」,中華管理評論國際學報,10(1),1-30頁,2007年。