跳到主要內容

簡易檢索 / 詳目顯示

研究生: 黃子菱
Tzu-Ling Huang
論文名稱: Pairs trading based on statistical learning
指導教授: 孫立憲
口試委員:
學位類別: 碩士
Master
系所名稱: 理學院 - 統計研究所
Graduate Institute of Statistics
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 55
中文關鍵詞: 配對交易統計套利共整合機器學習卡爾曼濾波風險價值界限公差極限
外文關鍵詞: Pairs trading, Statistical arbitrage, Cointegration, Machine learning, Kalman filter, Value at Risk bounds, Tolerance limits
相關次數: 點閱:23下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本文的主要目標是建構可提供超額報酬的彈性投資組合。我們基於機器學習技術和卡爾曼濾波演算法進行共整合配對交易策略,並且使用五種不同的門檻來生成交易訊號。根據獲得的實證結果,我們認為以公差極限作為門檻的策略是更保守的投資組合,而使用風險價值界限作為門檻的策略是更積極的投資組合。此外,我們在冠狀病毒COVID-19大流行期間獲得了更高的報酬率。


    The main objective of this thesis is to build a resilience portfolio that provides excess returns. Based on machine learning technology and the Kalman filter algorithm, we conduct a cointegration pairs trading strategy that uses five different thresholds to generate trading signals. According to the empirical results obtained, we believe that the strategy using the tolerance limits as the threshold is a more conservative portfolio, while the strategy using the Value at Risk bounds as the threshold is a more aggressive portfolio. In addition, we obtaine a higher rate of return during the coronavirus COVID-19 pandemic.

    Abstract i 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation and Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Research Methodology 4 2.1 Stocks Screener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Universe Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.2 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . 5 2.1.3 Density-Based Spatial Clustering of Applications with Noise . . . . . . 6 2.1.4 T-Distributed Stochastic Neighbor Embedding . . . . . . . . . . . . . 7 2.2 Mean Reversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.2 Pair Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Dynamic Hedge Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 Trade Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.1 Bollinger Bands Width . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4.2 Value at Risk Bounds for Normal Distribution . . . . . . . . . . . . . . 13 2.4.3 Tolerance Limits for Normal Distribution . . . . . . . . . . . . . . . . 14 3 Empirical Results 16 3.1 Stocks Screener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.1.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.1.2 PCA Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.3 DBSCAN Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2 Mean Reversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1 Cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.2 Pair Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3 Dynamic Hedge Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.1 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.2 Residual Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4 Trade Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4.1 Pair Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4.2 Portfolio Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 Portfolio Analysis 31 4.1 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.3 Resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5 Conclusion 39 References 40

    [1] Pearson K. (1901). On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine, vol. 2, pp. 559-572.
    [2] Hotelling H. (1933). Analysis of a Complex of Statistical Variables into Principal Components. Journal of Educational Psychology, vol. 24, pp. 417-441, 498-520.
    [3] Jolliffe, I.T. (2002). Principal Components Analysis,2nded., New York: Springer-Verlag.
    [4] Avellaneda, M., Lee, J.H. (2008). Statistical Arbitrage in the U.S. Equities Market. Quantitative Finance, vol.10, pp. 761-782.
    [5] Kim, D.H., Jeong, H. (2005). Systematic analysis of group identification in stocks markets. Physical Review, E 72, 046133.
    [6] Ester, M., Kriegel H.P., Sander, J., Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, pp. 226-231.
    [7] Hinton, G.E., Roweis, S.T. (2003). Stochastic Neighbor Embedding. Advances in Neural Information Processing Systems, vol. 15, pp. 833-840.
    [8] van der Maaten, L.J.P., Hinton, G.E. (2008). Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research, vol.9, pp. 2579-2605.
    [9] Yule, U. (1926). Why Do We Sometimes Get Nonsense Correlations Between Time Series? A Study in Sampling and the Nature of Time Series. Journal of the Royal Statistical Society, vol. 89, pp. 1-63.
    [10] Dickey, D.A., Fuller, W.A. (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association, vol. 74, pp. 427-431.
    [11] Engle, R.F., Granger, C.W.J. (1987). Co-integration and Error Correction: Representation, Estimation, and Testing. Econometrica, vol. 55, pp. 251-276.
    [12] MacKinnon, J.G. (1991). Critical Values for Cointegration Tests. Long Run Economic Relationships, Oxford University Press, pp. 267-276.
    [13] MacKinnon, J.G. (1994). Approximate Asymptotic Distribution Functions for Unit-Root and Cointegration Tests. Journal of Business and Economic Statistics, vol. 12, pp. 167-176.
    [14] Xu, H. (2017). High Frequency Statistical Arbitrage with Kalman Filter and Markov Chain Monte Carlo.
    [15] Kalman, R.E. (1960). A New Approach to Linear Filtering and Prediction Problems. ASME Journal of Basic Engineering, vol. 82, pp. 35-45.
    [16] O’Mahony, A. (2014). Online Linear Regression using a Kalman Filter. http://www.thealgoengineer.com/2014/online-linear-regression-kalman-filter/.
    [17] Kinlay, J. (2015). Statistical Arbitrage Using the Kalman Filter. http://jonathankinlay.com/2018/09/statistical-arbitrage-using-kalman-filter/.
    [18] Tilley, D.L. (1998). Moving Averages with Resistance and Support. Technical Analysis of Stocks and Commodities, pp. 62-87.
    [19] Bollinger, J. (2002). Bollinger on Bollinger Bands. New York: McGraw Hill.
    [20] Chan, J. (2006). Trading trends with the Bollinger Bands Z-Test. Technical Analysis ofStocks and Commodities, pp. 46-52.
    [21] Kolman, J., Onak, M., Jorion, P., Taleb, N., Derman, E., Putnam, B., Sandor, R., Jonas, S., Dembo, R., Holt, G., Tanenbaum, R., Margrabe, W., Mudge, D., Lam J., Rozsypal, J.(1998). Roundtable: The Limits of VaR. Derivatives Strategy.
    [22] Wald, A., Wolfowitz, J. (1946). Tolerance Limits for a Normal Distribution. Annals of Mathematical Statistics, vol. 17, pp. 208-215.
    [23] Weissberg, A. and Beatty, G.H. (1960). Tables of Tolerance-Limit Factors for Normal Distributions. Technometrics, vol. 2, pp. 483-500.
    [24] Gardiner, D.A. , Hull, N.C. (1966). An Approximation to Two-Sided Tolerance Limits for Normal Populations. Technometrics, vol. 8, No. 1, pp. 115-122.
    [25] Howe, W.G. (1969). Two-Sided Tolerance Limits for Normal Populations, Some Improvements. Journal of the American Statistical Association, vol. 64, pp. 610-620.
    [26] Guenther, W.C. (1977).Sampling Inspection in Statistical Quality Control, Grifin, London.
    [27] Lin, Y., Menchero, J., Orr, D.J., Wang, J. (2011). The Barra US Equity Model (USE4): Empirical Notes, San Francisco, CA.

    QR CODE
    :::