基於 Copula 下的馬可夫鏈模型對於常態序列數據之在線變化點偵測

簡易檢索 / 詳目顯示

回結果列表

研究生：	郭東華 Dong-Hua Kuo
論文名稱：	基於 Copula 下的馬可夫鏈模型對於常態序列數據之在線變化點偵測 Online Changepoint Detection under a Copula-based Markov Chain Model for Normal Sequential Data
指導教授：	孫立憲 Li-Hsien Sun
口試委員:
學位類別：	碩士 Master
系所名稱：	理學院 - 統計研究所 Graduate Institute of Statistics
論文出版年：	2022
畢業學年度：	110
語文別：	英文
論文頁數：	78
中文關鍵詞：	貝氏推論、變化點、Clayton copula 、一階自我迴歸模型、馬可夫模型、平均絕對誤差、模型誤導
外文關鍵詞：	Bayesian Inference, changepoint, Clayton copula, the first-order autoregressive model, Markov model, mean absolute error, misspecification
相關次數：	點閱：20 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

即時變化點檢測是辨別序列數據是否隨著時間的推移而發生結構變化的過
程。在實務上，相關性的結構是時間序列分析的重要問題。此外，為了放寬相
關性的限制，我們提出了一個建立在 Clayton copula 且其邊際分布為常態分佈的
copula-馬可夫模型，並將我們提出的模型在不同的情況下與獨立的模型以及一
階自我迴歸的模型進行比較。模擬的結果指出無論在何種情況下，我們提出的
模型在準確率以及平均絕對誤差下皆表現得比其他兩個模型來的好。在實證研
究中，我們考慮且偵測標準普爾 500 指數、日經 225 指數和富時 100 指數的每
日對數報酬率在 2008 金融危機和 2020 冠狀病毒疾病大流行下報酬率的變化點，
實證結果揭露我們提出的模型是可以捕捉有序列相關資料的結構變化。

Online changepoint detection is a procedure to identify whether a sequential data
structure changes over time. In practice, the dependent structure is an important issue
for time series analysis. To achieve flexibility limit dependence, we propose a copulabased Markov model based on the Clayton copula and the marginal distribution being
a normal distribution and compare the proposed model with the independent model and
the first-order autoregressive model under various scenarios. The simulation results
indicate that the proposed model outperforms the other models in precision and mean
absolute error (MAE) no matter the scenarios. For empirical studies, we consider the
daily log returns of the S&P 500 Index, the Nikkei 225 Index, and the FTSE 100 Index to
identify the changepoints in the period of the financial crisis in 2008 and the COVID-19
pandemic in 2020. Results reveal that the proposed model is able to capture the structure
change for serial dependent data

Contents
1 Introduction 1
2 Proposed Model 3
2.1 Copula function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 The tail dependence and Archimedean copula family . . . . . . . . . . . . 4
2.3 The Clayton copula model based on the first-order Markov Chain . . . . . 6
3 Bayesian Online Changepoint Detection 7
3.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 The changepoint algorithm using the proposed model . . . . . . . . . . . . 9
3.3 The EXact Online Bayesian Changepoint Algorithm . . . . . . . . . . . . 14
4 Simulation Study 16
5 Empirical Study 44
6 Concluding Remarks and Future Extensions 54
7 Code (Empirical study: S&P500 Index) 58


List of Tables
1 Simulation results for the proposed model versus the independent model. . 20
2 Simulation results for the proposed model versus the AR(1) model. . . . . . 21
3 Misspecification results for the proposed model versus the independent model. 22
4 Misspecification results for the proposed model versus the AR(1) model. . . 23
5 Summary of two sub-period corresponding to three indices. . . . . . . . . . 46
6 Summary statistic of three indices in two period respectively. . . . . . . . . 46
7 Summary of the empirical results. . . . . . . . . . . . . . . . . . . . . . . 47

List of Figures
1 The plot of relationship between the sequential data and current run length. 8
2 The barplot of performance measures corresponding to Table 1.
The number above bars are Miss . . . . . . . . . . . . . . . . . . . . . . . 24
3 The barplot of performance measures corresponding to Table 2.
The number above bars are Miss. . . . . . . . . . . . . . . . . . . . . . . . 25
4 The barplot of performance measures corresponding to Table 3.
The number above bars are Miss. . . . . . . . . . . . . . . . . . . . . . . . 26
5 The barplot of performance measures corresponding to Table 4.
The number above bars are Miss. . . . . . . . . . . . . . . . . . . . . . . . 27
6 The median run length plot of 100 replications corresponding to data
generated from (I) in Table 1. . . . . . . . . . . . . . . . . . . . . . . . . . 28
7 The median run length plot of 100 replications corresponding to data
generated from (II) in Table 1. . . . . . . . . . . . . . . . . . . . . . . . . 29
8 The median run length plot of 100 replications corresponding to data
generated from (III) in Table 2. . . . . . . . . . . . . . . . . . . . . . . . . 30
9 The median run length plot of 100 replications corresponding to data
generated from (II) in Table 2. . . . . . . . . . . . . . . . . . . . . . . . . 31
10 The median run length plot of 100 replications corresponding to data
generated from (IV) in Table 3. . . . . . . . . . . . . . . . . . . . . . . . . 32
11 The median run length plot of 100 replications corresponding to data
generated from (V) in Table 3. . . . . . . . . . . . . . . . . . . . . . . . . 33
12 The median run length plot of 100 replications corresponding to data
generated from (VI) in Table 4. . . . . . . . . . . . . . . . . . . . . . . . . 34
13 The median run length plot of 100 replications corresponding to data
generated from (V) in Table 4. . . . . . . . . . . . . . . . . . . . . . . . . 35
14 The median run length plot of 100 replications corresponding to data
generated from (I) in Table 1. . . . . . . . . . . . . . . . . . . . . . . . . . 36
15 The median run length plot of 100 replications corresponding to data
generated from (II) in Table 1. . . . . . . . . . . . . . . . . . . . . . . . . 37
16 The median run length plot of 100 replications corresponding to data
generated from (III) in Table 2. . . . . . . . . . . . . . . . . . . . . . . . . 38
17 The median run length plot of 100 replications corresponding to data
generated from (II) in Table 2. . . . . . . . . . . . . . . . . . . . . . . . . 39
18 The median run length plot of 100 replications corresponding to data
generated from (IV) in Table 3. . . . . . . . . . . . . . . . . . . . . . . . . 40
19 The median run length plot of 100 replications corresponding to data
generated from (V) in Table 3. . . . . . . . . . . . . . . . . . . . . . . . . 41
20 The median run length plot of 100 replications corresponding to data
generated from (VI) in Table 4. . . . . . . . . . . . . . . . . . . . . . . . . 42
21 The median run length plot of 100 replications corresponding to data
generated from (V) in Table 4. . . . . . . . . . . . . . . . . . . . . . . . . 43
22 Structure change is detected for the S&P 500 Index in 2008. . . . . . . . . 48
23 Structure change is detected for the Nikkei 225 Index in 2008. . . . . . . . 49
24 Structure change is detected for the FTSE 100 Index in 2008. . . . . . . . . 50
25 Structure change is detected for the S&P 500 Index in 2020. . . . . . . . . 51
26 Structure change is detected for the Nikkei 225 Index in 2020. . . . . . . . 52
27 Structure change is detected for the FTSE 100 Index in 2020. . . . . . . . . 53

                                

Adams, R.P. and MacKay, D.J. (2007). Bayesian online changepoint detection. https://
doi.org/10.48550/arXiv.0710.3742.
Aminikhanghahi, S., Wang, T., and Cook, D.J. (2018). Real-time change point detection with
application to smart home time series data. IEEE Transactions on Knowledge and Data
Engineering, 31, 1010–1023.
Barry, D. and Hartigan, J.A. (1992). Product partition models for change point problems. The
Annals of Statistics, 20, 260–279.
Box, G.E.P. and Jenkins, G.M. (1976). Time series analysis: Forecasting and control,
Holden-Day, San Francisco.
Caron, F., Doucet, A., and Gottardo, R. (2012). On-line changepoint detection and parameter
estimation with application to genomic data. Statistics and Computing, 22, 579–595.
Chib, S. (1998). Estimation and comparison of multiple changepoint models. Journal of
Econometrics, 86, 221–242.
Clayton, D.G. (1978). A model for association in bivariate life tables and its application in
epidemiological studies of familial tendency in chronic disease incidence. Biometrika, 65,
141–151.
Darsow, W.F., Nguyen, B., and Olsen, E.T. (1992). Copulas and Markov processes. Illinois
Journal of Mathematics, 36, 600–642.
Embrechts, P., McNeil, A., and Straumann, D. (2002). Correlation and dependence in risk
management: properties and pitfalls. In Risk Management: Value at Risk and Beyond
1: Dempster, M.A.H., and Moffatt, H.K. (Eds.) 176–223. Cambridge University Press.
Cambridge.
Emura, T., Long, T.H., and Sun, L.H. (2017a). Routines for performing estimation and statistical process control under copula-based time series models. Commun Statist Simulat
Comput, 46, 3067–3087.
Fearnhead, P. and Liu, Z. (2007). On‐line inference for multiple changepoint problems. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69, 589–605.
Frank, M.J. (1979). On the simultaneous associativity of F(x, y) and x + y–F(x, y). Aequationes Math, 19, 194–226
Genest, C. and MacKay, R.J. (1986). Copules archimédiennes et families de lois bidimensionnelles dont les marges sont données. The Canadian Journal of Statistics, 14, 145–159.
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation
of posterior moments. In Bayesian Statistics 4: Bernardo, J.M., Berger, J.O., Dawid, A.P.,
and Smith, A.F.M. (Eds.) 169–193. Oxford University Press. Oxford.
Gombay, E. (2008). Change detection in autoregressive time series. Journal of Multivariate
Analysis, 99, 451–464.
Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.
Huang, X.W. and Emura, T. (2019). Model diagnostic procedures for copula-based Markov
chain models for statistical process control. Commun Statist Simulat Comput, https://
doi.org/10.1080/03610918.2019.1602647
Inclan, C. (1993) Detection of Multiple Changes of Variance Using Posterior Odds. Journal
of Business and Economic Statistics, 11, 289–300.
Long, T.H. and Emura, T. (2014). A control chart using copula-based Markov chain models.
Journal of the Chinese Statistical Association, 52, 466–496.
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E. (1953).
Equation of state calculations by fast computing machines. The Journal of Chemical
Physics, 21, 1087–1092.
Metropolis, N. and Ulam, S. (1948). The Monte Carlo method. Journal of the American
Statistical Association, 44, 335–341. https://doi.org/10.2307/2280232.
Nelsen, R.B. (2006). An introduction to copulas (2nd ed.). Springer Series in Statistics.
Berlin.
Sklar, M. (1959). Fonctions de repartition an dimensions et leurs marges. Publications de l＇
Institut de Statistique de l＇Université de Paris, 8, 229-231.
Stephens, D.A. (1994). Bayesian retrospective multiple-changepoint identification. Applied
Statistics, 43, 159–178.
Truong, C., Oudre, L., and Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Processing, 167, 107299.
Wichern, D.W., Miller, R.B., and Hsu, D.A. (1976). Changes of variance in first‐order autoregressive time series models—With an application. Journal of the Royal Statistical Society:
Series C (Applied Statistics), 25, 248–256.
Wilson, R.C., Nassar, M.R., and Gold, J.I. (2010). Bayesian online learning of the hazard rate
in change-point problems. Neural Computation, 22, 2452–2476.
Yamanishi, K. and Takeuchi, J.I. (2002). A unifying framework for detecting outliers and
change points from non-stationary time series data. In Proceedings of the eighth ACM
SIGKDD international conference on Knowledge discovery and data mining: Hand, D.J.,
Keim, D.A., and Ng, R. (Eds.) 676–681. Association for Computing Machinery. New York.

簡易檢索 / 詳目顯示

相關論文