Publications

The analysis and forecasting of tennis matches using a high-dimensional dynamic model with Gorgi, P.  and Koopman, S.J.  Journal article

Journal of the Royal Statistical Society, Series A (2019)

We develop a high-dimensional dynamic model where the ability of each tennis player evolves smoothly over time. The ability or strength of a tennis player is also treated for four different court surface types (hard court, carpet, clay and grass) in the model. The proposed statistical model is treated in a likelihood-based analysis and is capable of handling high- dimensional datasets while the parameter dimension remains small. In particular, we analyze 17 years of ATP matches for a panel of over 500 players and hence more than 2000 dynamic strength levels. We find that considering player-specific abilities for different court surfaces is of key importance for modeling tennis matches. We further consider several other extensions including player-specific explanatory variables and accounting for the different configuration of Grand Slam tournaments. In our analysis we illustrate how the statistical results can be used to construct rankings of players for different court surface types. We finally show that our proposed model can also be effective in forecasting. We provide empirical evidence that our model significantly outperforms existing models in the forecasting of match results.

Long Term Forecasting of El Niño Events via Dynamic Factor Simulations with Li, M., Koopman, S.J., Lit, R. and Desislava Petrova.  Journal article

Journal of Econometrics

We propose a new forecasting procedure which particularly explores opportunities for improving the precision of medium and long term forecasts of the Niño3.4 time series that is linked with the well-known El Niño phenomenon. This important climatic time series is subject to an intricate dynamic structure and is interrelated to other climatological variables. The procedure consists of three steps. First, a univariate time series model is considered for producing prediction errors. Second, signal paths of the prediction errors are simulated via a dynamic factor model for the errors and explanatory variables. From these simulated errors, ensemble time series for Niño3.4 are constructed. Third, forecasts are generated from the ensemble time series and their sample average is our final forecast. As part of these dynamic factor simulations, we also obtain the forecast of the El Niño event which is a categorical variable. We present empirical evidence that our procedure can be superior in its forecasting performance when compared to other econometric forecasting methods.

Forecasting football match results in national league competitions using score-driven time series models with Koopman, S.J.  Journal article

International Journal of Forecasting

We develop a new dynamic multivariate model for the analysis and the forecasting of football match results in national league competitions. The proposed dynamic model is based on the score of the predictive observation mass function for a high-dimensional panel of weekly match results. Our main interest is to forecast whether the match result is a win, a loss or a draw for each team. To deliver such forecasts, the dynamic model can be based on three different dependent variables: the pairwise count of the number of goals, the difference between the number of goals, or the category of the match result (win, loss, draw). The different dependent variables require different distributional assumptions. Furthermore, different dynamic model specifications can be considered for generating the forecasts. We empirically investigate which dependent variable and which dynamic model specification yield the best forecasting results. In an extensive forecasting study for match results from six large European football competitions, we validate the precision of the forecasts and the success of the forecasts in a betting simulation. We conclude that the dynamic model for pairwise counts delivers the most precise forecasts while the dynamic model for difference between counts is most successful in betting; they both outperform benchmark and other competing models.

Dynamic Discrete Copula Models for High Frequency Stock Price Changes [link] with Koopman, S.J. , Lucas, A.  and Opschoor, A.  Journal article

Journal of Applied Econometrics

We develop a dynamic model for the intraday dependence between discrete stock price changes. The conditional copula mass function for the integer tick-size price changes has time-varying parameters that are driven by the score of the predictive likelihood function. The marginal distributions are Skellam and also have score-driven time-varying parameters. We show that the integration steps in the copula mass function for large dimensions can be accurately approximated via numerical integration. The resulting computational gains lead to a methodology that can treat high-dimensional applications. Its accuracy is shown by an extensive simulation study. In our empirical application of ten U.S. bank stocks, we reveal strong evidence of time-varying intraday dependence patterns: dependence starts at a low level but generally rises during the day. Based on one-step-ahead out-of-sample density forecasting, we find that our new model outperforms benchmarks for intraday dependence such as the cubic spline model, the fixed correlation model, or the rolling average realized correlation.

Modified Efficient Importance Sampling for partially non-Gaussian State Space Models [link] with Koopman, S.J.  and T.M. Nguyen  Journal article

Statistica Neerlandica

The construction of an importance density for partially non-Gaussian state space models is crucial when simulation methods are used for likelihood evaluation, signal extraction and forecasting. The method of efficient importance sampling is successful in this respect but we show that it can be implemented in a computationally more efficient manner using standard Kalman filter and smoothing methods. Efficient importance sampling is generally applicable for a wide range of models but it is typically a custom-built procedure. For the class of partially non-Gaussian state space models, we present a general method for efficient importance sampling. Our novel method makes the efficient importance sampling methodology more accessible because it does not require the computation of a (possibly) complicated density kernel that needs to be tracked for each time period. The new method is illustrated for a stochastic volatility model with a Student’s t-distribution.

Intraday Stochastic Volatility in Discrete Price Changes: the Dynamic Skellam Model [link] with Koopman, S.J.  and Lucas, A.  Journal article

Journal of the American Statistical Association (2017), 112, 1490-1503

We study intraday stochastic volatility for four liquid stocks traded on the New York Stock Exchange using a new dynamic Skellam model for high-frequency tick-by-tick discrete price changes. Since the likelihood function is analytically intractable, we rely on numerical methods for its evaluation. Given the high number of observations per series per day (1,000 to 10,000), we adopt computationally efficient methods including Monte Carlo integration. The intraday dynamics of volatility and the high number of trades without price impact require non-trivial adjustments to the basic dynamic Skellam model. In-sample residual diagnostics and goodness-of-fit statistics show that the final model provides a good fit to the data. An extensive day-to-day forecasting study of intraday volatility shows that the dynamic modified Skellam model provides accurate forecasts compared to alternative modeling approaches.

Model-Based Business Cycle and Financial Cycle Decomposition for Europe and the United States [link] with Koopman, S.J.  and Lucas, A.  Book chapter

Systemic Risk Tomography: Signals, Measurements and Transmission Channels, ISTE-Elsevier (2016)

We develop a multivariate unobserved components model to extract business cycle and financial cycle indicators from a panel of economic and financial time series of four large developed economies. Our model is flexible and allows for the inclusion of cycle components in different selections of economic variables with different scales and with possible phase shifts. We find clear evidence of the presence of a financial cycle with a length that is approximately twice the length of a regular business cycle. Moreover, cyclical movements in credit related variables largely depend on the financial cycle, and only marginally on the business cycle. Property prices appear to have their own idiosyncratic dynamics and do not substantially load on business or financial cycle components. Systemic surveillance policies should therefore account for the different dynamic components in typical macro financial variables.

A dynamic bivariate Poisson model for analysing and forecasting match results in the English Premier League [link] , with Koopman, S.J.  Journal article

Journal of the Royal Statistical Society, Series A (2015), 178(1), 167-186

We develop a statistical model for the analysis and forecasting of football match results which assumes a bivariate Poisson distribution with intensity coefficients that change stochastically over time. The dynamic model is a novelty in the statistical time series analysis of match results in team sports. Our treatment is based on state space and importance sampling methods which are computationally efficient. The out-of-sample performance of our methodology is verified in a betting strategy that is applied to the match outcomes from the 2010–2011 and 2011–2012 seasons of the English football Premier League.