E ssays on S ta tistical A rbitrage Der Rechts- und Wirtschaftswissenschaftlichen Fakultät/ dem Fachbereich Wirtschaftswissenschafen der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades Dr. rer. pol vorgelegt von Diplom-Wirtschaftsingenieur Univ. Christopher Krauß aus Nürnberg
Contents Abstract iv Abstract (deutsche Fassung) V Acknowledgements vi 1 Introduction 1 2 Statistical Arbitrage Pairs Trading Strategies: Review and Outlook 9 2.1 Introduction...11 2.2 Distance ap p ro ach...14 2.2.1 The baseline approach - Gatev, Goetzmann and Rouwenhorst... 14 2.2.2 Expanding on the GGR sam ple...16 2.2.3 From SSD to Pearson correlation and quasi-multivariate pairs trading... 17 2.2.4 Explaining pairs trading profitability...19 2.2.5 Further out-of-sample testing of GGR S strategy.. ٠ ٠...... 21 2.3 Cointegration approach... 22 2.3.1 Univariate pairs tra d in g... 22 2.3.1.1 Development of a theoretical framework.......... 22 2.3.1.2 A large-scale empirical application...22 2.3.1.3 A deep-dive on the development of optimal trading thresho ld s... 23 2.3.1.4 A review of further empirical applications......... 24 2.3.2 Multivariate cointegration ap p ro ach... 28 2.3.2.1 Passive index tracking and enhanced indexation strategies 28 2.3.2.2 Active statistical arbitrage strategies...28 2.3.3 Adjacent developments... 29 2.4 Time series approach... 30 2.4.1 Modeling the spread in state sp ace... 30 2.4.2 Applications of the Ornstein-Uhlenbeck process... 33 2.4.3 Further concepts from time series analysis...34 2.5 Stochastic control approach... 35
2.5.1 Modeling asset pricing dynamics with the Ornstein-Uhlenbeck proc e s s...... ٠........ 35 2.5.2 Modeling asset pricing dynamics with error correction models... 36 2.6 Other approaches...37 2.6.1 Machine learning and combined forecasts approach ٠. ٠...... 37 2.6.2 Copula approach...39 2.6.3 Principal components analysis approach...42 2.7 Pairs trading in the light of market frictions...43 2.8 Conclusion...46 2.8.1 Distance approach...46 2.8.2 Cointegration ap p ro ach...47 2.8.3 Times series approach...47 2.8.4 Stochastic control approach... 48 2.8.5 Other approaches...48 2.8.6 Pairs trading in the light of market frictions. ٠...49 3 On the power and size properties of cointegration tests in the light of high-frequency stylized facts 67 3.1 Introduction...69 3.2 Data sample and its stylized f a c ts...71 3.3 M ethodology...75 3.3.1 Simulation of stock p ric e s...76 3.3.2 Simulation of cointegration processes...77 3.3.2.1 Autoregressive m o d e l... 77 3.3.2.2 Generalized autoregressive conditional heteroscedasticity m odel... 78 3.3.2.3 Multiple regime smooth transition autoregressive model. 78 3.3.2.4 Multiple regime smooth transition autoregressive model with reversible ju m p s...79 3.3.2.5 Multiple regime smooth transition autoregressive model with nonreversible ju m p s...80 3.3.2.6 Parameter choices common to all Monte Carlo variants. 81 3.3.3 The cointegration relation... 81 3.3.4 Analysis of power and size properties...82 3.3.4.1 Cointegration tests... 82 3.3.4.2 Definition of size and power... 82 3.3.4.3 Setup of Monte Carlo sim ulations... 83 3.4 Results...87 3.4.1 Results Type I through Type I I I...87 3.4.2 Results Type I V... 90 3.4.3 Results Type V.. 92 3.4.4 Results Type V I... 94 3.5 Economic interpretation...96 3.6 Conclusion...98 4 Pairs trading with partial cointegration 109 4.1 Introduction...I ll 4.2 Partial cointegration...113
4.2.1 R epresentation... 113 4.2.2 Estimation of a partial cointegration m o d e l... 116.4.2.3 Consistency of estimation routine...117 4.2.4 Power and size properties of the likelihood ratio test........ 120 4.3 Study design: Comparing partial cointegration with cointegration in the context of pairs trading...121 4.3.1 D a ta...121 4.3.2 The backtesting fram ew ork...122 4.3.2.1 Building blocks.... 122 4.3.2.2 Formation period... 123 4.3.2.3 Trading perio d... 125 4.3.3 Trading on simulated d a ta...128 4.4 Results......................................129 4.4.1 Simulated d a t a... 129 4.4.2 Empirical d a t a... 132 4.4.2.1 Performance evaluation...132 4.4.2.2 Sub-period an aly sis...137 4.5 Conclusions...138 4.6 AppendixA. Identifiability...141 4.7 AppendixB. Likelihood function...142 4.8 Appendixe. Likelihood ratio t e s t...145 5 Nonlinear dependence modeling with bivariate copulas: Statistical arbitrage pairs trading on the S&P 100 151 5.1 Introduction...153 5.2 Data and so ftw are...157 5.2.1 D a ta...157 5.2.2 Software...158 5.3 M ethodology...158 5.3.1 Prelim inaries...158 5.3.1.1 Copula concept... 158 5.3.1.2 G0 0 dness0 f-fit of copulas... 159 Cramér-νοη Mises t e s t...159 Information criteria...161 5.3.2 Formation period...162 5.3.2.1 Estimation p e rio d...163 period... 164 5.3.2.2 Pseudo-trading.. Suitable p a irs...166 Individualized exit ru le s...166 5.3.3 Trading p e rio d... 167 5.3.4 Return com putation... 167 5.4 Results...168 5.4.1 Return characteristics and trading statistics... 168 5.4.2 Value at r i s k... 170 5.4.3 Annualized risk-return characteristics...170 5.4.4 Drawdown m easures... 171 5.4.5 Subperiod analysis... 172
Gouteuts 5.4.6 Common risk factors...174 5.4.7 Market frictions... 176 5.5 Conclusion... 177 6 Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the s& p 500 185 6.1 Introduction... 187 6.2 Literature review... 189 6.3 Data and so ftw are... 192 6.3.1 D a ta... 192 6.3.2 Software... 193 6.4 Methodology....... 193 6.4.1 Generation of training and trading sets... 193 6.4.2 Feature generation... 194 6.4.3 Model tra in in g...195 6.4.3.1 Deep neural netw orks...195 6.4.3.2 Gradient-boosted trees... 197 6.4.3.3 Random forests... 198 6.4.3.4 Equal-weighted ensem ble... 199 6.4.4 Forecasting, ranking, and trading...200 6.5 Results...200 6.5.1 General re s u lts...200 6.5.2 Strategy performance...203 6.5.3 Sub-period analysis... 207 6.5.4 Further analy ses...211 6.5.4.1 Variable im portances...211 6.5.4.2 Industry breakdown... 212 6.5.4.3 Robustness checks... 213 6.6 Conclusion...215 7 Conclusion 223 Bibliography 227