Inference in the Skew-t and Related Distributions

Σχετικά έγγραφα
Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

ST5224: Advanced Statistical Theory II

Other Test Constructions: Likelihood Ratio & Bayes Tests

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

Statistical Inference I Locally most powerful tests

Lecture 21: Properties and robustness of LSE

Introduction to the ML Estimation of ARMA processes


Lecture 34 Bootstrap confidence intervals

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

6. MAXIMUM LIKELIHOOD ESTIMATION

6.3 Forecasting ARMA processes

5.4 The Poisson Distribution.

Queensland University of Technology Transport Data Analysis and Modeling Methodologies

Biostatistics for Health Sciences Review Sheet

FORMULAS FOR STATISTICS 1

4.6 Autoregressive Moving Average Model ARMA(1,1)

Supplementary Appendix

Local Approximation with Kernels

Lecture 12: Pseudo likelihood approach


Μηχανική Μάθηση Hypothesis Testing

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

An Inventory of Continuous Distributions

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

Άσκηση 10, σελ Για τη μεταβλητή x (άτυπος όγκος) έχουμε: x censored_x 1 F 3 F 3 F 4 F 10 F 13 F 13 F 16 F 16 F 24 F 26 F 27 F 28 F

Math 6 SL Probability Distributions Practice Test Mark Scheme

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

Επιστηµονική Επιµέλεια ρ. Γεώργιος Μενεξές. Εργαστήριο Γεωργίας. Viola adorata

Exercises to Statistics of Material Fatigue No. 5

Tridiagonal matrices. Gérard MEURANT. October, 2008

Second Order Partial Differential Equations

Aquinas College. Edexcel Mathematical formulae and statistics tables DO NOT WRITE ON THIS BOOKLET

Figure A.2: MPC and MPCP Age Profiles (estimating ρ, ρ = 2, φ = 0.03)..

Concrete Mathematics Exercises from 30 September 2016

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

Matrices and Determinants

Chapter 1 Introduction to Observational Studies Part 2 Cross-Sectional Selection Bias Adjustment

Summary of the model specified

The ε-pseudospectrum of a Matrix

Λογαριθμικά Γραμμικά Μοντέλα Poisson Παλινδρόμηση Παράδειγμα στο SPSS

Lecture 7: Overdispersion in Poisson regression

Example Sheet 3 Solutions

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Multilevel models for analyzing people s daily moving behaviour

Finite Field Problems: Solutions

Solutions to Exercise Sheet 5

SECTION II: PROBABILITY MODELS

Does anemia contribute to end-organ dysfunction in ICU patients Statistical Analysis

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Various types of likelihood

UMI Number: All rights reserved

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

Supplementary figures

Theorem 8 Let φ be the most powerful size α test of H

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679

Homework 8 Model Solution Section

Inverse trigonometric functions & General Solution of Trigonometric Equations

The challenges of non-stable predicates

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

EE512: Error Control Coding

Solution Series 9. i=1 x i and i=1 x i.

Περιγραφική στατιστική μεθοδολογία.

5. Choice under Uncertainty

Generalized additive models in R

HISTOGRAMS AND PERCENTILES What is the 25 th percentile of a histogram? What is the 50 th percentile for the cigarette histogram?

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Section 8.3 Trigonometric Equations

PENGARUHKEPEMIMPINANINSTRUKSIONAL KEPALASEKOLAHDAN MOTIVASI BERPRESTASI GURU TERHADAP KINERJA MENGAJAR GURU SD NEGERI DI KOTA SUKABUMI

Fractional Colorings and Zykov Products of graphs

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)


Repeated measures Επαναληπτικές μετρήσεις

ESTIMATION OF SYSTEM RELIABILITY IN A TWO COMPONENT STRESS-STRENGTH MODELS DAVID D. HANAGAL

Numerical Analysis FMN011

About these lecture notes. Simply Typed λ-calculus. Types

Section 9.2 Polar Equations and Graphs

Δεδομένα (data) και Στατιστική (Statistics)

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

255 (log-normal distribution) 83, 106, 239 (malus) 26 - (Belgian BMS, Markovian presentation) 32 (median premium calculation principle) 186 À / Á (goo

Models for Probabilistic Programs with an Adversary

Table A.1 Random numbers (section 1)

General 2 2 PT -Symmetric Matrices and Jordan Blocks 1

P AND P. P : actual probability. P : risk neutral probability. Realtionship: mutual absolute continuity P P. For example:

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

Reminders: linear functions

ΠΕΡΙΕΧΟΜΕΝΑ. Μάρκετινγκ Αθλητικών Τουριστικών Προορισμών 1

Every set of first-order formulas is equivalent to an independent set

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

( ) 2 and compare to M.

Approximation of distance between locations on earth given by latitude and longitude

D Alembert s Solution to the Wave Equation

w o = R 1 p. (1) R = p =. = 1

C.S. 430 Assignment 6, Sample Solutions

Survival Analysis: One-Sample Problem /Two-Sample Problem/Regression. Lu Tian and Richard Olshen Stanford University

Econ Spring 2004 Instructor: Prof. Kiefer Solution to Problem set # 5. γ (0)

Transcript:

Inference in the Skew-t and Related Distributions Anna Clara Monti Pe.Me.Is Department, Statistical division University of Sannio Einaudi Institute for Economics and Finance Roma 8 maggio 008

Outline Brief overview of flexible models The skew-t model Problems connected to inference Reparameterizations Applications in robustness context

The basic lemma A continuous density function symmetric around 0: A skewing function [ ] d π : 0,1 ( ξ ) π ( ξ) f( y) = g y y g( y) is a proper density function. Azzalini (1985) and further literature Often ( y) G( w( y) ) π = ( ) is a scalar distribution function such that ( ) = 1 ( ) ( ): d is such that ( ) = ( ) G y G y G y w y w y w y

SN The Skew Normal distribution T 1 ( ; ξ, Ω, α) = ϕ ( ξ; Ω) Φ α ω ( ξ) d (, Ω, ) Y SN ξ α ( ) f y y y Azzalini (1985), Azzalini & Dalla Valle (1996) 1/ (,, 1/ 11 dd ) ω = diag ω ω When (, Ω, ) Y SN ξ α E( Y) = ξ + ωµ z T ( ) =Ω Var Z ωµ µω z z µ = bδ Z 1 δ = Ωα ( 1 + α' Ωα) 1/ b = π 1/ Ω= ω Ωω 1 1 In the scalar case ( bδ ) ( bδ ) 1 γ1 = ( 4 π) 1.995 < γ <.995 1 3/ ( δ ) ( bδ ) b γ = ( π 3) 1 δ = α ( 1+ α ) 1/

Stocastic representations Conditioning method Transformation method U 1 δ 0 N ( * ) * d + 1 0, Ω Ω = U δ Ω 0 ( ) U U > 0 SN 0, Ω, α U 1 0 U 0 Ψ 1 α = Ω 1 ( 1 δ ' Ω δ) ' 0 ' 1/ ' N 1 0, 0 ( 1 ) ' d+ Z j = δ j U + δ j U j ( Z Zd ) SN( Ω α ) 1,, 0,, 1/ 1 1< δ j < 1 δ Maximum (d=) 1/ U0 1 ρ 1 ρ N 0, max ( U0, U1) SN( 0,1, α) α U = 1 ρ 1 1+ ρ

Main properties Marginal distribution (, ) ( 0,, α ) ( 0,, α ) Y = Y Y SN Ω Y SN Ω α 1 1 11 1 α +Ω Ω α 1 1 11 1 1 1 = Ω 1/.1 =Ω Ω1Ω11Ω1 T ( 1+ αω.1α) Linear transformation Quadratic forms (, Ω, ) Y SN ξ α ( 0, Ω, ) Y SN α T A A non-singular ΩA Correlation matrix B symmetric positive semidefinite BΩ B = B rank( B) = p 1 ( 0, Ω T, ) AY SN A A A α T ( Y ξ ) B( Y ξ) χ p Even functions ( ) = ( ) ( ) ( ) ( ) Y f y g y G w y Z g z d ( ) = t( Z) t Y

Skew Normal Inferential issues Inflection point of the profile likelihood when α=0 The information matrix is singular The shape of the likelihood function can be far from quadratic Azzalini & Capitanio, 1999 These problems can be remedied by reparameterization CP = ( µ, Var ( Y ), γ1) Azzalini (1985), Azzalini & Capitanio (1999), Arellano-Valle & Azzalini (007) The estimate of the skewness parameter is often infinite Liseo & Loperfido (004), Sartori (005), Greco (008), Azzalini e Genton (008)

Skew elliptical distribution Purpose: to deal with both skewness and heavy tails c d 1/ { } T 1 ( ; ξ, Ω ) = ( ξ) Ω ( ξ) g y g y y Ω Density function constant on ellipsoids ( ) ( ξ ) ( ) f( y) = g y;, Ω G w y Stocastic representations Conditioning Transformation Maximum (d=) Azzalini & Capitanio (003), Branco & Dey (001)

Skew-t Z SN V χ ν ν ( 0, Ω, α ) Z and V independent 1/ Y V Z St (,,, ) = ξ + ξ Ω α ν T 1 ν + 1 f ( y; ξ, Ω, α, ν) = td ( y; ξ, Ω, ν) T1 α ω ( y ξ) ; ν + d Qy + ν Y T 1 ( ξ ) ( ξ ) Q = Y Ω Y t d ( y; ξων,, ) = ν / d / ν + d Qy Γ 1+ ν 1/ d / ν Ω ( πν ) Γ y ( ; ν) ( ;0,1, ν) T y = t u du

Main properties of the Skew-t Linear transformation Y St( ξ, Ω, αν, ) AY + b St( ξ', AΩA T, α', ν) Marginal distributions ' ( ξ, Ω, αν, ) j ( ξ j, ωjj, α j, ν) Y St Y St Quadratic forms ( ξ, Ω, α, ν) T 1 ( Y ξ) Ω ( Y ξ) Y St F d d, ν Azzalini & Capitanio, 003

Univariate Skew-t distribution Y (,,, ) St ξ ωαν α = 3 ν = 5 ( ; ξωαν,,, ) = ( ; ξων,, ) ( ζν ; + 1) f y t y T -3 - -1 0 1 3 4 ζ = ατ z z = y ξ ω τ ν + 1 ν + z = 1/ ( ; ξων,, ) t y = ν + 1 z Γ 1 + ν 1/ ν ω( πν) Γ ν / 1/ y ( ; ν) ( ;0,1, ν) T y = t u du

Moments of the Skew-t distribution µ = ξ + ωδb ν σ = ω δ b ν ν ν b ν 1/ δ = ( ν / 1/) ( ν /) ν Γ + = π Γ α ( 1+ α ) 1/ ( ) ν 3 δ 3ν ν γ1 = δb + δ b δ b ν 3 ν ν 3/ ν ν ν ( ) 3ν 4δ bν ν 3 δ 6δ bν ν 4 4 ν γ = + 3δ bν δ bν 3 ( ν )( ν 4) ν 3 ν ν

Stew-t Main features Model able to deal with Skewness Heavy tails α = ν Special cases ( ξ ω ν) ( ξωα,, ) 0 Y t,, ( ) α = 0 and ν Y N ξ, ω Y SN Topics of the talk Inferential issues under special cases Applications in robustness contexts

ϑ = Score functions and information matrix ( ξωαν,,, ) ( ϑ) ( y ϑ ) ( ) ( z ) ( ϑ) ln f y; zτ ατν ln f y; 1 z τ ατzν Sξ ( y) = = w S ( y) w ω = = + ξ ω ων+ ω ω ω ων+ ln f ; Sα ( y) = = zτ w α ( z ) z ( 1 τ ) ( + z ) ( ) ln f y; ϑ 1 ν ν ν + 1 z z τ α γ Sν ( y) = = Ψ + 1 Ψ ln 1+ + + w + ν ν ( ν + 1 ) ν ν τ ν T ζ, ν + 1 z = y ξ ω ζ = ατ z τ ν + 1 ν + z = 1/ (, + 1) ( ςν, + 1) t ςν w = T ln Γ Ψ ( u) = u ( u) ( ν ) ζ + u u γ = ln 1 t + ( ν 1)( ν 1 u ) ν + 1 + + + ν + 1 ( ) u du Maximum Likelihood Estimators n Y ˆ ( ˆ ˆ ˆ ˆ) 1, Y,, Yn ϑ = ξ, ω, α, ν : Sϑ ( yi) = 0 j = 1,,4 Information matrix i= 1 j ( y ϑ) { T ( ) ( ) } ln f I E S Y S Y E ; = ϑ ϑ = T ϑ ϑ

Score functions ξ ω ξ = 0 ω = 1 α = 3 ν=3 (blue) ν =5 (green) ν =10 (red) ν =5 (black) 0 - -4-6 -8-10 -5 0 5 10 x 5 0 15 10 5 0-10 -5 0 5 10 x α ν 0 0.0 - -0. -0.4-4 -0.6-0.8-6 -1.0-1. -1.4-10 -5 0 5 10-10 -5 0 5 10 x x

Issues related to likelihood inference Nice likelihood function behaviour No stationary point Quadratic shape (Azzalini and Capitanio 003,Azzalini and Genton, 008) The information matrix can be singular Normality or skew normality parameter on the boundary The estimate of α can be infinite Sartori (005), Azzalini e Genton (008), Greco (008)

Score functions and information matrix under the t distribution zτ 1 z τ S y S y S y z t ω ω ω t 1 ν ν z z 1 Sν ( y) = Ψ + 1 Ψ ln 1+ + ν ν + z t t t ξ ω α ν+ 1 ( ) = ( ) = + ( ) = τ ( 0) t t t t I = I = I = I = 0 ( ˆ, ˆ ) ξω ξν ωα αν ξ α ( ˆ ω, ˆ ν ) are asymptotically uncorrelated t I ξξ ν + 1 = ω ( ν + 3) t I ων = ω ν ν t I νν t I ξα ( + 1)( + 3) 1/ ν Γ ( ν /+ 3/) t ν I π ω ( ν / ) ωω = ω ( ν + 3) t 4 Γ ( ν /+ 1) I αα = π ( ν + 1 ) Γ ( ν /+ 1/) = Γ + 1 ν + 5 = { Ψ1( ν /) Ψ 1( ν /+ 1/) } 4 ν ν 1 3 ( + )( ν + )

Score functions when ν diverges S S SN ξ SN ω ( y) ( y) ( ) SN Sα y z S Skew Normal SN ν ( y) ( z) ( α z) ( z) ( z) z α φ α = ω ω Φ α ( z) ( z) 1 z α z φ α = + ω ω ω Φ α φ α = Φ = 0 ( ) = ( ν ) S y O ν S S N ξ N ω ( y) ( y) N Sα ( y) = z π S N ν Normal ( y) z = ω 1 z = + ω ω = 0 1/ rank( I N )= The information matrix is singular!

Inferential problems when ν diverges Singularity of the information matrix Parameter ν on the boundary Efficiency of the estimators Asymptotic distribution of the estimators Likelihood ratio test

Root mean square errors Root mean square errors of the estimators of the direct parameters ξ, ω, α, e κ=1/ν N ξ, ω when the underlying distribution is ( ) n ξ ω α κ 50 0.836 0.3467 350.7858 0.0667 100 0.7159 0.651 1.505 0.0459 00 0.610 0.054 0.9790 0.038 500 0.540 0.1489 0.758 0.011 1000 0.4584 0.1170 0.6394 0.0157 5000 0.350 0.0705 0.467 0.0073 10000 0.3115 0.0558 0.4083 0.005 Simulations=10,000

Simulated distribution of the estimators under the Normal model n = 10,000 Simulations = 10,000 1.5 1.0 ξ 10 8 6 ω 0.5 0.0 4 0-0.6-0. 0. 0.4 0.6 1.00 1.05 1.10 1.15 1. α 1.0 0.8 0.6 0.4 0. 0.0-0.5 0.0 0.5

Distribution of the likelihood ratio statistic to test normality or skew normality ( ) H : Y N ξ, ω 0 ( ) H0 : Y SN ξ, ωα, The likelihood statistic does not converge in distribution to a chi-square!!!

Reparameterization of the degrees of freedom The singularity at the SN Model can be removed by a reparameterization 1 κ = ν z( z 1) 1/ ( + κ) ( + κz ) ( ) ( ) ( κ ) 1 1 1 1 κ κ + + 1 z Sκ ( y) = Ψ + 1 Ψ ln( 1+ κ z ) + + ϑκ ( ) ( ζ ) κ T ( ζ ) ( z ) κ κ κ κ κ 1 κ κ 1 α t ζ γ 1 1 = 1/ ν + 1 3/ T1/ ν+ 1 1/ ν+ 1 ( ξωακ,,, ) 1 SN Sκ y z z z z z 4 4 ( ) = 1 α ( + α 1) ( z) ( α z) φ α Φ

Asymptotic properties of the skew-t MLE under the SN Model κ = 0 ( ˆ, ˆ, ˆ, ˆ) is consistent for (,,, κ ) ˆ ϑ = ξωακ ϑ = ξωακ κ ˆ ξ ξ Z ( ) 1 Z1 I I Z ξκ κκ 4 ˆ ω ω Z Z I I Z n = I Z > 0 + I Z < 0 ( ωκ κκ ) ( ) d 1/ 4 ( 4 ) 4 ˆ α α Z3 Z3 Iακ Iκκ Z 4 ˆ κ Z4 0 ( 1 ) ( ) ( SN Z ) 1, Z, Z3, Z4 N 0, I ( )

Asymptotic distribution of the likelihood statistic under the SN distribution ( ) H0 : Y SN ξ, ωα, ( ˆ ˆ ) ln = λ SN St SN d ln λ Z I Z 0 ( ) > Z N ( 0,1) Simulated level of the likelihood ratio test for skew-normality with nominal level 0.05 n α = 0.5 α = 1.5 α = 3 α = 5 α = 10 100 0.0179 0.065 0.0765 0.1676 0.1569 00 0.08 0.090 0.0306 0.0473 0.065 500 0.073 0.039 0.0378 0.0369 0.0331 1000 0.030 0.0338 0.037 0.0416 0.0393 Replications 10,000

Information matrix at the normal model with κ=1/ν S S N ξ N ω ( y) ( y) z = ω 1 z = + ω ω 1/ N Sα ( y) = z π N 1 Sκ y z z 4 4 ( ) = ( 1) N I κ 1/ 1 1 0 0 0 0 ω ω ω ω π = 1/ 1 0 0 ω π π 7 0 0 ω The information matrix is still singular due to the link between S ξ and S α The parameter κ is on the boundary

Pseudo centred parameterization ϑ = ( ξωαν,,, ) ϑκ = ( ξωακ,,, ) (,, ) 1, CP = µ σ γ γ ν > 4 CP ( T ) 1 S = D S κ DP D CP i = ϑ κ, j Azzalini 007 Normal model z Sµ ( y) = σ 1 z 1 S ( y) = σ σ 1 3 Sγ ( y) = ( z z) 1 6 1 Sγ y z z 4 4 ( ) = ( 6 + 3) { ( ) ( )'} I = E S Y S Y N N N CP CP CP ω 0 1 ω 0 0 0 0 1 6 0 0 0 0 1 4 1 0 0 0

Root mean square errors with centered parameterization Root mean square errors of the estimators of the direct parameters ξ, ω, α, e κ=1/ν and the N ξ, ω centred parameters µ, σ, γ 1 e γ when the underlying distribution is ( ) n ξ ω α κ µ σ γ 1 γ 50 0.836 0.3467 350.7858 0.0667 0.1463 0.1171 5.7735 1.367 (1) (43) (151) 100 0.7159 0.651 1.505 0.0459 0.0994 0.071 0.744.8703 00 0.610 0.054 0.9790 0.038 0.0714 0.0505 0.1915 0.4495 500 0.540 0.1489 0.758 0.011 0.0436 0.0318 0.1131 0.1874 1000 0.4584 0.1170 0.6394 0.0157 0.0311 0.0 0.0773 0.153 5000 0.350 0.0705 0.467 0.0073 0.014 0.0101 0.0338 0.0507 10000 0.3115 0.0558 0.4083 0.005 0.0099 0.0070 0.034 0.0353 (In parenthesis there is the number of omitted cases, in 10,000 simulations, due to estimates ˆ ν lower than the value required for the existence of the corresponding moments) (9) (1)

Skew t MLEs of the centred parameters at the Normal model κ = 0 γ is on the boundary Consistency ( ˆ µ, ˆ σ, ˆ γ ˆ ) ( ) 1, γ is consistent for µ, σ, γ1, γ Asymptotic distribution of the estimators n 1/ ˆ µ µ Z1 ˆ σ σ Z d = ˆ γ Z 1 3 ˆ γ ZI Z 0 ( > ) 4 4 ( Z ) ( 1 ) 1 ( 1, Z, Z3, Z4 N 0, I I = diag ω, ω,6,4)

Likelihood ratio normality test ( ) H : Y N ξ, ω 0 d lnλ Z I( Z > 0) + Z i ( ) 1 1 ( ˆ ˆ ) ln = λ N St N Z N 0,1 for i = 1, Simulated level of the likelihood ratio test for normality with Self and Liang percentiles d lnλ Z I( Z > 0) + Z with ( ) 1 1 Zi N 0,1 per i = 1, Nominal n level 100 00 500 1000 5000 10,000 0.10 0.0801 0.0764 0.0796 0.086 0.0881 0.0830 0.05 0.0406 0.0371 0.0371 0.0405 0.0448 0.0436 0.01 0.0095 0.0071 0.0075 0.0084 0.0083 0.0090 Replications 10,000

Skew t and robusteness The model is skew-t. Are the MLE robust? The skew-t is assumed to deal with deviations from a simpler model.

Influence function of the skew-t MLE 1 (, ) = ( ) (, ) IF x F I F S y F { T S ( y F) I( F) S ( y F) } 1/ ϑ ϑ γ = sup,, s y ϑ The degrees of freedom are estimated The influence functions are unbounded The degrees of fredom are fixed The influence functions are bounded The gross error sensitivity is limited if α is finite

Robustness versus flexible models Robust methods work in a neighbourhood Flexible models are enlarged models F Flexible model ADAPTIVE ROBUST ESTIMATION Only some parameters are of interest Y = Xβ + ε The skew-t score functions are modified estimating equations designed to deal with deviations from normality parameters have a clear meaning inference can be based on likelihood function normality can be tested through hypotheses on the parameters prediction intervals

Robustness properties Efficiency at the model Reliability in a neighbourhood of the model What is the loss of efficiency if the simpler model is the correct? How do the skew-t MLEs compare with robust estimators?

Efficiency at the normal model X (, ) N ξ ω Efficiency in the estimation of the location parameter under the normal model n Huber St DP St CP 50 0.95 0.089 0.945 100 0.9489 0.0193 0.9983 00 0.9591 0.013 0.9997 500 0.956 0.0069 1.0000 1000 0.9499 0.0046 1.0001 5000 0.9475 0.0016 1.0000 10000 0.9547 0.0010 1.0000 Efficiency in the estimation of the scale parameter (standard deviation) under the normal model n Huber St DP St CP 50 0.164 0.0835 0.7317 100 0.1604 0.0717 0.998 00 0.1648 0.0604 0.9985 500 0.1677 0.0455 0.9999 1000 0.1619 0.0359 1.0000 5000 0.1680 0.006 1.0001 10000 0.1649 0.0156 1.0000 Huber Proposal - 95% efficiency at the normal model (k=1.35)

Efficiency of the skew-t MLEs when the undelying model is Skew Normal Simulations results n = 1000 Montecarlo replications = 10,000 DIRECT PARAMETER n α = 0.5 α = 1.5 α = 3 α = 5 α = 10 ξ 1.1310 0.7569 0.8608 0.8996 0.9437 ω 1.0931 0.6141 0.615 0.5913 0.6395 α 1.1417 0.8469 0.9630 1.0019 1.060 CENTRED PARAMETERS n α = 0.5 α = 1.5 α = 3 α = 5 α = 10 µ 1.0000 1.0004 0.9988 0.9868 0.9846 σ 0.9999 0.9975 0.984 0.97 0.941 γ 1 0.6774 0.8846 0.5387 0.1953 0.0337 γ 0.0648 0.145 0.0555 0.0140 0.00

Efficiency of the skew t MLE when the underlying distribution is t No loss of efficiency in the estimation of ω and ν Efficiency of the skew t MLE of the location parameter ξ when the underlying distribution is t ν 3 5 7 10 15 0 30 60 DP 0.136 0.0778 0.0363 0.010 0.0114 0.0055 0.0033 0.0015 0.0004 CP 0.980 0.5743 0.8110 0.8958 0.946 0.9751 0.9858 0.9936 0.9984

Robustness in regression models Skew t MLE Least squares t MLE M-estimators Y = Xβ + ε MM estimators (BDP = 0.5, Eff=95%) Parameter of interest: Regression coefficients (with and without fixed degrees of freedom) (Huber s ψ, k=1.345, scale=mad) ψ ( u) xmin 1, k Tukey s biweigth ψ Step 1: S-estimates Step : M-estimates with fixed scale ψ = x u k ( u) = u 1 I( u k) Monte Carlo replications=1000, n=00 Estimate of the intercept adjusted so that the error has zero mean

Y = 5+ 1.5x+ ε Model 1 ( 0,1) N( 0 ) ε 0.9N + 0.1,9 ε 0.4 0 0.3 15 0. y 10 0.1 5 0.0-6 -4-0 4 6 0 4 6 8 10 x Root mean square errors of the estimators of the regression coefficients Model 1 St t LS M MM St St St St St (ν=3) (ν=5) (ν=7) (ν=10) (ν=15) β 1 0.1806 0.1786 0.038 0.1771 0.1760 0.1837 0.1791 0.1793 0.1810 0.1841 β 0.094 0.093 0.0337 0.090 0.089 0.097 0.091 0.091 0.093 0.098 Not substantially different from robust estimators

y y Models from to 4 Model Y = 5+ 1.5x+ ε ( 0,1).1N( 0, ) ε 0.9N + 0 5 ε Model 3 Y = 5+ 1.5x+ ε ( 0,1) N( 8 1) ε 0.95N + 0.05, 0.4 0.3 0. 0.1 5 0 15 10 0.0 5 0 5 10 0 4 6 8 10 x Model 4 0.4 ε 100 Y = 5+.5x+ 0.7x + ε ε ( 8 ) last 0 N,1 0.3 0. 80 60 40 0.1 0 0.0 0 0 4 6 8 10 0 5 10 x

Root mean square errors of the estimators of the regression coefficients Model St t LS M MM St St St St St (ν=3) (ν=5) (ν=7) (ν=10) (ν=15) β 1 0.00 0.1841 0.771 0.187 0.1765 0.191 0.1889 0.194 0.1998 0.117 β 0.0303 0.030 0.0460 0.099 0.091 0.097 0.095 0.099 0.0308 0.033 Model 4 St t LS M MM St St St St St (ν=3) (ν=5) (ν=7) (ν=10) (ν=15) β 1 1.08 0.9714.0716 1.464 0.859 0.9088 1.0373 1.1616 1.858 1.3961 β 0.505 0.5931 1.4048 0.9877 0.199 0.689 0.3554 0.4186 0.483 0.54 β 0.091 0.0703 0.177 0.10 0.019 0.031 0.0419 0.0496 0.0574 0.0647 3 Model 3 St t LS M MM St St St St St (ν=3) (ν=5) (ν=7) (ν=10) (ν=15) β 1 0.3913 0.186 0.5079 0.199 0.1645 0.3388 0.3337 0.3614 0.4066 0.465 β 0.097 0.0300 0.0517 0.091 0.071 0.09 0.09 0.099 0.0309 0.031 The skew-t MLE estimators are definitely better than L.S. not as efficient as MM estimators Should be the degrees of freedom be held fixed?