Computable error bounds for asymptotic expansions formulas of distributions related to gamma functions Hirofumi Wakaki (Math. of Department, Hiroshima Univ.) 20.7. Hiroshima Statistical Group Meeting at RERF
. Introduction Examples of moment formulas which consist of gamma functions 2. General formulas of asymptotic expansions Two types of asymptotic expansion formulas of distributions 3. Uniform error bounds 3.. Previous results Expansion formulas under a high dimensional framework 3.2. Error bounds for large sample approximation Some lemmas and theorems for deriving error bounds 3.3. Wilks lambda distribution (MANOVA test) An error bound for large sample approximation 3.4. Comparison of error bounds between two types of expansions Numerical examples 2
. Introduction. Testing equality of r covariance matrices X ij, i =,..., n j H 0 : Σ = = Σ r i.i.d. N p (µ j, Σ j ), j =,..., r The modified likelihood ratio test rejects H 0 for small values of r V = (det A j) n j/2, (det A) n/2 A j : the matrix of sums of squares and products formed from the j nth sample A = A + + A r Moment formula (Under H 0 ) E[V h Γ p [n/2] ] = Γ p [n( + h)/2] p Γ p [a] = π p(p )/4 3 r Γ p [n j ( + h)/2], Γ p [n j /2] Γ[a (j )/2]
2. The sphericity test X i, i =,..., n i.i.d. N p (µ, Σ) H 0 : Σ = λi p, where λ is unspecified The likelihood ratio text rejects H 0 for small values of V = A : ( det A ) p, tra p the matrix of sums of squares and products Moment formula (Under H 0 ) E[V h ] = p ph Γ[pn/2]Γ p[(n + 2h)/2] Γ[p(n + 2h)/2]Γ p [n/2] 4
3. MANOVA test Y i = BX i + ε i, i =,..., N B : p r, X i : r, ε i i =,..., N i.i.d. N p (0, Σ) H 0 : BC = O, C is a known matrix of rank q The likelihood ratio text rejects H 0 for small values of V = A : B : det(b) det(a + B) the matrix of sums of squares and products due to the hypothesis the matrix of sums of squares and products due to the error Moment formula (Under H 0 ) E[V h ] = Γ p[(n + 2h)/2]Γ p [(n + q)/2] Γ p [n/2]γ p [(n + q + 2h)/2], n = N r 5
2. General formulas of asymptotic expansion Classical large sample approximation Box (949, Biometrika) E[V h ] = K [ p y j y j q i= x i x i ] h q i= Γ[x i( + h) + ξ i ] p Γ[y j( + h) + η j ] p y j = q x k, x k = a k M, y j = b j M, M (n ) T = 2ρ log V, ψ T (t) = log E[e itt ] = f 2 log( 2it)+ s [ q f = 2 β k = ( ρ)x k = O(), ϵ j = ( ρ)y j = O() ξ k [ q γ k = ( )k+ k(k + ) i= p η j ], 2 (q p) B k+ (β i + ξ i ) (ρx i ) k 6 γ k M k {( 2it) k } +O(M (s+) ) p ] B k+ (ϵ j + η j ), (ρy j ) k
Expansion of the characteristic function ϕ T (t) = exp{ψ T (t)} = ϕ s (t) + O(M (s+) ), ϕ s (t) = ( 2it) f/2 { s + M k k m= m! l + +l m =k l i Asymptotic expansion of the distribution function m i= } γ li [( 2it) l i ] P(T x) = G s (x) + O(M (s+) ) x { } G s (x) = e itx ψ s (t)dt dx (inverting formula) 2π 7
High dimensional Edgeworth expansion The moment generating function of log V ϕ log V (t) := E[exp(it log V )] = E[V it ] [ p = K y ] j y it j q i= Γ[x i( + it) + ξ i ] q i= x i x i p Γ[y j( + it) + η j ] cumulants κ = E[log V ] = log κ k = p y j y j q i= x i x i q x i ψ (k ) (x i + ξ i ) i= where ψ (k) (a) = dk+ log Γ[a] dak+ + q x i ψ (0) (x i + ξ i ) i= p y j ψ (0) (y j + η j ) p y j ψ (k ) (y j + η j ) (k 2) 8
Formal Edgeworth expansion { ϕ log V (t) φ s (t) := e t2 /2 + γ k,j = κ (j+3k)/2 2 s + +s k =j s k! s k j=0 } γ k,j (it) 3k+j, κ s +3 κ sk +3 (s + 3)! (s k + 3)!, ( log V κ P κ2 ) x Q s (x) := Φ(x) ϕ(x) h j : the j th order Hermite polynomial s k! s k γ k,j h 3k+j (x), j=0 9
3. Uniform error bounds If we find a function such that ϕ log V (t) φ s (t) G(t) and G(t) dt is computable t Then ( sup log V P κ x κ2 ) x Q s (x) 2π ϕ log V (t) φ s (t) dt t G(t) dt t 0
3.. Previous results We have found computable error bounds of the following statistics under a framework MANOVA test p, n i, with p n i c i (0, ) The modified likelihood ratio test testing Σ = I p The likelihood ratio test testing the sphericity The modified likelihood ratio test testing Σ = Σ 2 = = Σ r
Numerical examples Error bounds for testing H 0 : Σ = Σ 2 in the case of s = 2 n = n 2 = 8 n = 8, n 2 = 36 p BOUND p BOUND 3 0.0037 (0.55) 3 0.0035 (0.65) 6 0.003 (0.55) 6 0.003 (0.65) 9 0.0009 (0.60) 9 0.000 (0.65) 2 0.000 (0.70) 2 0.002 (0.80) 5 0.0037 (0.95) 5 0.0097 (0.95) 2
Error bounds for testing H 0 : Σ = λi p in the case of s = 2 n = 30 n = 60 p BOUND p BOUND 5 0.47 (0.75) 0 0.0255 (0.55) 0 0.0276 (0.60) 20 0.003 (0.40) 5 0.0093 (0.60) 30 0.0008 (0.40) 20 0.0052 (0.65) 40 0.0004 (0.45) 25 0.0067 (0.90) 50 0.0004 (0.60) 3
Error bounds for testing H 0 : Σ = I p in the case of s = 2 n = 30 n = 60 p BOUND p BOUND 5 0.0886 (0.70) 0 0.086 (0.50) 0 0.096 (0.60) 20 0.0026 (0.40) 5 0.0074 (0.60) 30 0.0008 (0.40) 20 0.004 (0.65) 40 0.0004 (0.40) 25 0.0050 (0.85) 50 0.0004 (0.55) 4
3.2. Error bounds for large sample approximation ψ T (t) = where q p {g k (t) g k (0)} {h j (t) h j (0)}, g k (t) = 2itρx k log x k + log Γ[ρx k ( 2it) + β k + ξ k ], h j (t) = 2itρy j log y j + log Γ[ρy j ( 2it) + ϵ j + η j ], and β k = ( ρ)x k = O(), ϵ j = ( ρ)y j = O() Generalized Stirling s theorem due to Barnes (899): log Γ[x + h] = log 2π + (x + h 2 ) log x x s ( ) k B k+ (h) k(k + )x k + R s+ (x), rigorous representation of R s (x) is required. 5
Lemma (from the infinite product definition due to Euler) log Γ[z + a] = lim n { + (z + a ) log n n } {log k log(z + a + k)} Lemma generalized Euler Maclaurin s formula f(x) : complex valued function of real argument n G n (x) = f(x + l) s k=0 G (k) l= n (0) a k = k! + n n f(x)dx + s k=0 B k+ (a) (k + )! {f (k) (n) f (k) ()} B s+ (a) + ( ) s B s+ (x [x] a) f (s+) (x)dx (s + )! 6
Lemma n log(z + a + l) = l= + + s m= n n ( ) m+ B m+ (a) { m(m + ) C s+ (x; a) s + log(z + x)dx + B (a) log z + n z + (z + n) } m (z + ) m (z + x) s+ dx + R s,n(z, a), where C k (x; a) = B k (x [x] a) B k ( a), n ( a ) k R s,n (z, a) = k z + l l= k=s+ 7
Lemma g k (t) g k (0) = B (β k + ξ k ) log( 2it) s ( ) m+ B m+ (β k + ξ k ) { } + m(m + )(ρx k ) m ( 2it) s + m= + l=0 C s+ (x; β k + ξ k ) 0 s + { } (ρx k + x) dx s+ (ρx k ( 2it) + x) s+ { ( βk ξ ) m ( k β k ξ ) } m k m ρx k + l ρx k ( 2il) + l m=s+ 8
Theorem ψ T (t) = f s 2 log( 2it)+ [ q f = 2 ξ k m= p η j ], 2 (q p) [ q γ m = ( )m+ m(m + ) B m+ (β k + ξ k ) (ρx k ) m γ m M m { ( 2it) m } +Rem s, p ] B m+ (ϵ j + η j ), (ρy j ) m 9
Rem s = q + p q p 0 0 l=0 l=0 { C s+ (x; β k + ξ k ) s + (ρx k + x) s+ (ρx k ( 2it) + x) s+ { C s+ (x; ϵ j + η j ) s + (ρy j + x) s+ (ρy j ( 2it) + x) s+ { ( βk ξ ) m ( k β k ξ ) } m k m ρx k + l ρx k ( 2it) + l m=s+ m=s+ m { ( ϵj η j ρy j + l ) m ( ϵ j η ) } m j ρy j ( 2it) + l } dx } dx 20
3.3. Wilks lambda distribution (MANOVA test) Moment formula Λ = det{b(w + B) } Λ p (q, n) B W p (q, Σ), W W p (n, Σ) (n > p) E[Λ h ] = q Γ[ n p+j + h]γ[ n+j ] 2 2 Γ[ n p+j ]Γ[ n+j + h]. 2 2 2
High dimensional approximation (Tonda and Fujikoshi (2004), Wakaki (2007)) { } log Λ κ Pr x (κ 2 ) /2 { } κ3 = Φ(x) ϕ(x) 6 h 2(x) + + O(m (s+)/2 ), κ i : cumulants, κ i : standardized cumulant, m = n p 2 (κ 2 ) /2 2 n, (n p) 22
Error bound(wakaki(2007)) F s (x) : Edgeworth expansion with using (s + 2)-th cumulants Table : The error bounds for s = 2 and n = 50. p q error-bound 0 5 0.0272 20 5 0.00 30 5 0.0077 40 5 0.000 p q error-bound 0 0 0.0099 20 0 0.0035 30 0 0.0023 40 0 0.0029 23
Error bound for the large sample approximation Theorem where ψ T (t) log E[exp{ Mit log Λ}] = pq s 2 log( 2it) + γ k M { ( k 2it) k } q + Rem c p+j,s+ (it), γ k = ( 2)k q {B k+( c+j ) B 2 k+( c p+j )} 2, k(k + ) c = (p q + ) 2 24
where C s+ (x; p+a Rem a,s (it) = ) C 2 s+(x; a) 2 s + { ( M( 2it) + x ) 2 s+ ( ) k+ + k l= k=s+ {( ) (p + a) k a k {M( 2it) + 2l} k ( M 2 + x )s+ ( (p + a) k a k (M + 2l) k } dx )}, 25
Note k : odd γ k = 0 (Box (949) : γ = γ 3 = γ 5 = 0) ψ T (t) = pq s 2 log( 2it) + γ 2k M { ( 2k 2it) 2k } + q Rem c p+j,2s+2 (it), 26
The residual of the expansion of the characteristic function ψ(t) = ψ 0 (t) + s u= g u (t) = γ 2u { ( 2it) 2u }, q R s (t) = M 2s Rem c p+j,2s, g u (t) M + R s+(t) (s = 0,, ), 2u M 2(s+) ϕ T (t) = E[exp{ Mit log Λ}] = ϕ s (t) + ( 2it) pq/2 R p,q,s(t) M 2(m+) 27
ϕ s (t) = ( 2it) pq/2 { + s M 2j [ ] R p,q,s (t) = {R (t)} s+ R (t) E,s+ + R s+ + s k=2 + R s k+2 R k { k 2 k! j=0 M 2 l i : k j i= l i s j j ( k j i= k! }, E,s (x) = x s { e x l + +l k =j l i k g li }, i= g li )R s+ j li R j s } x k k! 28
Lemma q Rem c p+j,m (it) M K p,q,m(t), m R s (t) K p,q,2s (t) { } 2t K p,q,m (t) = min A p,q,m, B p,q,m + 4t 2 { } 2t + C p,q,m min, 2 + 4t 2 m + 29
A p,q,m = q { ( c + j ) (c + j) m+ L,m 2 M ( c p + j ) } (c p + j) m+ L,m, M q { ( c + j ) B p,q,m = (c + j) m+ L 2,m M ( c p + j ) } (c p + j) m+ L 2,m, M C p,q,m = 2m m max q ( C m+ x; c + j ) ( C m+ x; c p + j ), 0 x 2 2 L,m (x) = { m log( x) x m L 2,m (x) = x m+ } x k, k { m ( x) log( x) + x 30 } x k+. k(k + )
Theorem sup Pr{ M log Λ x} F s (x) x F s (x) = x { } e itx ϕ s (t)dt dx 2π 2πM 2(s+) [ ] G p,q,s (t) = {K p,q,2 (t)} s+ Kp,q,2 (t) E,s+ + K p,q,2s+2 (t) + s k=2 { k 2 k! j=0 l i : k j i= l i s j k j i= + K p,q,2(s k+2) (t)k p,q,2 (t) k }, M 2 G p,q,s (t) dt, t g li K p,q,2(s+ j li )(t)k p,q,2 (t) j 3
case of s = case of s = 2 F (x) = G f (x) + γ 2 M 2 {G f(x) G f+4 (x)}, G p,q, (t) = ( + 4t 2 ) pq/4 { K p,q,4 (t) + {K p,q,2 (t)} 2 E,2 [ Kp,q,2 (t) M 2 ]} F 2 (x) = G f (x) + γ 2 M 2 {G f(x) G f+4 (x)} + M 4 {( c c 2 )G f (x) + c G f+4 (x) + c 2 G f+8 (x)}. c = γ 2 ( + γ 2 ), c 2 = γ2 2 2 γ 4, γ 4 = pq(59 50p2 + 3p 4 50q 2 + 0p 2 q 2 + 3q 4 ), 920 G p,q,2 (t) = ( + 4t 2 ) pq/4 {K p,q,6 (t)+ γ 2 2 ( 2it) 2 K p,q,4 (t) + {K p,q,2 (t)} 3 E,3 [ Kp,q,2 (t) M 2 ]}. 32
3.4. Comparison of error bounds between two types of expansions Table 2: n = 50.HD(High dimension)(s = 2), LS(Large sample)(s = ) p q HD LS 0 5 0.0272 0.000783 20 5 0.00 0.070 30 5 0.0077 0.25 40 5 0.000 > p q HD LS 0 0 0.0099 0.00235 20 0 0.0035 0.0357 30 0 0.0023 0.57 40 0 0.0029 > 33
Table 3: n = 50, p = 20, q = 5 s HD 0 0.0406 0.08 2 0.00 3 0.0078 4 0.006 s LS 0 0.057 0.070 2 0.0087 34
End of this talk. Thank you. 35