Lecture 7: Overdispersion in Poisson regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0
Overview Introduction Modeling overdispersion through mixing Score test for detecting overdispersion c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 1
Introduction McCullagh and Nelder (1989) p.198-200 Overdispersion is present if V ar(y i ) > E(Y i ) Overdispersion can be modeled using a mixing approach: Y i Z i P oisson(z i ) Z i 0 ind. rv s with E(Z i ) = µ i c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 2
Distributional assumption for Z i 1) Z i Gamma with mean µ i and index φµ i Recall X Gamma(µ, ν) f X (x) = 1 Γ(ν) µ mean ν index E(X) = µ V ar(x) = µ 2 /ν ( ) ν ( νx µ exp νx µ ) 1 x Here f Zi (z i ) = 1 Γ(φµ i ) ( ) φµi ( ) φµi z i µ i exp φµ iz i µ i 1 z i E(Z i ) = µ i V ar(z i ) = µ2 i φµ i = µ i /φ c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 3
P (Y i = y i ) = 0 e z i(z i ) y i y i! 1 Γ(φµ i ) ( φµi z i µ i ) φµi exp{ φzi } 1 z i dz i = φ φµ i Γ(φµ i )y i! 0 e zi(1+φ) z (y i+φµ i 1) i dz i s i =z i (1+φ) = φ φµ i Γ(φµ i )y i! 0 e s i 1 s (y i+φµ i 1) 1 (1+φ) (y i +φµ i 1) i φ+1 ds i = = φ φµ i Γ(φµ i )y i!(1+φ) (y i +φµ i ) 0 e s is (y i+φµ i 1) i ds i φ φµ iγ(y i +φµ i ) Γ(φµ i )y i!(1+φ) (y i +φµ i ) = Γ(y i+φµ i ) Γ(φµ i )Γ(y i +1) ( ) φµ φ i ( ) 1 1 + φ 1 + φ }{{}}{{} 1 1/φ 1/φ+1 1/φ+1 y i c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 4
Negative binomial distribution X negbin(a, b) P (X = k) = Γ(a+k) Γ(a)Γ(k+1) ( 1 1+b ) a ( b 1+b) k k = 0, 1,... a, b 0 E(X) = ab V ar(x) = ab(1 + b) c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 5
is given by Marginal distribution of Y i Y i negbin(a i, b i ) with a i = φµ i b i = 1/φ E(Y i ) = φµ i φ = µ i und V ar(y i ) = φµ i φ (1 + 1/φ) = µ i(1 + 1/φ) Remarks: - V ar(y i ) > E(Y i ) if 1/φ > 0, i.e. overdispersion - φ = no overdispersion - To estimate β and φ jointly one needs to maximize the negative binomial likelihood. No standard software available but one can use the Splus library negbin() from http : //lib.stat.cmu.edu/s/ - V ar(y i ) = µ i (1 + 1/φ) is linear in µ i c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 6
2) Z i Gamma with mean µ i and index ν (exercise) Y i negbin(ν, µ i /ν) such that E(Y i ) = ν µ i /ν = µ i V ar(y i ) = µ i + µ 2 i /ν quadratic variance function in µ i c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 7
Multiplicative heterogeneity in Poisson regression Another approach for modeling overdispersion is to use Y i Z i P oisson(µ i Z i ) with E(Z i ) = 1 and V ar(z i ) = σ 2 Z, i.e. Z i i.i.d., Z i is called multiplicative random effect (exercise) E(Y i ) = µ i V ar(y i ) = µ i + σ 2 Z µ2 i If Z i Gamma with expectation 1 and index ν Y i is negbin(a i, b i ) a i = ν, b i = µ i ν c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 8
Summary 1) Y i Z i P oisson(z i ) E(Z i ) = µ i a) b) Z i Gamma(µ i, φµ i ) Y i Negbin(φµ i, 1/φ) E(Y i ) = µ i V ar(y i ) = µ i (1 + 1/φ) linear Z i Gamma(µ i, ν) Y i Negbin(ν, µ i /ν) E(Y i ) = µ i V ar(y i ) = µ i + µ 2 i /ν quadratic 2) Y i Z i P oisson(µ i Z i ) E(Z i ) = 1 Z i Gamma(1, ν) Y i Negbin(ν, µ i /ν) E(Y i ) = µ i V ar(y i ) = µ i + µ 2 i /ν quadratic c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 9
Test for overdispersion Dean (1992) Assume Y i P oisson(µ i ) with µ i = e x i t β θ i = ln(µ i ) = x t i β To model overdispersion we assume that the canonical parameters θ i are not fixed but random quantities θ i with E(θ i ) = θ i V ar(θ i ) = τk i(θ i ) > 0 for τ 0 and k i (θ i ) differentiable c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 10
To test for overdispersion, want to test H 0 : τ = 0 versus H 1 : τ > 0 Example: θi = xt i β + Z i with Z i i.i.d. E(Z i ) = 0 V ar(z i ) = τ < E(θi ) = θ i V ar(θi ) = τ i.e. k i(θ i ) = 1 Compare to: Y i Z i P oisson(µ i Z i ) E(Z i ) = 1 θ i = ln(µ iz i ) = ln(µ i ) + ln(z i ) E(ln(Z i )) 0 A score test will now be developed for H 0 : τ = 0 versus H 1 : τ > 0. c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 11
General score test Let Y 1,..., Y n ind. r.v. with densities f i (y, θ) θ Ω and H 0 : θ Ω 0 H 1 : θ Ω 1 Ω 0 + Ω 1 = Ω R d Ω 0 Ω1 = Loglikelihood: I(θ) := E l(θ) = n i=1 ( [ l(θ) θ l i (θ) l i (θ) = ln f i (y i ; θ) ] [ ] ) t l(θ) θ R d d Fisher information = E ( ) l 2 (θ) θ θ t under regularity conditions c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 12
We are interested in the following composite hypothesis for θ = (β t, τ) t : H 0 : τ = τ 0 versus H 1 : τ τ 0 Consider the score statistic T s = [ ] t θ l(θ) θ=ˆθ0 I 1 (ˆθ 0 ) [ ] θ l(θ) θ=ˆθ0 where ˆθ 0 is the MLE of θ in model θ = (β t, τ 0 ) t, i.e. ˆθ 0 = (ˆβ t, ˆτ 0 ) t satisfies θ l(θ) = ( l(β, τ) t, β ) t l(β, τ) τ θ=ˆθ0 = 0 Under regularity conditions we have T S χ 2 dimω dimω 0 under H 0. c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 13
Remark: There are regularity condition under which the standard asymptotic remains valid if τ 0 is at the boundary of Ω, as for example at H 0 : τ = 0 versus H 1 : τ > 0. Here: θ = (β t, τ) t For τ = 0 we have a Poisson GLM and one can show that ˆθ 0 = (ˆβ 0t, 0) t where ˆβ 0 is the MLE of the Poisson GLM. c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 14
Score test for overdispersion in count regression Y i θ i P oisson(eθ i ) with E(θ i ) = xt i β and V ar(θ i ) = τk i(θ i ) Let f(y i, θi ) be the conditional density of Y i given θi, then with Taylor we have f(y i, θ i ) = f(y i, θ i ) + θ i f(y i, θ i ) θ i =θ i (θ i θ i) + 1 2 2 θi 2 R = remainder term f(y i, θ i ) θ i =θ i (θ i θ i) 2 + R c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 15
f M (Y i, θ i ) = marginal likelihood of Y i = f(y i, θ i )f(θ i )dθ i = E θ i (f(y i, θ i )) = f(y i, θ i ) + 0 + 1 2 2 θi 2 [ = f(y i, θ i ) 1 + 1 2 τk i(θ i )f 1 (Y i, θ i ) 2 f(y i, θi ) θi =θ i E[(θ i θ i ) 2 ] +R }{{} V ar(θi )=τk i(θ i ) θi 2 ] f(y i, θi ) θi =θ + Rf 1 (Y i i, θ i ) c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 16
Marginal log likelihood l(θ) = n i=1 l i (θ i ) = n i=1 = n [ln f(y i, θ i ) i=1 + ln l(θ) θ ln f M (Y i, θ i ) { 1 + 1 2 τk i(θ i ) f 1 (Y i, θ i ) 2 θ=( ˆβ0t,0) t = l(β,τ) β l(β,τ) τ θi 2 β= ˆβ0,τ=0 β= ˆβ0,τ=0 } f(y i, θi ) θi =θ + Rf 1 (Y i i, θ i ) ] = 0 l(β,τ) τ β= ˆβ0,τ=0 θ i = x t i β ˆθ 0 i := x i tˆβ0 c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 17
We assume E((θ i θ i) r ) =: d r with d r = o(τ) for r 3 l(β,τ) τ β= ˆβ0,τ=0 n i=1 1 2 k i(θ i ) f 1 (Y i,θ i ) 2 θ i 2 1 + 1 2 τ k i(θ i ) f 1 (Y i,θ i ) 2 f(y i,θ i ) θ i =θ i θ i 2 f(y i,θi ) θ i =θ i β= ˆβ0,τ=0 = n i=1 1 2 k i(ˆθ i 0 ) f 1 (Y i, ˆθ i 0 ) θi 2 f(y i, θi ) θ i =ˆθ i 0 }{{} 2 =:T i (ˆθ 0 i ) (denominator = 1, since τ=0) c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 18
Additionally I(ˆθ 0 ) = T s = ( ) Iββ I βτ I βτ I ττ [ ˆθ 0 = (ˆβ 0t, 0) t θ l(θ) θ=( ˆβ0,0)] t I 1 (ˆβ 0, 0) [ ] θ l(θ) θ=( ˆβ0,0) = ( l(β,τ) τ ) 2 ( ) β= ˆβ0,τ=0 I 1 (ˆβ 0, 0) ττ = [ n i=1 ] 2 ) T i (ˆθ i (I 0) 1 (ˆβ 0, 0) ττ since components corresponding to β are zero. Further I 1 (ˆβ 0, 0) = (I ττ I t βτ I 1 ββ I βτ) 1 (partioned matrixes) T S = [ n i=1 T i (ˆθ 0 i ) ] 2 / V 2, V 2 = (I ττ I t βτ I 1 ββ I βτ) c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 19
Overdispersion test Reject H 0 : τ = 0 versus H 1 : τ > 0 at level α T S > χ 2 1,1 α c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 20
Ex.: Poisson model with random effects θ i = xt i β + Z i E(Z i ) = 0 V ar(z i ) = τ Y i θ i P oisson(eθ i ) θi = ln µ i = x t i β E(Y i ) = E θ i (E(Y i θ i )) = E θ i (eθ i ) Eθ i (1 + θ i ) = 1 + θ i = 1 + ln µ i 1 + µ i 1 = µ i V ar(y i ) = E θ i (V ar(y i θi )) + V ar }{{} θ i (E(Y i θi )) }{{} E(Y i θi ) = E(Y i ) + (e θ i) 2 V ar(e Z i ) }{{} V ar(1+z i )=V ar(z i )=τ µ i + µ 2 i τ can model overdispersion e θ i c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 21
Overdispersion test in Poisson model with random effect Model: Y i θ i P oisson(eθ i ) θ i = x t i β + Z i E(Z i ) = 0 V ar(z i ) = σ 2 Score test: Reject H 0 : τ = 0 versus H 1 : τ > 0 n i=1 T i (ˆθ i 0 ) V = 1 2 n [(Y i ˆµ 0 i )2 ˆµ 0 i] n i=1 1 2 i=1 ˆµ 02 i > Z 1 α/2 c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 22
References Dean, C. (1992). Testing for overdispersion in Poisson and binomial regression models. JASA 87(418), 451 457. McCullagh, P. and J. Nelder (1989). Generalized linear models. Chapman & Hall. c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 23