2012 10 Chinese Journal of Applied Probability and Statistics Vol.28 No.5 Oct. 2012 (,, 675000) Poisson,,, Gibbs, BIC.,. :,, Gibbs, BIC. : O212.8. 1. (count data), Poisson Poisson., (zeroinflation).,.,, ;,,.,, Fahrmeir Echavarría (2006) ; Yip Yau (2005) ; Xie (2009, 2008) Poisson Score ; Ghosh (2006).,.,,,,. [6]-[10],,,. (10BTJ001) (11171105). 2012 4 9, 2012 5 2.
552 Lee (2006),,, BIC,. : ; ; ;. 2., Poisson (Zero-Inflated Poisson Distribution), ZIP. Y ZIP(φ, λ), φ + (1 φ) exp( λ), y = 0; P {Y = y} = (1 φ) exp( λ)λ y /y!, y > 0, 0 < φ < 1,, φ = 0, ZIP Poisson. (2.1),, Y, y, i j, x, z, i = 1, 2,..., m, j = 1, 2,..., n i, N = m n i, Y ZIP(φ, λ ),, i=1 φ λ : ξ = log(λ ) = x T β + u i; ( φ ) η = log = z T 1 φ γ + v i, (2.2) β γ x z, u i v i,, u i v i, 0, σ 2 u, σ 2 v.,,, Lee (2006),, ZIP ξ = log(λ ) = β 0i + x T β 1i + ε (1), (2.3) β 0i = α 00 + wi T α 01 + u 0i, β 1i = α 10 + wi T α 11 + u 1i, (2.4) η = logit(φ ) = γ 0i + z T γ 1i + ε (2), (2.5)
: 553 γ 0i = δ 00 + wi T δ 01 + v 0i, γ 1i = δ 10 + wi T δ 11 + v 1i, (2.6) (2.3) (2.5), β 0i, γ 0i β 1i, γ 1i, ε (1), ε(2), N(0, σ2 1 ) N(0, σ2 2 ), (2.4) (2.6),, wi T = (w 1i,..., w pi ),,, u T i = (u 0i, u 1i ), v T i = (v 0i, v 1i ), u i N(0, Σ u ), v i N(0, Σ v ), Σ u Σ v. δ T 1 α = (α T 0, αt 1 ), αt 0 = (α 00, α 01 ), α T 1 = (α 10, α 11 ); δ = (δ T 0, δt 1 ), δt 0 = (δ 00, δ 01 ), = (δ 10, δ 11 ), (2.3)-(2.6) : ξ = Xβ + ε (1) ; η = Zγ + ε (2), β = W α + u; γ = W δ + v, ξ, η (2.3) (2.5),. 3. (2.7),,.,. 3.1 Ghosh (2006), Y Y = D (1 B ), B φ, D λ Poisson, Y (D, B ), (D, B ). Y = y, (D, B ) : y > 0, P (D = y, B = 0 Y = y ) = 1, y > 0, φ P (D = d ) P (D = d, B = 1 Y = y ) = φ + (1 φ )P (D = 0), (1 φ )P (D = 0) P (D = 0, B = 0 Y = y ) = φ + (1 φ )P (D = 0),
554 {Y = y : i = 1,..., m, j = 1,..., n i }, {D, B }, (D, B). 3.2 θ = (α, δ, σ 2, Σ), α, δ, σ 2 = (σ 2 1, σ2 2 ), Σ = (Σ u, Σ v ),,, π(θ) = π(α)π(δ)π(σ 2 )π(σ), π(α) N(0, Ω α ), π(δ) N(0, Ω δ ), Ω α = diag(σ 2 α), Ω δ = diag(σ 2 δ ), σ2 α σ 2 δ. π(σ2 ) Gamma, π(σ) Wishart, σ 2 1 Gamma(a 1, b 1 ), σ 2 2 Gamma(a 2, b 2 ), Σ 1 u Wishart(ρ, (ρr u ) 1 ), Σ 1 v Wishart(ρ, (ρr v ) 1 ), a 1, b 1, a 2, b 2, ρ, R u, R v.,, σ 2 α = σ 2 δ = 1000, a i = b i = 0.001 (i = 1, 2), R u = R v = I, I, ρ = 2. 3.3 Gibbs M-H, Gibbs : : (θ (0), u (0), v (0), ε (0) ), η (0) (2.5). φ (0) ) = exp(η(0) 1 + exp(η (0) ), : (θ (t), u (t), v (t), ε (t) ), (2.3) (2.5) φ (t) {y : i = 1,..., m, j = 1,..., n i }, (D (0), B(0) ) : D (t) y = 0, B (t) Poisson(λ (t) ); y > 0, B (t) = 0 D (t) = y. Bernoulli(φ (t) ), B(t) = 0, D (t) λ(t),, = 0;,
: 555 : (θ (t), u (t), v (t), ε (t) ) (D (t), B (t) ),, { P ((ε (1) ) (t+1) 1 } rest) exp 2(σ1 2)(t) (ξ(t) Xβ (t) ) T (ξ (t) Xβ (t) ), rest ε (1) θ u, v., { m P (β (t+1) rest) exp n i [D (t) i=1 j=1 ξ(t) exp(ξ(t) )] 1 2 (β(t) W α (t) ) T [Σ (t) u ] 1 (β (t) W α (t) ) P (u (t+1) rest) N(β (t) W α (t), Σ (t) u ), N ( P ((σ1 2 )(t+1) rest) Gamma( 2 + a 1, b 1 1 + 1 ) 1 ) 2 (ε(1) ) (t), ( ( m ) P (Σ (t+1) u rest) Wishart m + ρ, u (t) 1 i u (t)t i + ρr u ). { P ((ε (2) ) (t+1) 1 } rest) exp 2(σ2 2)(t) (η(t) Zγ (t) ) T (η (t) Zγ (t) ), { m P (γ (t+1) n i rest) exp [B (t) η(t) exp(η(t) )] i=1 j=1 i=1 1 2 (γ(t) W δ (t) ) T [Σ (t) v ] 1 (γ (t) W δ (t) ) P (v (t+1) rest) N(γ (t) W δ (t), Σ (t) v ), N ( P ((σ2 2 )(t+1) rest) Gamma( 2 + a 2, b 1 2 + 1 ) 1 ) 2 (ε(2) ) (t), ( ( m ) P (Σ (t+1) v rest) Wishart m + ρ, v (t) 1 i v (t)t i + ρr v )., P (ε (1) rest), P (ε (2) rest), P (β rest), P (γ rest), M-H, Gelman Rubin (1992),, α (t+1) δ (t+1) (2.7),, (θ (t+1), u (t+1), v (t+1), ε (t+1) ). :,. i=1 }, }, 3.4, Gelman & Rubin ( Gelman Rubin (1992)),,, Gibbs Bayes,,
556 {(θ (j), u (j), v (j), ε (j) ) : j = 1,..., J} ( Geyer (1992)), θ = 1 T T t=1 θ (t), Σu = 1 T t = 1,..., T. T t=1 u (t) u (t)t, Σv = 1 T,. 4.1 4. T v (t) v (t)t,, Johnson (2004), G( θ) = K [m k ( θ) Np k ] 2 /Np k, k=1 K, N, p k = 1/K, m k ( θ) k, θ. m k ( θ) : : {y : i = 1,..., m, j = 1,..., n i }, θ F (y θ), y [φ + (1 φ )e λ ] I(l=0)[ (1 φ )e λ λl l! l=0 t=1 ] I(l>0). : {y : i = 1,..., m, j = 1,..., n i }, u : y = 0, u U(0, F (0 θ)); y > 0, u U(F (y 1 θ), F (y θ)). : { k 1 m k ( θ) = K < u < k }, K i = 1,..., m, j = 1,..., n i, k = 1,..., K., K N 0.4., p-, P[χ 2 K 1 G( θ)], p- (K 1) χ 2, p-. 4.2 BIC : BIC = 2 log l( θ Y ) + log N Par,
: 557 log l( θ Y ) θ, Par. BIC,,., ZIP, ZIP, ZIP, Poisson (Hurdle Poisson Model), p- BIC. 5. McCullagh Nelder (1989),. 5.1 Lloyd 34 5,,,,..,, ZIP :, [y φ, λ ] ZIP(φ, λ ) log(λ ) = β 0i + β 1i t + ε (1) logit(φ ) = γ 0i + γ 1i t + ε (2), β mi = α m0 + 5 k=1 γ mi = δ m0 + 5 k=1 i = 1,..., 34, j = 1, 2,..., n i. x (1) ki α(1) mk + 4 x (2) li α (2) ml + 2 x (3) si α(3) ms + u mi l=1 s=1 m = 0, 1. x (1) ki δ(1) mk + 4 x (2) li δ (2) ml + 2 x (3) si δ(3) ms + v mi l=1 s=1 y i 5 j, t log, x (1) ki, x(2) li, x (3) si (k = 1,..., 5, l = 1,..., 4, s = 1, 2) 0 1, i (A, B, C, D, E), (Y 1,..., Y 4 ) (S 1, S 2 ). 5.2 ZIP, ZIP, ZIP, Poisson ( HPM), BIC, 1.
558 1 p- BIC ZIP ZIP ZIP HPM p- 0.673 0.198 0.519 0.028 BIC 12443.17 13209.08 13141.26 14532.83 1, ZIP p- BIC,, ZIP, ZIP HPM. ZIP 2. 2 ZIP Poisson logistic 2.116 0.180-3.472 0.163-1.565 0.214-10112 0.113 A 0.869 0.144 0.817 0.192 0.382 0.057 1.047 0.236 B -0.289 0.125 0.370 0.038-0.218 0.013 0.403 0.019 C -1.360 0.140 1.313 0.096 0.378 0.049 0.951 0.045 D -1.195 0.141-0.525 0.040 0.384 0.092 1.462 0.131 E 0.367 0.121 0.457 0.095-0.457 0.074 1.471 0.097 Y 1 1.504 0.129 1.462 0.269-0.767 0.165 0.645 0.042 Y 2 0.749 0.182 1.741 0.214 1.535 0.103 0.417 0.051 Y 3 1.623 0.142 0.658 0.254 0.292 0.083 0.499 0.156 Y 4-0.292 0.082 1.316 0.169-0.901 0.125 1.242 0.069 S 1 2.793 0.181 0.535 0.013-0.862 0.043 0.939 0.171 S 2-1.676 0.201 0.471 0.025 0.832 0.033 1.432 0.149 5.3 2, 0.2624 (φ 10,3 ) 0.1035 (φ 24,2 ), 0,,,. 2, 2 A, Y 2, S 1,, Poisson
: 559 β 02 = 2.116 + 0.869x (1) 0.749x (2) + 2.793x (3) = 5.029, β 12 = 3.472 + 0.817x (1) + 1.741x (2) + 0.471x (3) = 0.443. logist γ 02 = 1.721, γ 12 = 0.341, [ 1. 1 1 + exp( 1.721 + 0.341 1.2) ] exp(5.029 0.443 1.2) = 19.051. [1] Fahrmeir, L. and Echavarría, L.O., Structured additive regression for overdispersed and zero-inflated count data, Applied Stochastic Models in Business and Industry, 22(4)(2006), 351 369. [2] Yip, K.C.H. and Yau, K.K.W., On modeling claim frequency data in general insurance with extra zeros, Insurance: Mathematics and Economics, 36(2)(2005), 153 163. [3] Xie, F.C., Wei, B.C. and Lin, J.G., Score test for zero-inflated generalized Poisson mixed regression models, Computational Statistics & Data Analysis, 53(9)(2009), 3478 3489. [4] Xie, F.C., Wei, B.C. and Lin, J.G., Assessing influence for pharmaceutical data in zero-inflated generalized Poisson mixed models, Statistics in Medicine, 27(8)(2008), 3656 3673. [5] Ghosh, S.K., Mukhopadhyay, P. and Lu, J.C., Bayesian analysis of zero-inflated regression models, Journal of Statistical Planning and Inference, 136(4)(2006), 1360 1375. [6] Afshartous, D. and Michailidis, G., Distributed multilevel modeling, Journal of Computational and Graphical Statistics, 16(4)(2007), 901 924. [7] Gelman, A., Multilevel (hierarchical) modeling, Technometrics, 48(3)(2006), 432 435. [8] Crainiceanu, C.M., Staicu, A.M. and Di, C.Z., Generalized multilevel functional regression, Journal of the American Statistics Association, 104(488)(2009), 1550 1561. [9] Dunson, D.B., Bayesian nonparametric hierarchical modeling, Biometrical Journal, 51(2)(2009), 273 284. [10] Di, C.Z. and Roche, K.B., Multilevel latent class models with dirichlet mixing distribution, Biometrics, 67(1)(2011), 86-96. [11] Lee, A.H., Wang, K., Scott, J.A., Yau, K.K.W. and Mclachlan, G.J., Multi-level zero-inflated Poisson regression modeling of correlated count data with excess zeros, Statistical Methods in Medical research, 15(1)(2006), 47 61. [12] Gelman, A. and Rubin, G.O., Inference from iterative simulation using multiple sequences, Statistical Science, 7(4)(1992), 457 472. [13] Geyer, C.J., Practical Markov chain Monte Carlo, Statistical Science, 7(4)(1992), 473 483. [14] Johnson, V.E., A Bayesian χ 2 test for goodness-of-fit, The Annals of Statistics, 32(6)(2004), 2361 2384.
560 [15] McCullagh, P. and Nelder, J.A., Generalized Linear Models (2nd Edition), Chapman and Hall, 1989. Bayesian Inference of Hierarchical Regression Model for Zero-Inflated Clustered Count Data Shi Hongxing (School of Primary Education, Chuxiong Normal University, Chuxiong, 675000 ) Zero-inflated Poisson (ZIP) regression model is a popular tool for analyzing count data with excess zeros. In this paper, a flexible hierarchical ZIP regression model is proposed to handle with such data with cluster and Bayesian approach is develop. A Gibbs sampler is employed to produce the Bayesian estimate, a goodness-of-fit and a Bayesian information criterion (BIC) are used for model comparison and selection. Finally, an application of data from a ship damage incident study illustrates the proposed method. Keywords: Zero-inflation, hierarchical regression model, Gibbs sampler, BIC. AMS Subject Classification: 62F15.