SUFFICIENT BURN-IN FOR GIBBS SAMPLERS FOR A HIERARCHICAL RANDOM EFFECTS MODEL

Th Annals of Statistics 2004, Vol. 32, No. 2, 784 817 Institut of Mathmatical Statistics, 2004 SUFFICIENT BURN-IN FOR GIBBS SAMPLERS FOR A HIERARCHICAL RANDOM EFFECTS MODEL BY GALIN L. JONES AND JAMES P. HOBERT 1 Univrsity of Minnsota and Univrsity of Florida W considr Gibbs and block Gibbs samplrs for a Baysian hirarchical vrsion of th on-way random ffcts modl. Drift and minorization conditions ar stablishd for th undrlying Markov chains. Th drift and minorization ar usd in conjunction with rsults from J. S. Rosnthal J. Amr. Statist. Assoc. 90 (1995) 558 566] and G. O. Robrts and R. L. Twdi Stochastic Procss. Appl. 80 (1999) 211 229] to construct analytical uppr bounds on th distanc to stationarity. Ths lad to uppr bounds on th amount of burn-in that is rquird to gt th chain within a prspcifid (total variation) distanc of th stationary distribution. Th rsults ar illustratd with a numrical xampl. 1. Introduction. W considr a Baysian hirarchical vrsion of th standard normal thory on-way random ffcts modl. Th postrior dnsity for this modl is intractabl in th sns that th intgrals rquird for making infrncs cannot b computd in closd form. Hobrt and Gyr (1998) analyzd a Gibbs samplr and a block Gibbs samplr for this problm and showd that th Markov chains undrlying ths algorithms convrg to th stationary (i.., postrior) distribution at a gomtric rat. Howvr, Hobrt and Gyr stoppd short of constructing analytical uppr bounds on th total variation distanc to stationarity. In this articl, w construct such uppr bounds and this lads to a mthod for dtrmining a sufficint burn-in. Our rsults ar usful from a practical standpoint bcaus thy obviat troublsom, ad hoc convrgnc diagnostics Cowls and Carlin (1996) and Cowls, Robrts and Rosnthal (1999)]. Mor important, howvr, w bliv that this is th first analysis of a practically rlvant Gibbs samplr on a continuous stat spac that provids viabl burn-ins. By practically rlvant, w man that th stationary distribution is complx nough that indpndnt and idntically distributd (i.i.d.) sampling is not straightforward. W not that th Gibbs samplrs analyzd by Hobrt (2001) and Rosnthal (1995a, 1996) ar not practically rlvant sinc i.i.d. sampls can b drawn from th corrsponding stationary distributions using simpl, squntial sampling schms Jons (2001) and Marchv and Hobrt (2004)]. Som notation is now introducd that will allow for a mor dtaild ovrviw. Rcivd January 2002; rvisd Dcmbr 2002. 1 Supportd in part by NSF Grant DMS-00-72827. AMS 2000 subjct classifications. Primary 60J10; scondary 62F15. Ky words and phrass. Block Gibbs samplr, burn-in, convrgnc rat, drift condition, gomtric rgodicity, Markov chain, minorization condition, Mont Carlo, total variation distanc. 784

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 785 Lt X =X i, i = 0, 1,...} b a discrt tim, tim homognous Markov chain that is irrducibl, apriodic and positiv Harris rcurrnt. Lt P n (x, ) b th probability masur corrsponding to th random variabl X n conditional on starting th chain at X 0 = x;thatis,p n is th n-stp Markov transition krnl. Lt π( ) b th invariant probability masur of th chain and lt dnot th total variation norm. Formally, th issu of burn-in can b dscribd as follows. Givn a starting valu x 0 and an arbitrary ε>0, can w find an n = n (x 0,ε)such that (1) P n (x 0, ) π( ) <ε? If th answr is ys, thn, sinc th lft-hand sid of (1) is nonincrasing in th numbr of itrations, th distribution of X k is within ε of π for all k n. Bcaus w ar not dmanding that n b th smallst valu for which (1) holds, it is possibl that th chain actually gts within ε of stationarity in much fwr than n itrations. For this rason, w call n a sufficint burn-in. Svral authors s,.g., Myn and Twdi (1994), Rosnthal (1995a), Cowls and Rosnthal (1998), Robrts and Twdi (1999) and Douc, Moulins and Rosnthal (2002)] hav rcntly providd rsults that allow on to calculat n whn X is gomtrically rgodic. Howvr, to us ths rsults on must stablish both a drift condition and an associatd minorization condition for X. Foran accssibl tratmnt of ths concpts, s Jons and Hobrt (2001).] In this articl w stablish drift and minorization for th Gibbs samplrs analyzd by Hobrt and Gyr (1998). Ths conditions ar usd in conjunction with th thorms of Rosnthal (1995a) and Robrts and Twdi (1999) to construct formulas that can b usd to calculat n. Th rst of th articl is organizd as follows. Th modl and algorithms ar dscribd in Sction 2. Sction 3 contains important background matrial on gnral stat spac Markov chain thory as wll as statmnts of th thorms of Rosnthal (1995a) and Robrts and Twdi (1999). This sction also contains a nw convrsion lmma that provids a connction btwn th two diffrnt typs of drift usd in ths thorms. W stablish drift and minorization for th block Gibbs samplr in Sction 4 and th sam is don for th Gibbs samplr in Sction 5. In Sction 6 th rsults ar illustratd and Rosnthal s thorm is compard with th thorm of Robrts and Twdi. Sction 7 contains som concluding rmarks. 2. Th modl and th Gibbs samplrs. Considr th following Baysian vrsion of th standard normal thory on-way random ffcts modl. First, conditional on θ = (θ 1,...,θ K ) T and λ th data Y ij ar indpndnt with Y ij θ,λ N(θ i,λ 1 ), whr i = 1,...,K and j = 1,...,m i. At th scond stag, conditional on µ and λ θ, θ 1,...,θ K and λ ar indpndnt with θ i µ, λ θ N(µ, λ 1 θ ) and λ Gamma(a 2,b 2 ),

786 G. L. JONES AND J. P. HOBERT whr a 2 and b 2 ar known positiv constants. W say W Gamma(α, β) if its dnsity is proportional to w α 1 wβ I(w>0).] Finally, at th third stag µ and λ θ ar assumd indpndnt with µ N(m 0,s0 1 ) and λ θ Gamma(a 1,b 1 ), whr m 0,s 0,a 1 and b 1 ar known constants; all but m 0 ar assumd to b positiv so that all of th priors ar propr. Th postrior dnsity of this hirarchical modl is charactrizd by (2) π h (θ,µ,λ y) f(y θ,λ )f (θ µ, λ θ )f (λ )f (µ)f (λ θ ), whr λ = (λ θ,λ ) T, y is a vctor containing all of th data, and f dnots a gnric dnsity. W will oftn abus notation and us π h to dnot th probability distribution associatd with th dnsity in (2).] Expctations with rspct to π h ar typically ratios of intractabl intgrals, th numrators of which can hav dimnsion as high as K + 3 Jons and Hobrt (2001)]. Thus, to mak infrncs using π h, w must rsort to (possibly) high dimnsional numrical intgration, analytical approximations or Mont Carlo and Markov chain Mont Carlo tchniqus. In thir sminal articl on th Gibbs samplr, Glfand and Smith (1990) usd th balancd vrsion of this modl (in which m i m) as an xampl. S also Glfand, Hills, Racin-Poon and Smith (1990) and Rosnthal (1995b).] Each itration of th standard, fixd-scan Gibbs samplr consists of updating all of th K + 3 variabls in th sam prdtrmind ordr. Th full conditionals rquird for this Gibbs samplr ar now rportd. Dfin v 1 (θ, µ) = (θ i µ) 2, v 2 (θ) = m i (θ i ȳ i ) 2 and SSE = (y ij ȳ i ) 2, i,j whr ȳ i = m 1 mi i j=1 y ij. Th full conditionals for th varianc componnts ar ( K λ θ θ,µ,λ,y Gamma 2 + a 1, v ) 1(θ, µ) (3) + b 1 2 and ( M λ θ,µ,λ θ,y Gamma 2 + a 2, v 2(θ) + SSE (4) + b 2 ), 2 whr M = i m i. Ltting θ i = (θ 1,...,θ i 1,θ i+1,...,θ K ) T and θ = K 1 i θ i, th rmaining full conditionals ar ( ) λθ µ + m i λ ȳ i 1 θ i θ i,µ,λ θ,λ,y N, λ θ + m i λ λ θ + m i λ

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 787 for i = 1,...,K and ( ) s0 m 0 + Kλ θ θ 1 µ θ,λ θ,λ,y N,. s 0 + Kλ θ s 0 + Kλ θ W considr th fixd-scan Gibbs samplr that updats µ, thn th θ i s, thn λ θ and λ. Sinc th θ i s ar conditionally indpndnt givn (µ, λ), th ordr in which thy ar updatd is irrlvant. Th sam is tru of λ θ and λ sinc ths two random variabls ar conditionally indpndnt givn (θ, µ). Ifwwrita on-stp transition as (µ,θ,λ ) (µ, θ, λ), thn th Markov transition dnsity (MTD) of our Gibbs samplr is givn by K ] k(µ,θ,λ µ,θ,λ ) = f(µ θ,λ θ,λ,y) f(θ i θ i,µ,λ θ,λ,y) f(λ θ θ,µ,λ, y)f (λ θ,µ,λ θ,y). Hobrt and Gyr (1998) considrd this sam updat ordr. W not hr that, in gnral, Gibbs samplrs with diffrnt updat ordrs corrspond to diffrnt Markov chains. Howvr, two chains whos updat ordrs ar cyclic prmutations of on anothr convrg at th sam rat. As an altrnativ to th standard Gibbs samplr, Hobrt and Gyr (1998) introducd th mor fficint block Gibbs samplr in which all of th componnts of ξ = (θ 1,...,θ K,µ) T ar updatd simultanously. Ths authors showd that ξ λ,y N(ξ,V)and gav formulas for ξ = ξ (λ, y) and V = V(λ,y). Bcaus w will mak xtnsiv us of ths formulas, thy ar rstatd in Appndix A. On itration of th block Gibbs samplr consists of updating λ θ, λ and ξ in som ordr. Du to th conditional indpndnc of λ θ and λ, th block Gibbs samplr is ffctivly a two-variabl Gibbs samplr or data augmntation algorithm Tannr and Wong (1987)], th two componnts bing ξ and λ. W choos to updat λ first bcaus, as w will s latr, updating th most complicatd distribution last typically simplifis th calculations rquird to stablish drift and minorization conditions. If w writ a on-stp transition as (λ,ξ ) (λ, ξ),thn th corrsponding MTD is givn by k(λ,ξ λ,ξ ) = f(λ ξ,y)f(ξ λ,y) (5) = f(λ θ ξ,y)f(λ ξ,y)f(ξ λ θ,λ,y). Hobrt and Gyr (1998) considrd th opposit updat ordr bcaus thy wr not attmpting to simultanously stablish drift and minorization. Not, howvr, that our updat ordr is just a cyclic prmutation of th ordr usd by Hobrt and Gyr. A propr formulation of th burn-in problm rquirs som concpts and notation from Markov chain thory. Ths ar providd in th following sction. Mor gnral accounts of this matrial can b found in Nummlin (1984), Myn and Twdi (1993) and Tirny (1994).

788 G. L. JONES AND J. P. HOBERT 3. Markov chain background. Lt X R p for p 1andltB dnot th associatd Borl σ -algbra. Suppos that X =X i,i= 0, 1,...} is a discrt tim, tim homognous Markov chain with stat spac X and Markov transition krnl P ;thatis,forx X and A B, P(x,A) = Pr(X i+1 A X i = x). Also, for n = 1, 2, 3,..., lt P n dnot th n-stp transition krnl, that is, P n (x, A) = Pr(X i+n A X i = x) so, in particular, P P 1. Not that P n (x, ) is th probability masur of th random variabl X n conditional on starting th chain at X 0 = x. Lt ν bamasuronb. W will say that th Markov chain X satisfis assumption (A) ifitisν-irrducibl, apriodic and positiv Harris rcurrnt with invariant probability masur π( ). It is straightforward to show that th Gibbs samplrs dscribd in th prvious sction satisfy assumption (A) withν qual to Lbsgu masur. Undr assumption (A), for vry x X w hav P n (x, ) π( ) 0 asn, whr P n (x, ) π( ) :=sup A B P n (x, A) π(a) is th total variation distanc btwn P n and π. Th chain X is calld gomtrically rgodic if it satisfis assumption (A) and, in addition, thr xist a constant 0 <t<1anda function g : X 0, ) such that, for any x X, (6) P n (x, ) π( ) g(x)t n for n = 1, 2,... It has rcntly bn dmonstratd that stablishing drift and minorization conditions for X vrifis gomtric rgodicity (th xistnc of g and t) and yilds an uppr bound on th right-hand sid of (6). S Jons and Hobrt (2001) for an xpository look at this thory. In this papr, w will focus on th rsults du to Rosnthal (1995a) and Robrts and Twdi (1999). Slightly simplifid vrsions of ths rsults follow. THEOREM 3.1 Rosnthal (1995a)]. Lt X b a Markov chain satisfying assumption (A). Suppos X satisfis th following drift condition. For som function V : X 0, ), som 0 <γ <1 and som b<, (7) EV(X i+1 ) X i = x] γv(x)+ b x X. Lt C =x X : V(x) d R }, whr d R > 2b/(1 γ)and suppos that X satisfis th following minorization condition. For som probability masur Q on B and som ε>0, (8) P(x, ) εq( ) x C. Lt X 0 = x 0 and dfin two constants as follows: α = 1 + d R 1 + 2b + γd R and U = 1 + 2(γ d R + b).

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 789 Thn, for any 0 <r<1, ( U P n (x 0, ) π( ) (1 ε) rn r + α 1 r ) n ( 1 + b ) 1 γ + V(x 0). THEOREM 3.2 Robrts and Twdi (1999, 2001)]. Lt X b a Markov chain satisfying assumption (A). Suppos X satisfis th following drift condition. For som function W : X 1, ), som 0 <ρ<1 and som L<, (9) EW(X i+1 ) X i = x] ρw(x) + LI S (x) x X, whr S =x X : W(x) d RT } and d RT L 1 ρ 1. Suppos furthr that X satisfis th following minorization condition. For som probability masur Q on B and som ε>0, (10) P(x, ) εq( ) x S. Lt X 0 = x 0 and dfin som constants as follows: κ = ρ + L, J = (κd RT ε)(1 + d RT ) + Ld RT, 1 + d RT (1 + d RT )κ ζ = log(1/2)(l/(1 ρ) + w(x 0))] log(κ 1, η= log(1 ε) 1 J ] ) log(κ 1, ) ] log κ log(1 ε) β RT = xp. log J log(1 ε) Thn if J 1 and n = k ζ>η(1 ε)/ε, w hav, for any 1 β<β RT, (11) P k β(1 ε) (x 0, ) π( ) < 1 (1 + η/n ) 1/η ](1 + n η )(1 + ηn ) n /η β n. REMARK 3.1. Th vrsion of Thorm 3.2 in Robrts and Twdi (1999) rlis on thir Thorm 5.2, whos proof contains an rror. Using Robrts and Twdi s (1999) notation, suppos V : X 1, ), d>0, C =x X : V(x) d} and h(x,y) = (V (x) + V (y))/2. Robrts and Twdi (1999) claim that h(x,y) (1 + d)i C C] c(x, y), which is fals and, in fact, all that w can claim is that h(x,y) 1 + d 2 I C C] c(x, y).

790 G. L. JONES AND J. P. HOBERT W hav accountd for this rror in our statmnt of Thorm 3.2 and w ar gratful to an anonymous rfr for bringing th rror to our attntion. REMARK 3.2. Robrts and Twdi (1999) provid a diffrnt bound for th cas J<1but, sinc w do not us it in our application (s Sction 6), it is not statd hr. REMARK 3.3. Robrts and Twdi (1999) show that th right-hand sid of (11) is approximatly minimizd whn β = β RT /(1 + η/n ) 1/η. REMARK 3.4. It is wll known s,.g., Myn and Twdi (1993), Chaptr 15] that (7) and (8) togthr or (9) and (10) togthr] imply that X is gomtrically rgodic. S Jons and Hobrt (2001) for an huristic xplanation. In our xprinc it is oftn asir to stablish a Rosnthal-typ drift condition than a Robrts-and-Twdi-typ drift condition. Th following nw rsult provids a usful connction btwn ths two vrsions of drift. LEMMA 3.1. Lt X b a Markov chain satisfying assumption (A). Suppos thr xist V : X 0, ), γ (0, 1) and b< such that (12) EV(X n+1 ) X n = x] γv(x)+ b x X. St W(x)= 1 + V(x). Thn, for any a>0, (13) EW(X n+1 ) X n = x] ρw(x) + LI C (x) x X, whr ρ = (a + γ)/(a+ 1), L = b + (1 γ)and C = x X : W(x) PROOF. Clarly, (12) implis that (a + 1)L a(1 ρ) EW(X i+1 ) X i = x] γw(x)+ b + (1 γ)= γw(x)+ L x X. St W (x) = EW(X n+1 ) X n = x] W(x) and β = (1 γ)/(a+ 1).Thn or, quivalntly, for all x X. Ifx/ C, thn }. EW(X n+1 ) X n = x] 1 (a + 1)β]W(x)+ L W (x) βw(x) aβw (x) + L W(x)> (a + 1)L (a + 1)L > a(1 ρ) a(1 γ) = L aβ.

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 791 Now writ W(x)= aβ L + s(x),whrs(x) > 0. Thn ] L W (x) βw(x) aβ aβ + s(x) + L If, on th othr hand, x C, thn Now putting ths togthr givs = βw(x) aβs(x) βw(x). W (x) βw(x) aβw (x) + L βw(x) + L. EW(X n+1 ) X n = x] (1 β)w(x) + LI C = ρw(x) + LI C. REMARK 3.5. Sinc (a + 1)L a(1 ρ) L 1 ρ 1, (13) constituts a drift condition of th form (9). Thrfor, if w can stablish (12) as wll as a minorization condition on th st C, it will b as straightforward to apply Thorm 3.2 as it is to apply Thorm 3.1. Indd, this is th approach w tak with our Gibbs samplrs. Morovr, w us a = 1 in our application sinc is minimizd at this valu. (a+1)l a(1 ρ) Whil th Gibbs samplr is asir to implmnt than th block Gibbs samplr, it is actually hardr to analyz bcaus it is ffctivly a thr-variabl Gibbs samplr as opposd to th block Gibbs samplr, which is ffctivly a two-variabl Gibbs samplr. Thus, w bgin with block Gibbs. 4. Drift and minorization for th block Gibbs samplr. Drift conditions of th form (7) ar stablishd for th unbalancd and balancd cass in Sctions 4.1 and 4.2, rspctivly. A minorization condition that works for both cass is stablishd in Sction 4.3. Throughout this sction w assum that m = minm 1,m 2,...,m K } 2andthatK 3. 4.1. Drift: unbalancd cas. Dfin two constants as follows: δ 1 = 1 2a 1 + K 2 Also dfin δ 3 = (K + 1)δ 2 and δ 2 = 1 2a 2 + M 2. and δ 4 = δ K 2 m 1 i. Our assumptions about

792 G. L. JONES AND J. P. HOBERT K and m guarant that 0 <δ i < 1fori = 1, 2, 3, 4. St δ = maxδ 1,δ 3 }. Also, lt dnot th lngth of th convx hull of th st ȳ 1, ȳ 2,...,ȳ K,m 0 } and dfin c 1 = 2b 1 2a 1 + K 2 and c 2 = 2b 2 + SSE 2a 2 + M 2. PROPOSITION 4.1. Fix γ (δ, 1) and lt φ 1 and φ 2 b positiv numbrs such that φ 1δ 4 φ 2 + δ<γ. Dfin th drift function as V 1 (θ, µ) = φ 1 v 1 (θ, µ) + φ 2 v 2 (θ), whr v 1 (θ, µ) and v 2 (θ) ar as dfind in Sction 2. Thn th block Gibbs samplr satisfis (7) with K ] b = φ 1 c 1 + c 2 m 1 i + K 2 + φ 2 c 2 (K + 1) + M 2 ]. (14) PROOF. bcaus It suffics to show that EV 1 (θ, µ) λ,θ,µ ] φ 1 δ 1 v 1 (θ,µ ) + ( ) φ1 δ 4 + δ 3 φ 2 v 2 (θ ) + b φ 2 ( ) φ 1 δ 1 v 1 (θ,µ φ1 δ 4 ) + + δ 3 φ 2 v 2 (θ ) + b φ 2 ( ) φ 1 δv 1 (θ,µ φ1 δ 4 ) + + δ φ 2 v 2 (θ ) + b φ 2 γφ 1 v 1 (θ,µ ) + γφ 2 v 2 (θ ) + b = γv 1 (θ,µ ) + b. In bounding th lft-hand sid of (14), w will us th following rul: (15) EV 1 (θ, µ) λ,θ,µ ]=EV 1 (θ, µ) θ,µ ]=EEV 1 (θ, µ) λ] θ,µ }, which follows from th form of th MTD for th block Gibbs samplr givn in (5). W bgin with som prliminary calculations. First, not that (16) and (17) E(λ 1 θ θ,µ 2b 1 ) = 2a 1 + K 2 + v 1(θ,µ ) 2a 1 + K 2 = c 1 + δ 1 v 1 (θ,µ ) E(λ 1 θ,µ ) = 2b 2 + SSE 2a 2 + M 2 + v 2 (θ ) 2a 2 + M 2 = c 2 + δ 2 v 2 (θ ).

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 793 W now bgin th main calculation. Using our rul, w hav Ev 1 (θ, µ) θ,µ ]= E(θ i µ) 2 θ,µ ] K = E E(θ i µ) 2 λ] θ,µ }. Using rsults from Appndix A, w hav E(θ i µ) 2 λ] = Var(θ i λ) + Var(µ λ) 2Cov(θ i,µ) λ]+e(θ i λ) E(µ λ)] 2 Hnc, (18) = = 1 + λ2 θ + (λ θ + m i λ ) 2 2λ θ (λ θ + m i λ ) λ θ + m i λ (s 0 + t)(λ θ + m i λ ) 2 +E(θ i λ) E(µ λ)] 2 1 m 2 i + λ2 λ θ + m i λ (s 0 + t)(λ θ + m i λ ) 2 +E(θ i λ) E(µ λ)] 2 1 m i λ + m i λ t(λ θ + m i λ ) + 2. E(θ i µ) 2 λ] λ 1 m 1 i + λ 1 θ + K 2. Thus, by combining (16) (18) w obtain Eφ 1 v 1 (θ, µ) θ,µ ] δ 1 φ 1 v 1 (θ,µ ) + δ 4 φ 1 v 2 (θ K ] (19) ) + φ 1 c 1 + c 2 m 1 i + K 2. Now Ev 2 (θ) θ,µ ]= m i E(θ i ȳ i ) 2 θ,µ ]=E m i E(θ i ȳ i ) 2 λ] θ,µ }. i i W can bound th innrmost xpctation as follows: E(θ i ȳ i ) 2 λ]=var(θ i λ) +E(θ i λ) ȳ i ] 2 = 1 λ 2 θ + λ θ + m i λ (s 0 + t)(λ θ + m i λ ) 2 +E(θ i λ) ȳ i ] 2 1 λ θ + m i λ t(λ θ + m i λ ) + 2.

794 G. L. JONES AND J. P. HOBERT Hnc (20) m i E(θ i ȳ i ) 2 λ] (K + 1)λ 1 + M 2, and so by combining (17) and (20) w obtain (21) Eφ 2 v 2 (θ) θ,µ ] δ 3 φ 2 v 2 (θ ) + φ 2 (K + 1)c 2 + M 2 ]. Combining (19) and (21) yilds (14). REMARK 4.1. Th uppr bound on th total variation distanc that is th conclusion of Thorm 3.1 involvs th starting valu of th Markov chain, x 0, only through V(x 0 ). Morovr, givn th way in which V(x 0 ) ntrs th formula, it is clar that th optimal starting valu, in trms of minimizing th uppr bound, is th starting valu that minimizs V(x 0 ). This starting valu is also optimal for th application of Thorm 3.2. In Appndix B w show that th valu of (θ, µ) that minimizs V 1 (θ, µ) has componnts ˆθ i = φ 1 K j=1 (m j ȳ j /(φ 1 + φ 2 m j ))/ K j=1 (m j /(φ 1 + φ 2 m j ))]+φ 2 m i ȳ i φ 1 + φ 2 m i and ˆµ = K 1 K ˆθ i. Whil th conclusion of Proposition 4.1 crtainly holds whn th data ar balancd, it is possibl to do bttr in this cas. Spcifically, th proof of Proposition 4.1 uss th gnral bounds on E(θ i λ) E(µ λ)] 2 and E(θ i λ) ȳ i ] 2 givn in Appndix A. Much sharpr bounds ar possibl by xplicitly using th balancdnss, and ths lad to a bttr drift condition. 4.2. Drift: balancd cas. Now assum that m i = m 2foralli = 1,...,K and lt δ 5 = Kδ 2 (0, 1). PROPOSITION 4.2. Fix γ (δ, 1) and lt φ b a positiv numbr such that φδ 5 + δ<γ. Dfin th drift function as V 2 (θ, µ) = φv 1 (θ, µ) + m 1 v 2 (θ). Thn th block Gibbs samplr satisfis (7) with b = φc 1 +(φk + K + 1)/m]c 2 + maxφ,1} max (ȳ ȳ i ) 2,(m 0 ȳ i ) 2}, whr ȳ := K 1 K ȳ i. PROOF. Whn th data ar balancd, t = Mλ θ λ λ θ + mλ,

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 795 so that E(µ λ) = (tȳ + m 0 s 0 )/(s 0 + t). Hnc for all i = 1,...,K w hav ( ) E(θ i λ) ȳ i ] 2 λ θ tȳ + m0 s 0 = + λ ] 2 mȳ i ȳ i λ θ + mλ s 0 + t λ θ + mλ ( ) λ 2 ] θ t(ȳ ȳi ) + s 0 (m 0 ȳ i ) 2 = λ θ + mλ s 0 + t ( ) λ 2 θ t(ȳ ȳ i ) 2 + s 0 (m 0 ȳ i ) 2, λ θ + mλ s 0 + t whr th last inquality is Jnsn s. A similar argumnt shows that, for all i = 1,...,K, ( ) 2 E(θ i λ) E(µ λ)] 2 mλ t(ȳ ȳ i ) 2 + s 0 (m 0 ȳ i ) 2. λ θ + mλ s 0 + t Thrfor, φe(θ i λ) E(µ λ)] 2 +E(θ i λ) ȳ i ] 2 t(ȳ ȳi ) 2 + s 0 (m 0 ȳ i ) 2 ] maxφ,1}, s 0 + t and hnc φe(θi λ) E(µ λ)] 2 +E(θ i λ) ȳ i ] 2} maxφ,1} max (ȳ ȳ i ) 2,(m 0 ȳ i ) 2}. To prov th rsult, it suffics to show that (22) EV 2 (θ, µ) λ,θ,µ ] φδ 1 v 1 (θ,µ ) + (φδ 5 + δ 3 )m 1 v 2 (θ ) + b sinc φδ 1 v 1 (θ,µ ) + (φδ 5 + δ 3 )m 1 v 2 (θ ) + b φδv 1 (θ,µ ) + (φδ 5 + δ)m 1 v 2 (θ ) + b γφv 1 (θ,µ ) + γm 1 v 2 (θ ) + b = γv 2 (θ,µ ) + b. Th rmaindr of th proof is narly idntical to th proof of Proposition 4.1 and is thrfor lft to th radr. REMARK 4.2. This rsult is statd (without proof) in Jons and Hobrt (2001), Appndix A] and th statmnt contains an rror. Spcifically, b is statd incorrctly and should appar as abov.

796 G. L. JONES AND J. P. HOBERT 4.3. Minorization. W now us a tchniqu basd on Rosnthal s (1995a) Lmma 6b to stablish a minorization condition of th form (8) on th st S B = (θ, µ) : V 1 (θ, µ) d } = (θ, µ) : φ 1 v 1 (θ, µ) + φ 2 v 2 (θ) d }, for any d>0. Sinc V 2 of Proposition 4.2 is a spcial cas of V 1, this minorization will also work for V 2. First not that S B is containd in C B := C B1 C B2,whr C B1 = } (θ, µ) : v 1 (θ, µ) < d/φ 1 and C B2 = } (θ, µ) : v 2 (θ) < d/φ 2. Hnc, it suffics to stablish a minorization condition that holds on C B.Wwill accomplish this by finding an ε>0andadnsityq(λ,θ,µ) on R 2 + RK R such that k(λ,θ,µ λ,θ,µ ) εq(λ,θ,µ) (θ,µ ) C B, whr k(λ,θ,µ λ,θ,µ ) is th MTD for th block Gibbs samplr givn in (5). W will rquir th following lmma, whos proof is givn in Appndix C. LEMMA 4.1. Lt Gamma(α, β; x) dnot th valu of th Gamma(α, β) dnsity at th point x>0. If α>1, b>0 and c>0 ar fixd, thn, as a function of x, Gamma(α, b; x), if x<x inf Gamma(α, b + β/2; x) =, 0<β<c Gamma(α, b + c/2; x), if x>x, whr Hr is th minorization condition. PROPOSITION 4.3. whr for x = 2α ( c log 1 + c ). 2b Lt q(λ,θ,µ) b a dnsity on R 2 + RK R dfind as ] ] h 1 (λ θ ) h 2 (λ ) q(λ,θ,µ) = f(ξ λ,y), R + h 1 (λ θ )dλ θ R + h 2 (λ )dλ ( K Gamma 2 + a 1,b 1 ; λ θ h 1 (λ θ ) = ( K Gamma 2 + a 1, λ θ = φ 1(K + 2a 1 ) d ), λ θ <λ θ, d 2φ 1 + b 1 ; λ θ ( log 1 + d ) 2b 1 φ 1 ), λ θ λ θ,

and for GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 797 ( M Gamma 2 + a 2, SSE 2 + b 2; λ h 2 (λ ) = ( M Gamma 2 + a 2, φ 2SSE + d + b 2 ; λ 2φ 2 λ = φ 2(M + 2a 2 ) d ), λ <λ, ( ) d log 1 +. φ 2 (2b 2 + SSE) ), λ λ, St ε B = R + h 1 (λ θ )dλ θ ] R + h 2 (λ )dλ ]. Thn th Markov transition dnsity for th block Gibbs samplr satisfis th following minorization condition: k(λ,θ,µ λ,θ,µ ) ε B q(λ,θ,µ) (θ,µ ) C B. PROOF. W us ξ = (θ, µ) and ξ = (θ,µ ) to simplify notation. If ξ C B, w hav f(λ θ ξ, y)f (λ ξ, y)f (ξ λ,y) f(ξ λ,y) inf f(λ θ ξ,y)f(λ ξ,y)] ξ C B ] ] f(ξ λ,y) inf f(λ θ ξ,y) ξ C B inf f(λ ξ,y) ξ C B ] ] f(ξ λ,y) inf f(λ θ ξ,y) ξ C B1 inf f(λ ξ,y). ξ C B2 Thus w can tak ] ] q(λ,θ,µ) f(ξ λ,y) inf f(λ θ ξ,y) ξ C B1 inf f(λ ξ,y). ξ C B2 Two applications of Lmma 4.1 yild th rsult. Th drift and minorization conditions givn in Propositions 4.1 4.3 can b usd in conjunction with ithr Thorm 3.1 or 3.2 to gt a formula giving an uppr bound on th total variation distanc to stationarity for th block Gibbs samplr. On such formula is statd xplicitly at th start of Sction 6. 5. Drift and minorization for th Gibbs samplr. In this sction w dvlop drift and minorization conditions for th Gibbs samplr. W continu to assum that m = minm 1,m 2,...,m K } 2 and that K 3. Lt m = maxm 1,m 2,...,m K }.

798 G. L. JONES AND J. P. HOBERT 5.1. Drift. Rcall that δ 1 = 1/(2a 1 + K 2) and dfin K 2 + 2Ka 1 1 δ 6 = 2s 0 b 1 + K 2 and δ 7 = + 2Ka 1 2(a 1 1). Clarly δ 6 (0, 1). It is straightforward to show that if a 1 > 3/2, thn δ 7 (0, 1) and thr xists ρ 1 (0, 1) such that ( K + δ ) 6 (23) δ 1 <ρ 1. δ 7 Dfin th function v 3 (θ, λ) = Kλ θ s 0 +Kλ θ ( θ ȳ) 2.Also,lts 2 = K (ȳ i ȳ) 2.W will rquir th following lmma, whos proof is givn in Appndix D. LEMMA 5.1. Lt a and b b constants such that 5b >a b>0. Thn if x and y ar positiv, (24) Hr is th drift condition. ( ) ax 2 ( ) y 2 + < 1. ax + y bx + y PROPOSITION 5.1. Assum that a 1 > 3/2 and lt ρ 1 (0, 1) satisfy (23). Assum also that 5m >m. Fix c 3 (0, minb 1,b 2 }) and fix γ (maxρ 1,δ 6,δ 7 }, 1). Dfin th drift function as V 3 (θ, λ) = c 3λ θ + c 3λ + δ 7 + v 3 (θ, λ). Kδ 1 λ θ Thn th Gibbs samplr satisfis (7) with ( ) b1 a1 +K/2 ( ) b2 a2 +N/2 b = + b 1 c 3 b 2 c 3 PROOF. 1 + (δ 6 + δ 7 ) It suffics to show that EV 3 (θ, λ) µ,θ,λ ] s 0 + (m 0 ȳ) 2 + s2 K ] + 2b 1δ 7 K. (25) δ 7(K + δ 6 /δ 7 )δ 1 ] Kδ 1 λ θ ( m λ ) 2 ( λ ) ] 2 + δ 7 λ θ θ + m λ + δ 6 λ θ + m λ v 3 (θ,λ ) + b,

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 799 bcaus, using Lmma 5.1 and (23), w hav ( δ 7 Kδ 1 λ K + δ ] ( 6 m )δ λ ) 2 ( λ ) ] 2 1 + δ 7 θ δ 7 λ θ θ + m λ + δ 6 λ θ + m λ v 3 (θ,λ ) + b ρ 1δ 7 Kδ 1 λ + maxδ 6,δ 7 }v 3 (θ,λ ) + b θ γv 3 (θ,λ ) + b. Rcall that w ar considring Hobrt and Gyr s (1998) updating schm for th Gibbs samplr: (µ,θ,λ ) (µ, θ, λ). Establishing (25) rquirs th calculation of svral xpctations, and ths will b calculatd using th following rul: EV 3 (θ, λ) µ,θ,λ ]=EV 3 (θ, λ) θ,λ ] = E EEV 3 (θ, λ) µ, θ, θ,λ ] µ, θ,λ } θ,λ } = E EEV 3 (θ, λ) µ, θ] µ, λ } θ,λ }. W now stablish (25). First, it is asy to show that (26) E c 3λ θ θ,µ ] ( ) a1 +K/2 b1 b 1 c 3 E c 3λ θ,µ ] ( ) a2 +N/2 b2. b 2 c 3 and δ 7 Now w valuat E Kδ 1 λ θ µ,θ,λ ]. Not that (27) and Eλ 1 θ µ, θ]=δ 1 2b 1 + ] (θ i µ) 2 E(θ i µ) 2 µ, λ ]=Var(θ i µ, λ ) +E(θ i µ, λ ) µ] 2 ( 1 mi λ ) 2 (28) = λ θ + m iλ + λ θ + m iλ (µ ȳ i ) 2 1 ( m λ ) 2 λ + θ λ θ + m λ (µ ȳ i ) 2. It follows that E(θ i µ) 2 µ, λ ] K ( m λ ) 2 (29) λ + θ λ θ + m λ K(µ ȳ) 2 + s 2.

800 G. L. JONES AND J. P. HOBERT Ltting θ = K 1 i θ i,whav E(µ ȳ) 2 θ,λ ]=Var(µ θ,λ ) +E(µ θ,λ ) ȳ] 2 1 s 0 = s 0 + Kλ + θ s 0 + Kλ θ (30) 1 s 0 s 0 + Kλ + θ s 0 + Kλ θ 1 s 0 + (m 0 ȳ) 2 + v 3 (θ,λ ), (m 0 ȳ) + Kλ ] 2 θ s 0 + Kλ (θ ȳ) θ (m 0 ȳ) 2 + Kλ θ s 0 + Kλ (θ ȳ) 2 θ whr th first inquality is Jnsn s. On combining (27) (30), w hav (31) δ ] 7 µ E,θ,λ Kδ 1 λ θ δ 7 λ θ ( m λ ) 2 + δ 7 λ θ + m λ v 3 (θ,λ ) ] + δ 7 1 s 0 + (m 0 ȳ) 2 + s2 K + 2b 1δ 7 K. Th last thing w nd to valuat is Ev 3 (θ, λ) µ,θ,λ ]. As in Hobrt and Gyr (1998), Jnsn s inquality yilds ( Kλθ ) µ, E θ KE(λ θ µ, θ) s 0 + Kλ θ s 0 + KE(λ θ µ, θ) K 2 + 2Ka 1 (32) 2s 0 b 1 + K 2 = δ 6. + 2Ka 1 Ths authors also not that th conditional indpndnc of th θ i s implis that θ µ, λ N from which it follows that (33) ( 1 K λ θ µ + m i λ ȳ i λ θ + m i λ, 1 K 2 E( θ ȳ) 2 µ, λ ]=Var( θ µ, λ ) +E( θ µ, λ ) ȳ] 2 = 1 K 2 1 K 1 Kλ θ 1 λ θ + m iλ + 1 λ θ + m λ + 1 K ( + λ θ λ θ + m λ 1 K ( 1 λ θ + m i λ ) λ θ λ θ + m iλ (µ ȳ i ) λ θ λ θ + m iλ ) 2 (µ ȳ) 2 + s2 K,, ) 2 (µ ȳ i ) 2 ] 2

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 801 whr, again, (part of) th first inquality is Jnsn s. On combining (30), (32) and (33), w hav Ev 3 (θ, λ) µ,θ,λ ] δ ( 6 λ ) ] (34) 2 Kλ θ 1 + δ 6 θ λ θ + m λ v 3 (θ,λ ) + δ 6 + (m 0 ȳ) 2 + s2. s 0 K Combining (26), (31) and (34) yilds (25). REMARK 5.1. Not that our drift condition for th block Gibbs samplr (Proposition 4.1) holds for all hyprparamtr configurations (corrsponding to propr priors) and narly all valus of m and m. In contrast, it is assumd in Proposition 5.1 that a 1 > 3/2 andthat5m >m. On th othr hand, Hobrt and Gyr s (1998) drift condition for th Gibbs samplr involvs vn mor rstrictiv assumptions about a 1 and th rlationship btwn m and m. Spcifically, Hobrt and Gyr (1998) assum that a 1 (3K 2)/(2K 2) and that m >( 5 2)m. Not that (3K 2)/(2K 2)>3/2forallK 2andthat5>( 5 2) 1 4.23. REMARK 5.2. In this cas th optimal starting valu minimizs V 3 (θ, λ) = c 3λ θ + c 3λ + δ 7 + Kλ θ ( θ ȳ) 2. Kδ 1 λ θ s 0 + Kλ θ Th last trm will vanish as long as th θ i s ar such that θ =ȳ. Th optimal starting valu for λ θ is th minimizr of th function c 3λ θ + δ 7 /(Kδ 1 λ θ ).This cannot b computd in closd form, but is asily found numrically. Finally, sinc λ = 0 is not appropriat, w simply start λ at a small positiv numbr. 5.2. Minorization. Fix d>0and dfin S G =(θ, λ) : V 3 (θ, λ) d}. Similar to our prvious work with th block Gibbs samplr, our goal will b to find a dnsity q(µ,θ,λ) on R R K R 2 + and an ε>0 such that k(µ,θ,λ µ,λ,θ ) εq(µ,θ,λ) (θ,λ ) S G. As bfor, w will actually stablish th minorization on a suprst of S G with which it is mor convnint to work. Lt c 4 = δ 7 /(Kδ 1 d) and put c l and c u qual to ȳ (m 0 ȳ) 2 + d and ȳ + (m 0 ȳ) 2 + d, rspctivly. W show in Appndix E that S G C G = C G1 C G2 C G3,whr C G1 = (θ, λ) : c 4 λ θ log d }, C G2 = (θ, λ) :0<λ log d }, c 3 c 3 C G3 = (θ, λ) : c l s } 0m 0 + Kλ θ θ c u. s 0 + Kλ θ Also, C G1 C G2 is nonmpty as long as d log d>(c 3 δ 7 )/(Kδ 1 ). W will rquir th following obvious lmma.

802 G. L. JONES AND J. P. HOBERT LEMMA 5.2. Lt N(τ, σ 2 ; x) dnot th valu of th N(τ, σ 2 ) dnsity at th point x. If a b, thn, as a function of x, inf N(τ, σ 2 N(b, σ ; x) = 2 ; x), if x (a + b)/2, a τ b N(a, σ 2 ; x), if x>(a+ b)/2. Hr is th minorization condition. PROPOSITION 5.2. follows: whr q(µ,θ,λ) = Lt q(µ,θ,λ) b a dnsity on R R K R 2 + dfind as g 1 (µ, θ)g 2 (µ) R R K g 1(µ, θ)g 2 (µ) dθ dµ ( ) c4 K/2 g 1 (µ, θ) = xp log d 2π 2c 3 ] f(λ µ, θ, y), (θi µ) 2 + m i (θ i ȳ i ) 2]} and N(cu, s g 2 (µ) = 0 + K log(d)/c 3 ] 1 ; µ), µ ȳ, N(c l, s 0 + K log(d)/c 3 ] 1 ; µ), µ > ȳ. St ] s 0 + Kc 1/2 ] 4 ε G = s 0 + K log(d)/c g 1(µ, θ)g 2 (µ) dθ dµ. 3 R R K Thn th Markov transition dnsity for th Gibbs samplr satisfis th minorization condition k(µ,θ,λ µ,θ,λ ) ε G q(µ,θ,λ) (θ,λ ) C G. PROOF. Rcall that k(µ,θ,λ µ,θ,λ ) = f(µ θ,λ, y)f (θ µ, λ, y)f (λ µ, θ,y).for(θ,λ ) C G,whav f(µ θ,λ, y)f (θ µ, λ,y) inf f(µ θ,λ, y)f (θ µ, λ,y) (θ,λ ) C G ] ] inf f(µ θ,λ,y) inf f(θ µ, λ,y) (θ,λ ) C G (θ,λ ) C G ] ] inf f(µ θ,λ,y) inf f(θ µ, λ,y). (θ,λ ) C G λ C G1 C G2

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 803 Using th fact that th θ i s ar conditionally indpndnt, w hav inf f(θ µ, λ,y) λ C G1 C G2 K K = inf f(θ i µ, λ,y) inf f(θ i µ, λ,y). λ C G1 C G2 λ C G1 C G2 Now, using Jnsn s inquality again, w hav f(θ i µ,λ,y) λ θ = + m iλ xp λ θ + m iλ ( θ i λ θ µ + m iλ ) } 2 ȳi 2π 2 λ θ + m iλ λ θ = + m iλ 2π xp λ θ + m iλ λ θ 2 λ θ + m iλ (θ i µ) + m iλ ] } 2 λ θ + m iλ (θ i ȳ i ) λ θ + m iλ Hnc, = 2π xp λ θ + m iλ 2 λ θ (θ i µ) 2 + m iλ ]} λ θ + m iλ (θ i ȳ i ) 2 λ θ + m iλ λ θ + m iλ xp 1 λ 2π 2 θ (θ i µ) 2 + m i λ (θ i ȳ i ) 2]}. inf λ C G1 C G2 f(θ µ, λ,y) ( ) K/2 c4 xp log d 2π 2c 3 (θi µ) 2 + m i (θ i ȳ i ) 2]} = g 1 (µ, θ). Now, if (θ,λ ) C G,thnc 4 λ θ log d c 3 and hnc f(µ θ,λ s 0 + Kλ θ,y)= xp s 0 + Kλ θ (µ s 0m 0 + Kλ θ θ ) } 2 2π 2 s 0 + Kλ θ s 0 + Kc 4 s 0 + (K log d)/c 3 s 0 + (K log d)/c 3 2π xp s 0 + (K log d)/c 3 (µ s 0m 0 + Kλ θ θ ) } 2 2 s 0 + Kλ. θ

804 G. L. JONES AND J. P. HOBERT Thus, inf f(µ θ,λ,y) (θ,λ ) C G s 0 + Kc 4 s 0 + (K log d)/c 3 xp inf (θ,λ ) C G s 0 + (K log d)/c 3 2 s 0 + Kc 4 s 0 + (K log d)/c 3 xp inf (θ,λ ) C G3 s 0 + (K log d)/c 3 2 s 0 + Kc 4 g 2 (µ), s 0 + (K log d)/c 3 whr th last inquality is an application of Lmma 5.2. s 0 + (K log d)/c 3 2π (µ s 0m 0 + Kλ θ θ ) } 2 s 0 + Kλ θ s 0 + (K log d)/c 3 2π (µ s 0m 0 + Kλ θ θ ) } 2 s 0 + Kλ θ REMARK 5.3. In Appndix F w giv a closd form xprssion for ε G involving th standard normal cumulativ distribution function. 6. A numrical xampl. Considr a balancd data situation and lt π h ( ) dnot th probability masur corrsponding to th postrior dnsity in (2). Lt P n ((λ 0,ξ 0 ), ) dnot th n-stp Markov transition krnl for th block Gibbs samplr startd at (λ 0,ξ 0 ). Equation (5) shows that a starting valu for λ 0 is actually not rquird.] W now writ down an xplicit uppr bound for P n( (λ 0,ξ 0 ), ) π h ( ), basd on Thorm 3.1 and Propositions 4.2 and 4.3. Although it has bn supprssd in th notation, both π h and P n dpnd havily on th six hyprparamtrs, a 1, b 1, a 2, b 2, s 0 and m 0. Our uppr bound holds for all hyprparamtr configurations such that a 1,b 1,a 2,b 2,s 0 ar positiv, that is, all hyprparamtr configurations such that th priors on λ θ, λ and µ ar propr. Du to its gnrality, th bound is complicatd to stat. First, rcall that SSE = i,j (y ij ȳ i ) 2, whr ȳ i = m 1 m j=1 y ij. Rcall furthr that δ 1 = 1 2a 1 + K 2, δ 1 2 = 2a 2 + M 2,

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 805 δ 3 = (K + 1)δ 2, δ 5 = Kδ 2, δ = maxδ 1,δ 3 }, c 1 = 2b 1 δ 1 and c 2 = (2b 2 + SSE)δ 2. Not that all of ths quantitis dpnd only on th data and th hyprparamtrs. Now choos γ (δ, 1) and φ>0 such that φδ 5 + δ<γ.also,lt b = φc 1 +(φk + K + 1)/m]c 2 + maxφ,1} max (ȳ ȳ i ) 2,(m 0 ȳ i ) 2}, and choos d R > 2b/(1 γ). Finally, lt ] ] ε B = h 1 (λ θ )dλ θ h 2 (λ )dλ, R + R + whr ( ) K Gamma 2 + a 1,b 1 ; λ θ, λ θ <λ θ, h 1 (λ θ ) = ( K Gamma 2 + a 1, d ) R 2φ + b 1; λ θ, λ θ λ θ, for λ θ = φ(k + 2a ( 1) log 1 + d ) R d R 2b 1 φ and for h 2 (λ ) = ( M Gamma 2 + a 2, SSE 2 + b 2; λ ), λ <λ, ( M Gamma 2 + a 2, SSE + md R + b 2 ; λ ), λ λ 2, λ = (M + 2a 2) md R ( log 1 + md ) R. 2b 2 + SSE Not that ε B cannot b calculatd in closd form, but can b valuatd numrically with four calls to a routin that valuats th incomplt gamma function. Rcall from th statmnt of Thorm 3.1 that 1 + d R α = and U = 1 + 2(γ d R + b). 1 + 2b + γd R Hr is th bound. For any 0 <r<1andanyn 1, 2, 3,...}, P n( (λ 0,ξ 0 ), ) π h ( ) ( U (1 ε B ) rn r ) n ( + α 1 r 1 + b ) 1 γ + φv 1(θ 0,µ 0 ) + m 1 v 2 (θ 0 ).

806 G. L. JONES AND J. P. HOBERT TABLE 1 Simulatd data Cll 1 2 3 4 5 ȳ i 0.80247 1.0014 0.69090 1.1413 1.0125 M = mk = 50 ȳ = M 1 5 10 j=1 y ij = 0.92973 SSE = 5 10 j=1 (y ij ȳ i ) 2 = 32.990 Using th optimal starting valus from Rmark 4.1, this bcoms P n( (λ 0,ξ opt 0 ), ) π h ( ) ( U (1 ε B ) rn r ) ( (35) n + α 1 r 1 + b 1 γ + φ (ȳ i ȳ) ). 2 1 + φ Explicit uppr bounds can also b writtn for th block Gibbs samplr in th unbalancd cas and for th Gibbs samplr. Ths ar similar and ar lft to th radr. It is intrsting to not that bcaus our drift and minorization conditions for th block Gibbs samplr ar fr of s 0, so too is th bound in (35). To valuat (35), th usr must provid valus for γ, φ, d R and r. In our xprinc, small changs in ths quantitis can lad to dramatically diffrnt rsults. Unfortunatly, th right-hand sid of (35) is a vry complicatd function of γ, φ, d R and r. Hnc, it would b quit difficult to find optimal valus. In our applications of (35), w simply dfin rasonabl rangs for ths four quantitis and thn prform a grid sarch to find th configuration that lads to th smallst uppr bound. W now provid an xampl of th us of (35) and of th analogous bound basd on Thorm 3.2. Th data in Tabl 1 wr simulatd according to th modl dfind in Sction 2 with K = 5, m = 10, a 1 = 2.5, a 2 = b 1 = b 2 = 1, m 0 = 0ands 0 = 1. W now prtnd that th origin of th data is unknown and considr using th block Gibbs samplr to mak approximat draws from four diffrnt intractabl postrior TABLE 2 Four diffrnt prior spcifications Hyprparamtr stting a 1 b 1 a 2 b 2 m 0 1 2.5 1 1 1 0 2 2.5 1 1 1 ȳ 3 0.1 0.1 0.1 0.1 ȳ 4 0.01 0.01 0.01 0.01 ȳ

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 807 TABLE 3 Total variation bounds for th block Gibbs samplr via Thorm 3.1 Hyprparamtr stting γ φ d R r ε B n Bound 1 0.2596 0.9423 15.997 0.0188 3.1 10 7 7.94 10 8 0.00999 2 0.2596 0.5385 3.0079 0.0789 0.0171 3.415 10 3 0.00999 3 0.4183 0.3059 2.8351 0.0512 6.8 10 4 1.315 10 5 0.00999 4 0.4340 0.2965 2.8039 0.0483 8.1 10 6 1.1796 10 7 0.00999 distributions corrsponding to th four hyprparamtr sttings listd in Tabl 2. Th first stting in Tabl 2 is th corrct prior in that it is xactly th stting undr which th data wr simulatd. As on movs from stting 2 to stting 4, th prior variancs on λ θ and λ bcom largr; that is, th priors bcom mor diffus. For rasons discussd blow m 0 isstqualtoȳ in sttings 2 4. For ach of th hyprparamtr sttings in Tabl 2 w usd (35) as wll as th analogous bound basd on Thorm 3.2 to find an n such that P n ( (λ 0,ξ opt 0 ), ) π h ( ) 0.01. Th rsults ar givn in Tabls 3 and 4. For xampl, considr hyprparamtr stting 2. Thorm 3.1 yilds P 3415( (λ 0,ξ opt 0 ), ) π h ( ) 0.00999, whil Thorm 3.2 yilds P 6563( (λ 0,ξ opt 0 ), ) π h ( ) 0.00999. Whil xamining th n s in Tabls 3 and 4, kp in mind that it taks about 1.5 minuts to run on million itrations of th block Gibbs samplr on a standard PC. Thus, vn th largr n s ar fasibl. Not that th rsults basd on Thorm 3.1 ar bttr across th board than thos basd on Thorm 3.2. W suspct that our us of Lmma 3.1 in th application of Thorm 3.2 has somwhat (artificially) inflatd th n s in Tabl 4. TABLE 4 Total variation bounds for th block Gibbs samplr via Thorm 3.2 Hyprparamtr stting ρ φ d RT ε B n Bound 1 0.615 0.84 15.213 4.1 10 7 1.8835 10 9 0.00999 2 0.5975 0.49 2.6564 0.0234 6.563 10 3 0.00999 3 0.7113 0.3181 2.8492 7.2 10 4 3.3915 10 5 0.00999 4 0.7191 0.3084 2.8154 8.6 10 6 2.966 10 7 0.00999

808 G. L. JONES AND J. P. HOBERT A comparison of th n s for hyprparamtr sttings 1 and 2 (in ithr tabl) shows that our bound is xtrmly snsitiv to th distanc btwn m 0 and ȳ. This is du to th fact that ε B dcrass rapidly as b incrass and b contains th trm K max(ȳ ȳ i ) 2,(m 0 ȳ i ) 2 }, which is minimizd whn m 0 =ȳ. Whil thr may actually b som diffrnc in th convrgnc rats of th two Markov chains corrsponding to sttings 1 and 2, it sms unlikly that th diffrnc is as larg as ths numbrs suggst. (Rmmbr, ths ar only sufficint burn-ins.) It is probably th cas that our rsults simply produc a bttr bound undr stting 2 than thy do undr stting 1. This issu is discussd furthr in Sction 7. Anothr notworthy fatur of Tabls 3 and 4 is that n incrass as th priors bcom mor diffus. Figur 1 contains two plots dscribing th rlationship FIG. 1. Ths two plots show how th diffusnss of th priors on λ θ and λ affcts n. Th top plot shows n against a 2 = b 2 whr th hyprparamtrs associatd with λ θ ar hld constant at a 1 = b 1 = 1. Whn a 2 = b 2, th prior varianc of λ is 1/b 2 and th prior man is constant at 1. Th bottom plot shows log(log(n )) against a 1 = b 1 whr th hyprparamtrs associatd with λ ar hld constant at a 2 = b 2 = 1. Whn a 1 = b 1, th prior varianc of λ θ is 1/b 1 and th prior man is constant at 1. In all cass m 0 was st qual to ȳ.

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 809 btwn th prior variancs on λ θ and λ and n.thn s in this plot wr calculatd using (35).] Not that n incrass quit rapidly with th prior varianc on λ θ. Whil it is tmpting to conclud that th chains associatd with diffus priors ar rlativly slow to convrg, w cannot b sur that this is th cas bcaus, again, ths ar only sufficint burn-ins. Howvr, our findings ar ntirly consistnt with th work of Natarajan and McCulloch (1998), whos mpirical rsults suggst that th mixing rat of th Gibbs samplr (for a probit normal hirarchical modl) bcoms much slowr as th priors bcom mor diffus. 7. Discussion. Th quality of th uppr bounds producd using Thorms 3.1 and 3.2 dpnds not only on th sharpnss of th inqualitis usd to prov th thorms thmslvs, but also on th quality of th drift and minorization conditions usd in th particular application. Consquntly, it is possibl, and prhaps vn likly, that th chains w hav analyzd actually gt within 0.01 of stationarity much soonr than th n s in Tabls 3 and 4 would suggst. For xampl, w know from Tabl 3 that a sufficint burn-in for hyprparamtr stting 2 is 3415. Thus, th valu 6563 from Tabl 4 is too larg by at last a factor of 1.9. Th qustion thn bcoms how consrvativ ar th rsults basd on Rosnthal s thorm? As w now xplain, this qustion was addrssd by van Dyk and Mng (2001) in a diffrnt contxt. Hobrt (2001) usd Thorm 3.1 to calculat a sufficint burn-in for a Markov chain Mont Carlo (MCMC) algorithm dvlopd in Mng and van Dyk (1999). In th Rjoindr of van Dyk and Mng (2001) an mpirical stimator of th total variation distanc to stationarity was dvlopd and usd to dmonstrat that Hobrt s uppr bound is probably xtrmly consrvativ. Indd, Hobrt s sufficint burn-in was n = 335 whil van Dyk and Mng s simulation rsults suggstd that a burn-in of 2 is sufficint. W hav xprimntd with van Dyk and Mng s mpirical tchniqus in our situation and hav com to similar conclusions. It would b intrsting to us a Markov chain whos convrgnc bhavior is known xactly to study how th sharpnss of th bounds producd by Thorms 3.1 and 3.2 changs whn diffrnt drift and minorization conditions ar usd. In situations whr it is possibl to rigorously analyz two diffrnt MCMC algorithms for th sam family of intractabl postriors, it is tmpting to compar th algorithms using sufficint burn-in. Howvr, w do not bliv that this is an ntirly fair mthod of comparison. Considr using our rsults in this way to compar Gibbs and block Gibbs. As w mntiond abov, our Gibbs samplr is mor difficult to analyz than our block Gibbs samplr. This probably rsults in rlativly lowr quality drift and minorization conditions for th Gibbs samplr. Indd, using Propositions 5.1 and 5.2 in conjunction with Thorm 3.1 almost always yilds xtrmly larg n s. Spcifically, unlss th priors ar xtrmly

810 G. L. JONES AND J. P. HOBERT TABLE 5 Simulatd data Cll 1 2 3 ȳ i 0.54816 0.92516 0.19924 M T = mk = 12 ȳ = MT 1 3 4j=1 y ij = 0.059253 SSE = 3 4j=1 (y ij ȳ i ) 2 = 20.285 informativ, it is difficult to find a hyprparamtr configuration undr which ε G is not ffctivly 0. Hr is a comparison. Th data in Tabl 5 wr simulatd according to th modl dfind in Sction 2 with K = 3, m = 4, a 1 = a 2 = b 1 = b 2 = 2, s 0 = 1andm 0 = 0. W us th informativ hyprparamtr stting: a 1 = 5, a 2 = 2, b 1 = 20, b 2 = 20, m 0 = 0 and s 0 = 4. For th block Gibbs samplr (35) yilds P 16631( (λ 0,ξ opt 0 ), ) π h ( ) 0.00999. For th Gibbs samplr Propositions 5.1 and 5.2 in conjunction with Thorm 3.1 yild ( PG 4.826 1019 (µ0,θ opt 0,λopt 0 ), ) π h ( ) 0.00999. As starting valus for th Gibbs samplr w usd (θ opt 0,λopt 0 ) = (ȳ,ȳ,ȳ,10 6, 0.2839) (s Rmark 5.2). Th constants usd to construct ths bounds ar givn in Tabl 6. Whil it is probably th cas that block Gibbs convrgs fastr than Gibbs, it is unlikly that th tru diffrnc is anywhr nar as larg as ths numbrs suggst. Thus, if w us ths rsults to compar Gibbs and block Gibbs, th formr will b pnalizd by th fact that it is simply mor analytically cumbrsom. TABLE 6 Constants usd to construct total variation bounds Samplr γ φ ρ 1 c 3 d R r ε Block Gibbs 0.3956 0.3589 na na 28.328 0.0111 0.0246 Gibbs 0.41528 na 0.41527 2.6667 26.010 0.0009 5.6 10 17

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 811 APPENDIX A A.1. Th lmnts of ξ and V. Hobrt and Gyr (1998), pag 418] show that ξ λ,y N(ξ,V) and giv th spcific forms of ξ = ξ (λ, y) and V = V(λ,y). W rstat thir rsults hr. First w lt thn t = 1 Var(θ i λ) = λ θ + m i λ Cov(θ i,θ j λ) = m i λ θ λ λ θ + m i λ, 1 + λ 2 ] θ, (λ θ + m i λ )(s 0 + t) λ 2 θ (λ θ + m i λ )(λ θ + m j λ )(s 0 + t), λ θ Cov(θ i,µ λ) = (λ θ + m i λ )(s 0 + t), Var(µ λ) = 1 s 0 + t. Finally, E(µ λ) = 1 K ] m i λ θ λ ȳ i + m 0 s 0 s 0 + t λ θ + m i λ and λ θ 1 K ]] m j λ θ λ ȳ j E(θ i λ) = + m 0 s 0 + λ m i ȳ i. λ θ + m i λ s 0 + t λ j=1 θ + m j λ λ θ + m i λ Obsrv that E(µ λ) is a convx combination of ȳ i and m 0 and, furthrmor, E(θ i λ) is a convx combination of E(µ λ) and ȳ i.ifwlt dnot th lngth of th convx hull of th st ȳ 1, ȳ 2,...,ȳ K,m 0 }, thn for any i = 1, 2,...,K, E(θ i λ) E(µ λ)] 2 2 and E(θ i λ) ȳ i ] 2 2. APPENDIX B B.1. Optimal starting valus. W dsir th valu of (θ, µ) that minimizs K V 1 (θ, µ) = φ 1 v 1 (θ, µ) + φ 2 v 2 (θ) = φ 1 (θ i µ) 2 K + φ 2 m i (θ i ȳ i ) 2. Clarly, no mattr what valus ar chosn for th θ i s, th minimizing valu of µ is θ. Thus, w nd to find th valu of θ that minimizs K φ 1 (θ i θ) 2 K + φ 2 m i (θ i ȳ i ) 2.

812 G. L. JONES AND J. P. HOBERT Stting th drivativ with rspct to θ i qual to 0 yilds θ i = φ 1 θ + φ 2 m i ȳ i (36). φ 1 + φ 2 m i Summing both sids ovr i and dividing by K yilds an quation in θ whos solution can b pluggd back into (36) and this yilds th optimal starting valu θ i = φ 1 K j=1 (m j ȳ j /(φ 1 + φ 2 m j ))/ K j=1 (m j /(φ 1 + φ 2 m j ))]+φ 2 m i ȳ i φ 1 + φ 2 m i. APPENDIX C C.1. Proof of Lmma 4.1. Lt (b + c/2)α f c (x) = x α 1 x(b+c/2), Ɣ(α) f β (x) = (b + β/2)α x α 1 x(b+β/2), Ɣ(α) f 0 (x) = bα Ɣ(α) xα 1 xb. Not that x is th only positiv solution to f c (x) = f 0 (x). Toprovthrsult it suffics to show that (i) R 0 (β) = f β (x)/f 0 (x) > 1forallx (0,x ) and all β (0,c) and that (ii) R c (β) = f β (x)/f c (x) > 1 for all x (x, ) and all β (0,c).Fixk>0 and dfin a function h(u) = ku log(1 + ku) 1 + ku for u 0. Sinc h(0) = 0andh (u) < 0, w know h(u) < 0foru 0. Hnc, (37) 1 k u 1 + ku 1 log(1 + ku) < 0 u2 for u 0. Dfin anothr function, g(u) = 1 log(1 + ku) u for u>0. Sinc th th lft-hand sid of (37) is qual to g (u), w hav stablishd that g(u) is dcrasing for u>0. Thus, if x<x = 2α c log(1 + c 2b ) and β (0,c), thn ( log R 0 (β) = α log 1 + β ) xβ 2b 2 ( >αlog 1 + β ) αβc ( 2b log 1 + c ) 2b ( 1 = αβ β log 1 + β ) 1c ( 2b log 1 + c )] > 0, 2b

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 813 and (i) is stablishd. Cas (ii) is similar. APPENDIX D D.1. Proof of Lmma 5.1. First, lt g(v) = v + cv 1,whrc>0andv>0. It is asy to show that g is minimizd at ˆv = c. Thus, ( ) ax 2 ( ) y 2 1 ax + y bx + y = 2bx2 y 2 y/x + ab(x/y)]+x 2 y 2 (b 2 + 4ab a 2 ) (ax + y) 2 (bx + y) 2 2bx2 y 2 2 ab]+x 2 y 2 (b 2 + 4ab a 2 ) (ax + y) 2 (bx + y) 2 x2 y 2 (5b 2 + 4ab a 2 ) (ax + y) 2 (bx + y) 2 = x2 y 2 (5b a)(b + a) (ax + y) 2 (bx + y) 2 > 0. APPENDIX E E.1 S G C G = C G1 C G2 C G3. First, S G =(θ, λ) : V 3 (θ, λ) d} = (θ, λ) : c 3λ θ + c 3λ + δ 7 (θ, λ) : c 3λ θ d, c 3λ d, = (θ, λ) : δ 7 Kδ 1 d λ θ log d c 3 } + v 3 (θ, λ) d Kδ 1 λ θ δ 7 Kδ 1 λ θ d,v 3 (θ, λ) d, 0 <λ log d,v 3 (θ, λ) d c 3 As in th proof of Proposition 5.1, Jnsn s inquality yilds } }. ( s0 m 0 + Kλ θ θ s 0 + Kλ θ ) 2 ȳ s 0 s 0 + Kλ θ (m 0 ȳ) 2 + Kλ θ s 0 + Kλ θ ( θ ȳ) 2 (m 0 ȳ) 2 + v 3 (θ, λ),

814 G. L. JONES AND J. P. HOBERT and hnc S G is containd in δ 7 C G := (θ, λ) : Kδ 1 d λ θ log d, 0 <λ log d, c 3 c 3 ( ) s0 m 0 + Kλ θ θ 2 ȳ (m 0 ȳ) 2 + d}. s 0 + Kλ θ Lt c 4 = δ 7 /(Kδ 1 d) and put c l and c u qual to ȳ (m 0 ȳ) 2 + d and ȳ + (m 0 ȳ) 2 + d, rspctivly. Not that C G = C G1 C G2 C G3,whr C G1 = (θ, λ) : c 4 λ θ log d }, C G2 = c 3 (θ, λ) :0<λ log d c 3 }, C G3 = (θ, λ) : c l s 0m 0 + Kλ θ θ s 0 + Kλ θ APPENDIX F } c u. F.1. Closd form xprssion for ε G. Rcall that ] s 0 + Kc 1/2 ] 4 ε G = s 0 + K log(d)/c g 1(µ, θ)g 2 (µ) dθ dµ. 3 R R K A straightforward calculation shows that g 1(µ, θ) dθ R K (38) ( ) c4 c K/2 3 K 1 = xp m } i log d log d 1 + m i 2c 3 (1 + m i ) (µ ȳ i) 2. Thus, g 1 (µ, θ)g 2 (µ) dθ dµ R K R Now = ( ) c4 c K/2 3 K log d R 1 1 + m i g 2 (µ) xp log d 2c 3 = R s 0 + K log(d)/c 3 2π g 2 (µ) K xp } m i (µ ȳ i ) 2 dµ 1 + m i m i log d 2c 3 (1 + m i ) (µ ȳ i) 2 } dµ.

Dfin and put and Thn ȳ and ȳ GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 815 ȳ xp s } 0 + K log(d)/c 3 (µ c u ) 2 2 xp log d } m i (µ ȳ i ) 2 dµ 2c 3 1 + m i + xp s } 0 + K log(d)/c 3 (µ c l ) 2 ȳ 2 xp log d } ] m i (µ ȳ i ) 2 dµ. 2c 3 1 + m i v = s 0 + log d ( )] 1 m i K + c 3 1 + m i m l = v c l s 0 + log d ( )] ȳ i m i Kc l + c 3 1 + m i m u = v c u s 0 + log d ( )] ȳ i m i Kc u +. c 3 1 + m i xp s 0 + K log(d)/c 3 2 = xp c2 u s 0 2 log d 2c 3 } (µ c u ) 2 xp log d 2c 3 Kcu 2 K + ȳi 2m ] i + m2 u 1 + m i 2v m i } (µ ȳ i ) 2 1 + m i } 2πv (ȳ ) mu v xp s } 0 + K log(d)/c 3 (µ c l ) 2 xp log d } m i (µ ȳ i ) 2 dµ 2 2c 3 1 + m i = xp c2 l s 0 2 log d Kc 2 ȳi 2 l + m ] } i + m2 l 2c 3 1 + m i 2v ( (ȳ ml 2πv 1 )). v dµ

816 G. L. JONES AND J. P. HOBERT Putting all of this togthr yilds ε G = v(s 0 + Kc 4 ) K 1 1 + m i xp c2 u s 0 2 Kc2 u log d 2c 3 + xp c2 l s 0 2 Kc2 l log d + m2 l 2c 3 2v ( ) c4 c K/2 3 xp log d log d 2c 3 + m2 } (ȳ ) u mu 2v v ȳ 2 i m i 1 + m i }( (ȳ ml 1 ))], v whr ( ) dnots th standard normal cumulativ distribution function. Acknowldgmnt. Th authors ar gratful to two anonymous rfrs whos insightful commnts ld to substantial improvmnts in th papr. REFERENCES COWLES, M. K. and CARLIN, B. P. (1996). Markov chain Mont Carlo convrgnc diagnostics: A comparativ rviw. J. Amr. Statist. Assoc. 91 883 904. COWLES,M.K.,ROBERTS,G.O.andROSENTHAL, J. S. (1999). Possibl biass inducd by MCMC convrgnc diagnostics. J. Statist. Comput. Simulation 64 87 104. COWLES, M. K. and ROSENTHAL, J. S. (1998). A simulation approach to convrgnc rats for Markov chain Mont Carlo algorithms. Statist. Comput. 8 115 124. DOUC, R., MOULINES, E. and ROSENTHAL, J. S. (2002). Quantitativ bounds for gomtric convrgnc rats of Markov chains. Tchnical rport, Dpt. Statistics, Univ. Toronto. GELFAND, A.E.,HILLS, S.E.,RACINE-POON, A.andSMITH, A. F. M. (1990). Illustration of Baysian infrnc in normal data modls using Gibbs sampling. J. Amr. Statist. Assoc. 85 972 985. GELFAND, A. E. and SMITH, A. F. M. (1990). Sampling-basd approachs to calculating marginal dnsitis. J. Amr. Statist. Assoc. 85 398 409. HOBERT, J. P. (2001). Discussion of Th art of data augmntation, by D. A. van Dyk and X.-L. Mng. J. Comput. Graph. Statist. 10 59 68. HOBERT, J. P. and GEYER, C. J. (1998). Gomtric rgodicity of Gibbs and block Gibbs samplrs for a hirarchical random ffcts modl. J. Multivariat Anal. 67 414 430. JONES, G. L. (2001). Convrgnc rats and Mont Carlo standard rrors for Markov chain Mont Carlo algorithms. Ph.D. dissrtation, Univ. Florida. JONES, G. L. and HOBERT, J. P. (2001). Honst xploration of intractabl probability distributions via Markov chain Mont Carlo. Statist. Sci. 16 312 334. MARCHEV, D. and HOBERT, J. P. (2004). Gomtric rgodicity of van Dyk and Mng s algorithm for th multivariat Studnt s t modl. J. Amr. Statist. Assoc. 99 228 238. MENG, X.-L. and VAN DYK, D. A. (1999). Sking fficint data augmntation schms via conditional and marginal augmntation. Biomtrika 86 301 320. MEYN,S.P.andTWEEDIE, R. L. (1993). Markov Chains and Stochastic Stability. Springr, London. MEYN, S. P. and TWEEDIE, R. L. (1994). Computabl bounds for gomtric convrgnc rats of Markov chains. Ann. Appl. Probab. 4 981 1011. NATARAJAN, R. and MCCULLOCH, C. E. (1998). Gibbs sampling with diffus propr priors: A valid approach to data-drivn infrnc? J. Comput. Graph. Statist. 7 267 277. }

GIBBS SAMPLERS FOR A RANDOM EFFECTS MODEL 817 NUMMELIN, E. (1984). Gnral Irrducibl Markov Chains and Non-ngativ Oprators. Cambridg Univ. Prss. ROBERTS, G. O. and TWEEDIE, R. L. (1999). Bounds on rgnration tims and convrgnc rats for Markov chains. Stochastic Procss. Appl. 80 211 229. ROBERTS, G. O. and TWEEDIE, R. L. (2001). Corrigndum to Bounds on rgnration tims and convrgnc rats for Markov chains. Stochastic Procss. Appl. 91 337 338. ROSENTHAL, J. S. (1995a). Minorization conditions and convrgnc rats for Markov chain Mont Carlo. J. Amr. Statist. Assoc. 90 558 566. ROSENTHAL, J. S. (1995b). Rats of convrgnc for Gibbs sampling for varianc componnt modls. Ann. Statist. 23 740 761. ROSENTHAL, J. S. (1996). Analysis of th Gibbs samplr for a modl rlatd to Jams Stin stimators. Statist. Comput. 6 269 275. TANNER, M. A. and WONG, W. H. (1987). Th calculation of postrior distributions by data augmntation (with discussion). J. Amr. Statist. Assoc. 82 528 550. TIERNEY, L. (1994). Markov chains for xploring postrior distributions (with discussion). Ann. Statist. 22 1701 1762. VAN DYK, D. A. and MENG, X.-L. (2001). Th art of data augmntation (with discussion). J. Comput. Graph. Statist. 10 1 111. SCHOOL OF STATISTICS UNIVERSITY OF MINNESOTA TWIN CITIES CAMPUS 313 FORD HALL 224 CHURCH STREET,SE MINNEAPOLIS,MINNESOTA 55455 USA E-MAIL: galin@stat.umn.du DEPARTMENT OF STATISTICS UNIVERSITY OF FLORIDA GAINESVILLE,FLORIDA 32611 USA E-MAIL: jhobrt@stat.ufl.du