Description of the PX-HC algorithm

A Description of the PX-HC algorithm Let N = C c= N c and write C Nc K c= i= k= as, the Gibbs sampling algorithm at iteration m for continuous outcomes: Step A: For =,, J, draw θ m in the following steps: A λ m [ N Ω λ y c W T βm ξ m b m ] U m σ m, Ω λ {λ m 0}, where Ω λ = [ U m + ] σ m A β m [ N Ω β where Ω β = σ m prior variance for β y c λ m U m W W T + Σ β ξ m ] b m W σ m, and Σ β = 000 I p, Ω β, is the A3 A4 ξ m [ N Ω ξ y c W T βm K b m where Ω ξ = c,i + σ m λ m U m ] b m σ m, Ω ξ, C Nc ψ m IGA m, A m, where A m c= i= = K A m = c,i A mt H ρm A m +, +, and A m = U m µ m K X α m K gc m Z K a m c A5 α m N µ α, Ω α,

where µ α = c,i XH T ρm U m µ m K Z K ac m gc m K A6 Ω α ψ, Ω α = ψ c,i XT H ρm X + Σ α µ Nµ µ, Ω µ, where µ µ = c,i U m X α m g m c T K Z K a m H c ρm K Ω µ ψ, [ and Ω µ = ψ A7 For =,, J, C N c N µ m b A8 For =,, J, c,i K H K + c= i= τ m b m τ m IG 000 N +, ] N + N, 000 τ m + 000 τ m C N c c= i= b m µ m b + A9 σ g IG C + 0, C c= g m c + 0 A0 A For =,, J, σ m IG σ a C c= IG N + 0, C c= Nc i= K + 0, a m c T a m c + 0 y c W T βm λ m U m ξ m b m + 0 A pρ m { C N c H ρ / exp C N c ψ c= i= c= i= A mt H ρam }

where A m = U m µ X α m g c m K Z T a m c, and H ρ is a K K matrix with r, k th element ρ t r t We use parameter transformation to transform ρ to η via η = log ρ ρ Then the conditional distribution for posterior η is: pη { C N c H ρη / exp C N c ψ c= i= c= i= A mt H ρηam where ρη = expη +expη Random walk Metropolis-Hastings algorithm is used to sample η with proposal Nη old, v, where v is tuned to have a reasonable acceptance rate Step B Sample all latent variables: B For each {c, i}, U m µ U = [ J Ω U = + [ Ω U ψ and Ω U = J = MVNµ U, Ω U with λ m σ m B For each {c, i} draw [ K N Ω b b m where Ω b = y c W β m µ m + X α m + g m c ξ m b m K λ m I σ m K + H ψ ρm t= K ξ m σ m ξ m σ m y c W T β m + τ m ] } expη + expη, ] T K + Z K ac m H ρm λ m U m + β m 0 τ m ], Ω b, B3 For each c, g c Nµ gc, Ω gc with µ gc = Ω g c ψ N c i= U m µ m K X α m Z T a m T c H ρm T, and Ω gc = Nc ψ i= T H ρm T + σ g 3

B4 For c =, C, For each c, a m c MV Nµ ac, Ω ac with µ ac = Ω a c ψ N c i= T Z T H ρm and Ω ac = Nc ψ i= Z T T H ρm Z T + U m µ m K X α m gc m T, σa m I Nc B Description of the PX -HC algorithm The following steps are used to produce the sampling updates: Step C For all parameters that determine the continuous response and latent variable, the Gibbs steps are identical to the ones described in the previous section Spefically, the conditional distributions used in the Gibbs updates are identical for {λ, β, ξ, b, σ : J }, ψ, α, a, g, µ, µ b, σ a, σ g, τ, ρ Step D For = J +,, J the following conditional distributions are used to update the chain D Draw D y b m { T N+ µ,, if yb = T N µ,, if yb = 0, where T N + µ, σ and T N µ, σ are truncated normals with mean µ and variance σ truncated to 0, and, 0, respectively Also µ = W T βm γ m + λ m U m IG + ξ m b m Put ỹ b m c,i K + 0, ỹ b m W T m β = γ m y b m m λ U m ξ m b m + 0 D3 where ˆµ eλ = [ ỹ b m λ m W T m β N µ eλ, Ω eλ, ξ m b m U m ] + U m, and Ω eλ = [ ] U m m + γ 4

D4 β m N µ eβ, Ω e, β where ˆµ e β = [ ỹ b m m λ U m ξ m b m ] W W W T + Σ β, D5 and Ω e β = ξ m [ ] W W T + Σ m γ β N K b m + c,i [ ỹ b m W T m β λ m U m ] b m, Ω ξ, where Ω ξ = D6 Set β m b m = c,i K b m m + γ β m D7 For each {c, i} [ K N Ω b where Ω b = /γ m, λ m = t= ξ m y b m λ m /γ m W T βm K ξ m + τ m, and ξ m = D8 For each {c, i}, U m MV Nµ U, Ω U, where J λ m µ U = Ω U y c W β m + Ω U = J + Ω U ψ [ σ m λ m =J y b W β m µ m + X α m + g m c m ξ /γ m λ m U m + β m 0 ξ m b m ξ m b m K K τ m ], Ω b ] T K + Z K a m c H ρm J λ m and Ω U = = I σ m K + J =J + λ m I K + H ψ ρm, C Additional Simulation Plots C Model M 5

Figure : Comparison of Gelman-Rubin diagnostic plots for two loading factors, λ and λ 3 for models M and M The solid black line shows the evolution of of R for, while the dashed red line shows the evolution PX-HC for M and PX -HC for M M: GR plot for and PX HC sample of λ M: GR plot for and PX HC sample of λ 3 Gelman Rubin shrink factor 0 5 0 5 30 PX HC Gelman Rubin shrink factor 0 5 0 5 30 PX HC 0 000 4000 6000 8000 0000 0 000 4000 6000 8000 0000 last iteration in chain last iteration in chain M: GR plot for and PX HC sample of λ M: GR plot for and PX HC sample of λ 3 Gelman Rubin shrink factor 0 5 0 5 30 PX HC Gelman Rubin shrink factor 0 5 0 5 30 PX HC 0 000 4000 6000 8000 0000 0 000 4000 6000 8000 0000 last iteration in chain last iteration in chain 6

Figure : Comparison of trace plots for simulations under model M using and PX-HC scheme The blue line marks the true value of the parameter, and the red line represents the posterior mean Left side from top to bottom: trace plots for α, λ, and σa using standard Gibbs Right side from top to bottom: trace plots for α, λ, and σa using P X HC M: Trace plot for standard Gibbs sample of α M: Trace plot for PX HC sample of α α 05 00 095 α 05 00 095 5000 6000 7000 8000 9000 0000 iterations 5000 6000 7000 8000 9000 0000 iterations M: Trace plot for standard Gibbs sample of λ M: Trace plot for PX HC sample of λ λ 47 48 49 50 5 λ 47 48 49 50 5 5 53 5000 6000 7000 8000 9000 0000 iterations 5000 6000 7000 8000 9000 0000 iterations M: Trace plot for standard Gibbs sample of σ a M: Trace plot for PX HC sample of σ a σ 0 03 a 04 05 σ 0 03 04 a 05 06 5000 6000 7000 8000 9000 0000 iterations 5000 6000 7000 8000 9000 0000 iterations 7

Figure 3: Comparison of ACF plots for the three loading factors λ, =,, 3 for model M Red line shows the average ACF curve for computed from 00 replicated curves which are shown in purple The blue line shows the average ACF curve for PX-HC computed from 00 replicated curves which are shown in green M: ACF plots for and PX HC sample of λ M: ACF plots for and PX HC sample of λ ACF 00 0 04 06 08 0 PX HC ACF 00 0 04 06 08 0 PX HC 0 0 0 30 40 50 60 0 0 0 30 40 50 60 lag lag M: ACF plots for and PX HC sample of λ 3 ACF 00 0 04 06 08 0 PX HC 0 0 0 30 40 50 60 lag 8

Figure 4: Comparison of highest posterior density interval HpdI plots for the three loading factors λ, =,, 3 for model M The replication number is the order of the lower bound of HpdI s Left side: HpdI plots for λs using Right side: HpdI plots for λs using PX-HC The blue solid vertical line is the true value, which is 0, of α The red dashed vertical line is the mean estimation : 00 HpdI's for λ under H PX HC: 00 HpdI's for λ under H Coverage= 88 % Coverage= 93 % 46 47 48 49 50 5 5 53 47 48 49 50 5 5 53 54 λ λ : 00 HpdI's for λ under H PX HC: 00 HpdI's for λ under H Coverage= 89 % Coverage= 94 % 46 47 48 49 50 5 5 53 48 50 5 54 λ λ : 00 HpdI's for λ 3 under H PX HC: 00 HpdI's for λ 3 under H Coverage= 86 % Coverage= 93 % 46 48 50 5 48 50 5 54 λ 3 λ 3 9

C Model M D Additional results for the real data example 0

Figure 5: Comparison of highest posterior density interval HpdI plots for the three loading factors λ, =,, 4 for model M The replication number is the order of the lower bound of HpdI s Left side: HpdI plots for λs using Right side: HpdI plots for λs using PX-HC The blue solid vertical line is the true value, which is 0, of α The red dashed vertical line is the mean estimation : 00 HpdI's for λ under H PX HC: 00 HpdI's for λ under H Coverage= 94 % Coverage= 96 % 85 90 95 00 05 0 5 85 90 95 00 05 0 5 λ λ : 00 HpdI's for λ under H PX HC: 00 HpdI's for λ under H Coverage= 94 % Coverage= 95 % 8 9 30 3 3 33 8 9 30 3 3 λ λ : 00 HpdI's for λ 3 under H PX HC: 00 HpdI's for λ 3 under H Coverage= 94 % Coverage= 97 % 08 09 0 08 09 0 3 λ 3 λ 3 : 00 HpdI's for λ 4 under H PX HC: 00 HpdI's for λ 4 under H Coverage= 9 % Coverage= 95 % 08 09 0 3 08 09 0 3 λ 4 λ 4

Table : The fitting results for the genetic study of type diabetes TD complications dataset using the model proposed by Roy and Lin 000 Application results SNP rs784868 was previously identified to be assoated with diastolic blood pressure DBP and SNP rs358030 was previously identified to be assoated with HbAc Phenotypes of interest are DBP and systolic blood pressure SBP, two continuous outcomes, and hyperglycemia HPG, defined as HbAc greater or equal to 8, a binary outcome All phenotypes are thought to be related to type diabetes complication severity The coeffient λs assess the assoation between the phenotypes and the latent TD complication status, and αs evaluate the assoation between the latent variable and the genetic marker and the other covariates of interest See Section 5 for more details Analysis of SNP rs784868 Parameter Estimate 95% HpdI logbf SBP λ 66 653, 7077 485 DBP λ 384 3566, 40 98 HPG λ 3 00 89 0 7, 975 0-05 rs784868 α -069-037, -064 006 sex α -07-0866, -0584 67 cohort α 3 0443 099, 0585 05 treatment α 4 08-0004, 063 0366 Analysis of SNP rs358030 Parameter Estimate 95% HpdI logbf SBP λ 6868 6439, 730 83 DBP λ 3706 349, 3933 0 HPG λ 3 000 566 0 7, 740 0 7-034 rs358030 α - 0039-0049, 0-04 sex α -0758-0880, -063 6486 cohort α 3 0393 058, 053 87 treatment α 4 0088-004, 00-08