Generalized Linear Model [GLM]

Generalzed Lnear Model [GLM]. ก. ก Emal: nkom@kku.ac.th

A Lttle Hstory Multple lnear regresson normal dstrbuton & dentty lnk (Legendre, Guass: early 19th century). ANOVA normal dstrbuton & dentty lnk (Fsher: 1920 s 1935). Lkelhood functon a general approach to nference about any statstcal model (Fsher, 1922). Dluton assays a bnomal dstrbuton wth complementary log-log lnk (Fsher, 1922). Exponental famly class of dstrbutons wth suffcent statstcs for parameters (Fsher, 1934). Probt analyss bnomal dstrbuton & probt lnk (Blss, 1935). Logt for proportons bnomal dstrbuton & logt lnk (Berkson, 1944; Dyke & Patterson, 1952)

A Lttle Hstory (contnued) Item analyss Bernoull dstrbuton & logt lnk (Rasch, 1960). Log lnear models for counts Posson dstrbuton & log lnk (Brch, 1963). Regressons for survval data exponental dstrbuton & recprocal or log lnk (Fegl & Zelen, 1965; Zppn & Armtage, 1966; Glasser, 1967). Inverse polynomals Gamma dstrbuton & recprocal lnk (Nelder, 1966). Nelder & Wedderburn (1972): provded unfcaton. They showed - All the prevously mentoned models are specal cases of general model, Generalzed Lnear Models - The MLE for all these models could be obtaned usng same algorthm. - All of the models lsted have dstrbutons n the Exponental Dsperson Famly

Generalzed Lnear Model (Generalzed Lnear Model: GLM) ก Nelder & Wedderburn (1972) Contnuous data-contnuous data Regresson Contnuous data-categorcal data Anova

Generalzed Lnear Model [GLM] ก 3 ก ก - ก (random component) - ก (systematc component) - ก ก (lnk functon) E(Y)=α + β 1 x 1 + + β k x k

- ก (random component) ก ก ก ก ก ก (response varable) ก ก ก (type of exponental famly) E(Y)=α + β 1 x 1 + + β k x k

- ก (systematc component) ก ก ก ก ก E(Y) = α + β x + + 1 1 β x k k ก ก (lnear combnaton) ก ก ก (lnear predctor)

X ก X 3 = X I X 2 (X 3 ก nteracton X I X 2 ) X 3 = X 2 1

ก ก (lnk functon) ก ก ก ก ก ก µ =E(Y) ก ก (lnear predctor)

µ g(µ) = α + β 1 x 1 + + β k x k ก g(.) ก ก (lnk functon) ก ก ก g(µ) = µ ก ก ก ก (dentty lnk) µ = α + β 1 x 1 + + β k x k

ก ก -loglnear model ก g(µ) = log(µ) log(µ) = α + β 1 x 1 + + β k x k

ก ก -logt model ก µ g(µ) = log 1µ µ log = α+ β x + 1 µ +.. β x 1 1 k k

1 ก ก ก ก Normal Identty Regresson Normal Identty ก Analyss of varance Normal Identty Analyss of covarance Bernoull Logt Logstc regresson Posson Log Log lnear Multnomal Gernalzed logt Multnomal response

STATA lnk functons are Lnk functon glm opton ---------------------------------------- dentty lnk(dentty) log lnk(log) logt lnk(logt) probt lnk(probt) complementary log-log lnk(cloglog) odds power lnk(opower #) power lnk(power #) negatve bnomal lnk(nbnomal) log-log lnk(loglog) log-complment lnk(logc) STATA dstrbuton famles are Famly glm opton ---------------------------------------- Gaussan(normal) famly(gaussan) Inverse Gaussan famly(gaussan) Bernoull/bnomal famly(bnomal) Posson famly(posson) Negatve bnomal famly(nbnomal) Gamma famly(gamma)

ก ก ก ก ก ก HD NHD 0 24 1355 1379 2 35 603 638 4 21 192 213 5 30 224 254 µ log 1µ = α + β 1x 1 glm hd1 snore, famly(bnomal n) lnk(logt)

GLM. nput snore hd1 hd0 snore hd1 1. 0 24 1355 2. 2 35 603 3. 4 21 192 4. 5 30 224 5. end hd0. generate n=hd0+hd1

. glm hd1 snore, famly(bnomal n) lnk(logt) Iteraton 0: log lkelhood = -11.539348 Iteraton 1: log lkelhood = -11.530734 Iteraton 2: log lkelhood = -11.530733 Generalzed lnear models No. of obs = 4 Optmzaton : ML: Newton-Raphson Resdual df = 2 Scale param = 1 Devance = 2.808911793 (1/df) Devance = 1.404456 Pearson = 2.874323296 (1/df) Pearson = 1.437162 Varance functon: V(u) = u*(1-u/n) Lnk functon : g(u) = ln(u/(n-u)) Standard errors : OIM [Bnomal] [Logt] Log lkelhood = -11.53073319 AIC = 6.765367 BIC =.0363230709 ------------------------------------------------------------------------------ hd1 Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- snore.3973366.0500107 7.95 0.000.2993175.4953557 _cons -3.866248.1662144-23.26 0.000-4.192022-3.540474 ------------------------------------------------------------------------------

Analyss of Ft - Devance Log Lkeldood - ก random component - ก logt Log Lkelhood = n ln(n ) 1 1 + n ln(n 0 0 ) nln(n) Devence = -2[n ln(n ) 1 1 + n ln(n 0 ) 0 nln(n)]

ก ก CHD Age chd Phat l1 20 0 0.043479-0.0444523 23 0 0.059621-0.0614728 24 0 0.066153-0.0684424 69 1 0.912465-0.091606-53.6765477 ก Log Lkelhood, Devance constant - Devance (D) ก Log lkelhood - goodness of ft ก

Log Lkelhood = n ln(n ) 1 1 + n ln(n 0 = 43ln(43) + 57ln(57) 100ln(100) = 161.7316 + 230.45392-460.51702 = -68.331491 0 ) nln(n) Devence = -2[n ln(n ) 1 1 = -2(-68.331491) + n ln(n 0 ) 0 nln(n)] = 136.66298

LogLkelhood πˆ ι = n y ln( πˆ = 1 = 1 ) + (1 y )ln(1 - ε 5.309453+.1109211(20) + ε 5.309453+.1109211(20) πˆ ) = 0.04347874 LogLkelh ood = - 53.6765477 Devence = n 2 = 1 y ln( πˆ ) + (1 y )ln(1 - πˆ ) = -2(-53.67654) = 107.3531

Model Statstcs Akake nformaton crteron (AIC) 2L(M k ) + 2p AIC = n (53.676546) + 2(2) AIC = = 100 1.1135309 AIC better ft model

. glm chd age, famly(bnomal) lnk(logt) Iteraton 0: log lkelhood = -53.710416 Iteraton 1: log lkelhood = -53.676576 Iteraton 2: log lkelhood = -53.676546 Iteraton 3: log lkelhood = -53.676546 Generalzed lnear models No. of obs = 100 Optmzaton : ML: Newton-Raphson Resdual df = 98 Scale param = 1 Devance = 107.3530927 (1/df) Devance = 1.09544 Pearson = 101.9429241 (1/df) Pearson = 1.040234 Varance functon: V(u) = u*(1-u) [Bernoull] Lnk functon : g(u) = ln(u/(1-u)) [Logt] Standard errors : OIM Log lkelhood = -53.67654635 AIC = 1.113531 BIC = 98.14275232 ------------------------------------------------------------------------------ chd Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- age.1109211.0240598 4.61 0.000.0637647.1580776 _cons -5.309453 1.133655-4.68 0.000-7.531376-3.087531

Log lkelhood rato ft ก constant. glm chd, f(b) l(l) Iteraton 0: log lkelhood = -68.373484 Iteraton 1: log lkelhood = -68.331492 Iteraton 2: log lkelhood = -68.331491 Generalzed lnear models No. of obs = 100 Optmzaton : ML: Newton-Raphson Resdual df = 99 Scale param = 1 Devance = 136.6629827 (1/df) Devance = 1.380434 Pearson = 99.99999993 (1/df) Pearson = 1.010101 Varance functon: V(u) = u*(1-u) [Bernoull] Lnk functon : g(u) = ln(u/(1-u)) [Logt] Standard errors : OIM Log lkelhood = -68.33149136 AIC = 1.38663 BIC = 132.0578125 ------------------------------------------------------------------------------ chd Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons -.2818511.2019893-1.40 0.163 -.6777429.1140406 ------------------------------------------------------------------------------ log lkelhood ก 68.331491 Devence = -2(-68.331491) = 136.6629827

ft ก constant age. glm chd age, f(b) l(l) Iteraton 0: log lkelhood = -53.710416 Iteraton 1: log lkelhood = -53.676576 Iteraton 2: log lkelhood = -53.676546 Iteraton 3: log lkelhood = -53.676546 Generalzed lnear models No. of obs = 100 Optmzaton : ML: Newton-Raphson Resdual df = 98 Scale param = 1 Devance = 107.3530927 (1/df) Devance = 1.09544 Pearson = 101.9429241 (1/df) Pearson = 1.040234 Varance functon: V(u) = u*(1-u) [Bernoull] Lnk functon : g(u) = ln(u/(1-u)) [Logt] Standard errors : OIM Log lkelhood = -53.67654635 AIC = 1.113531 BIC = 98.14275232 ------------------------------------------------------------------------------ chd Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- age.1109211.0240598 4.61 0.000.0637647.1580776 _cons -5.309453 1.133655-4.68 0.000-7.531376-3.087531 ------------------------------------------------------------------------------

= varable the wthg lkelhood varable the wthout lkelhood 2ln G = = n 1 ) y (1 ) (1 y 0 n n 0 n 1 n n 1 n 2ln G πˆ πˆ [ ] [ ] = + + = n 1 nln(n) ) 0 )ln(n 0 (n ) 1 ln(n 1 n ) )ln(1 y (1 ) ln( y 2 G πˆ πˆ [ ] { } 31 29 100ln(100) 57ln(57) 43ln(43) 53.677 2 G. = + =

. logt chd age Iteraton 0: log lkelhood = -68.331491 Iteraton 1: log lkelhood = -54.170558 Iteraton 2: log lkelhood = -53.681645 Iteraton 3: log lkelhood = -53.676547 Iteraton 4: log lkelhood = -53.676546 Logt estmates Number of obs = 100 LR ch2(1) = 29.31 Prob > ch2 = 0.0000 Log lkelhood = -53.676546 Pseudo R2 = 0.2145 ------------------------------------------------------------------------------ chd Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- age.1109211.0240598 4.61 0.000.0637647.1580776 _cons -5.309453 1.133655-4.68 0.000-7.531376-3.087531