Multilevel models for analyzing people s daily moving behaviour

Σχετικά έγγραφα
MATHACHij = γ00 + u0j + rij

Summary of the model specified

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

Biostatistics for Health Sciences Review Sheet

Γενικευμένα Γραμμικά Μοντέλα (GLM) Επισκόπηση

Statistical Inference I Locally most powerful tests

90 [, ] p Panel nested error structure) : Lagrange-multiple LM) Honda [3] LM ; King Wu, Baltagi, Chang Li [4] Moulton Randolph ANOVA) F p Panel,, p Z

Λογαριθμικά Γραμμικά Μοντέλα Poisson Παλινδρόμηση Παράδειγμα στο SPSS

5.4 The Poisson Distribution.

Chapter 1 Introduction to Observational Studies Part 2 Cross-Sectional Selection Bias Adjustment

Aquinas College. Edexcel Mathematical formulae and statistics tables DO NOT WRITE ON THIS BOOKLET

Supplementary Appendix

Supplementary figures

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

FORMULAS FOR STATISTICS 1

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

Other Test Constructions: Likelihood Ratio & Bayes Tests

APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679

Table 1: Military Service: Models. Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 num unemployed mili mili num unemployed

!"!"!!#" $ "# % #" & #" '##' #!( #")*(+&#!', & - #% '##' #( &2(!%#(345#" 6##7

Queensland University of Technology Transport Data Analysis and Modeling Methodologies


Does anemia contribute to end-organ dysfunction in ICU patients Statistical Analysis

Probability and Random Processes (Part II)

6. MAXIMUM LIKELIHOOD ESTIMATION

ESTIMATION OF SYSTEM RELIABILITY IN A TWO COMPONENT STRESS-STRENGTH MODELS DAVID D. HANAGAL

ST5224: Advanced Statistical Theory II

SECTION II: PROBABILITY MODELS

Lecture 7: Overdispersion in Poisson regression

519.22(07.07) 78 : ( ) /.. ; c (07.07) , , 2008

An Introduction to Spatial Statistics: Data Types, Statistical Tools and Computer Software

Cite as: Pol Antras, course materials for International Economics I, Spring MIT OpenCourseWare ( Massachusetts

Repeated measures Επαναληπτικές μετρήσεις

χ 2 test ανεξαρτησίας

Οι απόψεις και τα συμπεράσματα που περιέχονται σε αυτό το έγγραφο, εκφράζουν τον συγγραφέα και δεν πρέπει να ερμηνευτεί ότι αντιπροσωπεύουν τις

ON NEGATIVE MOMENTS OF CERTAIN DISCRETE DISTRIBUTIONS

Solution Series 9. i=1 x i and i=1 x i.

Statistics & Research methods. Athanasios Papaioannou University of Thessaly Dept. of PE & Sport Science


PENGARUHKEPEMIMPINANINSTRUKSIONAL KEPALASEKOLAHDAN MOTIVASI BERPRESTASI GURU TERHADAP KINERJA MENGAJAR GURU SD NEGERI DI KOTA SUKABUMI

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

Επαναληπτικό μάθημα GLM

Buried Markov Model Pairwise

Αν οι προϋποθέσεις αυτές δεν ισχύουν, τότε ανατρέχουµε σε µη παραµετρικό τεστ.

Exercises to Statistics of Material Fatigue No. 5

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

255 (log-normal distribution) 83, 106, 239 (malus) 26 - (Belgian BMS, Markovian presentation) 32 (median premium calculation principle) 186 À / Á (goo

LAMPIRAN. Fixed-effects (within) regression Number of obs = 364 Group variable (i): kode Number of groups = 26

Βιογραφικό Σημείωμα. Διεύθυνση επικοινωνίας: Τμήμα Μαθηματικών, Πανεπιστήμιο Πατρών

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing.

ΧΩΡΙΚΑ ΟΙΚΟΝΟΜΕΤΡΙΚΑ ΥΠΟΔΕΙΓΜΑΤΑ ΣΤΗΝ ΕΚΤΙΜΗΣΗ ΤΩΝ ΤΙΜΩΝ ΤΩΝ ΑΚΙΝΗΤΩΝ SPATIAL ECONOMETRIC MODELS FOR VALUATION OF THE PROPERTY PRICES

Solutions to Exercise Sheet 5

Μηχανική Μάθηση Hypothesis Testing

Generalized additive models in R

Numerical Analysis FMN011


Άσκηση 10, σελ Για τη μεταβλητή x (άτυπος όγκος) έχουμε: x censored_x 1 F 3 F 3 F 4 F 10 F 13 F 13 F 16 F 16 F 24 F 26 F 27 F 28 F

Οικονοµετρική ιερεύνηση των Ελλειµµάτων της Ελληνικής Οικονοµίας

ΣΧΕΔΙΑΣΜΟΣ ΔΙΚΤΥΩΝ ΔΙΑΝΟΜΗΣ. Η εργασία υποβάλλεται για τη μερική κάλυψη των απαιτήσεων με στόχο. την απόκτηση του διπλώματος

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

Άσκηση 11. Δίνονται οι παρακάτω παρατηρήσεις:

5.1 logistic regresssion Chris Parrish July 3, 2016

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

w o = R 1 p. (1) R = p =. = 1

Δεδομένα (data) και Στατιστική (Statistics)

Bayesian modeling of inseparable space-time variation in disease risk

Διδακτορική Διατριβή

( ) Multiple Comparisons on Longitudinal Data Junji Kishimoto SAS Institute Japan / Keio Univ. SFC / Univ. of Tokyo address: jpnjak@jpn.sas.

UMI Number: All rights reserved

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

CE 530 Molecular Simulation

Reliability analysis Ανάλυση αξιοπιστίας

APPENDIX B NETWORK ADJUSTMENT REPORTS JEFFERSON COUNTY, KENTUCKY JEFFERSON COUNTY, KENTUCKY JUNE 2016

Additional Results for the Pareto/NBD Model

Βιογραφικό Σημείωμα. (τελευταία ενημέρωση 20 Ιουλίου 2015) 14 Ιουλίου 1973 Αθήνα Έγγαμος

Generalized Linear Model [GLM]

ΔΗΜΟΚΡΙΤΕΙΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΡΑΚΗΣ ΣΧΟΛΗ ΕΠΙΣΤΗΜΩΝ ΑΓΩΓΗΣ

1η εργασία για το μάθημα «Αναγνώριση προτύπων»

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ. Παναγιώτης Μερκούρης ΕΚΠΑΙΔΕΥΣΗ

Stabilization of stock price prediction by cross entropy optimization

Tutorial on Multinomial Logistic Regression

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ. Λέκτορας στο Τμήμα Οργάνωσης και Διοίκησης Επιχειρήσεων, Πανεπιστήμιο Πειραιώς, Ιανουάριος 2012-Μάρτιος 2014.

Marginal effects in the probit model with a triple dummy variable interaction term

Démographie spatiale/spatial Demography

The ε-pseudospectrum of a Matrix

Parametrized Surfaces

A Bonus-Malus System as a Markov Set-Chain. Małgorzata Niemiec Warsaw School of Economics Institute of Econometrics

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

«ΑΝΑΠΣΤΞΖ ΓΠ ΚΑΗ ΥΩΡΗΚΖ ΑΝΑΛΤΖ ΜΔΣΔΩΡΟΛΟΓΗΚΩΝ ΓΔΓΟΜΔΝΩΝ ΣΟΝ ΔΛΛΑΓΗΚΟ ΥΩΡΟ»

Outline Analog Communications. Lecture 05 Angle Modulation. Instantaneous Frequency and Frequency Deviation. Angle Modulation. Pierluigi SALVO ROSSI

ΓΕΩΠΟΝΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙO ΑΘΗΝΩΝ ΤΜΗΜΑ ΑΞΙΟΠΟΙΗΣΗΣ ΦΥΣΙΚΩΝ ΠΟΡΩΝ & ΓΕΩΡΓΙΚΗΣ ΜΗΧΑΝΙΚΗΣ

Lecture 21: Properties and robustness of LSE

Εισαγωγή στην Ανάλυση Συνδιακύμανσης (Analysis of Covariance, ANCOVA)

Homework 8 Model Solution Section

ΠΑΝΔΠΗΣΖΜΗΟ ΠΑΣΡΩΝ ΣΜΖΜΑ ΖΛΔΚΣΡΟΛΟΓΩΝ ΜΖΥΑΝΗΚΩΝ ΚΑΗ ΣΔΥΝΟΛΟΓΗΑ ΤΠΟΛΟΓΗΣΩΝ ΣΟΜΔΑ ΤΣΖΜΑΣΩΝ ΖΛΔΚΣΡΗΚΖ ΔΝΔΡΓΔΗΑ

ΤΕΛΙΚΗ ΕΞΕΤΑΣΗ ΔΕΙΓΜΑ ΟΙΚΟΝΟΜΕΤΡΙΑ Ι (3ο Εξάμηνο) Όνομα εξεταζόμενου: Α.Α. Οικονομικό Πανεπιστήμιο Αθήνας -- Τμήμα ΔΕΟΣ Καθηγητής: Γιάννης Μπίλιας

Simplex Crossover for Real-coded Genetic Algolithms

Transcript:

Multilevel models for analyzing people s daily moving behaviour Matteo BOTTAI 1 Nicola SALVATI 2 Nicola ORSINI 3 13th European Colloquium on Theoretical and Quantitative Geography Lucca 5th - 9th September, 2003 1 Institute of Information Science and Technology National Research Council of Italy and Dep. Epidemiology ad Biostatistics, University of South Carolina, Columbia, USA e-mail: m.bottai@isti.cnr.it 2 Department of Statistics G.Parenti University of Florence e-mail: salvati@ds.unifi.it 3 Institute of Information Science and Technology National Research Council of Italy e-mail: n.orsini@isti.cnr.it

OUTLINE Data - hierarchical data structure Statistical model - multilevel generalized linear model Application - A survey on daily moving behaviours of the people residing in the area of Pisa 2

HIERARCHICAL DATA STRUCTURE LEVEL UNIT SUBSCRIPT RANGE TOTAL 1 Individual j = 1,, n i 1 n i 5 N=806 2 Family i = 1,, m k 1 m k 76 M=373 3 Area k = 1,, K K=35 N K = k = 1 m k i= 1 n i = 806 M K = k = 1 m k = 373 3

FEATURES OF THE DATA Dependence - Correlation between individual observations in the same level. Ignoring hierarchical data structure - Misleading statistical inference Traditional statistical models should be adapted - Using random variables and relative assumptions to describe this correlation 4

MULTILEVEL GENERALIZED LINEAR MODEL Probability distribution Y ~ Exponential Family Conditional expectation Linear predictor Link function η = V v= 1 E[ Y x, z, u] = µ x v β v + L l= 2 g ( µ ) =η R l r= 1 u ( l ) r z ( l ) r 5

2-LEVEL RANDOM INTERCEPT MODEL NORMAL L = 2 R l = 1 V = 2 j = 1,, n i i = 1,, m k 2 Distribution Y ~ N(, σ ) µ Expectation Linear predictor Link E ( Y x, z, u ) = µ η = x β + x β + µ = η (2) 1 i < µ < + u (2) (2) 1 1i 2 2 1i 1 z (2) x 1 = 1 z 1 = 1 6

3-LEVEL RANDOM INTERCEPT MODEL NORMAL L = 3 R l = 1 V = 2 j = 1,, n i i = 1,, m k k = 1,, K 2 Distribution Y ~ N(, σ ) k µ k Expectation Linear predictor Link E ( Y x, z, u, u ) = µ k k k k (2) (3) 1 i 1k k < µ k < + η = x β + x β + u z + µ = η k (2) (2) (3) (3) 1 k 1ik 2 k 2 1i 1 k 1k 1 k k u z (2) (3) x 1 k = 1 z 1 k = 1 z 1 k = 1 7

2-LEVEL RANDOM INTERCEPT MODEL POISSON L = 2 R l = 1 V = 2 j = 1,, n i i = 1,, m k Distribution Y ~ µ ) P( Expectation Linear predictor Link E ( Y x, z, u ) = µ η = x β + x β + log( µ ) = η (2) 1 i 0 < µ < + u (2) (2) 1 1i 2 2 1i 1 z (2) x 1 = 1 z 1 = 1 8

RANDOM EFFECTS DISTRIBUTION ( l) 2 u ~ N(0, σ ) l 1,..., L r lr = r = 1,..., Rl Covariance between two random effects at the same level COV = σ with l = l and r r ( l) ( l ) 2 ( ur,ur ) lrr Covariance between two random effects at different level COV ( l) ( l ) ( ur,ur ) = 0 with l l and r r 9

MAXIMUM LIKELIHOOD ESTIMATORS Define the likelihood for the data L ( Y w) = f ( Y u, w) p( u w) du - w denote a vector containing all parameters to be estimated - f ( ) is the probability distribution of the outcome - p ( ) is the probability distribution of the random effects Maximize the likelihood respect to w in order to make inferences about w Optimization Algorithms Differences between linear and non linear multilevel model 10

Fixed effects - Wald test Random effects TESTING HYPOTHESIS H0 : β v = W = ˆ βv ˆ σ ˆ β 2 H 0 : σ lr = - Likelihood Ratio test LRT = 2 abs(l1-l0) where L1 and L0 are the log-likelihoods evaluated in a model with and 2 (l) without the variance component σ lr, that is, the random effect u r. v 0 0 11

STATISTICAL SOFTWARE Several packages are available - STATA (command GLLAMM and XT) - SPLUS or R (command LME and NLME) - SAS (command PROC MIXED) - MLwiN - HLM Differences in computational algorithms - Newton-Raphson - Expectation-Maximization or Fisher Scoring - Iterative Generalized Least Squares 12

MODEL 1. xi: gllamm distance gender i.agec, f(gaussian) l(identity) i(family) number of level 1 units = 806 number of level 2 units = 373 log likelihood = -4516.0369 (some STATA output is omitted) distance Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- gender -18.55207 4.294402-4.32 0.000-26.96894-10.13519 0-14 _Iagec_2 23.31702 9.492722 2.46 0.014 4.711627 41.92241 15-30 _Iagec_3 27.22336 8.56821 3.18 0.001 10.42997 44.01674 31-44 _Iagec_4 20.49728 8.890619 2.31 0.021 3.071982 37.92257 45-59 _Iagec_5 2.963706 10.46916 0.28 0.777-17.55547 23.48288 60-74 _Iagec_6-5.604856 29.3878-0.19 0.849-63.20389 51.99418 >74 _cons 24.56164 8.354864 2.94 0.003 8.186403 40.93687 Variance at level 1 3227.0578 (209.92252) Variances and covariances of random effects level 2 (family) var(1): 1562.0955 (307.4231) INTRA-FAMILY CORRELATION = 1562/(3227+1562) = 0.32 13

MODEL 2. xi: gllamm distance gender i.agec, f(gaussian) l(identity) i(family area) number of level 1 units = 806 number of level 2 units = 373 number of level 3 units = 35 log likelihood = -4513.9259 (some STATA output is omitted) distance Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- gender -18.57917 4.264296-4.36 0.000-26.93704-10.22131 0-14 _Iagec_2 22.05173 9.455582 2.33 0.020 3.519131 40.58433 15-30 _Iagec_3 26.95951 8.51035 3.17 0.002 10.27953 43.63948 31-44 _Iagec_4 20.09742 8.85395 2.27 0.023 2.743992 37.45084 45-59 _Iagec_5 2.625714 10.39971 0.25 0.801-17.75734 23.00877 60-74 _Iagec_6-6.245908 29.18454-0.21 0.831-63.44656 50.95474 >74 _cons 24.89028 8.625134 2.89 0.004 7.985325 41.79523 Variance at level 1 3180.402 (209.47161) Variances and covariances of random effects level 2 (family) var(1): 1479.8355 (300.42951) scalar LRT = 2*abs(-4513.9259 - -4516.0369) level 3 (area).di chiprob(1, LRT) var(1): 135.86588 (95.478454) 0.03990366 14

MODEL 1 (adaptive quadrature). xi: gllamm distance gender i.agec, f(gaussian) l(identity) i(family) adapt number of level 1 units = 806 number of level 2 units = 373 log likelihood = -4508.9392 (some STATA output is omitted) distance Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- gender -18.65784 3.999824-4.66 0.000-26.49735-10.81833 0-14 _Iagec_2 22.63511 9.034982 2.51 0.012 4.926874 40.34335 15-30 _Iagec_3 26.48049 7.971745 3.32 0.001 10.85615 42.10482 31-44 _Iagec_4 20.19226 8.470851 2.38 0.017 3.589694 36.79482 45-59 _Iagec_5 2.842191 10.06534 0.28 0.778-16.88551 22.56989 60-74 _Iagec_6-5.707835 27.53451-0.21 0.836-59.67449 48.25882 >74 _cons 24.60227 8.010341 3.07 0.002 8.902292 40.30225 Variance at level 1 2691.102 (209.34499) Variances and covariances of random effects level 2 (family) var(1): 2233.4213 (359.34996) INTRA-FAMILY CORRELATION = 2233/(2691+2233) = 0.45 15

MODEL 2 (adaptive quadrature). xi: gllamm distance gender i.agec, f(gaussian) l(identity) i(family area) adapt number of level 1 units = 806 number of level 2 units = 373 number of level 3 units = 35 log likelihood = -4507.8158 (some STATA output is omitted) distance Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- gender -18.64834 3.997244-4.67 0.000-26.48279-10.81388 0-14 _Iagec_2 21.86722 9.037881 2.42 0.016 4.153303 39.58114 15-30 _Iagec_3 26.34994 7.970704 3.31 0.001 10.72765 41.97223 31-44 _Iagec_4 19.98335 8.472274 2.36 0.018 3.378 36.5887 45-59 _Iagec_5 2.683085 10.04687 0.27 0.789-17.00843 22.3746 60-74 _Iagec_6-6.018096 27.51967-0.22 0.827-59.95566 47.91947 >74 _cons 24.92912 8.274806 3.01 0.003 8.710801 41.14744 Variance at level 1 2692.1052 (209.67197) Variances and covariances of random effects level 2 (family) var(1): 2125.0703 (359.25033) scalar LRT = 2*abs(-4508.9392 - -4507.8158) level 3 (area).di chiprob(1, LRT) var(1):103.8059(92.49525).13389047 16

MODEL 3. xi: gllamm ntrip gender i.agec, f(poisson) l(log) i(family) number of level 1 units = 806 number of level 2 units = 373 log likelihood = -1628.4103 (some STATA output is omitted) ntrip Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- gender -.0464929.0359264-1.29 0.196 -.1169073.0239215 0-14 _Iagec_2.2817658.0825903 3.41 0.001.1198919.4436398 15-30 _Iagec_3.2987986.0781225 3.82 0.000.1456814.4519158 31-44 _Iagec_4.294807.0778875 3.79 0.000.1421503.4474638 45-59 _Iagec_5.3061211.088164 3.47 0.001.1333227.4789194 60-74 _Iagec_6 -.0988595.2861549-0.35 0.730 -.6597128.4619938 >74 _cons 1.144427.073767 15.51 0.000.9998468 1.289008 Variances and covariances of random effects level 2 (family) var(1):.03422704 (.00978456). display 2*abs(-1628.4103 - -1638.08 ) (-1638.08 from poisson regression) 19.3392. display chiprob(1, 19.3392) 0.00001094 17

MODEL 3 (adaptive quadrature). xi: gllamm ntrip gender i.agec, f(poisson) l(log) i(family) adapt number of level 1 units = 806 number of level 2 units = 373 log likelihood = -1628.4102 ntrip Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- gender -.0464929.0359265-1.29 0.196 -.1169076.0239218 0-14 _Iagec_2.2817775.0825924 3.41 0.001.1198994.4436555 15-30 _Iagec_3.2988007.0781228 3.82 0.000.1456829.4519185 31-44 _Iagec_4.2948099.0778881 3.79 0.000.142152.4474678 45-59 _Iagec_5.3061284.088165 3.47 0.001.1333281.4789287 60-74 _Iagec_6 -.0988579.2861577-0.35 0.730 -.6597167.4620009 >74 _cons 1.144424.0737682 15.51 0.000.9998409 1.289007 Variances and covariances of random effects level 2 (family) var(1):.03422967 (.00978768). display 2*abs(-1628.4102 - -1638.08 ) (-1638.08 from poisson regression) 19.3392. display chiprob(1, 19.3392) 0.00001094 18

SUMMARY Close relation between features of the data and statistical techniques - Multilevel Statistical models take into account a hierarchical data structure Define a multilevel generalized linear model and obtained MLEs using the STATA command GLLAMM - Normal and Poisson probability distributions as special case in the exponential family 19

POTENTIAL FUTURE RESEARCH Testing variance components and confidence intervals - Score test Assessing the adequacy of multilevel models - Assumptions of the random variables Multi-stage sampling design 20

REFERENCES Raudenbush S.W., Bryk A.S. (2002) Hierarchical Linear Models, Sage Publications. McCulloch C.E., Searle S.R. (2001) Generalized, Linear and Mixed Models, Wiley Series in Probability and Statistics. Rabe-Hesketh, S., Pickles, A. and Skrondal, A. (2001b). GLLAMM Manual. Technical Report 2001/01, Department of Biostatistics and Computing, Institute of Psychiatry, King's College, London. 21