Modern Regression HW #8 Solutions

Σχετικά έγγραφα
DirichletReg: Dirichlet Regression for Compositional Data in R

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

DirichletReg: Dirichlet Regression for Compositional Data in R

Ανάλυση Δεδομένων με χρήση του Στατιστικού Πακέτου R

R & R- Studio. Πασχάλης Θρήσκος PhD Λάρισα

Numerical Analysis FMN011

An Introduction to Splines

ES440/ES911: CFD. Chapter 5. Solution of Linear Equation Systems

Partial Differential Equations in Biology The boundary element method. March 26, 2013

5.1 logistic regresssion Chris Parrish July 3, 2016

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

Section 8.3 Trigonometric Equations

The Simply Typed Lambda Calculus

Finite Field Problems: Solutions

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

TMA4115 Matematikk 3

Areas and Lengths in Polar Coordinates

Γραµµική Παλινδρόµηση

EE512: Error Control Coding

Inverse trigonometric functions & General Solution of Trigonometric Equations

Written Examination. Antennas and Propagation (AA ) April 26, 2017.

Approximation of distance between locations on earth given by latitude and longitude

2. Μηχανικό Μαύρο Κουτί: κύλινδρος με μια μπάλα μέσα σε αυτόν.

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Areas and Lengths in Polar Coordinates

Section 7.6 Double and Half Angle Formulas

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Math 6 SL Probability Distributions Practice Test Mark Scheme

Supplementary Appendix

Figure A.2: MPC and MPCP Age Profiles (estimating ρ, ρ = 2, φ = 0.03)..

Supplementary figures

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 24/3/2007

Μενύχτα, Πιπερίγκου, Σαββάτης. ΒΙΟΣΤΑΤΙΣΤΙΚΗ Εργαστήριο 6 ο

C.S. 430 Assignment 6, Sample Solutions

F19MC2 Solutions 9 Complex Analysis

Homework 8 Model Solution Section

Example Sheet 3 Solutions

Chapter 6 BLM Answers

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

waffle Chris Parrish June 18, 2016

Homework 3 Solutions

Matrices and Determinants

6.3 Forecasting ARMA processes

Section 8.2 Graphs of Polar Equations

2. Let H 1 and H 2 be Hilbert spaces and let T : H 1 H 2 be a bounded linear operator. Prove that [T (H 1 )] = N (T ). (6p)

Si + Al Mg Fe + Mn +Ni Ca rim Ca p.f.u

Pg The perimeter is P = 3x The area of a triangle is. where b is the base, h is the height. In our case b = x, then the area is

t ts P ALEPlot t t P rt P ts r P ts t r P ts

Srednicki Chapter 55

6.1. Dirac Equation. Hamiltonian. Dirac Eq.


Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

Lampiran 1 Output SPSS MODEL I

2 Composition. Invertible Mappings

Problem Set 3: Solutions

Άσκηση 11. Δίνονται οι παρακάτω παρατηρήσεις:

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

Math221: HW# 1 solutions

derivation of the Laplacian from rectangular to spherical coordinates

Lecture 26: Circular domains

PARTIAL NOTES for 6.1 Trigonometric Identities

Πρόβλημα 1: Αναζήτηση Ελάχιστης/Μέγιστης Τιμής

Other Test Constructions: Likelihood Ratio & Bayes Tests

CHAPTER 101 FOURIER SERIES FOR PERIODIC FUNCTIONS OF PERIOD

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

1. For each of the following power series, find the interval of convergence and the radius of convergence:

Reminders: linear functions

10.7 Performance of Second-Order System (Unit Step Response)

An Introduction to Signal Detection and Estimation - Second Edition Chapter II: Selected Solutions

Μηχανική Μάθηση Hypothesis Testing

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Solutions to Exercise Sheet 5

( ) 2 and compare to M.

Queensland University of Technology Transport Data Analysis and Modeling Methodologies


ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

Ηλεκτρονικοί Υπολογιστές IV

Standardized Coefficients t Sig.

Homework for 1/27 Due 2/5

ΚΥΠΡΙΑΚΗ ΜΑΘΗΜΑΤΙΚΗ ΕΤΑΙΡΕΙΑ IΔ ΚΥΠΡΙΑΚΗ ΜΑΘΗΜΑΤΙΚΗ ΟΛΥΜΠΙΑΔΑ ΑΠΡΙΛΙΟΥ 2013 Β & Γ ΛΥΚΕΙΟΥ.

Dynamic types, Lambda calculus machines Section and Practice Problems Apr 21 22, 2016

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Generalized additive models in R

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 7η: Consumer Behavior Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

Solution Series 9. i=1 x i and i=1 x i.

ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΝΟΣΗΛΕΥΤΙΚΗΣ

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

Wan Nor Arifin under the Creative Commons Attribution-ShareAlike 4.0 International License. 1 Introduction 1

ECE 308 SIGNALS AND SYSTEMS FALL 2017 Answers to selected problems on prior years examinations

Statistical Inference I Locally most powerful tests

Ψηφιακή Οικονομία. Διάλεξη 11η: Markets and Strategic Interaction in Networks Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

Ιστορία νεότερων Μαθηματικών

PENGARUHKEPEMIMPINANINSTRUKSIONAL KEPALASEKOLAHDAN MOTIVASI BERPRESTASI GURU TERHADAP KINERJA MENGAJAR GURU SD NEGERI DI KOTA SUKABUMI

[2] T.S.G. Peiris and R.O. Thattil, An Alternative Model to Estimate Solar Radiation

9.09. # 1. Area inside the oval limaçon r = cos θ. To graph, start with θ = 0 so r = 6. Compute dr

Transcript:

36-40 Modern Regression HW #8 Solutions Problem [25 points] (a) DUE: /0/207 at 3PM This is still a linear regression model the simplest possible one. That being the case, the solution we derived before β = ( T ) T Y still holds. The design matrix is so =., β = ( T ) }{{}} T {{ Y } /n = Y i = Y. (b) So H = ( T ) T = (. (n) ) n n n. =... n..... n n n h ii = n for all i =,..., n.

(c) Ŷ = HY n n i= Y i =. = n n i= Y i Y. Y (d) t i = Y i Ŷi( i) s i Y i n j n j i Y i = Var(Y i ( Ŷi( i)) n ) Y i n Y j Y i j= = Var(Y i ) + Var(Ŷi( i)) = Y ( ) i ny n Yi σ 2 + σ2 n = Y ( ) i ny n Yi = nσ 2 n n n (Y i Y ) nσ 2 n (e) lim t i = Y i = lim Y i = n n (Y i Y ) nσ 2 n n n (lim Y i Y i Y ) nσ 2 n 2

Problem 2 [20 points] H = HH h h 2 h n h h 2 h n h h 2 h n h 2 h 22 h 2n..... = h 2 h 22 h 2n h 2 h 22 h 2n.......... h n h nn h n h nn h n h nn So for each diagonal element of H we have, h ii = h 2 ii + h 2 ij. j i We have expressed h ii as a sum of squares, so h ii 0. Furthermore, so h ii h 2 ii, h ii. 3

Problem 3 [20 points] Comments omitted. One of the points in data set four produces NaN s because it has leverage. See equation (2) of Lecture Notes 20. attach(anscombe) names(anscombe) ## [] "x" "x2" "x3" "x4" "y" "y2" "y3" "y4" Y 4 5 6 7 8 9 0 Residuals 2 0 4 6 8 0 2 4 4 6 8 0 2 4 2 0 4 6 8 0 2 4 4 6 8 0 2 4 2 4 6 8 0 Figure : Data Set 4

Y 3 4 5 6 7 8 9 Residuals 2.0.5.0 0.5 0.0 0.5.0 4 6 8 0 2 4 4 6 8 0 2 4 2.0.5.0 0.5 0.0 0.5.0 4 6 8 0 2 4 4 6 8 0 2 4 2 4 6 8 0 Figure 2: Data Set 2 5

Y 6 8 0 2 Residuals 0 2 3 4 6 8 0 2 4 4 6 8 0 2 4 0 200 400 600 800 000 200 4 6 8 0 2 4 4 6 8 0 2 4 2 4 6 8 0 Figure 3: Data Set 3 6

Y 6 8 0 2 Residuals.5 0.5 0.0 0.5.0.5 8 0 2 4 6 8 8 0 2 4 6 8.5.0 0.5 0.0 0.5.0.5 8 0 2 4 6 8 8 0 2 4 6 8 2 4 6 8 0 Figure 4: Data Set 4 7

Problem 4 [25 points] Discussions omitted (i) 20 40 60 80 00 40 80 220 y 0.0 0.2 0.4 0.6 0.8.0 20 40 60 80 00 age tri 50 250 350 450 40 80 220 chol 0.0 0.2 0.4 0.6 0.8.0 50 250 350 450 (ii) model <- lm(y ~ age + tri + chol, data = data) 8

(iii) 0 2 3 4 5 20 40 60 80 00 20 40 60 80 00 Age 5 0 5 20 25 Age 0 2 3 4 5 50 200 250 300 350 400 450 50 200 250 300 350 400 450 Triglycerides 5 0 5 20 25 Triglycerides 0 2 3 4 5 40 60 80 200 220 240 40 60 80 200 220 240 Cholesterol 5 0 5 20 25 Cholesterol 9

(iv) model <- lm(y ~ age + tri + chol + I(age^2) + I(tri^2) + I(chol^2), data = data) 0 0 20 30 40 50 20 40 60 80 00 20 40 60 80 00 Age 5 0 5 20 25 Age 0 0 20 30 40 50 50 200 250 300 350 400 450 50 200 250 300 350 400 450 Triglycerides 5 0 5 20 25 Triglycerides 0 0 20 30 40 50 40 60 80 200 220 240 40 60 80 200 220 240 Cholesterol 5 0 5 20 25 Cholesterol 0

which(rstudent(model) > 30) ## 25 ## 25 (v) data2 <- data[-25,] model <- lm(y ~ age + tri + chol + I(age^2) + I(tri^2) + I(chol^2), data = data2) 3 2 0 2 20 30 40 50 60 70 80 20 30 40 50 60 70 80 Age 5 0 5 20 Age 3 2 0 2 50 200 250 300 350 400 450 50 200 250 300 350 400 450 Triglycerides 5 0 5 20 Triglycerides

3 2 0 2 40 60 80 200 220 240 40 60 80 200 220 240 Cholesterol 5 0 5 20 Cholesterol library(knitr) kable(summary(model)$coefficients) Estimate Std. Error t value Pr(> t ) (Intercept) -0.095302 0.00005-862.43588 0.0000000 age 0.000047 0.000005 28.0644703 0.0000000 tri 0.000036 0.0000003 385.7563999 0.0000000 chol 0.000242 0.000002 03.582206 0.0000000 I(ageˆ2) 0.000032 0.0000000 6932.5827948 0.0000000 I(triˆ2) 0.0000000 0.0000000 -.4373 0.288524 I(cholˆ2) 0.0000000 0.0000000-0.272432 0.78857 kable(confint(model, parm = 2:6, level = - 0.05 / 6)) 0.47 % 99.583 % age 0.0000373 0.0000462 tri 0.000028 0.000044 chol 0.000206 0.000278 I(ageˆ2) 0.000032 0.000033 I(triˆ2) 0.0000000 0.0000000 2

Problem 5 [0 points] d = read.table("http://stat.cmu.edu/~larry/secretdata.txt") y = d[,] x = d[,2] x2 = d[,3] x3 = d[,4] x4 = d[,5] (a) 0.5 0.0 0.5 0.5 0.0 0.5 2.0 0.0 0.5 3 y 0.0 0.5.0 x 0.5 0.0 0.5 0.5 x2 0.5.5 x3 3 2 0 2 0.5 0.0 0.5 Figure 5: Secret Data (b) model5 <- lm(y ~ x + x2 + x3 + x4) Summarize the fitted model. 3.5 0.5.5 0.5 x4 0.5.5

4

5

Figure 6: The above residual plot is a spoof of this scene from The Simpsons Appendix quartz(height = 2, width = 6) par(mar = c(4,4.5,2,2) + 0.) par(oma = c(.5,,2,) + 0.) par(mfrow=c(2,2)) par(bg = "azure") out <- lm(y ~ x+x2+x3+x4) plot(x, residuals(out), col = NA, pch = 9, axes = FALSE, ylab = "Residuals", font.lab = 3, xlab = "", xlim = range(x4), cex.lab=2) axis(side =, at = seq(-2.5,2.5,0.25), labels = as.character(seq(-2.5,2.5,0.25)), font = 5, cex.axis =.25) axis(side = 2, at = seq(-3,2.5,0.5), labels = as.character(seq(-3,2.5,0.5)), font = 5, cex.axis =.25) abline(h = seq(-3,2.5,0.5), col = "gray75", lty = 2) abline(v = seq(-2.5,2.5,0.25), col = "gray80", lty = 2) 6

abline(0,0, lty = 2, col = "gray45") points(x, residuals(out), col = addtrans("orange",20), pch = 9) points(x, residuals(out), col = "orange") panel.smooth(x, residuals(out), col = "orange",cex =, lwd =.2, col.smooth = "seagreen", span = 2/3, iter = 3) plot(x2, residuals(out), col = NA, pch = 9, axes = FALSE, ylab = "Residuals", font.lab = 3, xlab = "", xlim = range(x4), cex.lab=2) axis(side =, at = seq(-2.5,2.5,0.25), labels = as.character(seq(-2.5,2.5,0.25)), font = 5, cex.axis =.25) axis(side = 2, at = seq(-3,2.5,0.5), labels = as.character(seq(-3,2.5,0.5)), font = 5, cex.axis =.25) abline(h = seq(-3,2.5,0.5), col = "gray75", lty = 2) abline(v = seq(-2.5,2.5,0.25), col = "gray80", lty = 2) abline(0,0, lty = 2, col = "gray45") points(x2, residuals(out), col = addtrans("orange",20), pch = 9) points(x2, residuals(out), col = "orange") panel.smooth(x2, residuals(out), col = "orange",cex =, lwd =.2, col.smooth = "seagreen", span = 2/3, iter = 3) plot(x3, residuals(out), col = NA, pch = 9, axes = FALSE, ylab = "Residuals", font.lab = 3, xlab = "", xlim = range(x4), cex.lab=2) axis(side =, at = seq(-2.5,2.5,0.25), labels = as.character(seq(-2.5,2.5,0.25)), font = 5, cex.axis =.25) axis(side = 2, at = seq(-3,2.5,0.5), labels = as.character(seq(-3,2.5,0.5)), font = 5, cex.axis =.25) abline(h = seq(-3,2.5,0.5), col = "gray75", lty = 2) abline(v = seq(-2.5,2.5,0.25), col = "gray80", lty = 2) abline(0,0, lty = 2, col = "gray45") points(x3, residuals(out), col = addtrans("orange",20), pch = 9) points(x3, residuals(out), col = "orange") panel.smooth(x3, residuals(out), col = "orange",cex =, lwd =.2, col.smooth = "seagreen", span = 2/3, iter = 3) plot(x4, residuals(out), col = NA, pch = 9, axes = FALSE, ylab = "Residuals", font.lab = 3, xlab = "", xlim = range(x4), cex.lab=2) axis(side =, at = seq(-2.5,2.5,0.25), labels = as.character(seq(-2.5,2.5,0.25)), font = 5, cex.axis =.25) axis(side = 2, at = seq(-3,2.5,0.5), labels = as.character(seq(-3,2.5,0.5)), font = 5, cex.axis =.25) abline(h = seq(-3,2.5,0.5), col = "gray75", lty = 2) abline(v = seq(-2.5,2.5,0.25), col = "gray80", lty = 2) abline(0,0, lty = 2, col = "gray45") points(x4, residuals(out), col = addtrans("orange",20), pch = 9) points(x4, residuals(out), col = "orange") panel.smooth(x4, residuals(out), col = "orange",cex =, lwd =.2, col.smooth = "seagreen", span = 2/3, iter = 3) mtext("linear Regression Residuals vs. Predictors", side = 3, line = -.2, outer = TRUE, font = 3, cex = 2) quartz.save(file = "residuals.png", type = "png") graphics.off() 7

quartz(height = 2, width = 6) par(mar = c(7,4.5,2,2) + 0.) par(bg = "azure") par(mfrow=c(,)) plot(out, which =, col = NA, pch = 9, axes = FALSE,add.smooth = FALSE, caption = "", font.lab = 3, sub.caption = "", cex.lab = 2, labels.id = NA) axis(side =, at = seq(-3,2.5,0.25), labels = as.character(seq(-3,2.5,0.25)), font = 5, cex.axis =.5) axis(side = 2, at = seq(-3.5,2.5,0.5), labels = as.character(seq(-3.5,2.5,0.5)), font = 5, cex.axis =.5) abline(h = seq(-3,2.5,0.5), col = "gray75", lty = 2) abline(v = seq(-2.5,2.5,0.25), col = "gray80", lty = 2) abline(0,0, lty = 2, col = "gray45") points(fitted(out), residuals(out), col = addtrans("orange",20), pch = 9, cex = 0.8) points(fitted(out), residuals(out), col = "orange", cex = 0.8) quartz.save(file = "homer.png", type = "png") 8