Bayesian Data Analysis, Midterm I

Σχετικά έγγραφα
Solution Series 9. i=1 x i and i=1 x i.

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Other Test Constructions: Likelihood Ratio & Bayes Tests

Math221: HW# 1 solutions

5.4 The Poisson Distribution.

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

ST5224: Advanced Statistical Theory II

Math 6 SL Probability Distributions Practice Test Mark Scheme

CHAPTER 101 FOURIER SERIES FOR PERIODIC FUNCTIONS OF PERIOD

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

6.3 Forecasting ARMA processes

Matrices and Determinants

4.6 Autoregressive Moving Average Model ARMA(1,1)

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Μηχανική Μάθηση Hypothesis Testing

Statistical Inference I Locally most powerful tests

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS


Areas and Lengths in Polar Coordinates

Aquinas College. Edexcel Mathematical formulae and statistics tables DO NOT WRITE ON THIS BOOKLET

HOMEWORK#1. t E(x) = 1 λ = (b) Find the median lifetime of a randomly selected light bulb. Answer:

Areas and Lengths in Polar Coordinates

Homework 3 Solutions

Supplementary Appendix

Srednicki Chapter 55

APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679

FORMULAS FOR STATISTICS 1

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Practice Exam 2. Conceptual Questions. 1. State a Basic identity and then verify it. (a) Identity: Solution: One identity is csc(θ) = 1

Section 8.3 Trigonometric Equations

Probability and Random Processes (Part II)

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

CE 530 Molecular Simulation

Example Sheet 3 Solutions

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Figure A.2: MPC and MPCP Age Profiles (estimating ρ, ρ = 2, φ = 0.03)..

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 24/3/2007

Notes on the Open Economy

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Concrete Mathematics Exercises from 30 September 2016

Second Order RLC Filters

CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS

Numerical Analysis FMN011

- 1+x 2 - x 3 + 7x x x x x x 2 - x 3 - -

Exercise 2: The form of the generalized likelihood ratio

MATH423 String Theory Solutions 4. = 0 τ = f(s). (1) dτ ds = dxµ dτ f (s) (2) dτ 2 [f (s)] 2 + dxµ. dτ f (s) (3)

DERIVATION OF MILES EQUATION FOR AN APPLIED FORCE Revision C

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Problem Set 3: Solutions

Queensland University of Technology Transport Data Analysis and Modeling Methodologies

Section 7.6 Double and Half Angle Formulas

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Description of the PX-HC algorithm

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

F19MC2 Solutions 9 Complex Analysis

Calculating the propagation delay of coaxial cable

The Simply Typed Lambda Calculus

Exercises to Statistics of Material Fatigue No. 5

Macromechanics of a Laminate. Textbook: Mechanics of Composite Materials Author: Autar Kaw

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

5.1 logistic regresssion Chris Parrish July 3, 2016

the total number of electrons passing through the lamp.

Correction Table for an Alcoholometer Calibrated at 20 o C

Durbin-Levinson recursive method

Higher Derivative Gravity Theories

HISTOGRAMS AND PERCENTILES What is the 25 th percentile of a histogram? What is the 50 th percentile for the cigarette histogram?

An Inventory of Continuous Distributions

Gaussian related distributions

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

EE512: Error Control Coding


w o = R 1 p. (1) R = p =. = 1

VBA Microsoft Excel. J. Comput. Chem. Jpn., Vol. 5, No. 1, pp (2006)

Part III - Pricing A Down-And-Out Call Option

Solutions to Exercise Sheet 5

5. Choice under Uncertainty

Congruence Classes of Invertible Matrices of Order 3 over F 2

Lecture 34 Bootstrap confidence intervals

2 Composition. Invertible Mappings

10.7 Performance of Second-Order System (Unit Step Response)

PARTIAL NOTES for 6.1 Trigonometric Identities

Introduction to the ML Estimation of ARMA processes

Reminders: linear functions

ΑΡΙΣΤΟΤΕΛΕΙΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΟΝΙΚΗΣ ΤΜΗΜΑ ΟΔΟΝΤΙΑΤΡΙΚΗΣ ΕΡΓΑΣΤΗΡΙΟ ΟΔΟΝΤΙΚΗΣ ΚΑΙ ΑΝΩΤΕΡΑΣ ΠΡΟΣΘΕΤΙΚΗΣ

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

Conjoint. The Problems of Price Attribute by Conjoint Analysis. Akihiko SHIMAZAKI * Nobuyuki OTAKE

1 String with massive end-points

Lanczos and biorthogonalization methods for eigenvalues and eigenvectors of matrices

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

Pg The perimeter is P = 3x The area of a triangle is. where b is the base, h is the height. In our case b = x, then the area is

Approximation of distance between locations on earth given by latitude and longitude

Statistics & Research methods. Athanasios Papaioannou University of Thessaly Dept. of PE & Sport Science

Transcript:

Bayesian Data Analysis, Midterm I Bugra Gedik bgedik@cc.gatech.edu October 3, 4 Q1) I have used Gibs sampler to solve this problem. 5, iterations with burn-in value of 1, is used. The resulting histograms representing the posterior distributions of µ i,...,µ k, σ 1,...,σ k,andσ,ψ,τ are given below. Posterior sample means and variances of the parameters are also indicated on the histograms. The corresponding Matlab code is included at the..6 Posterior of µ 1, mean is 6.6663, variance is 3.6814.5.4.3..1 16 17 18 19 1 3 4 5 6 7 8 9 3 31 3 33 34 µ 1.6 Posterior of µ, mean is.575, variance is 3.7164.4. 5 6 7 8 9 1 11 1 13 14 15 16 17 18 19 1 3 4 5 6 7 8 9 3 µ.6 Posterior of µ 3, mean is.1171, variance is.149.4. 15 16 17 18 19 1 3 4 5 6 7 8 9 3 µ 3 1

Posterior of σ, mean is 3.431, variance is 389.594 1.1.1.8.6.4. 15 3 45 6 75 9 15 1 135 15 165 18 195 1 5 4 5.1.8.6.4. σ 1 Posterior of σ, mean is 47.91, variance is 614.33 4 6 8 1 1 14 16 18 4 6 8 3.8 σ Posterior of σ, mean is 36.165, variance is 6.5651 3.6.4. 1 3 4 5 6 7 8 9 1 11 1 13 14 15 16 17 18 σ 3

.6 Posterior of σ, mean is.176, variance is 19.73.5.4.3..1 1 3 4 5 6 7 8 9 1 11 1 σ. Posterior of ψ, mean is.695, variance is 8.5.15.1.5 1 5 5 1 15 5 3 35 4 45 5 ψ.8 Posterior of τ, mean is.384, variance is 1633.7498.6.4. 1 3 4 5 6 7 8 9 1 11 1 τ Q1 Matlab Code % data related definitions data = { [5.8 19.8 8.6 9.4.3 33.8 33.8 7.8 9.6];... [16.5 3.5 13.5 34.6 16.9 18.8 6.1 18.4 17. 11.6.];... [4. 9.1 16. 4.8 7. 1.9 11.8 3. 17.7 3.9 4.6 4. 7. 3.7] }; k = length(data); n = zeros(1, k); % data row lengths dms = zeros(1, k); % data row means dvs = zeros(1, k); % data row variances n(i) = length(data{i}); dms(i) = mean(data{i}); dvs(i) = var(data{i}); % hyper parameters a = 1; c = 1; d = 1; f = 1; g =.1; psi = 1; xi =.1; % initialize parameters tausq = 1/rand_gamma(c/,d/,1,1); psi = rand_nor(psi,tausq/xi,1,1); mus = rand_nor(psi,tausq,1,k); sigsq = rand_gamma(f/,g/,1,1); sigsqs = 1./rand_gamma(a/,(a*sigsq)/,1,k); % gibs parameters itrcnt = 5; burnin = 1; effcnt = itrcnt - burnin; % collected parameter values thetas = zeros(effcnt, *k+3); 3

for iter=1:itrcnt, % perform gibs %tausq step% ga = (c + k + 1) / ; gb = ( d + sum((mus-psi).^) + xi*(psi-psi)^ ) / ; tausq = 1 / rand_gamma(ga,gb,1,1); %psi step% nm = ( sum(mus) + xi*psi ) / (k + xi); nv = tausq / (k + xi); psi = rand_nor(nm,sqrt(nv),1,1); %mus steps% nv = 1 / ( n(i)/sigsqs(i) + 1/tausq ); nm = ( (n(i)*dms(i))/sigsqs(i) + psi/tausq ) * nv; mus(i) = rand_nor(nm,sqrt(nv),1,1); %sigsq step% ga = (f+k*a) / ; gb = ( g + a*sum(1./sigsqs) ) / ; sigsq = rand_gamma(ga,gb,1,1); %sigsqs steps% ga = (a+n(i))/; gb = ( a*sigsq + (n(i)-1)*dvs(i) + n(i)*(dms(i)-mus(i))^ ) / ; sigsqs(i) = 1 / rand_gamma(ga,gb,1,1); %collect% eiter = iter - burnin; if(eiter > ) thetas(eiter, :) = [mus, sigsqs, sigsq, psi, tausq]; figure; B = 75; subplot(k, 1, i); muis = thetas(:,i); [cnt, pos] = hist(muis, B); title([ Posterior of \mu_{ numstr(i) },... mean is numstr(mean(muis)),... variance is numstr(var(muis))]); xlabel([ \mu_{ numstr(i) } ]); ylabel( ); figure; subplot(k, 1, i); sigsqis = thetas(:,k+i); [cnt, pos] = hist(sigsqis, B); title([ Posterior of \sigma^{}_{ numstr(i) },... mean is numstr(mean(sigsqis)),... variance is numstr(var(sigsqis))]); xlabel([ \sigma^{}_{ numstr(i) } ]); ylabel( ); figure; subplot(3, 1, 1); sigsqs = thetas(:,*k+1); [cnt, pos] = hist(sigsqs, B); title([ Posterior of \sigma^{},... mean is numstr(mean(sigsqs)),... variance is numstr(var(sigsqs))]); xlabel( \sigma^{} ); ylabel( ); subplot(3, 1, ); psis = thetas(:,*k+); [cnt, pos] = hist(psis, B); title([ Posterior of \psi,... mean is numstr(mean(psis)),... variance is numstr(var(psis))]); xlabel( \psi ); ylabel( ); subplot(3, 1, 3); tausqs = thetas(:,*k+3); [cnt, pos] = hist(tausqs, B); title([ Posterior of \tau^{},... mean is numstr(mean(tausqs)),... variance is numstr(var(tausqs))]); xlabel( \tau^{} ); ylabel( ); 4

Q) a) I first find the joint posterior distribution of µ, σ up to a normalizing constant: p(y i µ, σ) = 1 y σ e i µ σ e e y i µ σ µ, σ are indepent, thus p(µ, σ) = p(µ)p(σ) p(µ) =N (µ, 1 ),p(σ) =LN (σ, 1 ) n ( ) 1 y i µ p(y 1,...,y n µ, σ) = σ e σ e e y i µ σ i=1 p(y 1,...,y n µ, σ) = 1 n (y µ) σ n e σ e n y i µ i=1 e σ p(µ, σ y 1,...,y n ) p(y 1,...,y n µ, σ)p(µ, σ) p(µ, σ y 1,...,y n ) 1 n (y µ) σ n e σ e n y i µ i=1 e σ e µ 1 (ln σ) σ e p(µ, σ y 1,...,y n ) 1 n (y µ) e σ µ +(ln σ) e n y i µ i=1 e σ σn+1 [ 1.5 b) I developed a metropolis algorithm using a bi-variate normal jumping function N (., Σ). Σ is set to c.5 1 [ 1 I also experimented with Σ = c ], which gave the same result. c is set to m.4, where m is taken as 1, in 1 order to achieve an acceptance percentage of around 4% in metropolis jumps, which is suggested as a plausible value in the course book. c) In order to calculate p(y 41), I used the µ (i) and σ (i) values generated by the ith step of the metropolis algorithm to find p(y 41 µ (i),σ (i) ) using the CDF function of the Gumbel distribution, i.e. 1 F (41 µ (i),σ (i) ). Then I took the average over different steps. Formally, p(y 41) = 1 b a b i=a+1 ( 1 F (41 µ (i),σ (i) ) ), where a is the burn-in value and b is the total number of iterations. The resulting probability is.75 1 7, when a =1, and b = 5,. The resulting posterior distributions of µ and σ are given below. The Matlab code is also included. ]..5 x 14 Posterior of µ 1.5 1.5 4 6 8 µ.5 x 14 Posterior of σ 1.5 1.5 1 3 4 5 σ 5

Q Matlab Code % data related parameters y = [154 49.6 46.7 58.3 7.5 9 7.1 15.7 37.4 4.8 34.7 58.9... 7. 3 71.6 1 33.7 49.9 56.1 14.3 8.6 54.8 74.1 6... 5.9 38.6 53.4 13.5 5.7 4.8 84.3 38.8 7.4 67 118.7 3.... 55 67.9 87.3 89 98.7 47.1 71.6 83.6 44.3 41. 35.9 44.3]; n = length(y); ym = mean(y); % metropolis parameters itrcnt = 5; burnin = 1; effcnt = itrcnt - burnin; c = 1*.4/sqrt(); s = c * [1.5;.5 1]; % collected parameter values thetas = zeros(effcnt, ); p = ; % find initial values with non-zero probability while (p==), theta = [rand_nor(, 1,1,1) 1*rand]; %theta = [rand_nor(, 1,1,1) exp(1*rand_nor(,1,1,1))]; mu = theta(1); sig = theta(); p = (1/sig^(n+1)) * exp(- (n*(ym-mu))/sig - (mu^+log(sig)^)/)... * exp(- sum( exp( -(y-mu)./sig ) ) ); mcnt = ; for iter=1:itrcnt, % propose ntheta = rand_mvn(1, theta, s); % find ratio ntheta() = abs(ntheta()); nmu = ntheta(1); nsig = ntheta(); pn = (1/nsig^(n+1)) * exp(- (n*(ym-nmu))/nsig - (nmu^+log(nsig)^)/)... * exp(- sum( exp( -(y-nmu)./nsig ) ) ); r = pn / p; % decide move if (rand < r) mcnt = mcnt + 1; theta = ntheta; p = pn; mu = theta(1); sig = theta(); % collect eiter = iter - burnin; if(eiter > ) thetas(eiter, :) = theta; fprintf(1, Acceptance rate: %f\n, mcnt/itrcnt); B = 1; subplot(1,, 1); hist(thetas(:,1), B); title([ Posterior of \mu ]); xlabel([ \mu ]); ylabel( ); subplot(1,, ); hist(thetas(:,), B); title([ Posterior of \sigma ]); xlabel([ \sigma ]); ylabel( ); % find probability that y* > 41 target = 41; %% >=41 probs = 1-exp(-exp(-(target-thetas(:,1))./thetas(:,))); eprob = mean(probs); fprintf(1, [ probability of y>= %f is %d\n ], target, eprob); 6

Q3) The posterior statistics are as follows: Matlab plots are included below: node mean sd MC error.5% median 97.5% λ 8.631.6675.735 8. 9. 1. p.48.1 1.89E-4 5.19E-4.145.757 p 7.6138.3417 3.476E-4.156.555.144 p 9.96.4171 4.373E-4.691.847.1874 p 1.136.4318 4.75E-4.358.988.13 p 13.8195.3896 4.37E-4.316.7633.1738 p 17.43.5.79E-4 5.54E-4.1433.7447.5 Posterior of λ.45.4.35.3.5..15.1.5 6 7 8 9 1 11 λ.8 Posterior of p.6.4...4.6.8.1.1.14.16.18 p.5 Posterior of p 7.4.3..1.5.1.15..5.3.35 p 7.4 Posterior of p 9.3..1.5.1.15..5.3.35 p 9 7

.4 Posterior of p 1.3..1.5.1.15..5.3.35 p 1.4 Posterior of p 13.3..1.5.1.15..5.3.35 p 13.8 Posterior of p 17.6.4...4.6.8.1.1.14.16.18 p 17 8