Bayesian Data Analysis, Midterm I
1 Bayesian Data Analysis, Midterm I Bugra Gedik October 3, 4 Q1) I have used Gibs sampler to solve this problem. 5, iterations with burn-in value of 1, is used. The resulting histograms representing the posterior distributions of µ i,...,µ k, σ 1,...,σ k,andσ,ψ,τ are given below. Posterior sample means and variances of the parameters are also indicated on the histograms. The corresponding Matlab code is included at the..6 Posterior of µ 1, mean is , variance is µ 1.6 Posterior of µ, mean is.575, variance is µ.6 Posterior of µ 3, mean is.1171, variance is µ 3 1
2 Posterior of σ, mean is 3.431, variance is σ 1 Posterior of σ, mean is 47.91, variance is σ Posterior of σ, mean is , variance is σ 3
3 .6 Posterior of σ, mean is.176, variance is σ. Posterior of ψ, mean is.695, variance is ψ.8 Posterior of τ, mean is.384, variance is τ Q1 Matlab Code % data related definitions data = { [ ];... [ ];... [ ] }; k = length(data); n = zeros(1, k); % data row lengths dms = zeros(1, k); % data row means dvs = zeros(1, k); % data row variances n(i) = length(data{i}); dms(i) = mean(data{i}); dvs(i) = var(data{i}); % hyper parameters a = 1; c = 1; d = 1; f = 1; g =.1; psi = 1; xi =.1; % initialize parameters tausq = 1/rand_gamma(c/,d/,1,1); psi = rand_nor(psi,tausq/xi,1,1); mus = rand_nor(psi,tausq,1,k); sigsq = rand_gamma(f/,g/,1,1); sigsqs = 1./rand_gamma(a/,(a*sigsq)/,1,k); % gibs parameters itrcnt = 5; burnin = 1; effcnt = itrcnt - burnin; % collected parameter values thetas = zeros(effcnt, *k+3); 3
4 for iter=1:itrcnt, % perform gibs %tausq step% ga = (c + k + 1) / ; gb = ( d + sum((mus-psi).^) + xi*(psi-psi)^ ) / ; tausq = 1 / rand_gamma(ga,gb,1,1); %psi step% nm = ( sum(mus) + xi*psi ) / (k + xi); nv = tausq / (k + xi); psi = rand_nor(nm,sqrt(nv),1,1); %mus steps% nv = 1 / ( n(i)/sigsqs(i) + 1/tausq ); nm = ( (n(i)*dms(i))/sigsqs(i) + psi/tausq ) * nv; mus(i) = rand_nor(nm,sqrt(nv),1,1); %sigsq step% ga = (f+k*a) / ; gb = ( g + a*sum(1./sigsqs) ) / ; sigsq = rand_gamma(ga,gb,1,1); %sigsqs steps% ga = (a+n(i))/; gb = ( a*sigsq + (n(i)-1)*dvs(i) + n(i)*(dms(i)-mus(i))^ ) / ; sigsqs(i) = 1 / rand_gamma(ga,gb,1,1); %collect% eiter = iter - burnin; if(eiter > ) thetas(eiter, :) = [mus, sigsqs, sigsq, psi, tausq]; figure; B = 75; subplot(k, 1, i); muis = thetas(:,i); [cnt, pos] = hist(muis, B); title([ Posterior of \mu_{ numstr(i) },... mean is numstr(mean(muis)),... variance is numstr(var(muis))]); xlabel([ \mu_{ numstr(i) } ]); ylabel( ); figure; subplot(k, 1, i); sigsqis = thetas(:,k+i); [cnt, pos] = hist(sigsqis, B); title([ Posterior of \sigma^{}_{ numstr(i) },... mean is numstr(mean(sigsqis)),... variance is numstr(var(sigsqis))]); xlabel([ \sigma^{}_{ numstr(i) } ]); ylabel( ); figure; subplot(3, 1, 1); sigsqs = thetas(:,*k+1); [cnt, pos] = hist(sigsqs, B); title([ Posterior of \sigma^{},... mean is numstr(mean(sigsqs)),... variance is numstr(var(sigsqs))]); xlabel( \sigma^{} ); ylabel( ); subplot(3, 1, ); psis = thetas(:,*k+); [cnt, pos] = hist(psis, B); title([ Posterior of \psi,... mean is numstr(mean(psis)),... variance is numstr(var(psis))]); xlabel( \psi ); ylabel( ); subplot(3, 1, 3); tausqs = thetas(:,*k+3); [cnt, pos] = hist(tausqs, B); title([ Posterior of \tau^{},... mean is numstr(mean(tausqs)),... variance is numstr(var(tausqs))]); xlabel( \tau^{} ); ylabel( ); 4
5 Q) a) I first find the joint posterior distribution of µ, σ up to a normalizing constant: p(y i µ, σ) = 1 y σ e i µ σ e e y i µ σ µ, σ are indepent, thus p(µ, σ) = p(µ)p(σ) p(µ) =N (µ, 1 ),p(σ) =LN (σ, 1 ) n ( ) 1 y i µ p(y 1,...,y n µ, σ) = σ e σ e e y i µ σ i=1 p(y 1,...,y n µ, σ) = 1 n (y µ) σ n e σ e n y i µ i=1 e σ p(µ, σ y 1,...,y n ) p(y 1,...,y n µ, σ)p(µ, σ) p(µ, σ y 1,...,y n ) 1 n (y µ) σ n e σ e n y i µ i=1 e σ e µ 1 (ln σ) σ e p(µ, σ y 1,...,y n ) 1 n (y µ) e σ µ +(ln σ) e n y i µ i=1 e σ σn+1 [ 1.5 b) I developed a metropolis algorithm using a bi-variate normal jumping function N (., Σ). Σ is set to c.5 1 [ 1 I also experimented with Σ = c ], which gave the same result. c is set to m.4, where m is taken as 1, in 1 order to achieve an acceptance percentage of around 4% in metropolis jumps, which is suggested as a plausible value in the course book. c) In order to calculate p(y 41), I used the µ (i) and σ (i) values generated by the ith step of the metropolis algorithm to find p(y 41 µ (i),σ (i) ) using the CDF function of the Gumbel distribution, i.e. 1 F (41 µ (i),σ (i) ). Then I took the average over different steps. Formally, p(y 41) = 1 b a b i=a+1 ( 1 F (41 µ (i),σ (i) ) ), where a is the burn-in value and b is the total number of iterations. The resulting probability is , when a =1, and b = 5,. The resulting posterior distributions of µ and σ are given below. The Matlab code is also included. ]..5 x 14 Posterior of µ µ.5 x 14 Posterior of σ σ 5
6 Q Matlab Code % data related parameters y = [ ]; n = length(y); ym = mean(y); % metropolis parameters itrcnt = 5; burnin = 1; effcnt = itrcnt - burnin; c = 1*.4/sqrt(); s = c * [1.5;.5 1]; % collected parameter values thetas = zeros(effcnt, ); p = ; % find initial values with non-zero probability while (p==), theta = [rand_nor(, 1,1,1) 1*rand]; %theta = [rand_nor(, 1,1,1) exp(1*rand_nor(,1,1,1))]; mu = theta(1); sig = theta(); p = (1/sig^(n+1)) * exp(- (n*(ym-mu))/sig - (mu^+log(sig)^)/)... * exp(- sum( exp( -(y-mu)./sig ) ) ); mcnt = ; for iter=1:itrcnt, % propose ntheta = rand_mvn(1, theta, s); % find ratio ntheta() = abs(ntheta()); nmu = ntheta(1); nsig = ntheta(); pn = (1/nsig^(n+1)) * exp(- (n*(ym-nmu))/nsig - (nmu^+log(nsig)^)/)... * exp(- sum( exp( -(y-nmu)./nsig ) ) ); r = pn / p; % decide move if (rand < r) mcnt = mcnt + 1; theta = ntheta; p = pn; mu = theta(1); sig = theta(); % collect eiter = iter - burnin; if(eiter > ) thetas(eiter, :) = theta; fprintf(1, Acceptance rate: %f\n, mcnt/itrcnt); B = 1; subplot(1,, 1); hist(thetas(:,1), B); title([ Posterior of \mu ]); xlabel([ \mu ]); ylabel( ); subplot(1,, ); hist(thetas(:,), B); title([ Posterior of \sigma ]); xlabel([ \sigma ]); ylabel( ); % find probability that y* > 41 target = 41; %% >=41 probs = 1-exp(-exp(-(target-thetas(:,1))./thetas(:,))); eprob = mean(probs); fprintf(1, [ probability of y>= %f is %d\n ], target, eprob); 6
7 Q3) The posterior statistics are as follows: Matlab plots are included below: node mean sd MC error.5% median 97.5% λ p E E p E p E p E p E p E E Posterior of λ λ.8 Posterior of p p.5 Posterior of p p 7.4 Posterior of p p 9 7
8 .4 Posterior of p p 1.4 Posterior of p p 13.8 Posterior of p p 17 8
