Probabilistic & Unsupervised Learning

Μέγεθος: px
Εμφάνιση ξεκινά από τη σελίδα:

Download "Probabilistic & Unsupervised Learning"

Transcript

1 Probabilistic & Unsupervised Learning Week 3: The EM algorithm Yee Whye Teh Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2009

2 Mixtures of Gaussians Data: X = {x 1... x N } Latent process: s i iid Disc[π] Component distributions: x i (s i = m) P m [θ m ] = N (µ m, Σ m ) Marginal distribution: P (x i ) = k π m P m (x; θ m ) m=1 Log-likelihood: log p(x {µ m }, {Σ m }, π) = n log i=1 k π m 2πΣ m 1/2 exp m=1 [ 1 ] 2 (x i µ m ) T Σ 1 m (x i µ m )

3 EM for MoGs Evaluate responsibilities r im = P m (x)π m m P m (x)π m Update parameters µ m i r imx i Σ m π m i r im i r im(x i µ m )(x i µ m ) T i r im i r im N

4 The Expectation Maximisation (EM) algorithm The EM algorithm finds a (local) maximum of a latent variable model likelihood. It starts from arbitrary values of the parameters, and iterates two steps: E step: Fill in values of latent variables according to posterior given data. M step: Maximise likelihood as if latent variables were not hidden. Useful in models where learning would be easy if hidden variables were, in fact, observed (e.g. MoGs). Decomposes difficult problems into series of tractable steps. No learning rate. Framework lends itself to principled approximations.

5 Jensen s Inequality log(x) log(α x 1 + (1 α) x 2 ) α log(x 1 ) + (1 α) log(x 2 ) x 1 α x 1 + (1 α)x 2 x 2 For α i 0, α i = 1 and any {x i > 0} ( ) log α i x i i i α i log(x i ) Equality if and only if α i = 1 for some i (and therefore all others are 0).

6 The Free Energy for a Latent Variable Model Observed data X = {x i }; Latent variables Y = {y i }; Parameters θ. Goal: Maximize the log likelihood (i.e. ML learning) wrt θ: l(θ) = log P (X θ) = log P (Y, X θ)dy, Any distribution,, over the hidden variables can be used to obtain a lower bound on the log likelihood using Jensen s inequality: P (Y, X θ) P (Y, X θ) l(θ) = log dy log dy def = F(q, θ). Now, log P (Y, X θ) dy = = log P (Y, X θ) dy log dy log P (Y, X θ) dy + H[q], where H[q] is the entropy of. So: F(q, θ) = log P (Y, X θ) + H[q]

7 The Free Energy for a Latent Variable Model Observed data X = {x i }; Latent variables Y = {y i }; Parameters θ. Goal: Maximize the log likelihood (i.e. ML learning) wrt θ: l(θ) = log P (X θ) = log P (Y, X θ)dy, Any distribution,, over the hidden variables can be used to obtain a lower bound on the log likelihood using Jensen s inequality: P (Y, X θ) P (Y, X θ) l(θ) = log dy log dy def = F(q, θ). Now, log P (Y, X θ) dy = = log P (Y, X θ) dy log dy log P (Y, X θ) dy + H[q], where H[q] is the entropy of. So: F(q, θ) = log P (Y, X θ) + H[q]

8 The Free Energy for a Latent Variable Model Observed data X = {x i }; Latent variables Y = {y i }; Parameters θ. Goal: Maximize the log likelihood (i.e. ML learning) wrt θ: l(θ) = log P (X θ) = log P (Y, X θ)dy, Any distribution,, over the hidden variables can be used to obtain a lower bound on the log likelihood using Jensen s inequality: P (Y, X θ) P (Y, X θ) l(θ) = log dy log dy def = F(q, θ). Now, log P (Y, X θ) dy = = log P (Y, X θ) dy log dy log P (Y, X θ) dy + H[q], where H[q] is the entropy of. So: F(q, θ) = log P (Y, X θ) + H[q]

9 The Free Energy for a Latent Variable Model Observed data X = {x i }; Latent variables Y = {y i }; Parameters θ. Goal: Maximize the log likelihood (i.e. ML learning) wrt θ: l(θ) = log P (X θ) = log P (Y, X θ)dy, Any distribution,, over the hidden variables can be used to obtain a lower bound on the log likelihood using Jensen s inequality: P (Y, X θ) P (Y, X θ) l(θ) = log dy log dy def = F(q, θ). Now, log P (Y, X θ) dy = = log P (Y, X θ) dy log dy log P (Y, X θ) dy + H[q], where H[q] is the entropy of. So: F(q, θ) = log P (Y, X θ) + H[q]

10 The Free Energy for a Latent Variable Model Observed data X = {x i }; Latent variables Y = {y i }; Parameters θ. Goal: Maximize the log likelihood (i.e. ML learning) wrt θ: l(θ) = log P (X θ) = log P (Y, X θ)dy, Any distribution,, over the hidden variables can be used to obtain a lower bound on the log likelihood using Jensen s inequality: P (Y, X θ) P (Y, X θ) l(θ) = log dy log dy def = F(q, θ). Now, log P (Y, X θ) dy = = log P (Y, X θ) dy log dy log P (Y, X θ) dy + H[q], where H[q] is the entropy of. So: F(q, θ) = log P (Y, X θ) + H[q]

11 The E and M steps of EM The lower bound on the log likelihood is given by: F(q, θ) = log P (Y, X θ) + H[q], EM alternates between: E step: optimize F(q, θ) wrt distribution over hidden variables holding parameters fixed: q (k) (Y) := argmax F (, θ (k 1)). M step: maximize F(q, θ) wrt parameters holding hidden distribution fixed: θ (k) := argmax θ F ( q (k) (Y), θ ) = argmax θ log P (Y, X θ) q (k) (Y) The second equality comes from the fact that the entropy of does not depend directly on θ.

12 The E and M steps of EM The lower bound on the log likelihood is given by: F(q, θ) = log P (Y, X θ) + H[q], EM alternates between: E step: optimize F(q, θ) wrt distribution over hidden variables holding parameters fixed: q (k) (Y) := argmax F (, θ (k 1)). M step: maximize F(q, θ) wrt parameters holding hidden distribution fixed: θ (k) := argmax θ F ( q (k) (Y), θ ) = argmax θ log P (Y, X θ) q (k) (Y) The second equality comes from the fact that the entropy of does not depend directly on θ.

13 The E and M steps of EM The lower bound on the log likelihood is given by: F(q, θ) = log P (Y, X θ) + H[q], EM alternates between: E step: optimize F(q, θ) wrt distribution over hidden variables holding parameters fixed: q (k) (Y) := argmax F (, θ (k 1)). M step: maximize F(q, θ) wrt parameters holding hidden distribution fixed: θ (k) := argmax θ F ( q (k) (Y), θ ) = argmax θ log P (Y, X θ) q (k) (Y) The second equality comes from the fact that the entropy of does not depend directly on θ.

14 EM as Coordinate Ascent in F

15 The E Step The free energy can be re-written P (Y, X θ) F(q, θ)= log dy P (Y X, θ)p (X θ) = log dy = log P (X θ) dy + log = l(θ) KL[ P (Y X, θ)] The second term is the Kullback-Leibler divergence. P (Y X, θ) dy This means that, for fixed θ, F is bounded above by l, and achieves that bound when KL[ P (Y X, θ)] = 0. But KL[q p] is zero if and only if q = p. So, the E step simply sets q (k) (Y) = P (Y X, θ (k 1) ) and, after an E step, the free energy equals the likelihood.

16 The E Step The free energy can be re-written P (Y, X θ) F(q, θ)= log dy P (Y X, θ)p (X θ) = log dy = log P (X θ) dy + log = l(θ) KL[ P (Y X, θ)] The second term is the Kullback-Leibler divergence. P (Y X, θ) dy This means that, for fixed θ, F is bounded above by l, and achieves that bound when KL[ P (Y X, θ)] = 0. But KL[q p] is zero if and only if q = p. So, the E step simply sets q (k) (Y) = P (Y X, θ (k 1) ) and, after an E step, the free energy equals the likelihood.

17 The E Step The free energy can be re-written P (Y, X θ) F(q, θ)= log dy P (Y X, θ)p (X θ) = log dy = log P (X θ) dy + log = l(θ) KL[ P (Y X, θ)] The second term is the Kullback-Leibler divergence. P (Y X, θ) dy This means that, for fixed θ, F is bounded above by l, and achieves that bound when KL[ P (Y X, θ)] = 0. But KL[q p] is zero if and only if q = p. So, the E step simply sets q (k) (Y) = P (Y X, θ (k 1) ) and, after an E step, the free energy equals the likelihood.

18 The E Step The free energy can be re-written P (Y, X θ) F(q, θ)= log dy P (Y X, θ)p (X θ) = log dy = log P (X θ) dy + log = l(θ) KL[ P (Y X, θ)] The second term is the Kullback-Leibler divergence. P (Y X, θ) dy This means that, for fixed θ, F is bounded above by l, and achieves that bound when KL[ P (Y X, θ)] = 0. But KL[q p] is zero if and only if q = p. So, the E step simply sets q (k) (Y) = P (Y X, θ (k 1) ) and, after an E step, the free energy equals the likelihood.

19 The E Step The free energy can be re-written P (Y, X θ) F(q, θ)= log dy P (Y X, θ)p (X θ) = log dy = log P (X θ) dy + log = l(θ) KL[ P (Y X, θ)] The second term is the Kullback-Leibler divergence. P (Y X, θ) dy This means that, for fixed θ, F is bounded above by l, and achieves that bound when KL[ P (Y X, θ)] = 0. But KL[q p] is zero if and only if q = p. So, the E step simply sets q (k) (Y) = P (Y X, θ (k 1) ) and, after an E step, the free energy equals the likelihood.

20 The E Step The free energy can be re-written P (Y, X θ) F(q, θ)= log dy P (Y X, θ)p (X θ) = log dy = log P (X θ) dy + log = l(θ) KL[ P (Y X, θ)] The second term is the Kullback-Leibler divergence. P (Y X, θ) dy This means that, for fixed θ, F is bounded above by l, and achieves that bound when KL[ P (Y X, θ)] = 0. But KL[q p] is zero if and only if q = p. So, the E step simply sets q (k) (Y) = P (Y X, θ (k 1) ) and, after an E step, the free energy equals the likelihood.

21 The KL[q(x) p(x)] is non-negative and zero iff x : p(x) = q(x) First let s consider discrete distributions; the Kullback-Liebler divergence is: KL[q p] = i q i log q i p i. To find the distribution q which minimizes KL[q p] we add a Lagrange multiplier to enforce the normalization constraint: ) ) q i = q i E def = KL[q p] + λ ( 1 i We then take partial derivatives and set to zero: i q i log q i p i + λ ( 1 i E = log q i log p i + 1 λ = 0 q i = p i exp(λ 1) q i E λ = 1 q i = 0 q i = 1 i i q i = p i.

22 The KL[q(x) p(x)] is non-negative and zero iff x : p(x) = q(x) Check that the curvature (Hessian) is positive (definite), corresponding to a minimum: 2 E q i q i = 1 q i > 0, 2 E q i q j = 0, showing that q i = p i is a genuine minimum. At the minimum is it easily verified that KL[p p] = 0. A similar proof holds for KL[ ] between continuous densities, the derivatives being substituted by functional derivatives.

23 Coordinate Ascent in F (Demo) One parameter mixture: s Bernoulli[π] x s = 0 N [ 1, 1] x s = 1 N [1, 1] and one data point x 1 =.3. q(s) is a distribution on a single binary latent, and so is represented by r 1 [0, 1]

24 Coordinate Ascent in F (Demo) r π

25 EM Never Decreases the Likelihood The E and M steps together never decrease the log likelihood: l ( θ (k 1)) = E step F( q (k), θ (k 1)) M step F ( q (k), θ (k)) Jensen l ( θ (k)), The E step brings the free energy to the likelihood. The M-step maximises the free energy wrt θ. F l by Jensen or, equivalently, from the non-negativity of KL If the M-step is executed so that θ (k) θ (k 1) iff F increases, then the overall EM iteration will step to a new value of θ iff the likelihood increases.

26 EM Never Decreases the Likelihood The E and M steps together never decrease the log likelihood: l ( θ (k 1)) = E step F( q (k), θ (k 1)) M step F ( q (k), θ (k)) Jensen l ( θ (k)), The E step brings the free energy to the likelihood. The M-step maximises the free energy wrt θ. F l by Jensen or, equivalently, from the non-negativity of KL If the M-step is executed so that θ (k) θ (k 1) iff F increases, then the overall EM iteration will step to a new value of θ iff the likelihood increases.

27 EM Never Decreases the Likelihood The E and M steps together never decrease the log likelihood: l ( θ (k 1)) = E step F( q (k), θ (k 1)) M step F ( q (k), θ (k)) Jensen l ( θ (k)), The E step brings the free energy to the likelihood. The M-step maximises the free energy wrt θ. F l by Jensen or, equivalently, from the non-negativity of KL If the M-step is executed so that θ (k) θ (k 1) iff F increases, then the overall EM iteration will step to a new value of θ iff the likelihood increases.

28 EM Never Decreases the Likelihood The E and M steps together never decrease the log likelihood: l ( θ (k 1)) = E step F( q (k), θ (k 1)) M step F ( q (k), θ (k)) Jensen l ( θ (k)), The E step brings the free energy to the likelihood. The M-step maximises the free energy wrt θ. F l by Jensen or, equivalently, from the non-negativity of KL If the M-step is executed so that θ (k) θ (k 1) iff F increases, then the overall EM iteration will step to a new value of θ iff the likelihood increases.

29 EM Never Decreases the Likelihood The E and M steps together never decrease the log likelihood: l ( θ (k 1)) = E step F( q (k), θ (k 1)) M step F ( q (k), θ (k)) Jensen l ( θ (k)), The E step brings the free energy to the likelihood. The M-step maximises the free energy wrt θ. F l by Jensen or, equivalently, from the non-negativity of KL If the M-step is executed so that θ (k) θ (k 1) iff F increases, then the overall EM iteration will step to a new value of θ iff the likelihood increases.

30 Fixed Points of EM are Stationary Points in l Let a fixed point of EM occur with parameter θ. Then: θ log P (Y, X θ) P (Y X,θ ) = 0 θ Now, l(θ)= log P (X θ)= log P (X θ) P (Y X,θ ) P (Y, X θ) = log P (Y X, θ) so, P (Y X,θ ) = log P (Y, X θ) P (Y X,θ ) log P (Y X, θ) P (Y X,θ ) d dθ l(θ)= d dθ log P (Y, X θ) P (Y X,θ ) d dθ log P (Y X, θ) P (Y X,θ ) The second term is 0 at θ if the derivative exists (minimum of KL[ ]), and thus: d dθ l(θ) θ = d dθ log P (Y, X θ) P (Y X,θ ) = 0 θ So, EM converges to a stationary point of l(θ).

31 Fixed Points of EM are Stationary Points in l Let a fixed point of EM occur with parameter θ. Then: θ log P (Y, X θ) P (Y X,θ ) = 0 θ Now, l(θ)= log P (X θ)= log P (X θ) P (Y X,θ ) P (Y, X θ) = log P (Y X, θ) so, P (Y X,θ ) = log P (Y, X θ) P (Y X,θ ) log P (Y X, θ) P (Y X,θ ) d dθ l(θ)= d dθ log P (Y, X θ) P (Y X,θ ) d dθ log P (Y X, θ) P (Y X,θ ) The second term is 0 at θ if the derivative exists (minimum of KL[ ]), and thus: d dθ l(θ) θ = d dθ log P (Y, X θ) P (Y X,θ ) = 0 θ So, EM converges to a stationary point of l(θ).

32 Fixed Points of EM are Stationary Points in l Let a fixed point of EM occur with parameter θ. Then: θ log P (Y, X θ) P (Y X,θ ) = 0 θ Now, l(θ)= log P (X θ)= log P (X θ) P (Y X,θ ) P (Y, X θ) = log P (Y X, θ) so, P (Y X,θ ) = log P (Y, X θ) P (Y X,θ ) log P (Y X, θ) P (Y X,θ ) d dθ l(θ)= d dθ log P (Y, X θ) P (Y X,θ ) d dθ log P (Y X, θ) P (Y X,θ ) The second term is 0 at θ if the derivative exists (minimum of KL[ ]), and thus: d dθ l(θ) θ = d dθ log P (Y, X θ) P (Y X,θ ) = 0 θ So, EM converges to a stationary point of l(θ).

33 Fixed Points of EM are Stationary Points in l Let a fixed point of EM occur with parameter θ. Then: θ log P (Y, X θ) P (Y X,θ ) = 0 θ Now, l(θ)= log P (X θ)= log P (X θ) P (Y X,θ ) P (Y, X θ) = log P (Y X, θ) so, P (Y X,θ ) = log P (Y, X θ) P (Y X,θ ) log P (Y X, θ) P (Y X,θ ) d dθ l(θ)= d dθ log P (Y, X θ) P (Y X,θ ) d dθ log P (Y X, θ) P (Y X,θ ) The second term is 0 at θ if the derivative exists (minimum of KL[ ]), and thus: d dθ l(θ) θ = d dθ log P (Y, X θ) P (Y X,θ ) = 0 θ So, EM converges to a stationary point of l(θ).

34 Fixed Points of EM are Stationary Points in l Let a fixed point of EM occur with parameter θ. Then: θ log P (Y, X θ) P (Y X,θ ) = 0 θ Now, l(θ)= log P (X θ)= log P (X θ) P (Y X,θ ) P (Y, X θ) = log P (Y X, θ) so, P (Y X,θ ) = log P (Y, X θ) P (Y X,θ ) log P (Y X, θ) P (Y X,θ ) d dθ l(θ)= d dθ log P (Y, X θ) P (Y X,θ ) d dθ log P (Y X, θ) P (Y X,θ ) The second term is 0 at θ if the derivative exists (minimum of KL[ ]), and thus: d dθ l(θ) θ = d dθ log P (Y, X θ) P (Y X,θ ) = 0 θ So, EM converges to a stationary point of l(θ).

35 Fixed Points of EM are Stationary Points in l Let a fixed point of EM occur with parameter θ. Then: θ log P (Y, X θ) P (Y X,θ ) = 0 θ Now, l(θ)= log P (X θ)= log P (X θ) P (Y X,θ ) P (Y, X θ) = log P (Y X, θ) so, P (Y X,θ ) = log P (Y, X θ) P (Y X,θ ) log P (Y X, θ) P (Y X,θ ) d dθ l(θ)= d dθ log P (Y, X θ) P (Y X,θ ) d dθ log P (Y X, θ) P (Y X,θ ) The second term is 0 at θ if the derivative exists (minimum of KL[ ]), and thus: d dθ l(θ) θ = d dθ log P (Y, X θ) P (Y X,θ ) = 0 θ So, EM converges to a stationary point of l(θ).

36 Fixed Points of EM are Stationary Points in l Let a fixed point of EM occur with parameter θ. Then: θ log P (Y, X θ) P (Y X,θ ) = 0 θ Now, l(θ)= log P (X θ)= log P (X θ) P (Y X,θ ) P (Y, X θ) = log P (Y X, θ) so, P (Y X,θ ) = log P (Y, X θ) P (Y X,θ ) log P (Y X, θ) P (Y X,θ ) d dθ l(θ)= d dθ log P (Y, X θ) P (Y X,θ ) d dθ log P (Y X, θ) P (Y X,θ ) The second term is 0 at θ if the derivative exists (minimum of KL[ ]), and thus: d dθ l(θ) θ = d dθ log P (Y, X θ) P (Y X,θ ) = 0 θ So, EM converges to a stationary point of l(θ).

37 Fixed Points of EM are Stationary Points in l Let a fixed point of EM occur with parameter θ. Then: θ log P (Y, X θ) P (Y X,θ ) = 0 θ Now, l(θ)= log P (X θ)= log P (X θ) P (Y X,θ ) P (Y, X θ) = log P (Y X, θ) so, P (Y X,θ ) = log P (Y, X θ) P (Y X,θ ) log P (Y X, θ) P (Y X,θ ) d dθ l(θ)= d dθ log P (Y, X θ) P (Y X,θ ) d dθ log P (Y X, θ) P (Y X,θ ) The second term is 0 at θ if the derivative exists (minimum of KL[ ]), and thus: d dθ l(θ) θ = d dθ log P (Y, X θ) P (Y X,θ ) = 0 θ So, EM converges to a stationary point of l(θ).

38 Maxima in F correspond to maxima in l Let θ now be the parameter value at a local maximum of F (and thus at a fixed point) Differentiating the previous expression wrt θ again we find d 2 d2 dθ2l(θ)= dθ 2 log P (Y, X θ) P (Y X,θ ) d2 dθ 2 log P (Y X, θ) P (Y X,θ ) The first term on the right is negative (a maximum) and the second term is positive (a minimum). Thus the curvature of the likelihood is negative and θ is a maximum of l. [... as long as the derivatives exist. They sometimes don t (zero-noise ICA)].

39 Maxima in F correspond to maxima in l Let θ now be the parameter value at a local maximum of F (and thus at a fixed point) Differentiating the previous expression wrt θ again we find d 2 d2 dθ2l(θ)= dθ 2 log P (Y, X θ) P (Y X,θ ) d2 dθ 2 log P (Y X, θ) P (Y X,θ ) The first term on the right is negative (a maximum) and the second term is positive (a minimum). Thus the curvature of the likelihood is negative and θ is a maximum of l. [... as long as the derivatives exist. They sometimes don t (zero-noise ICA)].

40 Maxima in F correspond to maxima in l Let θ now be the parameter value at a local maximum of F (and thus at a fixed point) Differentiating the previous expression wrt θ again we find d 2 d2 dθ2l(θ)= dθ 2 log P (Y, X θ) P (Y X,θ ) d2 dθ 2 log P (Y X, θ) P (Y X,θ ) The first term on the right is negative (a maximum) and the second term is positive (a minimum). Thus the curvature of the likelihood is negative and θ is a maximum of l. [... as long as the derivatives exist. They sometimes don t (zero-noise ICA)].

41 Maxima in F correspond to maxima in l Let θ now be the parameter value at a local maximum of F (and thus at a fixed point) Differentiating the previous expression wrt θ again we find d 2 d2 dθ2l(θ)= dθ 2 log P (Y, X θ) P (Y X,θ ) d2 dθ 2 log P (Y X, θ) P (Y X,θ ) The first term on the right is negative (a maximum) and the second term is positive (a minimum). Thus the curvature of the likelihood is negative and θ is a maximum of l. [... as long as the derivatives exist. They sometimes don t (zero-noise ICA)].

42 Maxima in F correspond to maxima in l Let θ now be the parameter value at a local maximum of F (and thus at a fixed point) Differentiating the previous expression wrt θ again we find d 2 d2 dθ2l(θ)= dθ 2 log P (Y, X θ) P (Y X,θ ) d2 dθ 2 log P (Y X, θ) P (Y X,θ ) The first term on the right is negative (a maximum) and the second term is positive (a minimum). Thus the curvature of the likelihood is negative and θ is a maximum of l. [... as long as the derivatives exist. They sometimes don t (zero-noise ICA)].

43 Partial M steps and Partial E steps Partial M steps: The proof holds even if we just increase F wrt θ rather than maximize. (Dempster, Laird and Rubin (1977) call this the generalized EM, or GEM, algorithm). Partial E steps: We can also just increase F wrt to some of the qs. For example, sparse or online versions of the EM algorithm would compute the posterior for a subset of the data points or as the data arrives, respectively. You can also update the posterior over a subset of the hidden variables, while holding others fixed...

44 The Gaussian mixture model (E-step) In a univariate Gaussian mixture model, the density of a data point x is: p(x θ) = k p(s = m θ)p(x s = m, θ) m=1 k m=1 π m exp { 1 ( x µm ) 2}, σ m 2σm 2 where θ is the collection of parameters: means µ m, variances σm 2 π m = p(s = m θ). and mixing proportions The hidden variable s i indicates which component observation x i belongs to. The E-step computes the posterior for s i given the current parameters: q(s i )= p(s i x i, θ) p(x i s i, θ)p(s i θ) def r im = q(s i = m) π m exp { 1 (x σ m 2σm 2 i µ m ) 2} (responsibilities) with the normalization such that m r im = 1.

45 The Gaussian mixture model (E-step) In a univariate Gaussian mixture model, the density of a data point x is: p(x θ) = k p(s = m θ)p(x s = m, θ) m=1 k m=1 π m exp { 1 ( x µm ) 2}, σ m 2σm 2 where θ is the collection of parameters: means µ m, variances σm 2 π m = p(s = m θ). and mixing proportions The hidden variable s i indicates which component observation x i belongs to. The E-step computes the posterior for s i given the current parameters: q(s i )= p(s i x i, θ) p(x i s i, θ)p(s i θ) def r im = q(s i = m) π m exp { 1 (x σ m 2σm 2 i µ m ) 2} (responsibilities) with the normalization such that m r im = 1.

46 The Gaussian mixture model (E-step) In a univariate Gaussian mixture model, the density of a data point x is: p(x θ) = k p(s = m θ)p(x s = m, θ) m=1 k m=1 π m exp { 1 ( x µm ) 2}, σ m 2σm 2 where θ is the collection of parameters: means µ m, variances σm 2 π m = p(s = m θ). and mixing proportions The hidden variable s i indicates which component observation x i belongs to. The E-step computes the posterior for s i given the current parameters: q(s i )= p(s i x i, θ) p(x i s i, θ)p(s i θ) def r im = q(s i = m) π m exp { 1 (x σ m 2σm 2 i µ m ) 2} (responsibilities) with the normalization such that m r im = 1.

47 The Gaussian mixture model (M-step) In the M-step we optimize the sum (since s is discrete): E = log p(x, s θ) q(s) = q(s) log[p(s θ) p(x s, θ)] = i,m [ r im log πm log σ m 1 (x 2σm 2 i µ m ) 2]. Optimization is done by setting the partial derivatives of E to zero: E = (x i µ m ) r im = 0 µ µ m 2σ 2 m = i r imx i i m i r, im E = [ r im 1 + (x i µ m ) 2 ] i = 0 σ 2 σ m σ i m σm 3 m = r im(x i µ m ) 2 i r, im E = 1 E r im, + λ = 0 π m = 1 r im, π m π i m π m n i where λ is a Lagrange multiplier ensuring that the mixing proportions sum to unity.

48 Factor Analysis y 1 y K x 1 x 2 x D Linear generative model: x d = K Λ dk y k + ɛ d k=1 y k are independent N (0, 1) Gaussian factors ɛ d are independent N (0, Ψ dd ) Gaussian noise K <D So, x is Gaussian with: p(x) = p(y)p(x y)dy = N (0, ΛΛ + Ψ) where Λ is a D K matrix, and Ψ is diagonal. Dimensionality Reduction: Finds a low-dimensional projection of high dimensional data that captures the correlation structure of the data.

49 EM for Factor Analysis y 1 y K The model for x: p(x θ) = p(y θ)p(x y, θ)dy = N (0, ΛΛ + Ψ) x 1 x 2 x D Model parameters: θ = {Λ, Ψ}. E step: For each data point x n, compute the posterior distribution of hidden factors given the observed data: q n (y) = p(y x n, θ t ). M step: Find the θ t+1 that maximises F(q, θ): F(q, θ) = q n (y) [log p(y θ) + log p(x n y, θ) log q n (y)] dy n = q n (y) [log p(y θ) + log p(x n y, θ)] dy + c. n

50 The E step for Factor Analysis E step: For each data point x n, compute the posterior distribution of hidden factors given the observed data: q n (y) = p(y x n, θ) = p(y, x n θ)/p(x n θ) Tactic: write p(y, x n θ), consider x n to be fixed. What is this as a function of y? p(y, x n ) = p(y)p(x n y) = (2π) K 2 exp{ 1 2 y y} 2πΨ 1 2 exp{ 1 2 (x n Λy) Ψ 1 (x n Λy)} = c exp{ 1 2 [y y + (x n Λy) Ψ 1 (x n Λy)]} = c exp{ 1 2 [y (I + Λ Ψ 1 Λ)y 2y Λ Ψ 1 x n ]} = c exp{ 1 2 [y Σ 1 y 2y Σ 1 µ + µ Σ 1 µ]} So Σ = (I + Λ Ψ 1 Λ) 1 = I βλ and µ = ΣΛ Ψ 1 x n = βx n. Where β = ΣΛ Ψ 1. Note that µ is a linear function of x n and Σ does not depend on x n.

51 The M step for Factor Analysis M step: Find θ t+1 maximising F = n qn (y) [log p(y θ) + log p(x n y, θ)] dy + c log p(y θ)+ log p(x n y, θ) = c 1 2 y y 1 2 log Ψ 1 2 (x n Λy) Ψ 1 (x n Λy) Taking expectations over q n (y)... = c 1 2 log Ψ 1 2 [x n Ψ 1 x n 2x n Ψ 1 Λy + y Λ Ψ 1 Λy] = c 1 2 log Ψ 1 2 [x n Ψ 1 x n 2x n Ψ 1 Λy + Tr [ Λ Ψ 1 Λyy ] ] = c 1 2 log Ψ 1 2 [x n Ψ 1 x n 2x n Ψ 1 Λµ n + Tr [ Λ Ψ 1 Λ(µ n µ n + Σ) ] ] Note that we don t need to know everything about q, just the expectations of y and yy under q (i.e. the expected sufficient statistics).

52 The M step for Factor Analysis (cont.) F = c N 2 log Ψ 1 2 [ xn Ψ 1 x n 2x n Ψ 1 Λµ n + Tr [ Λ Ψ 1 Λ(µ n µ n + Σ) ]] n log A Taking derivatives w.r.t. Λ and Ψ 1, using Tr[AB] B = A and A = A : F Λ = ( Ψ 1 x n µ n Ψ 1 Λ NΣ + ) µ n µ n = 0 n n ( ) 1 ˆΛ= ( x n µ n ) NΣ+ µ n µ n n n F Ψ = N 1 2 Ψ 1 [ xn x n Λµ n x n x n µ n Λ + Λ(µ n µ n + Σ)Λ ] 2 n ˆΨ = 1 [ xn x n Λµ n x n x n µ n Λ + Λ(µ n µ n + Σ)Λ ] N n ˆΨ= ΛΣΛ + 1 (x n Λµ n )(x n Λµ n ) (squared residuals) N n Note: we should actually only take derivarives w.r.t. Ψ dd since Ψ is diagonal. When Σ 0 these become the equations for linear regression!

53 Mixtures of Factor Analysers Simultaneous clustering and dimensionality reduction. p(x θ) = k π k N (µ k, Λ k Λ k + Ψ) where π k is the mixing proportion for FA k, µ k is its centre, Λ k is its factor loading matrix, and Ψ is a common sensor noise model. θ = {{π k, µ k, Λ k } k=1...k, Ψ} We can think of this model as having two sets of hidden latent variables: A discrete indicator variable s n {1,... K} For each factor analyzer, a continous factor vector y n,k R D k K p(x θ) = p(s n θ) p(y s n, θ)p(x n y, s n, θ) dy s n =1 As before, an EM algorithm can be derived for this model: E step: Infer joint distribution of latent variables, p(y n, s n x n, θ) M step: Maximize F with respect to θ.

54 EM for exponential families Defn: p is in the exponential family for z = (y, x) if it can be written: p(z θ) = b(z) exp{θ s(z)}/α(θ) where α(θ) = b(z) exp{θ s(z)}dz E step: q(y) = p(y x, θ) M step: θ (k) := argmax θ F(q, θ) F(q, θ) = = q(y) log p(y, x θ)dy H(q) q(y)[θ s(z) log α(θ)]dy + const It is easy to verify that: Therefore, M step solves: log α(θ) θ = E[s(z) θ] F θ = E q(y)[s(z)] E[s(z) θ] = 0

55 References A. P. Dempster, N. M. Laird and D. B. Rubin (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, No. 1 (1977), pp R. M. Neal and G. E. Hinton (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. I. Jordan (editor) Learning in Graphical Models, pp , Dordrecht: Kluwer Academic Publishers. radford/ftp/emk.pdf Z. Ghahramani and G. E. Hinton (1996). The EM Algorithm for Mixtures of Factor Analyzers. University of Toronto Technical Report CRG-TR

56 Failure Modes of EM EM can fail under a number of degenerate situations: EM may converge to a bad local maximum. Likelihood function may not be bounded above. E.g. a cluster responsible for a single data item can given arbitrarily large likelihood if variance σ m 0. Free energy may not be well defined (or is ).

57 Proof of the Matrix Inversion Lemma (A + XBX ) 1 = A 1 A 1 X(B 1 + X A 1 X) 1 X A 1 Need to prove: ( A 1 A 1 X(B 1 + X A 1 X) 1 X A 1) (A + XBX ) = I Expand: Regroup: I + A 1 XBX A 1 X(B 1 + X A 1 X) 1 X A 1 X(B 1 + X A 1 X) 1 X A 1 XBX = I + A 1 X ( BX (B 1 + X A 1 X) 1 X (B 1 + X A 1 X) 1 X A 1 XBX ) = I + A 1 X ( BX (B 1 + X A 1 X) 1 B 1 BX (B 1 + X A 1 X) 1 X A 1 XBX ) = I + A 1 X ( BX (B 1 + X A 1 X) 1 (B 1 + X A 1 X)BX ) = I + A 1 X(BX BX ) = I

58 Proof of the Matrix Inversion Lemma

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 3: The EM Algorithm Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/ ywteh/teaching/probmodels

Διαβάστε περισσότερα

lecture 10: the em algorithm (contd)

lecture 10: the em algorithm (contd) lecture 10: the em algorithm (contd) STAT 545: Intro. to Computational Statistics Vinayak Rao Purdue University September 24, 2018 Exponential family models Consider a space X. E.g. R, R d or N. ϕ(x) =

Διαβάστε περισσότερα

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science. Bayesian statistics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Frequentist vs Bayesian statistics In frequentist

Διαβάστε περισσότερα

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch: HOMEWORK 4 Problem a For the fast loading case, we want to derive the relationship between P zz and λ z. We know that the nominal stress is expressed as: P zz = ψ λ z where λ z = λ λ z. Therefore, applying

Διαβάστε περισσότερα

Other Test Constructions: Likelihood Ratio & Bayes Tests

Other Test Constructions: Likelihood Ratio & Bayes Tests Other Test Constructions: Likelihood Ratio & Bayes Tests Side-Note: So far we have seen a few approaches for creating tests such as Neyman-Pearson Lemma ( most powerful tests of H 0 : θ = θ 0 vs H 1 :

Διαβάστε περισσότερα

Solution Series 9. i=1 x i and i=1 x i.

Solution Series 9. i=1 x i and i=1 x i. Lecturer: Prof. Dr. Mete SONER Coordinator: Yilin WANG Solution Series 9 Q1. Let α, β >, the p.d.f. of a beta distribution with parameters α and β is { Γ(α+β) Γ(α)Γ(β) f(x α, β) xα 1 (1 x) β 1 for < x

Διαβάστε περισσότερα

Statistical Inference I Locally most powerful tests

Statistical Inference I Locally most powerful tests Statistical Inference I Locally most powerful tests Shirsendu Mukherjee Department of Statistics, Asutosh College, Kolkata, India. shirsendu st@yahoo.co.in So far we have treated the testing of one-sided

Διαβάστε περισσότερα

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM Solutions to Question 1 a) The cumulative distribution function of T conditional on N n is Pr T t N n) Pr max X 1,..., X N ) t N n) Pr max

Διαβάστε περισσότερα

Problem Set 3: Solutions

Problem Set 3: Solutions CMPSCI 69GG Applied Information Theory Fall 006 Problem Set 3: Solutions. [Cover and Thomas 7.] a Define the following notation, C I p xx; Y max X; Y C I p xx; Ỹ max I X; Ỹ We would like to show that C

Διαβάστε περισσότερα

6.3 Forecasting ARMA processes

6.3 Forecasting ARMA processes 122 CHAPTER 6. ARMA MODELS 6.3 Forecasting ARMA processes The purpose of forecasting is to predict future values of a TS based on the data collected to the present. In this section we will discuss a linear

Διαβάστε περισσότερα

Homework 8 Model Solution Section

Homework 8 Model Solution Section MATH 004 Homework Solution Homework 8 Model Solution Section 14.5 14.6. 14.5. Use the Chain Rule to find dz where z cosx + 4y), x 5t 4, y 1 t. dz dx + dy y sinx + 4y)0t + 4) sinx + 4y) 1t ) 0t + 4t ) sinx

Διαβάστε περισσότερα

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β 3.4 SUM AND DIFFERENCE FORMULAS Page Theorem cos(αβ cos α cos β -sin α cos(α-β cos α cos β sin α NOTE: cos(αβ cos α cos β cos(α-β cos α -cos β Proof of cos(α-β cos α cos β sin α Let s use a unit circle

Διαβάστε περισσότερα

derivation of the Laplacian from rectangular to spherical coordinates

derivation of the Laplacian from rectangular to spherical coordinates derivation of the Laplacian from rectangular to spherical coordinates swapnizzle 03-03- :5:43 We begin by recognizing the familiar conversion from rectangular to spherical coordinates (note that φ is used

Διαβάστε περισσότερα

ST5224: Advanced Statistical Theory II

ST5224: Advanced Statistical Theory II ST5224: Advanced Statistical Theory II 2014/2015: Semester II Tutorial 7 1. Let X be a sample from a population P and consider testing hypotheses H 0 : P = P 0 versus H 1 : P = P 1, where P j is a known

Διαβάστε περισσότερα

Example Sheet 3 Solutions

Example Sheet 3 Solutions Example Sheet 3 Solutions. i Regular Sturm-Liouville. ii Singular Sturm-Liouville mixed boundary conditions. iii Not Sturm-Liouville ODE is not in Sturm-Liouville form. iv Regular Sturm-Liouville note

Διαβάστε περισσότερα

Math221: HW# 1 solutions

Math221: HW# 1 solutions Math: HW# solutions Andy Royston October, 5 7.5.7, 3 rd Ed. We have a n = b n = a = fxdx = xdx =, x cos nxdx = x sin nx n sin nxdx n = cos nx n = n n, x sin nxdx = x cos nx n + cos nxdx n cos n = + sin

Διαβάστε περισσότερα

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions SCHOOL OF MATHEMATICAL SCIENCES GLMA Linear Mathematics 00- Examination Solutions. (a) i. ( + 5i)( i) = (6 + 5) + (5 )i = + i. Real part is, imaginary part is. (b) ii. + 5i i ( + 5i)( + i) = ( i)( + i)

Διαβάστε περισσότερα

EE512: Error Control Coding

EE512: Error Control Coding EE512: Error Control Coding Solution for Assignment on Finite Fields February 16, 2007 1. (a) Addition and Multiplication tables for GF (5) and GF (7) are shown in Tables 1 and 2. + 0 1 2 3 4 0 0 1 2 3

Διαβάστε περισσότερα

Solutions to Exercise Sheet 5

Solutions to Exercise Sheet 5 Solutions to Eercise Sheet 5 jacques@ucsd.edu. Let X and Y be random variables with joint pdf f(, y) = 3y( + y) where and y. Determine each of the following probabilities. Solutions. a. P (X ). b. P (X

Διαβάστε περισσότερα

Homework 3 Solutions

Homework 3 Solutions Homework 3 Solutions Igor Yanovsky (Math 151A TA) Problem 1: Compute the absolute error and relative error in approximations of p by p. (Use calculator!) a) p π, p 22/7; b) p π, p 3.141. Solution: For

Διαβάστε περισσότερα

2 Composition. Invertible Mappings

2 Composition. Invertible Mappings Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan Composition. Invertible Mappings In this section we discuss two procedures for creating new mappings from old ones, namely,

Διαβάστε περισσότερα

Matrices and Determinants

Matrices and Determinants Matrices and Determinants SUBJECTIVE PROBLEMS: Q 1. For what value of k do the following system of equations possess a non-trivial (i.e., not all zero) solution over the set of rationals Q? x + ky + 3z

Διαβάστε περισσότερα

4.6 Autoregressive Moving Average Model ARMA(1,1)

4.6 Autoregressive Moving Average Model ARMA(1,1) 84 CHAPTER 4. STATIONARY TS MODELS 4.6 Autoregressive Moving Average Model ARMA(,) This section is an introduction to a wide class of models ARMA(p,q) which we will consider in more detail later in this

Διαβάστε περισσότερα

Section 8.3 Trigonometric Equations

Section 8.3 Trigonometric Equations 99 Section 8. Trigonometric Equations Objective 1: Solve Equations Involving One Trigonometric Function. In this section and the next, we will exple how to solving equations involving trigonometric functions.

Διαβάστε περισσότερα

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM Solutions to Question 1 a) The cumulative distribution function of T conditional on N n is Pr (T t N n) Pr (max (X 1,..., X N ) t N n) Pr (max

Διαβάστε περισσότερα

The Simply Typed Lambda Calculus

The Simply Typed Lambda Calculus Type Inference Instead of writing type annotations, can we use an algorithm to infer what the type annotations should be? That depends on the type system. For simple type systems the answer is yes, and

Διαβάστε περισσότερα

Numerical Analysis FMN011

Numerical Analysis FMN011 Numerical Analysis FMN011 Carmen Arévalo Lund University carmen@maths.lth.se Lecture 12 Periodic data A function g has period P if g(x + P ) = g(x) Model: Trigonometric polynomial of order M T M (x) =

Διαβάστε περισσότερα

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3 Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3 1 State vector space and the dual space Space of wavefunctions The space of wavefunctions is the set of all

Διαβάστε περισσότερα

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Chapter 6: Systems of Linear Differential. be continuous functions on the interval Chapter 6: Systems of Linear Differential Equations Let a (t), a 2 (t),..., a nn (t), b (t), b 2 (t),..., b n (t) be continuous functions on the interval I. The system of n first-order differential equations

Διαβάστε περισσότερα

Areas and Lengths in Polar Coordinates

Areas and Lengths in Polar Coordinates Kiryl Tsishchanka Areas and Lengths in Polar Coordinates In this section we develop the formula for the area of a region whose boundary is given by a polar equation. We need to use the formula for the

Διαβάστε περισσότερα

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8  questions or comments to Dan Fetter 1 Eon : Fall 8 Suggested Solutions to Problem Set 8 Email questions or omments to Dan Fetter Problem. Let X be a salar with density f(x, θ) (θx + θ) [ x ] with θ. (a) Find the most powerful level α test

Διαβάστε περισσότερα

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R + Chapter 3. Fuzzy Arithmetic 3- Fuzzy arithmetic: ~Addition(+) and subtraction (-): Let A = [a and B = [b, b in R If x [a and y [b, b than x+y [a +b +b Symbolically,we write A(+)B = [a (+)[b, b = [a +b

Διαβάστε περισσότερα

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit Ting Zhang Stanford May 11, 2001 Stanford, 5/11/2001 1 Outline Ordinal Classification Ordinal Addition Ordinal Multiplication Ordinal

Διαβάστε περισσότερα

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS CHAPTER 5 SOLVING EQUATIONS BY ITERATIVE METHODS EXERCISE 104 Page 8 1. Find the positive root of the equation x + 3x 5 = 0, correct to 3 significant figures, using the method of bisection. Let f(x) =

Διαβάστε περισσότερα

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =? Teko Classes IITJEE/AIEEE Maths by SUHAAG SIR, Bhopal, Ph (0755) 3 00 000 www.tekoclasses.com ANSWERSHEET (TOPIC DIFFERENTIAL CALCULUS) COLLECTION # Question Type A.Single Correct Type Q. (A) Sol least

Διαβάστε περισσότερα

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics Fourier Series MATH 211, Calculus II J. Robert Buchanan Department of Mathematics Spring 2018 Introduction Not all functions can be represented by Taylor series. f (k) (c) A Taylor series f (x) = (x c)

Διαβάστε περισσότερα

Tridiagonal matrices. Gérard MEURANT. October, 2008

Tridiagonal matrices. Gérard MEURANT. October, 2008 Tridiagonal matrices Gérard MEURANT October, 2008 1 Similarity 2 Cholesy factorizations 3 Eigenvalues 4 Inverse Similarity Let α 1 ω 1 β 1 α 2 ω 2 T =......... β 2 α 1 ω 1 β 1 α and β i ω i, i = 1,...,

Διαβάστε περισσότερα

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018 Journal of rogressive Research in Mathematics(JRM) ISSN: 2395-028 SCITECH Volume 3, Issue 2 RESEARCH ORGANISATION ublished online: March 29, 208 Journal of rogressive Research in Mathematics www.scitecresearch.com/journals

Διαβάστε περισσότερα

C.S. 430 Assignment 6, Sample Solutions

C.S. 430 Assignment 6, Sample Solutions C.S. 430 Assignment 6, Sample Solutions Paul Liu November 15, 2007 Note that these are sample solutions only; in many cases there were many acceptable answers. 1 Reynolds Problem 10.1 1.1 Normal-order

Διαβάστε περισσότερα

Areas and Lengths in Polar Coordinates

Areas and Lengths in Polar Coordinates Kiryl Tsishchanka Areas and Lengths in Polar Coordinates In this section we develop the formula for the area of a region whose boundary is given by a polar equation. We need to use the formula for the

Διαβάστε περισσότερα

Fractional Colorings and Zykov Products of graphs

Fractional Colorings and Zykov Products of graphs Fractional Colorings and Zykov Products of graphs Who? Nichole Schimanski When? July 27, 2011 Graphs A graph, G, consists of a vertex set, V (G), and an edge set, E(G). V (G) is any finite set E(G) is

Διαβάστε περισσότερα

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ. Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο The time integral of a force is referred to as impulse, is determined by and is obtained from: Newton s 2 nd Law of motion states that the action

Διαβάστε περισσότερα

Math 6 SL Probability Distributions Practice Test Mark Scheme

Math 6 SL Probability Distributions Practice Test Mark Scheme Math 6 SL Probability Distributions Practice Test Mark Scheme. (a) Note: Award A for vertical line to right of mean, A for shading to right of their vertical line. AA N (b) evidence of recognizing symmetry

Διαβάστε περισσότερα

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits. EAMCET-. THEORY OF EQUATIONS PREVIOUS EAMCET Bits. Each of the roots of the equation x 6x + 6x 5= are increased by k so that the new transformed equation does not contain term. Then k =... - 4. - Sol.

Διαβάστε περισσότερα

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013 Notes on Average Scattering imes and Hall Factors Jesse Maassen and Mar Lundstrom Purdue University November 5, 13 I. Introduction 1 II. Solution of the BE 1 III. Exercises: Woring out average scattering

Διαβάστε περισσότερα

Srednicki Chapter 55

Srednicki Chapter 55 Srednicki Chapter 55 QFT Problems & Solutions A. George August 3, 03 Srednicki 55.. Use equations 55.3-55.0 and A i, A j ] = Π i, Π j ] = 0 (at equal times) to verify equations 55.-55.3. This is our third

Διαβάστε περισσότερα

Probability and Random Processes (Part II)

Probability and Random Processes (Part II) Probability and Random Processes (Part II) 1. If the variance σ x of d(n) = x(n) x(n 1) is one-tenth the variance σ x of a stationary zero-mean discrete-time signal x(n), then the normalized autocorrelation

Διαβάστε περισσότερα

Section 7.6 Double and Half Angle Formulas

Section 7.6 Double and Half Angle Formulas 09 Section 7. Double and Half Angle Fmulas To derive the double-angles fmulas, we will use the sum of two angles fmulas that we developed in the last section. We will let α θ and β θ: cos(θ) cos(θ + θ)

Διαβάστε περισσότερα

Every set of first-order formulas is equivalent to an independent set

Every set of first-order formulas is equivalent to an independent set Every set of first-order formulas is equivalent to an independent set May 6, 2008 Abstract A set of first-order formulas, whatever the cardinality of the set of symbols, is equivalent to an independent

Διαβάστε περισσότερα

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with Week 03: C lassification of S econd- Order L inear Equations In last week s lectures we have illustrated how to obtain the general solutions of first order PDEs using the method of characteristics. We

Διαβάστε περισσότερα

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007 Οδηγίες: Να απαντηθούν όλες οι ερωτήσεις. Αν κάπου κάνετε κάποιες υποθέσεις να αναφερθούν στη σχετική ερώτηση. Όλα τα αρχεία που αναφέρονται στα προβλήματα βρίσκονται στον ίδιο φάκελο με το εκτελέσιμο

Διαβάστε περισσότερα

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review Harvard College Statistics 104: Quantitative Methods for Economics Formula and Theorem Review Tommy MacWilliam, 13 tmacwilliam@college.harvard.edu March 10, 2011 Contents 1 Introduction to Data 5 1.1 Sample

Διαβάστε περισσότερα

Mean-Variance Analysis

Mean-Variance Analysis Mean-Variance Analysis Jan Schneider McCombs School of Business University of Texas at Austin Jan Schneider Mean-Variance Analysis Beta Representation of the Risk Premium risk premium E t [Rt t+τ ] R1

Διαβάστε περισσότερα

Parametrized Surfaces

Parametrized Surfaces Parametrized Surfaces Recall from our unit on vector-valued functions at the beginning of the semester that an R 3 -valued function c(t) in one parameter is a mapping of the form c : I R 3 where I is some

Διαβάστε περισσότερα

Μηχανική Μάθηση Hypothesis Testing

Μηχανική Μάθηση Hypothesis Testing ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ Μηχανική Μάθηση Hypothesis Testing Γιώργος Μπορμπουδάκης Τμήμα Επιστήμης Υπολογιστών Procedure 1. Form the null (H 0 ) and alternative (H 1 ) hypothesis 2. Consider

Διαβάστε περισσότερα

Tutorial on Multinomial Logistic Regression

Tutorial on Multinomial Logistic Regression Tutorial on Multinomial Logistic Regression Javier R Movellan June 19, 2013 1 1 General Model The inputs are n-dimensional vectors the outputs are c-dimensional vectors The training sample consist of m

Διαβάστε περισσότερα

Uniform Convergence of Fourier Series Michael Taylor

Uniform Convergence of Fourier Series Michael Taylor Uniform Convergence of Fourier Series Michael Taylor Given f L 1 T 1 ), we consider the partial sums of the Fourier series of f: N 1) S N fθ) = ˆfk)e ikθ. k= N A calculation gives the Dirichlet formula

Διαβάστε περισσότερα

The challenges of non-stable predicates

The challenges of non-stable predicates The challenges of non-stable predicates Consider a non-stable predicate Φ encoding, say, a safety property. We want to determine whether Φ holds for our program. The challenges of non-stable predicates

Διαβάστε περισσότερα

Reminders: linear functions

Reminders: linear functions Reminders: linear functions Let U and V be vector spaces over the same field F. Definition A function f : U V is linear if for every u 1, u 2 U, f (u 1 + u 2 ) = f (u 1 ) + f (u 2 ), and for every u U

Διαβάστε περισσότερα

6. MAXIMUM LIKELIHOOD ESTIMATION

6. MAXIMUM LIKELIHOOD ESTIMATION 6 MAXIMUM LIKELIHOOD ESIMAION [1] Maximum Likelihood Estimator (1) Cases in which θ (unknown parameter) is scalar Notational Clarification: From now on, we denote the true value of θ as θ o hen, view θ

Διαβάστε περισσότερα

Second Order Partial Differential Equations

Second Order Partial Differential Equations Chapter 7 Second Order Partial Differential Equations 7.1 Introduction A second order linear PDE in two independent variables (x, y Ω can be written as A(x, y u x + B(x, y u xy + C(x, y u u u + D(x, y

Διαβάστε περισσότερα

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM 2008 6 Chinese Journal of Applied Probability and Statistics Vol.24 No.3 Jun. 2008 Monte Carlo EM 1,2 ( 1,, 200241; 2,, 310018) EM, E,,. Monte Carlo EM, EM E Monte Carlo,. EM, Monte Carlo EM,,,,. Newton-Raphson.

Διαβάστε περισσότερα

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

6.1. Dirac Equation. Hamiltonian. Dirac Eq. 6.1. Dirac Equation Ref: M.Kaku, Quantum Field Theory, Oxford Univ Press (1993) η μν = η μν = diag(1, -1, -1, -1) p 0 = p 0 p = p i = -p i p μ p μ = p 0 p 0 + p i p i = E c 2 - p 2 = (m c) 2 H = c p 2

Διαβάστε περισσότερα

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0. DESIGN OF MACHINERY SOLUTION MANUAL -7-1! PROBLEM -7 Statement: Design a double-dwell cam to move a follower from to 25 6, dwell for 12, fall 25 and dwell for the remader The total cycle must take 4 sec

Διαβάστε περισσότερα

Approximation of distance between locations on earth given by latitude and longitude

Approximation of distance between locations on earth given by latitude and longitude Approximation of distance between locations on earth given by latitude and longitude Jan Behrens 2012-12-31 In this paper we shall provide a method to approximate distances between two points on earth

Διαβάστε περισσότερα

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006 Οδηγίες: Να απαντηθούν όλες οι ερωτήσεις. Ολοι οι αριθμοί που αναφέρονται σε όλα τα ερωτήματα είναι μικρότεροι το 1000 εκτός αν ορίζεται διαφορετικά στη διατύπωση του προβλήματος. Διάρκεια: 3,5 ώρες Καλή

Διαβάστε περισσότερα

Inverse trigonometric functions & General Solution of Trigonometric Equations. ------------------ ----------------------------- -----------------

Inverse trigonometric functions & General Solution of Trigonometric Equations. ------------------ ----------------------------- ----------------- Inverse trigonometric functions & General Solution of Trigonometric Equations. 1. Sin ( ) = a) b) c) d) Ans b. Solution : Method 1. Ans a: 17 > 1 a) is rejected. w.k.t Sin ( sin ) = d is rejected. If sin

Διαβάστε περισσότερα

Introduction to the ML Estimation of ARMA processes

Introduction to the ML Estimation of ARMA processes Introduction to the ML Estimation of ARMA processes Eduardo Rossi University of Pavia October 2013 Rossi ARMA Estimation Financial Econometrics - 2013 1 / 1 We consider the AR(p) model: Y t = c + φ 1 Y

Διαβάστε περισσότερα

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required) Phys460.nb 81 ψ n (t) is still the (same) eigenstate of H But for tdependent H. The answer is NO. 5.5.5. Solution for the tdependent Schrodinger s equation If we assume that at time t 0, the electron starts

Διαβάστε περισσότερα

Durbin-Levinson recursive method

Durbin-Levinson recursive method Durbin-Levinson recursive method A recursive method for computing ϕ n is useful because it avoids inverting large matrices; when new data are acquired, one can update predictions, instead of starting again

Διαβάστε περισσότερα

Pg The perimeter is P = 3x The area of a triangle is. where b is the base, h is the height. In our case b = x, then the area is

Pg The perimeter is P = 3x The area of a triangle is. where b is the base, h is the height. In our case b = x, then the area is Pg. 9. The perimeter is P = The area of a triangle is A = bh where b is the base, h is the height 0 h= btan 60 = b = b In our case b =, then the area is A = = 0. By Pythagorean theorem a + a = d a a =

Διαβάστε περισσότερα

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Partial Differential Equations in Biology The boundary element method. March 26, 2013 The boundary element method March 26, 203 Introduction and notation The problem: u = f in D R d u = ϕ in Γ D u n = g on Γ N, where D = Γ D Γ N, Γ D Γ N = (possibly, Γ D = [Neumann problem] or Γ N = [Dirichlet

Διαβάστε περισσότερα

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1. Exercises 0 More exercises are available in Elementary Differential Equations. If you have a problem to solve any of them, feel free to come to office hour. Problem Find a fundamental matrix of the given

Διαβάστε περισσότερα

Lecture 2. Soundness and completeness of propositional logic

Lecture 2. Soundness and completeness of propositional logic Lecture 2 Soundness and completeness of propositional logic February 9, 2004 1 Overview Review of natural deduction. Soundness and completeness. Semantics of propositional formulas. Soundness proof. Completeness

Διαβάστε περισσότερα

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds! MTH U341 urface Integrals, tokes theorem, the divergence theorem To be turned in Wed., Dec. 1. 1. Let be the sphere of radius a, x 2 + y 2 + z 2 a 2. a. Use spherical coordinates (with ρ a) to parametrize.

Διαβάστε περισσότερα

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y Stat 50 Homework Solutions Spring 005. (a λ λ λ 44 (b trace( λ + λ + λ 0 (c V (e x e e λ e e λ e (λ e by definition, the eigenvector e has the properties e λ e and e e. (d λ e e + λ e e + λ e e 8 6 4 4

Διαβάστε περισσότερα

Problem Set 9 Solutions. θ + 1. θ 2 + cotθ ( ) sinθ e iφ is an eigenfunction of the ˆ L 2 operator. / θ 2. φ 2. sin 2 θ φ 2. ( ) = e iφ. = e iφ cosθ.

Problem Set 9 Solutions. θ + 1. θ 2 + cotθ ( ) sinθ e iφ is an eigenfunction of the ˆ L 2 operator. / θ 2. φ 2. sin 2 θ φ 2. ( ) = e iφ. = e iφ cosθ. Chemistry 362 Dr Jean M Standard Problem Set 9 Solutions The ˆ L 2 operator is defined as Verify that the angular wavefunction Y θ,φ) Also verify that the eigenvalue is given by 2! 2 & L ˆ 2! 2 2 θ 2 +

Διαβάστε περισσότερα

Exercises to Statistics of Material Fatigue No. 5

Exercises to Statistics of Material Fatigue No. 5 Prof. Dr. Christine Müller Dipl.-Math. Christoph Kustosz Eercises to Statistics of Material Fatigue No. 5 E. 9 (5 a Show, that a Fisher information matri for a two dimensional parameter θ (θ,θ 2 R 2, can

Διαβάστε περισσότερα

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

Main source: Discrete-time systems and computer control by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1 Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1 A Brief History of Sampling Research 1915 - Edmund Taylor Whittaker (1873-1956) devised a

Διαβάστε περισσότερα

Theorem 8 Let φ be the most powerful size α test of H

Theorem 8 Let φ be the most powerful size α test of H Testing composite hypotheses Θ = Θ 0 Θ c 0 H 0 : θ Θ 0 H 1 : θ Θ c 0 Definition 16 A test φ is a uniformly most powerful (UMP) level α test for H 0 vs. H 1 if φ has level α and for any other level α test

Διαβάστε περισσότερα

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Chapter 6: Systems of Linear Differential. be continuous functions on the interval Chapter 6: Systems of Linear Differential Equations Let a (t), a 2 (t),..., a nn (t), b (t), b 2 (t),..., b n (t) be continuous functions on the interval I. The system of n first-order differential equations

Διαβάστε περισσότερα

Lecture 21: Properties and robustness of LSE

Lecture 21: Properties and robustness of LSE Lecture 21: Properties and robustness of LSE BLUE: Robustness of LSE against normality We now study properties of l τ β and σ 2 under assumption A2, i.e., without the normality assumption on ε. From Theorem

Διαβάστε περισσότερα

w o = R 1 p. (1) R = p =. = 1

w o = R 1 p. (1) R = p =. = 1 Πανεπιστήµιο Κρήτης - Τµήµα Επιστήµης Υπολογιστών ΗΥ-570: Στατιστική Επεξεργασία Σήµατος 205 ιδάσκων : Α. Μουχτάρης Τριτη Σειρά Ασκήσεων Λύσεις Ασκηση 3. 5.2 (a) From the Wiener-Hopf equation we have:

Διαβάστε περισσότερα

Lecture 12: Pseudo likelihood approach

Lecture 12: Pseudo likelihood approach Lecture 12: Pseudo likelihood approach Pseudo MLE Let X 1,...,X n be a random sample from a pdf in a family indexed by two parameters θ and π with likelihood l(θ,π). The method of pseudo MLE may be viewed

Διαβάστε περισσότερα

PARTIAL NOTES for 6.1 Trigonometric Identities

PARTIAL NOTES for 6.1 Trigonometric Identities PARTIAL NOTES for 6.1 Trigonometric Identities tanθ = sinθ cosθ cotθ = cosθ sinθ BASIC IDENTITIES cscθ = 1 sinθ secθ = 1 cosθ cotθ = 1 tanθ PYTHAGOREAN IDENTITIES sin θ + cos θ =1 tan θ +1= sec θ 1 + cot

Διαβάστε περισσότερα

Concrete Mathematics Exercises from 30 September 2016

Concrete Mathematics Exercises from 30 September 2016 Concrete Mathematics Exercises from 30 September 2016 Silvio Capobianco Exercise 1.7 Let H(n) = J(n + 1) J(n). Equation (1.8) tells us that H(2n) = 2, and H(2n+1) = J(2n+2) J(2n+1) = (2J(n+1) 1) (2J(n)+1)

Διαβάστε περισσότερα

5.4 The Poisson Distribution.

5.4 The Poisson Distribution. The worst thing you can do about a situation is nothing. Sr. O Shea Jackson 5.4 The Poisson Distribution. Description of the Poisson Distribution Discrete probability distribution. The random variable

Διαβάστε περισσότερα

5. Choice under Uncertainty

5. Choice under Uncertainty 5. Choice under Uncertainty Daisuke Oyama Microeconomics I May 23, 2018 Formulations von Neumann-Morgenstern (1944/1947) X: Set of prizes Π: Set of probability distributions on X : Preference relation

Διαβάστε περισσότερα

Lecture 10 - Representation Theory III: Theory of Weights

Lecture 10 - Representation Theory III: Theory of Weights Lecture 10 - Representation Theory III: Theory of Weights February 18, 2012 1 Terminology One assumes a base = {α i } i has been chosen. Then a weight Λ with non-negative integral Dynkin coefficients Λ

Διαβάστε περισσότερα

Appendix S1 1. ( z) α βc. dβ β δ β

Appendix S1 1. ( z) α βc. dβ β δ β Appendix S1 1 Proof of Lemma 1. Taking first and second partial derivatives of the expected profit function, as expressed in Eq. (7), with respect to l: Π Π ( z, λ, l) l θ + s ( s + h ) g ( t) dt λ Ω(

Διαβάστε περισσότερα

TMA4115 Matematikk 3

TMA4115 Matematikk 3 TMA4115 Matematikk 3 Andrew Stacey Norges Teknisk-Naturvitenskapelige Universitet Trondheim Spring 2010 Lecture 12: Mathematics Marvellous Matrices Andrew Stacey Norges Teknisk-Naturvitenskapelige Universitet

Διαβάστε περισσότερα

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1) HW 3 Solutions a) I use the autoarima R function to search over models using AIC and decide on an ARMA3,) b) I compare the ARMA3,) to ARMA,0) ARMA3,) does better in all three criteria c) The plot of the

Διαβάστε περισσότερα

Practice Exam 2. Conceptual Questions. 1. State a Basic identity and then verify it. (a) Identity: Solution: One identity is csc(θ) = 1

Practice Exam 2. Conceptual Questions. 1. State a Basic identity and then verify it. (a) Identity: Solution: One identity is csc(θ) = 1 Conceptual Questions. State a Basic identity and then verify it. a) Identity: Solution: One identity is cscθ) = sinθ) Practice Exam b) Verification: Solution: Given the point of intersection x, y) of the

Διαβάστε περισσότερα

A Bonus-Malus System as a Markov Set-Chain. Małgorzata Niemiec Warsaw School of Economics Institute of Econometrics

A Bonus-Malus System as a Markov Set-Chain. Małgorzata Niemiec Warsaw School of Economics Institute of Econometrics A Bonus-Malus System as a Markov Set-Chain Małgorzata Niemiec Warsaw School of Economics Institute of Econometrics Contents 1. Markov set-chain 2. Model of bonus-malus system 3. Example 4. Conclusions

Διαβάστε περισσότερα

Queensland University of Technology Transport Data Analysis and Modeling Methodologies

Queensland University of Technology Transport Data Analysis and Modeling Methodologies Queensland University of Technology Transport Data Analysis and Modeling Methodologies Lab Session #7 Example 5.2 (with 3SLS Extensions) Seemingly Unrelated Regression Estimation and 3SLS A survey of 206

Διαβάστε περισσότερα

CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS

CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS EXERCISE 01 Page 545 1. Use matrices to solve: 3x + 4y x + 5y + 7 3x + 4y x + 5y 7 Hence, 3 4 x 0 5 y 7 The inverse of 3 4 5 is: 1 5 4 1 5 4 15 8 3

Διαβάστε περισσότερα

The Probabilistic Method - Probabilistic Techniques. Lecture 7: The Janson Inequality

The Probabilistic Method - Probabilistic Techniques. Lecture 7: The Janson Inequality The Probabilistic Method - Probabilistic Techniques Lecture 7: The Janson Inequality Sotiris Nikoletseas Associate Professor Computer Engineering and Informatics Department 2014-2015 Sotiris Nikoletseas,

Διαβάστε περισσότερα

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture 7: Information bound Lecturer: Yihong Wu Scribe: Shiyu Liang, Feb 6, 06 [Ed. Mar 9] Recall the Chi-squared divergence

Διαβάστε περισσότερα

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in : tail in X, head in A nowhere-zero Γ-flow is a Γ-circulation such that

Διαβάστε περισσότερα

Example of the Baum-Welch Algorithm

Example of the Baum-Welch Algorithm Example of the Baum-Welch Algorithm Larry Moss Q520, Spring 2008 1 Our corpus c We start with a very simple corpus. We take the set Y of unanalyzed words to be {ABBA, BAB}, and c to be given by c(abba)

Διαβάστε περισσότερα