ENGR 691/692 Section 66 (Fall 06): Machine Learning Assigned: August 30 Homework 1: Bayesian Decision Theory (solutions) Due: September 13

Σχετικά έγγραφα
2 Composition. Invertible Mappings

ST5224: Advanced Statistical Theory II

Homework 3 Solutions

Solution Series 9. i=1 x i and i=1 x i.

Other Test Constructions: Likelihood Ratio & Bayes Tests

Section 8.3 Trigonometric Equations

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

C.S. 430 Assignment 6, Sample Solutions

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

EE512: Error Control Coding

Homework 8 Model Solution Section

Example Sheet 3 Solutions

Areas and Lengths in Polar Coordinates

Solutions to Exercise Sheet 5

Areas and Lengths in Polar Coordinates

4.6 Autoregressive Moving Average Model ARMA(1,1)

An Introduction to Signal Detection and Estimation - Second Edition Chapter II: Selected Solutions

w o = R 1 p. (1) R = p =. = 1

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Every set of first-order formulas is equivalent to an independent set

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Second Order Partial Differential Equations

A Note on Intuitionistic Fuzzy. Equivalence Relation

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

CHAPTER 101 FOURIER SERIES FOR PERIODIC FUNCTIONS OF PERIOD

Bayes Rule and its Applications

Statistical Inference I Locally most powerful tests

Second Order RLC Filters

Matrices and Determinants

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Partial Differential Equations in Biology The boundary element method. March 26, 2013

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

Math 446 Homework 3 Solutions. (1). (i): Reverse triangle inequality for metrics: Let (X, d) be a metric space and let x, y, z X.

Problem Set 3: Solutions

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

6.3 Forecasting ARMA processes

Section 7.6 Double and Half Angle Formulas

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Approximation of distance between locations on earth given by latitude and longitude

derivation of the Laplacian from rectangular to spherical coordinates

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

SOLVING CUBICS AND QUARTICS BY RADICALS

Lecture 2. Soundness and completeness of propositional logic

Inverse trigonometric functions & General Solution of Trigonometric Equations

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

Math 6 SL Probability Distributions Practice Test Mark Scheme

Quadratic Expressions

6. MAXIMUM LIKELIHOOD ESTIMATION

Practice Exam 2. Conceptual Questions. 1. State a Basic identity and then verify it. (a) Identity: Solution: One identity is csc(θ) = 1

Reminders: linear functions

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

5. Choice under Uncertainty

( ) 2 and compare to M.

MATH423 String Theory Solutions 4. = 0 τ = f(s). (1) dτ ds = dxµ dτ f (s) (2) dτ 2 [f (s)] 2 + dxµ. dτ f (s) (3)

CRASH COURSE IN PRECALCULUS

Strain gauge and rosettes

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

Exercises to Statistics of Material Fatigue No. 5

Parametrized Surfaces

MathCity.org Merging man and maths

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

2. Let H 1 and H 2 be Hilbert spaces and let T : H 1 H 2 be a bounded linear operator. Prove that [T (H 1 )] = N (T ). (6p)

Homework for 1/27 Due 2/5

The challenges of non-stable predicates

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Probability and Random Processes (Part II)

Srednicki Chapter 55

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Math221: HW# 1 solutions

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS

Spherical Coordinates

Numerical Analysis FMN011

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

ΚΥΠΡΙΑΚΟΣ ΣΥΝΔΕΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY 21 ος ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ Δεύτερος Γύρος - 30 Μαρτίου 2011

ORDINAL ARITHMETIC JULIAN J. SCHLÖDER

The ε-pseudospectrum of a Matrix

D Alembert s Solution to the Wave Equation

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

Chapter 3: Ordinal Numbers

STAT200C: Hypothesis Testing

Fractional Colorings and Zykov Products of graphs

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

Solutions to the Schrodinger equation atomic orbitals. Ψ 1 s Ψ 2 s Ψ 2 px Ψ 2 py Ψ 2 pz

Problem Set 9 Solutions. θ + 1. θ 2 + cotθ ( ) sinθ e iφ is an eigenfunction of the ˆ L 2 operator. / θ 2. φ 2. sin 2 θ φ 2. ( ) = e iφ. = e iφ cosθ.

= {{D α, D α }, D α }. = [D α, 4iσ µ α α D α µ ] = 4iσ µ α α [Dα, D α ] µ.

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

If we restrict the domain of y = sin x to [ π, π ], the restrict function. y = sin x, π 2 x π 2

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

Μηχανική Μάθηση Hypothesis Testing

Lecture 13 - Root Space Decomposition II

Πρόβλημα 1: Αναζήτηση Ελάχιστης/Μέγιστης Τιμής

( )( ) ( ) ( )( ) ( )( ) β = Chapter 5 Exercise Problems EX α So 49 β 199 EX EX EX5.4 EX5.5. (a)

Tridiagonal matrices. Gérard MEURANT. October, 2008

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Transcript:

ENGR 69/69 Section 66 (Fall 06): Machine Learning Assigned: August 30 Homework : Bayesian Decision Theory (solutions) Due: Septemer 3 Prolem : ( pts) Let the conditional densities for a two-category one-dimensional prolem e given y the following Cauchy distriution: p(x ω i ) π + ( x a i ), i, (6 pts) By explicit integration, check that the distriution are indeed normalized (9 pts) Assuming P (ω )P (ω ), show that P (ω x) P (ω x) ifx a+a, that is, the minimum error decision oundary is a point midway etween the peaks of the two distriutions, regardless of 3 (7 pts) Show that the minimum proaility of error is given y P (error) π tan a a We sustitute y x ai u p(x ω i )dx π into the aove and get k π + ( x a i ) dx +y dy By setting p(x ω )P (ω )p(x ω )P (ω ), we have π + ( x a π tan (y) ( π π + π ) ) π + ( x a ), or, equivalently, x a ±(x a ) For a a,thisimpliesthatx a+a 3 Without loss of generality, we assume a >a The proaility of error is defined as P (error) P (error,x)dx P (error x)p(x)dx Note that the decision oundary is at a+a, hence { P (ω x) if x a+a P (error x) P (ω x) if x> a+a { p(x ω)p (ω ) p(x) if x a+a p(x ω )P (ω ) p(x) if x> a+a

Therefore, the proaility of error is We sustitute y x a P (error) a +a π and z x a a +a [ P (error) a a π [ π π p(x ω )P (ω )dx + p(x ω )P (ω )dx a +a +( x a ) dx + π a +a +( x a ) dx into the aove and get ] +y dy + a a +z dz tan a a ] (y) +tan (z) a a ( tan a a + π + π a tan a π tan a a Similarly, if a >a,wehavep (error) π tan a a P (error) π tan a a ) Therefore, we have shown that Prolem : ( pts) Let ω max (x) e the state of nature for which P (ω max x) P (ω i x) for all i, i,,c (7 pts) Show that P (ω max x) c (7 pts) Show that for the minimum-error-rate decision rule the average proaility of error is given y P (error) P (ω max x)p(x)dx 3 (7 pts) Show that P (error) c c Since P (ω max x) P (ω i x), we have Hence which implies that P (ω max x) c P (ω max x) P (ω i x) cp (ω max x), By definition, P (error) P (error x)p(x)dx [ P (ω max x)] p(x)dx P (ω max x)p(x)dx

3 3 From and, it is clear that P (error) P (ω max x)p(x)dx c p(x)dx c c c Prolem 3: ( pts) In many machine learning applications, one has the option either to assign the pattern to one of c classes, or to reject it as eing unrecognizale If the cost for rejects is not too high, rejection may e a desirale action Let 0 i j i,j,,c λ(α i ω j ) λ r i c + λ s otherwise, where λ r is the loss incurred for choosing the (c + )th action, rejection, and λ s is the loss incurred for making any sustitution error (0 pts) Please derive the decision rule with the minimum risk (6 pts) What happens if λ r 0? 3 (6 pts) What happens if λ r >λ s? For i,,c, R(α i x) λ(α i ω j )P (ω j x) j λ s j,j i P (ω j x) λ s [ P (ω i x)] For i c +, R(α c+ x) λ r Therefore, the minimum risk is achieved if we decide ω i if R(α i x) R(α c+ x), ie, P (ω i x) λr λ s,and reject otherwise If λ r 0, we always reject 3 If λ r >λ s, we will never reject Prolem 4: ( pts + 0 extra points) Let the components of the vector x [x,,x d ] T e inary-valued (0 or ), and let P (ω j ) e the prior proaility for the state of nature ω j and j,,c We define p ij P (x i ω j ),,,d,j,,c, with the components of x i eing statistically independent for all x in ω j ( pts) Show that the minimum proaility of error is achieved y the following decision rule: Decide ω k if g k (x) g j (x) for all j and k, where g j (x) x i ln p ij + p ij ln( p ij )+lnp(ω j ) (0 extra pts) If the components of x are ternary valued (, 0, or ), show that a minimum proaility of error decision rule can e derived that involves discriminant functions g j (x) that are quadratic function of the components x i

4 Consider the following discriminant function g j (x) ln[p(x ω j )P (ω j )] ln p(x ω j )+lnp (ω j ) The components of x are statistically independent for all x in ω j, then we can write the density as a product: p(x ω j ) p(x i ω j ) p xi ij ( p ij) xi Thus we have the discriminant function g j (x) [x i ln p ij +( x i )ln( p ij )] + ln P (ω j ) p ij x i ln + p ij Consider the following discriminant function ln( p ij )+lnp(ω j ) g j (x) ln[p(x ω j )P (ω j )] ln p(x ω j )+lnp (ω j ) The components of x are statistically independent for all x in ω j, therefore, Let It is not hard to check that p(x i ω j ) p(x ω j ) p(x i ω j ) p ij P (x i ω j ), q ij P (x i 0 ω j ), r ij P (x i ω j ) Thus the discriminant functions can e written as [( g j (x) x i + x i x i ln pij r ij q ij + p xi+ x i ij ) ln p ij +( x i )lnq ij + x i ln p ij r ij + q x i ij r xi+ x i ij ( x i + ) ] x i ln r ij +lnp (ω j ) ln q ij +lnp (ω j ) which are quadratic functions of the components x i Question 5: (3 pts) Suppose we have three categories with prior proailities P (ω )05, P (ω )P(ω 3 ) 05 and the class conditional proaility distriutions p(x ω ) N(0, ) p(x ω ) N(05, ) p(x ω 3 ) N(, )

5 where N(µ, σ ) represents the normal distriution with density function p(x) e (x µ) σ πσ We sample the following sequence of four points: x 06, 0, 09, (9 pts) Calculate explicitly the proaility that the sequence actually came from ω, ω 3, ω 3, ω (6 pts) Repeat for the sequence ω, ω, ω, ω 3 3 (8 pts) Find the sequence of states having the maximum proaility It is straightforward to compute that p(06 ω )03335 p(06 ω )0396953 p(06 ω 3 )036870 p(0 ω )0396953 p(0 ω )036870 p(0 ω 3 )066085 p(09 ω )066085 p(09 ω )036870 p(09 ω 3 )0396953 p( ω )0785 p( ω )03335 p( ω 3 )0396953 We denote X (x,x,x 3,x 4 )andω (ω(),ω(),ω(3),ω(4)) Clearly, there are 3 4 possile values of ω, such as (ω,ω,ω,ω ) (ω,ω,ω,ω ) (ω,ω,ω,ω 3 ) (ω,ω,ω,ω ) (ω,ω,ω,ω ) (ω,ω,ω,ω 3 ) (ω,ω 3,ω,ω ) (ω,ω,ω 3,ω ) (ω,ω,ω 3,ω 3 ) (ω 3,ω 3,ω 3,ω ) (ω 3,ω 3,ω 3,ω ) (ω 3,ω 3,ω 3,ω 3 ) For each possile value of ω, wecalculatep (ω) andp (x ω) using the following, which assume the independences of x i and ω(i): p(x ω) P (ω) 4 p(x i w(i)) 4 P (ω(i)) For example, if ω (ω,ω 3,ω 3,ω )andx (06, 0, 09, ), then we have and p(x ω) p((06, 0, 09, ) (ω,ω 3,ω 3,ω )) p(06 ω )p(0 ω 3 )p(09 ω 3 )p( ω ) 03335 066085 0396953 03335 0073 P (ω) P (ω )P (ω )P (ω 3 )P (ω 4 ) 4 4 4 000785 Given X (06, 0, 09, ) and ω (ω,ω 3,ω 3,ω ), we have p(x) p(x 06,x 0,x 3 09,x 4 ) p(x 06,x 0,x 3 09,x 4 ω)p (ω)!

6 p(x 06,x 0,x 3 09,x 4 ω,ω,ω,ω )P (ω,ω,ω,ω ) +p(x 06,x 0,x 3 09,x 4 ω,ω,ω,ω )P (ω,ω,ω,ω ) +p(x 06,x 0,x 3 09,x 4 ω 3,ω 3,ω 3,ω 3 )P (ω 3,ω 3,ω 3,ω 3 ) p(06 ω )p(0 ω )p(09 ω )p( ω )P (ω )P (ω )P (ω )P (ω ) +p(06 ω )p(0 ω )p(09 ω )p( ω )P (ω )P (ω )P (ω )P (ω ) +p(06 ω 3 )p(0 ω 3 )p(09 ω 3 )p( ω 3 )P (ω 3 )P (ω 3 )P (ω 3 )P (ω 3 ) 00083 Therefore, Following the steps in part, we have P (ω X) P (ω,ω 3,ω 3,ω 06, 09, 0, ) p(06, 09, 0, ω,ω 3,ω 3,ω )P (ω,ω 3,ω 3,ω ) p(x) 0073 000785 00083 0007584 P (ω,ω,ω,ω 3 06, 0, 09, ) p(06, 0, 09, ω,ω,ω,ω 3 )P (ω,ω,ω,ω 3 ) p(x) 00794 000785 00083 0060 3 The sequence ω (ω,ω,ω,ω ) has the maximum proaility to oserve X (06, 0, 09, ) This maximum proaility is 003966