Αναγνώριση Προτύπων. Non Parametric

Σχετικά έγγραφα
Homework for 1/27 Due 2/5

1. For each of the following power series, find the interval of convergence and the radius of convergence:

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing.

SUPERPOSITION, MEASUREMENT, NORMALIZATION, EXPECTATION VALUES. Reading: QM course packet Ch 5 up to 5.6

INTEGRATION OF THE NORMAL DISTRIBUTION CURVE

p n r

IIT JEE (2013) (Trigonomtery 1) Solutions

Statistical Inference I Locally most powerful tests

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Μηχανική Μάθηση Hypothesis Testing

Other Test Constructions: Likelihood Ratio & Bayes Tests


The Simply Typed Lambda Calculus

Introduction of Numerical Analysis #03 TAGAMI, Daisuke (IMI, Kyushu University)

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Areas and Lengths in Polar Coordinates

Homework 3 Solutions

Ψηφιακή Επεξεργασία Εικόνας

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 7η: Consumer Behavior Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

Homework 4.1 Solutions Math 5110/6830

Bessel function for complex variable

α β

Example Sheet 3 Solutions

Areas and Lengths in Polar Coordinates

Lecture 17: Minimum Variance Unbiased (MVUB) Estimators

Solution Series 9. i=1 x i and i=1 x i.

ST5224: Advanced Statistical Theory II

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

The Equivalence Theorem in Optimal Design

n r f ( n-r ) () x g () r () x (1.1) = Σ g() x = Σ n f < -n+ r> g () r -n + r dx r dx n + ( -n,m) dx -n n+1 1 -n -1 + ( -n,n+1)

Στα επόμενα θεωρούμε ότι όλα συμβαίνουν σε ένα χώρο πιθανότητας ( Ω,,P) Modes of convergence: Οι τρόποι σύγκλισης μιας ακολουθίας τ.μ.

LAD Estimation for Time Series Models With Finite and Infinite Variance

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

CHAPTER 103 EVEN AND ODD FUNCTIONS AND HALF-RANGE FOURIER SERIES

C.S. 430 Assignment 6, Sample Solutions

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Volume of a Cuboid. Volume = length x breadth x height. V = l x b x h. The formula for the volume of a cuboid is

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

Approximation of distance between locations on earth given by latitude and longitude

Matrices and Determinants

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

Three Classical Tests; Wald, LM(Score), and LR tests

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Math 6 SL Probability Distributions Practice Test Mark Scheme

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

Potential Dividers. 46 minutes. 46 marks. Page 1 of 11

derivation of the Laplacian from rectangular to spherical coordinates

Fourier Series. constant. The ;east value of T>0 is called the period of f(x). f(x) is well defined and single valued periodic function

Numerical Analysis FMN011

5.4 The Poisson Distribution.

Solutions: Homework 3

2 Composition. Invertible Mappings

The challenges of non-stable predicates

Tired Waiting in Queues? Then get in line now to learn more about Queuing!

The Heisenberg Uncertainty Principle

On a four-dimensional hyperbolic manifold with finite volume

Capacitors - Capacitance, Charge and Potential Difference

Solve the difference equation

TMA4115 Matematikk 3

[1] P Q. Fig. 3.1

Math221: HW# 1 solutions

Concrete Mathematics Exercises from 30 September 2016

EE512: Error Control Coding

Fractional Colorings and Zykov Products of graphs

Solutions to Exercise Sheet 5

Second Order Partial Differential Equations

Aquinas College. Edexcel Mathematical formulae and statistics tables DO NOT WRITE ON THIS BOOKLET

Degenerate Perturbation Theory

Μαθηματικά Πληροφορικής Συνδυαστικά Θεωρήματα σε Πεπερασμένα Σύνολα

4.6 Autoregressive Moving Average Model ARMA(1,1)

Homework 8 Model Solution Section

Instruction Execution Times

Παραμετρικές εξισώσεις καμπύλων. ΗΥ111 Απειροστικός Λογισμός ΙΙ

Parameter Estimation Fitting Probability Distributions Bayesian Approach

Γιάννης Σαριδάκης Σχολή Μ.Π.Δ., Πολυτεχνείο Κρήτης

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Parametrized Surfaces

Digital Integrated Circuits, 2 nd edition, J. M. Rabaey, A. Chandrakasan, B. Nikolic

Finite Field Problems: Solutions

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Every set of first-order formulas is equivalent to an independent set

EU-Profiler: User Profiles in the 2009 European Elections

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Lecture 2. Soundness and completeness of propositional logic

Presentation of complex number in Cartesian and polar coordinate system

6.3 Forecasting ARMA processes

1. Matrix Algebra and Linear Economic Models

Inverse trigonometric functions & General Solution of Trigonometric Equations

Durbin-Levinson recursive method

Reminders: linear functions

EE 570: Location and Navigation

Section 8.3 Trigonometric Equations

Probability theory STATISTICAL MODELING OF MULTIVARIATE EXTREMES, FMSN15/MASM23 TABLE OF FORMULÆ. Basic probability theory

On Generating Relations of Some Triple. Hypergeometric Functions

Μια εισαγωγή στα Μαθηματικά για Οικονομολόγους

Outline. M/M/1 Queue (infinite buffer) M/M/1/N (finite buffer) Networks of M/M/1 Queues M/G/1 Priority Queue

Ψηφιακή Οικονομία. Διάλεξη 11η: Markets and Strategic Interaction in Networks Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

Transcript:

Γιώργος Γιαννακάκης No Parametric ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ ΕΠΙΣΤΗΜΗΣ ΥΠΟΛΟΓΙΣΤΩΝ

Probability Desity Fuctio If the radom variable is deoted by X, its probability desity fuctio f has the property that

No Parametric Desity estimatio Problem: patter distributio is ukow -> estimate PDF Whe patter distributio is ukow o parametric techiques are employed Estimate desity distributio from the data through geeralizatio of the histogram. Uderlyig PDF Data

Histogram The simplest form of o-parametric desity estimatio is the histogram Separates the samplig area i small areas ad approximates the desity from umber of samples fallig withi each regio.

No Parametric Desity estimatio Histogram methods partitio the data space ito distict bis with widths Δ i ad cout the umber of observatios, i, i each bi. Ofte, the same width is used for all bis, Δ i = Δ. Δ acts as a smoothig parameter.

No parametric desity estimatio Give that the probability P of the PDF p(x) that a vector x will fall i a regio R is give by P P{ x R} p( x ')dx' Suppose we have N samples x 1, x 2,, x draw from the distributio p(x). The probability that k poits fall i R is the give by biomial distributio N k P{ x1, x2,..., xn} P (1 P) k Whe N the distributio becomes more sharp, so we ca cosider a good estimate of P the mea value of samples R k P k N

If we ow assume that p(x) is cotiuous ad that the regio R is so small that p does ot vary appreciably withi it Combiig results No parametric desity estimatio P p( x ')dx' p( x) V R k P k k p( x) V p( x) V P p( x) V Desity estimatio is more accurate as the sample size N icreases

No parametric desity estimatio To estimate the desity at x, we form a sequece of regios R 1, R 2,, R where R i cotais i samples. Let V be the volume of R, k be the umber of samples fallig i R, ad p (x) be the th estimate for p(x): k p ( x) V V: area of R k: umber of samples withi area R : total umber of samples Desity estimatio is more accurate as the sample size N icreases

No parametric desity estimatio If the total umber of samples N is stable the to improve desity estimatio accuracy miimize the volume, but the the area R will become so small will ot cotai practical samples A compromise must therefore be made so that V be large eough to cotai eough samples ad small eough to support the hypothesis that p (x) remais costat withi R Three coditios are required i order p ( ) ( ) x p x k p ( x) V 1) limv 0 2) lim k 3) lim k / 0

Example No parametric desity estimatio p k ( x) V

No parametric desity estimatio p k ( x) V There are two commo approaches of obtaiig sequeces of regios R i so as p (x) -> p(x) Set a fixed value for volume V ad calculate the cotets samples from the data (Parze Widows) Set a fixed umber samples k ad calculate the correspodig oe volume V of data (k-nearest Neighbours) It turs out that both approaches above coverge to the actual value of the fuctio probability desity whe N, sice thevolume V shriks ad k grows with N

I k-earest eighbor approach we fix k, ad fid V that cotais k poits iside Algorithm K-earest eighbors a iitial area aroud x is selected i order to estimate p(x) the area icreases i order k samples to be withi the area the k are the k-earest eighbors of x desity is estimated usig the formula p k ( x) NV

K-earest eighbors It the desity is high ear x, the cell will be relatively small, which leads to good resolutio. If the desity is low, the cell will grow large, but it will stop soo after it eters regios of higher desity

K-earest eighbors K selectio for desity estimatio A good rule of thumb is k It ca prove covergece if goes to ifiity

K-earest eighbors K selectio k should be large so that error rate is miimized k too small will lead to oisy decisio boudaries k should be small eough so that oly earby samples are icluded k too large will lead to oversmoothed boudaries Balacig 1 ad 2 is ot trivial This is a recurret issue, eed to smooth data, but ot too much

K-earest eighbor classificatio The k-earest-eighbor classificatio problem Goal: Classify a sample x by assigig it the label most frequetly represeted amog the k earest samples ad use a votig scheme Idea: To determie the label of a ukow sample (x), look at x s k-earest eighbors Compute Distace Test Record Traiig Records Choose k of the earest records

Load the data K-earest eighbor classificatio Iitialize the value of k For each data poit x i Calculate the distace metric betwee test poit ad each row of traiig data. Sort the calculated distaces ad idetify k-earest eighbors Get the most observed class amog the k-earest eighbors Classify x i accordig to the bigger umber belogig to a predefied class K is odd i order to esure votig will have a resultig class

K-earest eighbor classificatio

K-earest eighbor classificatio

k - Nearest eighbor method Majority vote withi the k earest eighbors ew K= 1: brow K= 3: gree 4/30/2018

k - Nearest eighbor method For k = 1,,7 poit x gets classified correctly red class For larger k classificatio of x is wrog blue class

K earest algorithm

How may eighbors to cosider? Noisy decisio boudaries

k - Nearest eighbor method K acts as a smoother For, the error rate of the 1-earest-eighbour classifier is ever more tha twice the optimal error (obtaied from the true coditioal class distributios).

Traiig error rate is a icreasig fuctio of k. As you ca see, the error rate at K=1 is always zero for the traiig sample. This is because the closest poit to ay traiig data poit is itself. Validatio error rate iitially decreases ad reaches a miima. After the miima poit, it the icrease with icreasig K. To get the optimal value of K, you ca segregate the traiig ad validatio from the iitial dataset. The optimal k (miimum) should be used for all predictios.

Value of k Larger k icreases cofidece i predictio Note that if k is too large, decisio may be skewed Weighted evaluatio of earest eighbors Plai majority may ufairly skew decisio Revise algorithm so that closer eighbors have greater vote weight Other distace measures k-nn variatios

k - Nearest eighbor method KNN belogs to the class of lazy algorithms: Process the traiig data after classificatio request Aswers to the classificatio request combiig stored data educatio Does ot take accout of logic or other results

Pros ad Cos of k-nn Pros Simple Good results Easy to add ew traiig examples Cos Computatioally expesive To determie earest eighbor, visit each traiig samples O(d) = umber of traiig samples d = dimesios