476,,. : 4. 7, MML. 4 6,.,. : ; Wishart ; MML Wishart ; CEM 2 ; ;,. 2. EM 2.1 Y = Y 1,, Y d T d, y = y 1,, y d T Y. k : p(y θ) = k α m p(y θ m ), (2.1

Σχετικά έγγραφα
: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Research on Economics and Management

Research on model of early2warning of enterprise crisis based on entropy

172,,,,. P,. Box (1980)P, Guttman (1967)Rubin (1984)P, Meng (1994), Gelman(1996)De la HorraRodriguez-Bernal (2003). BayarriBerger (2000)P P.. : Casell

HMY 795: Αναγνώριση Προτύπων

90 [, ] p Panel nested error structure) : Lagrange-multiple LM) Honda [3] LM ; King Wu, Baltagi, Chang Li [4] Moulton Randolph ANOVA) F p Panel,, p Z

Anomaly Detection with Neighborhood Preservation Principle

Applying Markov Decision Processes to Role-playing Game

Chinese Journal of Applied Probability and Statistics Vol.28 No.3 Jun (,, ) 应用概率统计 版权所用,,, EM,,. :,,, P-. : O (count data)

46 2. Coula Coula Coula [7], Coula. Coula C(u, v) = φ [ ] {φ(u) + φ(v)}, u, v [, ]. (2.) φ( ) (generator), : [, ], ; φ() = ;, φ ( ). φ [ ] ( ) φ( ) []

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

Architecture for Visualization Using Teacher Information based on SOM

ESTIMATION OF SYSTEM RELIABILITY IN A TWO COMPONENT STRESS-STRENGTH MODELS DAVID D. HANAGAL

Bayesian Discriminant Feature Selection

Bayesian modeling of inseparable space-time variation in disease risk

ER-Tree (Extended R*-Tree)

Statistical Inference I Locally most powerful tests

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

ST5224: Advanced Statistical Theory II

{takasu, Conditional Random Field

Aquinas College. Edexcel Mathematical formulae and statistics tables DO NOT WRITE ON THIS BOOKLET

Bayesian., 2016, 31(2): : (heterogeneity) Bayesian. . Gibbs : O212.8 : A : (2016)

Probabilistic Approach to Robust Optimization

Simplex Crossover for Real-coded Genetic Algolithms

Quick algorithm f or computing core attribute

Congruence Classes of Invertible Matrices of Order 3 over F 2

Βιογραφικό Σημείωμα. Διεύθυνση επικοινωνίας: Τμήμα Μαθηματικών, Πανεπιστήμιο Πατρών

n 1 n 3 choice node (shelf) choice node (rough group) choice node (representative candidate)

: Ω F F 0 t T P F 0 t T F 0 P Q. Merton 1974 XT T X T XT. T t. V t t X d T = XT [V t/t ]. τ 0 < τ < X d T = XT I {V τ T } δt XT I {V τ<t } I A

User Behavior Analysis for a Large2scale Search Engine

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

Introduction to Bayesian Statistics

ΟΙΚΟΝΟΜΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ

y = f(x)+ffl x 2.2 x 2X f(x) x x p T (x) = 1 Z T exp( f(x)=t ) (2) x 1 exp Z T Z T = X x2x exp( f(x)=t ) (3) Z T T > 0 T 0 x p T (x) x f(x) (MAP = Max

Exercises to Statistics of Material Fatigue No. 5

A research on the influence of dummy activity on float in an AOA network and its amendments


Extraction of Basic Patterns of Household Energy Consumption

Stabilization of stock price prediction by cross entropy optimization

Vol. 37 ( 2017 ) No. 3. J. of Math. (PRC) : A : (2017) k=1. ,, f. f + u = f φ, x 1. x n : ( ).

Buried Markov Model Pairwise

Other Test Constructions: Likelihood Ratio & Bayes Tests

Re-Pair n. Re-Pair. Re-Pair. Re-Pair. Re-Pair. (Re-Merge) Re-Merge. Sekine [4, 5, 8] (highly repetitive text) [2] Re-Pair. Blocked-Repair-VF [7]

HMY 795: Αναγνώριση Προτύπων

Area Location and Recognition of Video Text Based on Depth Learning Method

High order interpolation function for surface contact problem

552 Lee (2006),,, BIC,. : ; ; ;. 2., Poisson (Zero-Inflated Poisson Distribution), ZIP. Y ZIP(φ, λ), φ + (1 φ) exp( λ), y = 0; P {Y = y} = (1 φ) exp(

ΠΑΡΑΔΟΤΕΟ 3.1 : Έκθεση καταγραφής χρήσεων γης

MIA MONTE CARLO ΜΕΛΕΤΗ ΤΩΝ ΕΚΤΙΜΗΤΩΝ RIDGE ΚΑΙ ΕΛΑΧΙΣΤΩΝ ΤΕΤΡΑΓΩΝΩΝ

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Optimization, PSO) DE [1, 2, 3, 4] PSO [5, 6, 7, 8, 9, 10, 11] (P)

Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb

Yahoo 2. SNS Social Networking Service [3,5,12] Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited.

HMY 795: Αναγνώριση Προτύπων

Merging Particle Filter

Prey-Taxis Holling-Tanner

Detection and Recognition of Traffic Signal Using Machine Learning

Research on vehicle routing problem with stochastic demand and PSO2DP algorithm with Inver2over operator

Optimization Investment of Football Lottery Game Online Combinatorial Optimization

HOSVD. Higher Order Data Classification Method with Autocorrelation Matrix Correcting on HOSVD. Junichi MORIGAKI and Kaoru KATAYAMA

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

J. of Math. (PRC) 6 n (nt ) + n V = 0, (1.1) n t + div. div(n T ) = n τ (T L(x) T ), (1.2) n)xx (nt ) x + nv x = J 0, (1.4) n. 6 n

lecture 10: the em algorithm (contd)

P É Ô Ô² 1,2,.. Ò± 1,.. ±μ 1,. ƒ. ±μ μ 1,.Š. ±μ μ 1, ˆ.. Ê Ò 1,.. Ê Ò 1 Œˆ ˆŸ. ² μ Ê ² μ Ì μ ÉÓ. É μ ±, Ì μé μ Ò É μ Ò ² μ Ö

J. of Math. (PRC) Banach, , X = N(T ) R(T + ), Y = R(T ) N(T + ). Vol. 37 ( 2017 ) No. 5

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

Υπολογιστική Φυσική Στοιχειωδών Σωματιδίων

Dynamic types, Lambda calculus machines Section and Practice Problems Apr 21 22, 2016

I. Μητρώο Εξωτερικών Μελών της ημεδαπής για το γνωστικό αντικείμενο «Μη Γραμμικές Ελλειπτικές Διαφορικές Εξισώσεις»

SocialDict. A reading support tool with prediction capability and its extension to readability measurement

GPU. CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA. Parallelizing the Number Partitioning Problem for GPUs

ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΑΓΡΟΤΙΚΕΣ ΣΤΑΤΙΣΤΙΚΕΣ ΜΕ ΕΡΓΑΛΕΙΑ ΓΕΩΠΛΗΡΟΦΟΡΙΚΗΣ

ΣΥΝΔΥΑΣΤΙΚΗ ΒΕΛΤΙΣΤΟΠΟΙΗΣΗ

CSJ. Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

Statistical analysis of extreme events in a nonstationary context via a Bayesian framework. Case study with peak-over-threshold data

ΠΟΛΥΤΕΧΝΕΙΟ ΚΡΗΤΗΣ ΣΧΟΛΗ ΜΗΧΑΝΙΚΩΝ ΠΕΡΙΒΑΛΛΟΝΤΟΣ

Αλγοριθµική και νοηµατική µάθηση της χηµείας: η περίπτωση των πανελλαδικών εξετάσεων γενικής παιδείας 1999

Computational study of the structure, UV-vis absorption spectra and conductivity of biphenylene-based polymers and their boron nitride analogues

A Sequential Experimental Design based on Bayesian Statistics for Online Automatic Tuning. Reiji SUDA,

Supplementary Appendix

Kernel Methods and their Application for Image Understanding

ES440/ES911: CFD. Chapter 5. Solution of Linear Equation Systems

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Bundle Adjustment for 3-D Reconstruction: Implementation and Evaluation

FORMULAS FOR STATISTICS 1

Local Approximation with Kernels

College of Life Science, Dalian Nationalities University, Dalian , PR China.

Introduction to the ML Estimation of ARMA processes

Sparse Modeling and Model Selection

Copper-catalyzed formal O-H insertion reaction of α-diazo-1,3-dicarb- onyl compounds to carboxylic acids with the assistance of isocyanide

HOMEWORK#1. t E(x) = 1 λ = (b) Find the median lifetime of a randomly selected light bulb. Answer:

Optimization Investment of Football Lottery Game Online Combinatorial Optimization

Study on the Strengthen Method of Masonry Structure by Steel Truss for Collapse Prevention

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

Transcript:

2008 10 Chinese Journal of Applied Probability and Statistics Vol.24 No.5 Oct. 2008 (,, 1000871;,, 100044) (,, 100875) (,, 100871). EM, Wishart Jeffery.,,,,. : :,,, EM, Wishart. O212.7. 1.,. 1894, Pearson. 1969, Day 1 χ 2,. 1977, Dempster 2 EM,, Redner Walker 3, EM. EM MCMC,,. EM, : ; 4., :, ; ; EM (SMEM ) 5.,, 6., E-mail: xweitong@bnu.edu.cn. 2004 7 22.

476,,. : 4. 7, MML. 4 6,.,. : ; Wishart ; MML Wishart ; CEM 2 ; ;,. 2. EM 2.1 Y = Y 1,, Y d T d, y = y 1,, y d T Y. k : p(y θ) = k α m p(y θ m ), (2.1) Y,., α 1,, α k ; θ m m ; θ {θ 1,, θ k, α 1,, α k } ; α m : α m 0, m = 1,, k k α m = 1. (2.2) p(y θ m ) = 2π d/2 Σ m 1/2 exp{ (1/2) (y µ m ) T Σ 1 m (y µ m )},. d θ m µ m Σ m. (2.2), (2.1),., EM. N Y = {y (1),, y (N) }, : log p(y θ) = N ln k, (2.3) θ : α m p(y θ m ). (2.3) ( ) : θ ML = max{log p(y θ)}. (2.4) θ θ MAP = max{log p(y θ) + log p(θ)}. (2.5) θ

: 477 2.2 EM EM E- M-. E- Q, M-, E-,,., : E- : µ m, Σ m α m, n m : / k R mn = α m p(y θ m ) α m p(y θ m ). (2.6) M- : (2.2), µ m, Σ m α m : α m = µ m = Σ m =, N, d. ( N ) /N, R mn (2.7) ( N ) /(N R mn y n αm ), (2.8) N R mn (y n µ m )(y n µ m ) T / (N αm ), (2.9) 3. Wishart,,, 18.,, ( (2.5))., Ridolfi Idier 9 Gamma. Ormoneit Tresp 6 Snoussi M-Djafari 10 Wishart. : p(θ) = D(α γ), k D(α γ) = b(γ) p(µ m, Σ m ) = D(α γ) k k α γm 1 m, α m 0, N(µ m v m, η 1 m Σ m )IW (Σ 1 m α m, β m ). (3.1) k N(µ m ν m, η 1 m Σ m ) = (2π) d/2 η 1 m Σ m 1/2 exp α m = 1, (3.2) η m 2 (µ m ν m ) T Σ 1 m (µ m ν m ), (3.3) IW (Σ 1 m δ m, β m ) = c(δ m, β m ) Σ 1 m δm (d+1)/2 exp tr(β m Σ 1 m ). (3.4), δ m > (d 1)/2, b(γ) c(δ m, β m ), tr( ).

478 (3.2)(3.3)(3.4) (2.5), : { N k θ MAP = max R mn log α m + log N(x n µ m, Σ m ) + log D(α m γ) θ + k } log N(µ m ν m, ηm 1 Σ m ) + log IW (Σ 1 m δ m, β m ). (3.5) : E- : (2.6), : / k R mn = α m p(y θ m ) α m p(y θ m ). (3.6) M- : µ m, Σ m α m : ( N α m = R mn + γ m 1 )/(N + k ) γ m k, (3.7) ( N ) /(N µ m = R mn x n + η m ν m αm + η m ), (3.8) N R mn (x n µ m )(x n µ m ) T + η m ( µ m v m )( µ m v m ) T + 2β m Σ m =, (3.9) N α m + 2δ m d, γ, η, ν, δ, β, N, d. 4. MML (minimum message length, MML ) Wallace Freeman 11 1987.,.. : MessLen log p(θ) + 1 2 log I(θ) log p(y θ) + c 2 log κ c + c 2. (4.1), p(θ) ; I(θ) Fisher, I(θ)= E 2 log p(y θ)/ θ 2 ; p(y θ) ; c ; κ c c, κ 1 = 1/12, κ 2 = 5/(36 3), Conwan Sloane 12. (4.1) (3.1) p(θ),, Fisher I(θ)., 4 I(θ) I c (θ). { ĉ MDL = arg min k c ( + δ m d + 1 2 (γ m 1) log α m 1 2 log Σ m η m 2 (µ m v m ) T Σ 1 ) log Σ 1 m tr(β m Σ 1 log p(y θ) + c 2 log κ c + c 2 m ) n k m (µ m v m ) log(i d d + K)(Σ m Σ m ) }, (4.2)

: 479, γ, η, ν, δ, β, N, d, I d d d d, K = H ij H ij, H ij d d, (i, j) 1, 0. ij,, (2.2), : { ( n α m (t + 1) = max 0, i=1 ) R m (i) N }/ k { ( n max 0, 2 i=1 ) R m (i) N }. (4.3) 2 α m, α m (t + 1) = 0., CEM 2 13. α 1 θ 1, ; α 2 θ 2,,., k. 5. 3 4, : 1. : k min, k max, ε, α m µ m Σ m. 2. t = 0, k = k max, L min = +. 3. k > k min, : (MML). (1) t = t + 1 CEM 2, (3.6)-(3.9). comp = 1. comp < k,. R mn, µ m, Σ m α m. α m (t + 1) = 0, ( µ m, Σ m α m ), k = k 1. (2) (4.2), L(t). (3) L(t 1) L(t) ε L(t), (1). 4. L(t) L min, L min = L(t), k best = k. 5. k = k 1, 3., ε 10 5 ; ; µ ; Σ. 6. 4,. 900,, 3. : α 1 = α 2 = α 3 = 1/3; : µ 1 = 0, 2, µ 2 = 0, 0, µ 3 = 0, +2; :

480 C 1 = C 2 = C 3 = diag{2, 0.2}. 25, 80, 3, ( 1). (a) (b) 1 (a) 25 ; (b) 3, Figueiredo Jain,,. 50,. Jeffrey, 98%, 100%. 2, 1000,, 4,. : α 1 = α 2 = α 3 = 0.3, α 4 = 0.1. : µ 1 = µ 2 = 4, 4, µ 3 = 2, 2, µ 4 = 1, 6.

: 481 : 1 0.5 6 2 C 1 = 0.5 1 2 1, C 2 =, 2 6 0.125 0 C 3 = 1 2, C 4 = 0 0.125. 25, 100, 4, ( 3). (a) (b) 3 (a) 25 ; (b) 4, Figueiredo Jain, 1.,. 4 (a) (a) ; (b) (b), Wishart Jeffrey,,.

482 7.,,. 70, 30,.,, MML,. 4 6, Wishart Jeffery.,. 1 Day, N.E., Estimating the components of a mixture of normal distributions, Biometrika, 56(3)(1969), 463 474. 2 Dempster, A., Laird, N. and Rubin, D., Maximum likehood estimation from incomplete data via the EM algorithm, J. Royal Statistical Soc. B, 39(1977), 1 38. 3 Redner, R.A., Walker, H.F., Mixtures densities, maximum likelihood and the EM algorithm, SIAM Review, 26(1984), 195 239. 4 Figueiredo, M.A.T., Jain, A.K., Unsupervised learning of finite mixture models, IEEE-PAMI, 24(3)(2002), 381 396. 5 Ueda, N., Nakano, R., Gharhamani, Z. and Hinton, G., SMEM algorithm for mixture models, Neural Computation, 12(2000), 2109 2128. 6 Ormoneit, D., Tresp, V., Averaging, maximum penalized likelihood and Bayesian estimation for improving Gaussian mixture probability density estimates, IEEE Transactions on Neural Networks, 9(4)(1998), 639 649. 7 Oliver, J., Baxter, R. and Wallace, C., Unsupervised Learning Using MML, Proc. 13 th Int l Conf. Machine Learning, 364 372, 1996. 8 Hathway, R., Another interpretation of the EM algorithm for mixture distributions, Journal of Statistics & Probability Letters, 4(1986), 53 56. 9 Ridolfi, A., Idier, J., Penalized Maximum Likelihood Estimation for Normal Mixture Distributions, Actes 17 Coll. GRETSI, Vannes, France, 259 262, 1999. 10 Snoussi, H., M-Djafari, A., Penalized Maximum Likelihood for Multivariate Gaussian Mixture, Bayesian Inference and Maximum Entropy Methods in Science and Engeering, 21 st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engeering, Baltimore, Maryland, 77 88, 2001. 11 Wallace, C.S. and Freeman, P.R., Estimation and inference by compact coding, Journal of the Royal Statistical B, 49(1987), 240 252. 12 Conway, J.H. and Sloane, N.J.A., Sphere Packings, Lattices and Groups, Springer-Verlag, London, 1988. 13 Celeux, G., Chretien, S., Forbes, F. and Mkhadri, A., A component-wise EM algorithm for mixtures, Technical Report 3746, INRIA Rhone-Alpes, France, 1999. Available at http://www.inria.fr /RRRT/RR-3746.html.

: 483 Unsupervised Classification Based on Penalized Maximum Likelihood of Gaussian Mixture Models Yu Peng (School of Mathematical Sciences, Peking University, Beijing, 100871; National Geomatics Center of China, Beijing, 100044 ) Tong Xinwei (School of Mathematical Sciences, Beijing Normal University, Beijing, 100875 ) Feng Jufu (National Laboratory on Machine Perception, Center for Information Science, School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871 ) In this paper we propose an unsupervised classification algorithm which is based on Gaussian mixture models. Thinking that EM algorithm will result in a local optimal resolution of Gaussian mixture models in parameter estimations, we substitute invert Wishart distribution for Jeffery prior. Experiments show that this algorithm improves correct rates and decreases time while estimating classifications. Keywords: Gaussian mixture models, unsupervised Classification, penalized maximum likelihood, EM algorithm, invert Wishart distribution. AMS Subject Classification: 62G32.