Buried Markov Model 1 2 2 HMM Buried Markov Model J. Bilmes Buried Markov Model Pairwise 0.6 0.6 1.3 Structuring Model for Speech Recognition using Buried Markov Model Takayuki Yamamoto, 1 Tetsuya Takiguchi 2 and Yasuo Ariki 2 Though HMM makes it possible to recognize clear utterance by high accuracy, there is a problem that the speech including the noise or the continuous utterance cause the decrease in accuracy. To solve this problem, buried Markov model was proposed by J. Bilmes, where Buried Markov model contains the conditional independence between the observation node. In this research, to enable to obtain the conditional independence that the assumption of distribution prevents from obtaining, we proposed to introduce non-parametric independence test instead of mutual information into pairwise algorithm, that is the algorithm to learn the structure of buried Markov model. We performed phone recognition experiment by HMM, BMM structured using mutual information and BMM structured using multi-resolution independence test. Experimental results achieved improvement of phone recognition rate of male of 0.6%, improvement of phone recognition rate of female of 0.6%, and improvement of phone recognition rate of mix of 1.3%. 1. (Hidden Markov Model, HMM) HMM (Dynamic Bayesian Network, DBN) HMM HMM HMM J. Bilmes Buried Markov Model (BMM) 1) BMM HMM BMM BMM BMM 2 BMM 3 BMM 4 5 1 Graduate School of Engineering, Kobe University 2 Organization of Advanced Science and Technology, Kobe University 1 c 2009 Information Processing Society of Japan
2. Buried Markov Model 2.1 Buried Markov Model Hidden Markov Model 1 HMM (hidden) BMM (buried) Buried Markov Model BMM 1 Buried Markov Model Fig. 1 Structure of Buried Markov Model BMM (1) P r(x 1:T ) = P r(x t z t (q t ), q t )P r(q t q t 1 ) (1) t x 1:T T q z t t z t t q t 2.2 BMM Pairwise Pairwise I (X, Z q) > δ 1 (2) I (Z, Z i ) < δ 2 I (X, Z) Z i Z (3) I (X, Z) < δ 3 (4) X Z Z δ 1 3 (2) X Z q X q q (4) X Z q q (3) X Z Z X Z i Z X Z i X Z Z U(Z) = I(X, Z q) I(X, Z) Z X BMM 2 c 2009 Information Processing Society of Japan
3.1 (Multi-resolution Independence Test) R I J I J ( 3) c 1,, c K K IJ N p 1,, p K B R D Fig. 2 2 Buried Markov Model Method to structure Buried Markov Model P r(d p 1 K, B R, R) = N! p c k k c k! (5) (5) p 1 K BMM HMM HMM Pairwise BMM Pairwise HMM HMM 3. BMM BMM 2) 3) 4) 3 Fig. 3 Histgramization 3 c 2009 Information Processing Society of Japan
P r(p 1 K) = Γ(γ) p γ k 1 k Γ(γ k ) (6) γ = Σ K γ k γ k Γ(x) P r(d) p k (5) (6) P r(d) P r(d) = P r(d p 1 K )P r(p 1 K )dp 1 K = Γ(γ) Γ(γ + N) Γ(γ k + c k ) Γ(γ k ) D M I M I P r(d) = P r(d M I)P r(m I) + P r(d M I)P r(m I) (8) (8) P r(m I) ρ P r(m I ) 1 ρ [ ] P r(m I D) = 1/ 1 + 1 ρ P r(d M I ) ρ P r(d M I ) P r(d M I) P r(d M I) (9) P r(d M I) = P r(d M I ) = Γ(γ) Γ(γ + N) Γ(γ k + c k ) Γ(γ k ) Υ(C K, γ K ) (10) Γ(α) Γ(α + N) I i=1 Γ(α i + c i ) Γ(α i ) Γ(β) Γ(β + N) J j=1 Γ(β j + c j ) Γ(β j ) Υ(C I, α I)Υ(C J, β J) (11) (10) (11) γ = Σ K γ k α = Σ I i=1α i β = Σ J j=1β j α i = β j = γ k = 1 (10) (11) (9) (6) (7) (9) P r(m I D) = 1/ [ 1 + ] (1 ρ)υ(ck, γk) ρυ(c I, α I)Υ(C J, β J) (12) R (13) P r(m I R, D) = (12) P r(m I B R, R, D)P r(b R R, D)dB R (13) B R R 3 R P r(m I D) R P r(b R R, D) R R max R R 1, 3.2 (Spearman s rank correlation coefficient) ρ = 1 6 n i=1 D2 i (n 1)n(n + 1) n X Y i X i,y i D i = rank(x i ) rank(y i ) ρ (14) 4 c 2009 Information Processing Society of Japan
3.3 (Kendall tau rank correlation coefficient) τ = P Q 1 n(n 1) (15) 2 n X Y 1 n(n 1) 2 a b P X a < X b Y a < Y b X a > X b Y a > Y b Q X a < X b Y a > Y b X a > X b Y a < Y b P Q P τ 1 Q τ 1 P Q τ 0 4. BMM (ASJ-PB) (ASJ-JNAS) ( 100 10 ) GMTK 5) 16 khz 0 MFCC 12 36 10 ms 25ms 43 1 BMM 5 5.4 BMM 0.6 0.6 1.3 5. 4 [ ] Fig. 4 Phone recognition rate [%] BMM GMTK JNAS Pairwise 1) J.A. Bilmes, Buried Markov models: a graphical-modeling approach to automatic speech recognition,computer Speech and Language, Volume 17, Issues 2-3, pp. 213-231, 2003. 2) D. Margaritis et al., A Bayesian Multiresolution Independence Test for Continuous Variables, Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence(UAI), pp. 346-353, 2001. 5 c 2009 Information Processing Society of Japan
3) J. L. Myers, A. D. Well, Research Design and Statistical Analysis (second edition ed.), pp. 508, 2003. 4) H. Abdi, Kendall rank correlation, Neil Salkind (Ed.), Encyclopedia of Measurement and Statistics, 2007. 5) J.A. Bilmes and G. Zweig, The Graphical Models Toolkit: An open source software system for speech and time-series processing, ICASSP, Orland, 3916-3919, 2002. 6 c 2009 Information Processing Society of Japan