CSJ. Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity

Σχετικά έγγραφα
[5] F 16.1% MFCC NMF D-CASE 17 [5] NMF NMF 3. [5] 1 NMF Deep Neural Network(DNN) FUSION 3.1 NMF NMF [12] S W H 1 Fig. 1 Our aoustic event detect

Buried Markov Model Pairwise

Estimation, Evaluation and Guarantee of the Reverberant Speech Recognition Performance based on Room Acoustic Parameters

Fourier transform, STFT 5. Continuous wavelet transform, CWT STFT STFT STFT STFT [1] CWT CWT CWT STFT [2 5] CWT STFT STFT CWT CWT. Griffin [8] CWT CWT

MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

, -.

Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

Query by Phrase (QBP) (Music Information Retrieval, MIR) QBH QBP / [1, 2] [3, 4] Query-by-Humming (QBH) QBP MIDI [5, 6] [8 10] [7]

Ψηφιακή Επεξεργασία Φωνής

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

Detection and Recognition of Traffic Signal Using Machine Learning

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

. i-vector, Total Variability Subspace Adaptation Based Speaker Recognition. Brief Paper ACTA AUTOMATICA SINICA Vol. 40, No. 8 August, 2014.

Feasible Regions Defined by Stability Constraints Based on the Argument Principle

Yoshifumi Moriyama 1,a) Ichiro Iimura 2,b) Tomotsugu Ohno 1,c) Shigeru Nakayama 3,d)

Study on Re-adhesion control by monitoring excessive angular momentum in electric railway traction

Πανεπιστήµιο Πειραιώς Τµήµα Πληροφορικής

Πτυχιακή Εργασι α «Εκτι μήσή τής ποιο τήτας εικο νων με τήν χρή σή τεχνήτων νευρωνικων δικτυ ων»

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

1,a) 1,b) 2 3 Sakriani Sakti 1 Graham Neubig 1 1. A Study on HMM-Based Speech Synthesis Using Rich Context Models

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

Μηχανισμοί πρόβλεψης προσήμων σε προσημασμένα μοντέλα κοινωνικών δικτύων ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

Quick algorithm f or computing core attribute

( ) (Harmonic-Temporal Clustering; HTC) [1], [2] ( ) ( ) [4] HTC. (Non-negative Matrix Factorization; NMF) [3] [5], [6] [7], [8]

Applying Markov Decision Processes to Role-playing Game

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Simplex Crossover for Real-coded Genetic Algolithms

HIV HIV HIV HIV AIDS 3 :.1 /-,**1 +332

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

GPU. CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA. Parallelizing the Number Partitioning Problem for GPUs

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

Stabilization of stock price prediction by cross entropy optimization

«ΑΝΑΠΣΤΞΖ ΓΠ ΚΑΗ ΥΩΡΗΚΖ ΑΝΑΛΤΖ ΜΔΣΔΩΡΟΛΟΓΗΚΩΝ ΓΔΓΟΜΔΝΩΝ ΣΟΝ ΔΛΛΑΓΗΚΟ ΥΩΡΟ»

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

Study of In-vehicle Sound Field Creation by Simultaneous Equation Method

ER-Tree (Extended R*-Tree)

Η ΠΡΟΣΩΠΙΚΗ ΟΡΙΟΘΕΤΗΣΗ ΤΟΥ ΧΩΡΟΥ Η ΠΕΡΙΠΤΩΣΗ ΤΩΝ CHAT ROOMS

Analysis of prosodic features in native and non-native Japanese using generation process model of fundamental frequency contours

Voice Conversion based on Non-negative Matrix Factorization with Segment Features in Noisy Environments

Research on Economics and Management

Prey-Taxis Holling-Tanner

Bayesian Discriminant Feature Selection

ΖΩΝΟΠΟΙΗΣΗ ΤΗΣ ΚΑΤΟΛΙΣΘΗΤΙΚΗΣ ΕΠΙΚΙΝΔΥΝΟΤΗΤΑΣ ΣΤΟ ΟΡΟΣ ΠΗΛΙΟ ΜΕ ΤΗ ΣΥΜΒΟΛΗ ΔΕΔΟΜΕΝΩΝ ΣΥΜΒΟΛΟΜΕΤΡΙΑΣ ΜΟΝΙΜΩΝ ΣΚΕΔΑΣΤΩΝ

ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ "ΠΟΛΥΚΡΙΤΗΡΙΑ ΣΥΣΤΗΜΑΤΑ ΛΗΨΗΣ ΑΠΟΦΑΣΕΩΝ. Η ΠΕΡΙΠΤΩΣΗ ΤΗΣ ΕΠΙΛΟΓΗΣ ΑΣΦΑΛΙΣΤΗΡΙΟΥ ΣΥΜΒΟΛΑΙΟΥ ΥΓΕΙΑΣ "

DETERMINATION OF DYNAMIC CHARACTERISTICS OF A 2DOF SYSTEM. by Zoran VARGA, Ms.C.E.

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

SNR F0 [2], [3], [4] F0 F0 F0 F0 F0 TUSK F0 TUSK F0 6 TUSK 6 F0 2. F0 F0 [5] [6] [7] p[8] Cepstrum [9], [10] [11] [12] [13] F0 [14] F0 [15] DIO[16] [1

Συνδυασμένη Οπτική-Ακουστική Ανάλυση Ομιλίας

ΙΕΥΘΥΝΤΗΣ: Καθηγητής Γ. ΧΡΥΣΟΛΟΥΡΗΣ Ι ΑΚΤΟΡΙΚΗ ΙΑΤΡΙΒΗ

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

Bundle Adjustment for 3-D Reconstruction: Implementation and Evaluation

Conjoint. The Problems of Price Attribute by Conjoint Analysis. Akihiko SHIMAZAKI * Nobuyuki OTAKE

Echo path identification for stereophonic acoustic echo cancellation without pre-processing

Elements of Information Theory

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ ΣΧΟΛΗ ΗΛΕΚΤΡΟΛΟΓΩΝ ΜΗΧΑΝΙΚΩΝ ΚΑΙ ΜΗΧΑΝΙΚΩΝ ΥΠΟΛΟΓΙΣΤΩΝ

FX10 SIMD SIMD. [3] Dekker [4] IEEE754. a.lo. (SpMV Sparse matrix and vector product) IEEE754 IEEE754 [5] Double-Double Knuth FMA FMA FX10 FMA SIMD

1 n-gram n-gram n-gram [11], [15] n-best [16] n-gram. n-gram. 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e)

Anomaly Detection with Neighborhood Preservation Principle

Identifying Scenes with the Same Person in Video Content on the Basis of Scene Continuity and Face Similarity Measurement

ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΣΧΟΛΗ ΕΠΙΣΤΗΜΩΝ ΥΓΕΙΑΣ

The Algorithm to Extract Characteristic Chord Progression Extended the Sequential Pattern Mining

Numerical Analysis FMN011

(Υπογραϕή) (Υπογραϕή) (Υπογραϕή)

Other Test Constructions: Likelihood Ratio & Bayes Tests

[2] REVERB 8 [3], [4] [5] [20] [6], [7], [8], [9], [10] [11] REVERB 8 *1 [9] LDA *2 MLLT (SAT) [8] (basis fmllr) [12] (DNN) [10] DNN [11] [13] [14] Ka

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

Vol.4-DCC-8 No.8 Vol.4-MUS-5 No.8 4// 3 3 Hanning (T ) 3 Hanning 3T (y(t)w(t)) dt =.5 T y (t)dt. () STRAIGHT F 3 TANDEM-STRAIGHT[] 3 F F 3 [] F []. :

*,* + -+ on Bedrock Bath. Hideyuki O, Shoichi O, Takao O, Kumiko Y, Yoshinao K and Tsuneaki G

Προσδιορισμός Χαρακτηριστικών των Λεκανών Απορροής

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ

Matrices and vectors. Matrix and vector. a 11 a 12 a 1n a 21 a 22 a 2n A = b 1 b 2. b m. R m n, b = = ( a ij. a m1 a m2 a mn. def

A study of geometric dependency of cepstrum on vocal tract length

ΕΚΤΙΜΗΣΗ ΤΟΥ ΚΟΣΤΟΥΣ ΤΩΝ ΟΔΙΚΩΝ ΑΤΥΧΗΜΑΤΩΝ ΚΑΙ ΔΙΕΡΕΥΝΗΣΗ ΤΩΝ ΠΑΡΑΓΟΝΤΩΝ ΕΠΙΡΡΟΗΣ ΤΟΥ

Extraction of Basic Patterns of Household Energy Consumption

Newman Modularity Newman [4], [5] Newman Q Q Q greedy algorithm[6] Newman Newman Q 1 Tabu Search[7] Newman Newman Newman Q Newman 1 2 Newman 3

Development of the Nursing Program for Rehabilitation of Woman Diagnosed with Breast Cancer

Technical Research Report, Earthquake Research Institute, the University of Tokyo, No. +-, pp. 0 +3,,**1. No ,**1

Δυνατότητα Εργαστηρίου Εκπαιδευτικής Ρομποτικής στα Σχολεία (*)

A Method of Trajectory Tracking Control for Nonminimum Phase Continuous Time Systems

[1] DNA ATM [2] c 2013 Information Processing Society of Japan. Gait motion descriptors. Osaka University 2. Drexel University a)

Εφαρμογές της τεχνολογίας επίγειας σάρωσης Laser στις μεταφορές

n 1 n 3 choice node (shelf) choice node (rough group) choice node (representative candidate)

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

«ΑΓΡΟΤΟΥΡΙΣΜΟΣ ΚΑΙ ΤΟΠΙΚΗ ΑΝΑΠΤΥΞΗ: Ο ΡΟΛΟΣ ΤΩΝ ΝΕΩΝ ΤΕΧΝΟΛΟΓΙΩΝ ΣΤΗΝ ΠΡΟΩΘΗΣΗ ΤΩΝ ΓΥΝΑΙΚΕΙΩΝ ΣΥΝΕΤΑΙΡΙΣΜΩΝ»

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

HOSVD. Higher Order Data Classification Method with Autocorrelation Matrix Correcting on HOSVD. Junichi MORIGAKI and Kaoru KATAYAMA

Medium Data on Big Data

ICT use and literature courses in secondary Education: possibilities and limitations

IMES DISCUSSION PAPER SERIES

Ψηφιακή Επεξεργασία Φωνής

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΩΣ ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ ΠΜΣ «ΠΡΟΗΓΜΕΝΑ ΣΥΣΤΗΜΑΤΑ ΠΛΗΡΟΦΟΡΙΚΗΣ» ΚΑΤΕΥΘΥΝΣΗ «ΕΥΦΥΕΙΣ ΤΕΧΝΟΛΟΓΙΕΣ ΕΠΙΚΟΙΝΩΝΙΑΣ ΑΝΘΡΩΠΟΥ - ΥΠΟΛΟΓΙΣΤΗ»

ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ (Τ.Ε.Ι.) ΠΕΙΡΑΙΑ ΣΧΟΛΗ ΔΙΟΙΚΗΣΗΣ ΚΑΙ ΟΙΚΟΝΟΜΙΑΣ ΤΜΗΜΑ ΔΙΟΙΚΗΣΗΣ ΕΠΙΧΕΙΡΗΣΕΩΝ ΚΑΤΕΥΘΥΝΣΗ: ΔΙΟΙΚΗΣΗΣ ΕΠΙΧΕΙΡΗΣΕΩΝ

Wavelet based matrix compression for boundary integral equations on complex geometries

ΠΕΡΙΕΧΟΜΕΝΑ. Κεφάλαιο 1: Κεφάλαιο 2: Κεφάλαιο 3:

ΠΑΝΔΠΗΣΖΜΗΟ ΠΑΣΡΩΝ ΓΗΑΣΜΖΜΑΣΗΚΟ ΠΡΟΓΡΑΜΜΑ ΜΔΣΑΠΣΤΥΗΑΚΩΝ ΠΟΤΓΩΝ «ΤΣΖΜΑΣΑ ΔΠΔΞΔΡΓΑΗΑ ΖΜΑΣΩΝ ΚΑΗ ΔΠΗΚΟΗΝΩΝΗΩΝ» ΣΜΖΜΑ ΜΖΥΑΝΗΚΩΝ Ζ/Τ ΚΑΗ ΠΛΖΡΟΦΟΡΗΚΖ

VBA Microsoft Excel. J. Comput. Chem. Jpn., Vol. 5, No. 1, pp (2006)

Αξιολόγηση των Φασματικού Διαχωρισμού στην Διάκριση Διαφορετικών Τύπων Εδάφους ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ. Σπίγγος Γεώργιος

Transcript:

i-vector 1 1 1 1 i-vector CSJ i-vector Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity Fukuchi Yusuke 1 Tawara Naohiro 1 Ogawa Tetsuji 1 Kobayashi Tetsunori 1 Abstract: We have developed a novel speaker clustering method by integrating highly accurate speaker representation and a clustering algorithm. The conventional method caused significant degradation in clustering accuracy when the number of utterances increased. High-accuracy speaker representation and high-performance clustering method are required to realize robust speaker clustering system against such a condition. For this purpose, we used i-vectors for the speaker representation, which contributes to the realization of high-accuracy speaker verification systems, and efficient non-negative matrix factorization for the clustering algorithm. Experimental results show that the proposed method outperforms the conventional methods, irrespective of the amount of data. Keywords: speaker clustering, i-vector, non-negative matrix factorization 1. Web [1] 1 Waseda University, Shinjuku, Tokyo 164 0042, Japan (Gaussian mixture model; GMM) [2], [3], [4], [] [6] c 2012 Information Processing Society of Japan 1

Bayesian information criteria; BIC) (agglomerative hierarchical clustering; AHC) k-means BIC (BIC-AHC) [4] (non-negative matrix factorization; NMF) [] (Gaussian mixture model: GMM) GMM cross likelihood ratio(clr) GMM BIC GMM i-vector k-means [6] i-vector BIC [4] i-vector k-means [7] GMM NMF [] 2 3 4 CSJ 6 2. 2.1 GMM Cross likelihood ratio (CLR) 2.2 i-vector 2.1 GMM X i GMM p(x i λ i ) = K K m i,k N (x µ i,k, Σ i,k ), ( m i,k = 1)(1) k=1 k=1 N ( ) 1 N (X i µ i,k, Σ i,k )= (2π) D 2 Σ i,k 1 2 exp ( 1 2 (x µ i,k) T Σ 1 i,k (x µ i,k) ). (2) λ i = { m i,k, µ i,k, Σ i,k } K k=1 i X i GMM µ i,k Σ i,k m i,k k 2 GMM CLR [] i j CLR Dij CLR = log p(x i λ i ) p(x i λ j ) + log p(x j λ j ) p(x j λ i ) (3) p(x i λ j ) λ j = { m j,k, µ j,k, Σ j,k } K k=1 GMM X i T i p(x i λ j ) = p(x i,t λ j ) (4) t=1 2.2 i-vector 2.2.1 i-vector u GMM M u (Total variability; TV) M u = m + T w u () m universal background model (UBM) GMM T TV c 2012 Information Processing Society of Japan 2

T Eigen voice [6] w u u i-vector u i-vector TV T u w u = (I + T t ΣN(u)T) 1 T t Σ 1 F(u) (6) N(u) F(u) 0 1 N c = F c = L P (c x t, Ω) (7) t=1 L P (c x t, Ω) (x c m c ) (8) t=1 C UBM Ω F c = 1,, C UBM m c c N(u) N c I(I R F F ) CF CF F(u) F c CF 1 Σ TV CF CF [8] i-vector (Fisher discriminant analysis; FDA) (within-class covariance normalization; WCCN) 2.2.2 i-vector FDA WCCN [6], [] FDA w A t w (w R d, A R d d, d < d) A S b S w S b v = λs w v (9) WCCN w B t w (w R d, B d d ) WCCN ( ) B s i w s i s w s S W = 1 S 1 n s n s=1 s i=1 (w s i w s )(w s i w s ) t () W 1 W 1 = BB t 2.2.3 i j i-vector Di,j cos z T i = z j z i z j (11) z u i-vector w u FDA WCCN u 3. BIC NMF 3.1 BIC BIC BIC 0 i,j BIC i,j X i X j BIC BIC 0 i,j = N i + N j 2 + α 2 log Σ 0 d(d + 1) (d + ) log(n i + N j ) (12) 2 BIC i,j = N i 2 log Σ i + N j 2 log Σ j + α 2 d(d + 1) (d + ) log(n i + N j ). (13) 2 Σ 0 2 Σ i Σ j N i N j d α i j BIC i,j = BIC 0 i,j BIC i,j (14) BIC i,j BIC i,j c 2012 Information Processing Society of Japan 3

BIC i,j (14) 3.2 U V R U U (Non-negative Matrix Factorization; NMF) V W R U S H R S U V WH (1) U S W H u s u ŝ u = arg max H s,u (16) s u = 1,, U (16) ŝ u 2.1 GMM CLR i-vector W H V [11] D(V WH) = ( V ) i,j V i,j log V ij + (WH) i,j (WH) i,j i,j (17) W H D(V WH) u W i,a W H a,uv i,u /(WH) i,u i,a u H (18) a,u s H a,j H H s,av s,j /(WH) s,j a,j s H (19) s,a V i,j W i,j H i,j W W H i j 4. 1 Method 1 Speaker representation Clustering algorithm BIC [4] single Gaussian BIC-AHC GMM-NMF [] GMM NMF IV-kM [7] i-vector k-means IV-NMF (proposed) i-vector NMF BIC [4] 3.1 GMM-NMF 2.1 GMM CLR 3.2 NMF BIC [] CLR NMF IV-kM i-vector k-means k-means [7] i-vector NMF i-vector (20) { 0, v < 0 f(v) = v 3, otherwise (20). 1.1.1.1 (CSJ) [12] 0 0 46 4 00 ms s s 0 c 2012 Information Processing Society of Japan 4

.1.2 BIC GMM-NMF 12 MFCC GMM- NMF GMM 8 8 GMM CLR 0 1. IV-kM IV-NMF UBM i-vector T FDA WCCN 183 28,171 UBM MFCC 12 MFCC 12 MFCC 12 36 12 i-vector 30 FDA 0 WCCN.1.3 NMF k-means ( ) BIC (12) (13) BIC α K ( 0 α = 6.8 α = 13.8) NMF H W 0.0001.1.4 (average speaker purity; ASP) (average cluster purity; ACP) K [13] p i q j p i = q j = n 2 ij n 2 j=0 i i=0 n 2 ij n 2 j (21) (22) S U n j j n i i n i j j i V ASP V ACP V ACP = 1 U p i n i (23) i=0 2 K BIC IV-kM GMM-NMF IV-NMF V ASP = 1 U # speakers # utterances K value 0.989 0 0.6 0.993 0 0.92 0.922 0 0.914 0.920 0 0.896 0.977 0 0.931 0.940 0 0.896 0.984 0 0.987 0.966 0 0.944 q j n j (24) j=0 K K = V ACP V ASP (2) K 1.2 2 BIC IV-kM GMM-NMF (IV-NMF) IV-NMF K IV-NMF GMM-NMF IV-kM BIC-AHC BIC-AHC 0 BIC-AHC BIC IV-kM GMM-NMF c 2012 Information Processing Society of Japan

BIC IV-kM IV-kM IV-NMF BIC 3 IV-kM IV- NMF i-vector BIC GMM-NMF i-vector NMF (GMM-NMF IV- NMF) 2 non-negative matrix factorization, Proc. Interspeech, pp.949-92, Aug. 2011. [6] N. Dehak et al., Front-end factor analysis for speaker verification, IEEE trans. Speech Audio Process., vol.19, no.4, pp.788-798, May 2011. [7] S. Shum et al., Exploiting intra-conversation variability for speaker diarization, Proc. Interspeech, pp.94-948, Aug. 2011. [8] P.Kenny, et al., Eigenvice modeling with sparse training data, IEEE trans. Speech Audio Process., vol.13, no. 3, pp.34-34 May 200. [9] P. Kenny et al., A study of interspeaker variability in speaker verification, IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no., pp. 980-988, Jul. 2008. [] A. Hatch et al., Within-class covariance normalization for SVM-based speaker recognition, Proc. ICSLP, pp.1471-1474, Sept. 2006. [11] D. D. Lee and H. S. Seung, Algorithms for non-negative matrix factorization, Proc. NIPS, pp.6-62, Nov. 2000. [12] K. Maekawa, Corpus of spontaneous Japanese: its design and evaluation, Proc. SSPR, pp.7-12, Apr. 2003. [13] A. Solomonoff et al., Clustering speakers by their voices, Proc. ICASSP, vol.2, pp.77-760, May 1998. 6. i-vector NMF [1] T.J. Reynolds, et al., Clustering via the Baysian information criterion with applications in speech recognition, Proc. ICASSP, vol.2, pp.64-648, May 1998. [2] D.A. Reynolds, et al., Blind Clustering of Speech Utterances based on Speaker and Language Characteristics, Proc. ISCLP, vol.7, pp.3193-3196, Nov. 1998. [3] D.A. Reynolds, et al., Speaker Verification Using Adapted Gaussian Mixture Models, Proc. Digital Signal Processing, vol., pp.19-41, Jan. 2000. [4] S. Chen and P. Gopalakrishan, Speaker, environment and channel change detection and clustering via the Bayesian information criterion, Proc. DARPA Broadcast News Transcription and Understanding Workshop, pp.127-132, May 1998. [] M. Nishida et al., Speaker clustering based on c 2012 Information Processing Society of Japan 6