40 3 Journal of South China University of Technology Vol. 40 No Natural Science Edition March

Σχετικά έγγραφα
Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

Buried Markov Model Pairwise

ER-Tree (Extended R*-Tree)

CorV CVAC. CorV TU317. 1

Quick algorithm f or computing core attribute

[5] F 16.1% MFCC NMF D-CASE 17 [5] NMF NMF 3. [5] 1 NMF Deep Neural Network(DNN) FUSION 3.1 NMF NMF [12] S W H 1 Fig. 1 Our aoustic event detect

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

Supplementary Materials for Evolutionary Multiobjective Optimization Based Multimodal Optimization: Fitness Landscape Approximation and Peak Detection

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

High order interpolation function for surface contact problem

Study on the Strengthen Method of Masonry Structure by Steel Truss for Collapse Prevention

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

Q L -BFGS. Method of Q through full waveform inversion based on L -BFGS algorithm. SUN Hui-qiu HAN Li-guo XU Yang-yang GAO Han ZHOU Yan ZHANG Pan

,,, (, ) , ;,,, ; -

Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb

Optimization Investment of Football Lottery Game Online Combinatorial Optimization

Adaptive grouping difference variation wolf pack algorithm

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

FENXI HUAXUE Chinese Journal of Analytical Chemistry. Savitzky-Golay. n = SG SG. Savitzky-Golay mmol /L 5700.

Optimization Investment of Football Lottery Game Online Combinatorial Optimization

Motion analysis and simulation of a stratospheric airship

VSC STEADY2STATE MOD EL AND ITS NONL INEAR CONTROL OF VSC2HVDC SYSTEM VSC (1. , ; 2. , )

CSJ. Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity

Quantum dot sensitized solar cells with efficiency over 12% based on tetraethyl orthosilicate additive in polysulfide electrolyte

( ) , ) , ; kg 1) 80 % kg. Vol. 28,No. 1 Jan.,2006 RESOURCES SCIENCE : (2006) ,2 ,,,, ; ;

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

Lossless Image Authentication Algorithm with Minimal Expansion

Study of In-vehicle Sound Field Creation by Simultaneous Equation Method

Stress Relaxation Test and Constitutive Equation of Saturated Soft Soil

Speech Recognition using Phase Information based on Long-Term Analysis

Design and Fabrication of Water Heater with Electromagnetic Induction Heating

Optimizing Microwave-assisted Extraction Process for Paprika Red Pigments Using Response Surface Methodology

MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

Arbitrage Analysis of Futures Market with Frictions

Antimicrobial Ability of Limonene, a Natural and Active Monoterpene

IL - 13 /IL - 18 ELISA PCR RT - PCR. IL - 13 IL - 18 mrna. 13 IL - 18 mrna IL - 13 /IL Th1 /Th2

Approximation Expressions for the Temperature Integral

Ανάκτηση Εικόνας βάσει Υφής με χρήση Eye Tracker

Approximation of distance between locations on earth given by latitude and longitude

2 ~ 8 Hz Hz. Blondet 1 Trombetti 2-4 Symans 5. = - M p. M p. s 2 x p. s 2 x t x t. + C p. sx p. + K p. x p. C p. s 2. x tp x t.

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

Research on model of early2warning of enterprise crisis based on entropy

8Q5SAC) 8Q5SAC UV2Vis 8500 ( ) ; PHS23C ) ;721 ( ) :1 4. ;8Q5SAC : molπl ;Britton2Robinson Q5SAC BSA Britton2Robinson,

Area Location and Recognition of Video Text Based on Depth Learning Method

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

C F E E E F FF E F B F F A EA C AEC

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

Συνδυασμένη Οπτική-Ακουστική Ανάλυση Ομιλίας

Analysis of energy consumption of telecommunications network and application of energy-saving techniques

1181 (real-timespeechdriven) 1 1 ( ) D FAP FAP (voiceactivationdetectionvad) D FaceGen 3- D XfaceEd MPEG-4 1 FAP 66 FAP ( ) FAP 84

PLATEAU METEOROLOGY. X 6 min. 6 min Vol. 34 No. 4 August doi /j. issn X X / cosθcosφ P412.

1530 ( ) 2014,54(12),, E (, 1, X ) [4],,, α, T α, β,, T β, c, P(T β 1 T α,α, β,c) 1 1,,X X F, X E F X E X F X F E X E 1 [1-2] , 2 : X X 1 X 2 ;

Zigbee. Zigbee. Zigbee Zigbee ZigBee. ZigBee. ZigBee

Resurvey of Possible Seismic Fissures in the Old-Edo River in Tokyo

Homomorphism in Intuitionistic Fuzzy Automata

Application of Wavelet Transform in Fundamental Study of Measurement of Blood Glucose Concentration with Near2Infrared Spectroscopy

Gain self-tuning of PI controller and parameter optimum for PMSM drives

GPU. CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA. Parallelizing the Number Partitioning Problem for GPUs

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

Gro wth Properties of Typical Water Bloom Algae in Reclaimed Water

Technical Research Report, Earthquake Research Institute, the University of Tokyo, No. +-, pp. 0 +3,,**1. No ,**1

Experimental Study of Dielectric Properties on Human Lung Tissue

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ

Application of a novel immune network learn ing algorithm to fault diagnosis

* ** *** *** Jun S HIMADA*, Kyoko O HSUMI**, Kazuhiko O HBA*** and Atsushi M ARUYAMA***

SLM. 1 SLM SLM SLM SLM . 316L SLM. TN249 TF124 doi /j. issn X SLM SLM SLM D SLM 100% 3-8. SLM.

Design Method of Ball Mill by Discrete Element Method

PACS: Pj, Gg

LUO, Hong2Qun LIU, Shao2Pu Ξ LI, Nian2Bing

Detection and Recognition of Traffic Signal Using Machine Learning


. i-vector, Total Variability Subspace Adaptation Based Speaker Recognition. Brief Paper ACTA AUTOMATICA SINICA Vol. 40, No. 8 August, 2014.

Research of Han Character Internal Codes Recognition Algorithm in the Multi2lingual Environment

College of Life Science, Dalian Nationalities University, Dalian , PR China.

Correction of chromatic aberration for human eyes with diffractive-refractive hybrid elements

A summation formula ramified with hypergeometric function and involving recurrence relation

A research on the influence of dummy activity on float in an AOA network and its amendments

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

JOURNAL OF APPLIED SCIENCES Electronics and Information Engineering. Cyclic MUSIC DOA TN (2012)

ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ. «Προστασία ηλεκτροδίων γείωσης από τη διάβρωση»

Εφαρμογή Υπολογιστικών Τεχνικών στην Γεωργία

Analysis on construction application of lager diameter pile foundation engineering in Guangdong coastal areas

Διπλωματική Εργασία του φοιτητή του Τμήματος Ηλεκτρολόγων Μηχανικών και Τεχνολογίας Υπολογιστών της Πολυτεχνικής Σχολής του Πανεπιστημίου Πατρών

Ανάλυση Προτιμήσεων για τη Χρήση Συστήματος Κοινόχρηστων Ποδηλάτων στην Αθήνα

An experimental and theoretical study of the gas phase kinetics of atomic chlorine reactions with CH 3 NH 2, (CH 3 ) 2 NH, and (CH 3 ) 3 N

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

ΠΟΛΥΤΕΧΝΕΙΟ ΚΡΗΤΗΣ ΣΧΟΛΗ ΜΗΧΑΝΙΚΩΝ ΠΕΡΙΒΑΛΛΟΝΤΟΣ

Οι απόψεις και τα συμπεράσματα που περιέχονται σε αυτό το έγγραφο, εκφράζουν τον συγγραφέα και δεν πρέπει να ερμηνευτεί ότι αντιπροσωπεύουν τις

1 h, , CaCl 2. pelamis) 58.1%, (Headspace solid -phase microextraction and gas chromatography -mass spectrometry,hs -SPME - Vol. 15 No.

[1] DNA ATM [2] c 2013 Information Processing Society of Japan. Gait motion descriptors. Osaka University 2. Drexel University a)

The optimization of EV powertrain s efficiency control strategy under dynamic operation condition

ADT

Re-Pair n. Re-Pair. Re-Pair. Re-Pair. Re-Pair. (Re-Merge) Re-Merge. Sekine [4, 5, 8] (highly repetitive text) [2] Re-Pair. Blocked-Repair-VF [7]

ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ ΕΠΑΝΑΣΧΕΔΙΑΣΜΟΣ ΓΡΑΜΜΗΣ ΣΥΝΑΡΜΟΛΟΓΗΣΗΣ ΜΕ ΧΡΗΣΗ ΕΡΓΑΛΕΙΩΝ ΛΙΤΗΣ ΠΑΡΑΓΩΓΗΣ REDESIGNING AN ASSEMBLY LINE WITH LEAN PRODUCTION TOOLS

þÿ¹º±½ À Ã Â Ä Å ½ ûµÅĹº þÿàá ÃÉÀ¹º Í Ä Å µ½¹º Í þÿ à º ¼µ Å Æ Å

Supporting Information

VBA Microsoft Excel. J. Comput. Chem. Jpn., Vol. 5, No. 1, pp (2006)

Διπλωματική Εργασία της φοιτήτριας του Τμήματος Ηλεκτρολόγων Μηχανικών και Τεχνολογίας Υπολογιστών της Πολυτεχνικής Σχολής του Πανεπιστημίου Πατρών

Comparison of carbon-sulfur and carbon-amine bond in therapeutic drug: -S-aromatic heterocyclic podophyllum derivatives display antitumor activity

Transcript:

40 3 Journal of South China University of Technology Vol 40 No 3 2012 3 Natural Science Edition March 2012 1000-565X 2012 03-0106-06 * 510640 MFCC K-L K-L MFCC K-L 46 61% 42 25% 39 68% 36 36% K-L TN912 3 doi 10 3969 /j issn 1000-565X 2012 03 017 SR PMC 1-2 7 CMN CMN 2 6% 8 3-5 Mel 1 3% 3 6 1 9 MFCC CMN RAS- TIMIT TA WM 2 0 6% 3 8-9 2011-08-14 * 60972132 61101160 9351064101000003 10451064101004651 2011ZM0029 1978- E-mail hejun_723@ 126 com 1980- E-mail eeyxli@ scut edu cn

3 107 k MFCC F i k = f 1k i f2k i frk i T frk i i r k F NSFT m F NSFT MFCC Mel F 1 1 F 1 2 F 1 n F 2 1 F 2 2 F 2 n F NSFT = 1 F m 1 F m m 2 Fn 1 2 CMN 1 /ɑ / /ɑ / 1 a 4 8 khz 6 0 khz 1 6 ~ 4 8 khz 0 ~ 1 6 khz 1 b 6 0 ~ 8 0 khz MFCC K-L Kullback-Leibler Divergence 10 MFCC GMM 1 1 1 MFCC MFCC MFCC NSFT s i i s i n MFCC 1 /ɑ / /ɑ / F i = F 1 i F i k F i n F i k i Fig 1 Spectra of abnormal speech /ɑ / and normal speech /ɑ /

108 40 1 6 ~ 4 8 khz 1 17 s 826 0 ~ 1 6 khz 15 ~ 20 s MFCC 1 2 b K-L 10 MFCC K-L 12 MFCC 3 600 11-12 K-L 3 ~ 5 s 826 MFCC 2 a MFCC K-L 1 1 6 9 10 11 12 5 7 8 1 826 NSFT K-L Table 1 K-L distances from one abnormal speech and 826 abnormal speeches to NSFT 1 2 3 4 5 6 7 8 9 10 11 12 2 5 15 0 1 8 5 0 7 5 1 0 12 5 11 0 18 0 12 0 8 0 95 0 826 6 5 6 0 6 5 6 4 5 0 11 0 2 5 4 0 12 0 11 0 18 0 13 0 2 2 1 K-L p x q x K-L D K-L p q = p x lg p x q x dx 2 12 MFCC d k K-L q k p k = N q k x lg q k x 4 Fig 2 12-order MFCC feature probability distribution of i = 1 p k x normal speech and one abnormal speech N 2 p x q x 2 D K-L p q = N i = 1 p x lg p x q x 3 F k F NSFT k MFCC p k F k s c MFCC k q k k MFCC F NSFT k

3 109 F NSFT K-L W i E 7 W k K-L D c K-L = d 1 K-L d k K-L d n K-L 5 d k K-L k MFCC 3 d k K-L k MFCC 863 k K-L MEEI W k K-L = a K-L + εδ x /d k K-L 6 δ x δ x = 1 x median Dc K-L { - 1 x < median D c K-L d E x i x j = M x im' m' = 1 槡 2 - x jm' 8 M SNR 37 ~ 55 db yep120 s c MFCC F c F c i A i F NSFT i B i A i j B i d j A i j B i d j A i j B i = min ( M' A i j - B i k 9 ) k = 1 M' B i 60 d i E A i B i = N' d j A i j B i 10 4 j = 1 N' A i PANSD 9 3600 F NSFT D c E = d 1 E d i E d n E 11 W i E = a E + εδ x /d i E 12 a E 12 PANSD PANSD 1 27 2 0 ~ 9 10 10 10 3 863 20 median D c K-L D c K-L a K-L 8 ~ 16 ε 4 3 2 7 1 362 F Nk = W k K-LF Ok 7 1 317 1 F Nk k MFCC 1253 F Ok k PANSD 2010 3 2011 8 2 2 17 m' 9 8 20 ~ 35 Toplux TVP208 22 05 khz 16 3 ~ 5 5 ~ 20 cm 20 ~ 25 min PANSD 700 min PANSD 400 3 ~ 4 s 7 557 d i E i F NSFT 826 i i 15 ~ 20 s 1 ~ 2 min GMM WAV Cooledit Pro 2 0 16 khz

110 40 16 32 ms 16 ms 24 2 MFCC 12 2 826 1 394 Table 2 Comparison of speaker recognition rates for abnormal speech % 2 154 2 ~ 3 min 9 67 26 10 38 11 87 39 68 3 66 24 10 39 9 71 36 36 278 K-L-W E-W 83 77 77 91 17 53 10 38 10 07 10 43 46 61 42 25 12 MFCC 2 3 K-L-W E-W 9 1 K-L 2 3 9 3 Fig 3 Flowchart of the proposed algorithm 4 K-L-W E-W 9 9 4 4 1 8 ε = 0 5 K-L-W 46 61% 10 25% E-W 9 4 36% 6 93% K-L-W E-W E-W 98 56% 9 98 54% 98 38% K-L 98 02% K-L-W E-W 5 4 Fig 4 Influence of weighting parameter on speaker recognition rate for abnormal speech K-L GMM K-L-W E-W K-L-W E-W GMM K-L

3 111 46 61% 42 25% Processing Magazine 2010 27 1 120-123 6 Togneri R Pullella D An overview of speaker identification accuracy and robustness issues J Circuits and 39 68% 36 36% Systems Magazine 2011 11 2 23-61 1 Rashid R A Mahalin N H Sarijari M A et al Security system using biometric technology design and implementation of voice recognition system C Proceedings of International Conference on Computer and Communication Engineering Kuala Lumpur IEEE 2008 898-902 2 J 2009 37 9 47-51 Yang Ji-cheng He Qian-hua Pan Wei-qiang Modified BIC algorithm of speaker change detection J Journal of South China University of Technology Natural Science Edition 2009 37 9 47-51 3 J J Annals of Mathematical Statistics 1951 30 3 79-86 2003 31 3 411-418 Zhang Lei Han Ji-qing Wang Cheng-fa Research progress of stress speech processing J Acta Electronic Sinica 2003 31 3 411-418 4 Alpan A Maryn Y Kacha A et al Multi-band dysperiodicity analyses of disordered connected speech J Speech Communication 2011 53 1 131-141 5 Maciel C D Pereira J C Stewart D Identifying healthy and pathologically affected voice signals J IEEE Signal 7 Garner Philip N Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition J Speech Communication 2011 53 8 991-1001 8 Yang Hong-wu Liu Ya-li Huang De-zhi Speaker recognition based on beighted Mel-cepstrum C Proceedings of the Fourth International Conference on Computer Sciences and Convergence Information Technology Seoul BIC IEEE 2009 200-203 9 Weng Zufeng Li Lin Guo Donghui Speaker recognition using weighted dynamic MFCC based on GMM C Proceedings of International Conference on Anti-Counterfeiting Security and Identification in Communication Chendu IEEE 2010 285-288 10 Kullback S Leibler R On information and sufficiency 11 You Chang Huai Lee Kong Aik Li Haizhou GMM-SVM kernel with a bhattacharyya-based distance for speaker recognition J IEEE Transactions on Audio Speech and Language Processing 2010 18 6 1300-1312 12 Ferrante A Ramponi F Ticozzi F On the convergence of an efficient algorithm for kullback-leibler approximation of spectral densities J IEEE Transactions on Automatic Control 2011 56 3 506-515 Speaker Recognition Algorithm for Abnormal Speech Based on Abnormal Feature Weighting He Jun Li Yan-xiong He Qian-hua Li Wei School of Electronic and Information Engineering South China University of Technology Guangzhou 510640 Guangdong China Abstract As the commonly-used weighting algorithm is inefficient in tracking the abnormal feature of abnormal speech a speaker recognition algorithm for abnormal speech is proposed based on the abnormal feature weighting In this algorithm first a feature template of normal speech is established by computing the probability distribution of MFCC features of each order in a large number of normal speech samples Then the K-L distance and the Euclidean distance are used to measure the differences between a given test speech and the normal speech templates and to further determine the K-L and the Euclidean weighting factors Finally the two weighting factors are used to weight the MFCC features of the test speech and the weighted MFCC features are input in the Gaussian mixture model for the speaker recognition with abnormal speech Experimental results show that the global recognition rates of the speaker recognition algorithms based on the K-L weighting and the Euclidean weighting are respectively 46 61% and 42 25% while those of the algorithms with and without the weighting of speaker recognition contribution of each order feature are respectively only 39 68% and 36 36% Key words abnormal speech speaker recognition abnormal feature weighting K-L distance weighting factor