21 6 2016 12 Vol 21 No 6 JOURNAL OF HARBIN UNIVERSITY OF SCIENCE AND TECHNOLOGY Dec 2016 1 1 1 2 1 150080 2 130300 Gabor RBM OCR DOI 10 15938 /j jhust 2016 06 012 TP391 43 A 1007-2683 2016 06-0061- 06 Area Location and Recognition of Video Text Based on Depth Learning Method LIU Ming-zhu 1 ZHENG Yun-fei 1 FAN Jin-fei 1 YU Fang 2 1 School of Measure-control Technology and Communications Engineering Harbin University of Science and Technology Harbin 150080 China 2 Dehui Education Technology Service Center of Jilin Province Dehui 130300 China Abstract It is advantageous to improve the efficiency and accuracy of video information processing through fast and accurate text area location and recognition of video images The Gabor filter has been used to extract the texture features of video images in the four directions of horizontal vertical left-failing and right-falling Then by RBM layer increment depth learning algorithm a depth belief network has been structured and at the same time the text region location for the texture feature images has been realized The paper also studied the feasibility and recognition effect about using morphological process and OCR character database to realize the video image text recognition The test results showed that the proposed optimized depth learning algorithm combining with morphology character recognition method can not only realize the accurate location of the text region for video images but also improve the efficiency and accuracy of the character recognition Keywords depth learning algorithm video image text area location morphological denoising character recognition 0 OCR 2015-06 - 29 61401126 1973 E-mail lmz@ hrbust edu cn 1990 1989
62 21 1 2 Gabor 3 4 2D-Gabor Daugman 5 OCR 6 OCR Gabor 1 1 exp - 2πj Gabor g x y = Kexp - π p 2 x - x 0 2 + q 2 y - y 0 2 u 0 x - x 0 + v 0 y - y 0 2 F u v = K pq exp - π u - u 0 2 + v - v 0 { 2 p 2 q } 2 1 exp{ - 2πj x 0 u - u 0 + y 0 v - v 0 } Gabor 2 K Gauss x 0 y 0 restricted boltzmann machine RBM Gauss u 0 v 0 p q Gauss x 0 y 0 0 deep belief network 0 p q Gabor DBN 7 - p q 3 8 λ = U h /U I 1 M - 1 p = λ - 1 U h / λ + 1 槡 2ln2 OCR q = tan( π 2T) [ U h - 2ln2 p2 U ] 2ln2-2ln2 2 p [ 2 1 2 ] 1 Gabor Gabor U I = 0 2 2 U h = 0 4 T = 4 M = 4 2 1 2 b 2 a 4 Gabor 2 h U 2 h 1 3 U h U I T M λ Gabor 4 Gabor 4 4 Wang Gabor Gabor λ λ η 9 Gabor Gabor
6 63 b Gabor 3 4 v h 3 RBM 2 Gabor P θ v h = 1 exp - E v Z θ h θ 2 2 = 1 Z θ e W ij v i h j e b i v i e a j h j 5 ij i j Z θ = exp - E v h θ h v 10 depth belief network DBN re- P v h = P v j h P v j = 1 h stricted boltzmann machine RBM i 1 = 6 1 + exp - j W ij h j - b i S n S 1 S 2 S n I O I S 1 S 2 S n O O I I = I S i S i RBM 0 1 P v h Boltzmann v Boltzmann E v h θ = - W ij v i h j - b i v i - a j h j 4 ij i j θ = W a b a b RBM θ v P h v = P h j v P h j = 1 v j 1 1 + exp - i W ij - a i I 1 2 L θ = 1 N n N logp θ v n - λ = 1 N W 2 F 3 L θ W ij L θ W ij 2 3 DBN L θ = E W Pdata v i h j - Pθ E v i h j - 2λ ij N W ij 9 RBM E Pdata v i h j Hinton Sejnowski E Pθ v i h j RBM - 11 7 D = v 1 v 2 v N θ = W a b 8
64 21 DBN DBN 2 4 DBN DBN 12 RBM RBM DBN DBN 13 1 H 0 DBN n L 1 L 2 DBN L n DBN H 0 H 0 H 1 W 0 8 0 1 9 W 0 7 H 1 H 1 W 1 W 2 W n - 1 3 DBN 5 b 5 c 4 DBN DBN W i = W 0 W 1 W 2 Z 2 A W n - 1 DBN n + 2 H 0 H 1 H 2 H n CC A AΘC H 0 AΘC = z C z A 10 64 n Z 2 C z C z L 1 L 2 L n z DBN H 0 H 1 H 0 H 1 11 Z 2 RBM H 0 v H 1 A C C A A h C W 0 RBM A C = z C z A Φ 11 RBM Z 2 C z C z z Φ RBM DBN 5 5 DBN DBN 4 DBN 5 5
6 65 5 DBN 4 OCR DBN 1 DBN 6 7 OCR 1 DBN OCR 6 5 7 OCR 14 Kim 15 SVM 16 12 RR PR F RR = c m PR = c n 2 PR RR F = PR + RR 12 c m n F 4 1 DBN 4-DBN 4-DBN 4 2 100 4 2 DBN F n m c /% /% 2-DBN 378 302 224 74 17 59 26 65 88 3-DBN 378 334 251 75 15 66 40 70 50 4-DBN 378 364 295 81 04 78 04 79 51 5-DBN 378 369 301 81 57 79 62 80 58 6-DBN 378 372 304 81 72 80 42 81 06
66 21 2 F n m c /% /% 378 305 229 75 81 60 58 67 34 Kim 378 327 253 77 37 66 93 71 77 SVM 378 342 276 80 70 73 02 76 67 4-DBN 378 364 295 81 04 78 04 79 51 2 DBN 3 Kim SVM F 1 DBN 378 5 059-631 OCR 2 DBN 3 5 Gabor DBN OCR 1 Edge Detection Based on Mathematics Morphology J International Journal of Signal Processing Image Processing and J 2005 28 3 427-432 Pattern 2 EPSHTEIN B OFEK E WEXLER Y Detecting Text in Natural Scenes with Stroke Width Transform C 2010 IEEE Conference on Computer Vision and Pattern Recognition San Francisco USA IEEE Computer Society 2010 2963-2970 3 - SVM Natural Scene Images Using Hierarchical Feature Combining and J 2010 31 4 916-922 4 CHEN X R YUILLE A L Detecting and Reading Text in Natural Scenes C 2004 IEEE Computer Society Conference on Computer Verification C Proceedings of the 17th International Conference on Pattern Recognition Cambridge United Kingdom Institute of Electrical and Electronics Engineers Inc 2004 679 Vision and Pattern Recognition Washington D C USA - 682 Institute of Electrical and Electronics Engineers Computer Society 2004 366-373 16 YAN J Q LI J GAO X B Chinese Text Location Under Complex Background Using Gabor Filter and SVM J Neurocomputing 5 DAUGMAN J G Complete Discrete 2-D Gabor Transforms by Neural Networks for Image Analysis and Compression J IEEE Transactions on Acoustics Speech and Signal Processing 1988 36 7 1169-1179 6 J 2005 10 1 122-124 7 KAMARAINEN Joni KYRKI Ville K LVI INEN Heikki Fundamental Frequency Gabor Filters for Object Recognition J International Conferenceon Pattern Recognition 2002 16 1 628 8 FU P LI M YIN T Gabor Filter Based Text Extraction from Digital Document Image J Tien Tzu Hsueh Pao / Acta Electronica Sinica 2006 34 2387-2390 9 WANG X W DING X Q LIU C S Optimized Gabor Filter Based Feature Extraction for Character Recognition C 16th International Conference on Pattern Recoqnition Quebec City Canada Institute of Electrical and Electronics Engineers Inc 2002 223-234 10 J 2015 45 2 596-599 11 HINTON G E OSINDERO S BAO K Reducing the Dimensionality of Data with Neural Networks C Proceedings of the 10th International Workshop on Artificial Intelligene and Statistics Barbados Society for Artificial Intelligence and Statistics United States 2005 128-135 12 HINTON G E A Practical Guide to Training Restricted Boltzmann Machines J Lecture Notes in Computer Science 2012 599-619 13 DENG C X CHEN Y BI H et al Recognition 2014 7 5 309-322 The Improved Algorithm of 14 JUNG K Neural Network-based Text Location in Color Images J Pattern Recognition Letters 2001 22 14 1503-1515 15 KIM K C BYUN H R SONG Y J et al Scene Text Extraction in 2011 74 17 2998-3008