Robust Feature Extraction Method Based on Run-Length Compensation for Degraded Character Recognition Minoru MORI, Minako SAWAKI, Norihiro HAGITA, Hiroshi MURASE, and Naoki MUKAWA OCR 1. [1] [4] [5] [7] NTT NTT Communication Science Laboratories, NTT Corporation, 3 1 Morinosato-Wakamiya, Atsugi-shi, 243 0198 Japan ATR [8] [9] LSD [10], [11] [12] [14] [15] 1 1 D II Vol. J86 D II No. 7 pp. 1049 1057 2003 7 1049
2003/7 Vol. J86 D II No. 7 2. 3. 4. 5. ETL-9 [16] 6. 2. 2. 1 2 G W W I 1 (x, y) g x,y x y g x,y 1 0 4 a b c e a = b = c = e = I 2 k=0 I 2 k=0 I 2 k=0 I 2 k=0 g x+k,y g x+k+1,y (1) (1 g x+k,y ) g x+k+1,y (2) g x+k,y (1 g x+k+1,y ) (3) (1 g x+k,y ) (1 g x+k+1,y ) (4) a b c e 1 4 3. 2. 2 r o r t r n r o = r t + r n (5) r t Rectangular window Black White Scanning I pixels Enlarged part Transition between black and white pixels 1 Fig. 1 Extraction of parameters from the input image. 1050
r t = r o r n =(1 r n/r o) r o (6) r n/r o 0 1 0 < = r n/r o < = 1 (7) r n/r o 2. 1 a b c e r n/r o r n/r o K add V add β add β add = K add / V add (8) β add 1 K add V add β add 0 r n/r o 1 β add (9) (9) (6) r t =(1 (1 β add )) r o = β add r o (10) β add K add V add a b c e ā b c ē K add =(a + b)/(ā + b) (11) V add =(b + c)/( b + c) (12) r o r o = a + b (13) r t (8) (10) (13) r t = (a + b)/(ā + b) (a + b) (14) (b + c)/( b + c) 2. 3 r o r t r n r o = r t r n (15) r t r t = r o + r n =(1+r n/r o) r o. (16) K sub V sub β sub β sub = K sub / V sub (17) K sub =(e + c)/(ē + c) (18) V sub =(b + c)/( b + c) (19) r n/r o 1 β sub (20) r t r t = ( 2 ) (e + c)/(ē + c) (a + b) (21) (b + c)/( b + c) 1051
2003/7 Vol. J86 D II No. 7 rt = 8 (a) Noise-free rt = 2 & 9 (b) Additive noise r't = 6.9 r't = 6.1 2. 1 (I = W ) a b c e p(y) (y =1,...,W) [6] p(y) = a e b c (a + b) (c + e) (a + c) (b + e) ( 1 < = p(y) < = 1) (22) rt = 1 & 4 (c) Subtractive noise r't = 6.4 2 ā =7.4 b =0.9 c =0.9 ē =4.8 Fig. 2 Extraction of observed run-length and compensatedrun-length. ā = 7.4, b = 0.9, c = 0.9, and ē = 4.8 are calculated from training data. 2. 4 2 2 r t r t =8 r t =2&9 r t =1&4 r t =6.1 r t =6.4 2. 2 2. 3 3. 2. a e b c a e b c 2W T 2W T M T M M 4. [9] LSD [10], [11] d i (i =1,...,4) d i = l i 4 j=1 lj 2 (23) l 1 l 2 l 3 l 4 3 Step 1: 1052
Fig. 3 N blocks N blocks Partitioning Enlarged part 4 3 2 l4 1 Scanning direction l3 l1 Run-length extraction 3 Feature extraction of direction contributivity. l2 Step 5: (23) l i r t,i d i 1 N =8 8 8=64 4 (a) 4 (b) 4 5. Horizontal Vertical (a) Feature values based on compensated run-length Horizontal Vertical (b) Feature values based on observed run-length 4 Fig. 4 Examples of the feature values based on compensated run-length and observed run-length. Horizontal and vertical feature values are visualized. Step 2: N N Step 3: Step 1 r t,i (i =1,...,4) Step 4: r t,i 5. 1 ETL-9 3,036 200 /[16] (1,...,199) 100 / (2,...,200) 100 / [6] α (α <0) α (α > = 0) α 70 < = α < = 70[ ] 10 G α G Z α AND G α G Z α OR 1053
2003/7 Vol. J86 D II No. 7 g α x,y = { g x,y zx,y α g x,y zx,y α if α<0, otherwise. (24) gx,y, α g x,y, zx,y α G α, G, Z α (x, y) 0 1 5 [6] W 64 64 128 = 64 2 256 = 8 8 4 I 1 Noise model (a) Additive noisy image Noise model (b) Subtractive noisy image Noisy character Noisy character 5 Fig. 5 Additive noise model, subtractive noise model, and each noisy character image. b c 1 1 I =15 ±7 I =15 I =11 ±5 ā b c ē 7.4 0.9 0.9 4.8 5. 2 70 < = α < = 70 15 α 70 10 0 10 70 15 3,036 100 15 seed 1 10 < = α < = 0 99.8 α =0 4.1 α = 10 5.9 5. 3 3 3 α 1 [ ] Table 1 Noise type detection accuracy [ ]. Input Output Subtractive Additive -70-60 -50-40 -30-20 -10 0 10 20 30 40 50 60 70 Subtractive 99.9 99.9 99.9 99.9 99.9 99.8 94.1 4.1 0.01 0 0 0 0 0 0 Additive 0.05 0.07 0.05 0.03 0.06 0.2 5.9 95.9 99.9 100 100 100 100 100 100 1054
100 80 Feature based on compensated run Median filter + Feature based on observed run Feature based on observed run 100 Noise: 1 1 Noise: 2 2 Noise: 3 3 Noise: 4 4 Noise: 5 5 Recognition rate [%] 60 40 20 0-70 -60-50 -40-30 -20-10 0 10 20 30 40 50 60 70 Noise Recognition rate [%] 80 60 40 20 0-70 -60-50 -40-30 -20-10 0 10 20 30 40 50 60 70 Noise -60% -30% 0% 30% 60% 6 Fig. 6 Recognition rates for patterns with subtractive or additive noise. 6 60 < = α < = 70 0 0 0 α > = 10 0 3. 1 1 5 5 7 Fig. 7 Recognition rates for patterns with each size of subtractive/additive noise. 7 5 5 5. 4 6 α < = 60 α > = 60 α < = 60 2 (21) 2 1 10 < = α < = 0 100 1055
2003/7 Vol. J86 D II No. 7 α = 10 0.1 α =0 0 α = 10 0 α 5. 3 6 8 9 9 (a) 9 (b) [17] [19] 8 Fig. 8 Examples of images recognized correctly by the proposed method. (a) 9 Fig. 9 Examples of images recognized erroneously by the proposed method (Correct category Erroneous category as the recognition result). (b) 5. 3 7 6. ETL-9 ETL-9 [1] G.E. Kopec, Supervised template estimation for document image decoding, IEEE Trans. Pattern Anal. Mach. Intell., vol.19, no.12, pp.1313 1324, Dec. 1997. [2] Y. Xu and G. Nagy, Prototype extraction and adaptive OCR, IEEE Trans. Pattern Anal. Mach. Intell., vol.21, no.12, pp.1280 1296, Dec. 1999. [3] T.K. Ho, Bootstrapping text recognition from stop words, Proc. 14th ICPR, vol.1, pp.605 609, Brisnane, Australia, Aug. 1998. [4] M. Sawaki, H. Murase, and N. Hagita, Automatic acquisition of context-based images templates for degraded character recognition in scene images, Proc. 15th ICPR, pp.15 18, Barcelona, Spain, Sept. 2000. [5] 1989. [6] M. Sawaki and N. Hagita, Text-line extraction and 1056
character recognition of document headlines with graphical designs using complementary similarity measure, IEEE Trans. Pattern Anal. Mach. Intell., vol.20, no.10, pp.1103 1109, Oct. 1998. [7] A. Sato, A learning method for definite canonicalization based on minimum classification error, Proc. 15th ICPR, vol.2, pp.199 202, Barcelona, Spain, Sept. 2000. [8] M. Umeda, Advances in recognition methods for handwritten Kanji characters, IEICE Trans. Inf. & Syst., vol.e79-d, no.5, pp.401 410, May 1996. [9] D vol.j66-d, no.10, pp.1185 1192, Oct. 1983. [10] J. Zhu, T. Hong, and J.J. Hull, Image-based keyword recognition in oriental language document images, Pattern Recognit., vol.30, no.8, pp.1293 1300, Aug. 1997. [11] S.N. Srihari, T. Hong, and G. Srikantan, Machineprinted Japanese document recognition, Pattern Recognit., vol.30, no.8, pp.1301 1313, Aug. 1997. [12] H. Ozawa and T. Nakagawa, A character image enhancement method from characters with various background image, Proc. 2nd ICDAR, pp.58 61, Oct. 1993. [13] S. Liang, M. Ahmadi, and M. Shridhar, A morphological approach to text string extraction from regular periodic overlapping text/background images, CVGIP, vol.56, no.5, pp.402 413, Sept. 1994. [14] M.Y. Yoon, S.W. Lee, and J.S. Kim, Faxed image restoration using Kalman filtering, Proc. 3rd ICDAR, vol.2, pp.677 680, Montreal, Canada, Aug. 1995. [15] Z. Shi and V. Govindaraju, Character image enhancement by selective region-growing, Pattern Recognit. Lett., vol.17, no.5, pp.523 527, May 1996. [16] JIS 1 ETL9 D vol.j68-d, no.4, pp.757 764, April 1985. [17] D-II vol.j79-d-ii, no.9, pp.1534 1542, Sept. 1996. [18] S. Omachi, M. Inoue, and H. Aso, A noise-adaptive discriminant function and its application to blurred machine-printed Kanji recognition, IEEE Trans. Pattern Anal. Mach. Intell., vol.22, no.3, pp.314 319, March 2000. [19] D-II vol.j83-d-ii, no.7, pp.1658 1666, July 2000. 14 10 4 15 3 4 5 NTT 11 NTT 9 13 51 53 NTT 13 10 ATR 56 IEEE 53 55 NTT 4 1 NTT 60 4 6 CVPR 7 8 ICRA 13 14 IEEE 49 51 NTT 6 12NTT 12 NTT IEEE 1057