THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. y y yy y 1565 0871 2 1 yy 525 8577 1 1 1 E-mail: yfmakihara,shiraig@cv.mech.eng.osaka-u.ac.jp, yyshimada@ci.ritsumei.ac.jp Strategy of Candidate Choice for Interactive Vision Yasushi MAKIHARA y, Yoshiaki SHIRAI y, and Nobutaka SHIMADA yy y Faculty of Engineering, Osaka University Yamadaoka 2 1, Suita-shi, 565-0871 Japan yy College of Information Science and Engineering, Ritsumeikan University Noji Higashi 1-1-1, Kusatsu-shi, 525 8577 Japan E-mail: yfmakihara,shiraig@cv.mech.eng.osaka-u.ac.jp, yyshimada@ci.ritsumei.ac.jp Abstract This paper describes a strategy of object candidates choice to ease user's burdens for an interactive object recognition system when the system obtains multiple candidates as a recognition result. We take the strategy to minimize time spent for candidate choice by displaying the candidates visibly and by introducing hierarchical choice to reduce candidates per one choice. First, we propose several display methods and hierarchical choices. Next, we verify effectiveness of the methods by subjective tests. Last, we divide the time spent for the choice into user's recognition time, making answer time, and vocal dialog time, and formulate each time to repute the methods quantitatively. Key words Hierarchical Choice, Decision Tree, Object Recognition, Dialog 1. 2. 3. 4. 5. 2. 1(a) [1] 1
(a) No label (Lno) (b) Coloring (L clr ) (a) Object model (b) Original image (c) Normalized image (d) Recognition result 1 Fig. 1 Overview of object recognition ( 1(b) ) ( 1(b)) ( 1(c)) [2] [3] 1(d) [4] [5] [6] 3. 3. 1 3. 2 3. 1 1 2 3. 1. 1 2(a) ( L no ) 2(b) ( L clr ) (c) Numbering (Lnum) (d) Numbering with color (Lnwc) 2 Fig. 2 Labeling to distinguish candidates 2(c) ( L num ) 2 3 ] 1 2(d) ( L nwc ) 3. 1. 2 2 2. 2 ( B sim ) ( B turn ) L num 1 2(c) 6 7 2
Table 1 1 Display method and reputation value for subjective test D1 Lno B sim D2 L clr B sim D3 Lnum B sim D4 Lnwc B sim D5 Lno Bturn D6 L clr Bturn D7 Lnum Bturn D8 Lnwc Bturn D9 Lno B cnf 0 1 2 3 4 5 6 7 8 9 10 Fig. 3 3 Result of subjective tests for diaplay method T blk T blk T blk ( B cnf ) (L no) 3. 1. 3 2 6 1 9 11 1 11 3 D4( ) D7( ) (a) All candidates (b) Representative candidates 4 Fig. 4 Hierarchy by representative candidates 3. 2 3. 1 4(a) 3. 2. 1 4(a) 3. 2. 2 5(a) 5(b) ( 3
(c) Segmentation by dichotomy (b) Segmentation by color 5 Fig. 5 Segmentation of candidate regions 2 Table 2 Items of subjective test for hierarchical choice H1 H2 ( ) H3 Fig. 6 6 Result of subjective test for hierarchical display Tr i : decision n 1 tree (rough group) n 0 (shelf) ) 3. 2. 3 4(a) ( 4(b)) 3. 2. 4 2 2 6 6 5 4 6 11 5. 4. 3. O1 O2 Fig. 7 n 3 7 (representative candidate) Ok n j (alternative candidate) end node (each candidate) Ok 1 Decision tree for hierarchy of candidate choice B sim B turn L clr ;L num B cnf L no fb sim;l clr g, fb turn;l clr g, fb sim;l numg, fbturn;lnumg, fbcnf ;L nog 5 () 4. 1 7 ( ) Tr i Tr Λ Tr Λ = arg min i ft (Tr i)g (1) T (Tr i) Tr i 4. 2 T (Tr i) T (Tr i) N O O k p(o k ) 4
T (Tr i;o k ) T (Tr i) N OX T (Tr i)= p(o k )T (Tr i;o k ) (2) k=1 T (Tr i;o k ) Tr i O k N c(tr i;o k ) fn1;:::;n Nc(Tr i ;Ok)g n T c(n) X N c(tr i ;O k ) T (Tr i;o k )= T c(n l (Tr i;o k )) (3) l=1 B L D = fb; Lg n D s T c(n; D s) T c(n) = min s T c(n; D s) (4) ffl T rcg: ffl T make ans : 2 ffl T dlg : T c T c = T rcg + T make ans + T dlg (5) 4. 3 m T blk N rcg=blk N rcg=blk =1 2 N rcg=blk = N rcg=blk sim (> 1) i i T rcg;i T blk T rcg;i = i (6) N rcg=blk m T rcg T rcg = m +1 T blk (7) 2 N rcg=blk T rcg m 8(a) 0 (7) 4. 3. 1 m node m ffl ( ) m = m end T blk N rcg=blk sim N rcg=blk = N rcg=blk sim ffl ( ) m = m node (<m end ) T blk N rcg=blk N rcg=blk = N rcg=blk sim N rcg=blk =1 4. 4 B cnf 5
fb turn;l numg fb sim;l numg L clr m ltd 8(b) 4. 5 m L T dlg T vr T cnf T dlg (m; L) =T vr + T cnf (8) ( ) ffl (p tp): ffl (p fp ): ffl (p n): T dlg 1X T dlg (m; L) = i=1 1X p i 1 n p tp(it vr + T cnf ) + p i 1 n p fp (it vr + T cnf + T dlg (m 1;L)) i=1 1 = T vr+t cnf + p fp T dlg (m 1;L) (9) p tp+p fp p tp+p fp T dlg (1;L)=T cnf (10) m L 2 3 T vr;t cnf m; L m m T rcg 8 Fig. 8 T { B turn,*} B, } { cnf { B sim,*} m L no make _ ans {*, L } clr B sim, L m ltd { num } { B, L turn num} { B cnf, L no } m T dlg (a) (b) (c) { B cnf, } {*, Lclr} {*, L } num ( :: ) Property of choice time for each display method (horizontal axis: #choices, vertical axis: time) m T dlg (m; L) = m +1 T cnf (11) 2 8(c) 5. ffl ffl ffl [1], " ", 19, CD-ROM, 2001. [2], " ",, Vol. J87-D-II, No. 2, pp. 629-638, Feb. 2004. [3] Y. Makihara, M. Takizawa, Y. Shirai, J. Miura, and N. Shimada "Object Recognition Supported by User Interaction for Service Robots", Proc. of 16th Int. Conf. on Pattern Recognition, Vol. 3, pp. 561-564, Quebec, Canada, Aug. 2002. [4] Y. Makihara, M. Takizawa, Y. Shirai, J. Miura, and N. Shimada "Object Recognition Supported by User Interaction for Service Robots", Proc. of 5th Asian Conf. on Computer Vision, Vol. 2, pp. 719-724, Melbourne, Australia, Jan. 2002. [5], " ",, Vol. 16, No. 4, pp. 174-182, 2002. [6], " ", 20, CD-ROM, 2002. m L no 6