25 6 2008 12 : 1000 8152(2008)06 0995 06 Control Theory & Applications Vol. 25 No. 6 Dec. 2008 1,2, 2 (1., 200237; 2., 310018) :,., ART2,.,.,,. : ; ; ; ART2 ; ; : TP242 : A Autonomous navigation control for mobile robots based on emotion and environment cognition ZHANG Hui-di 1,2, LIU Shi-rong 2 (1. Research Institute of Automation, East China University of Science and Technology, Shanghai 200237, China; 2. Institute of Automation, Hangzhou Dianzi University, Hangzhou Zhejiang 310018, China) Abstract: A new autonomous navigation control system for mobile robots is presented, in which the learning and decision making model based on emotion and cognition is introduced into the control architecture of behavior-based robot system. Dynamic system approach is applied to design the basic behaviors. Continuous environment perceptions are classified into categories by Adaptive Resonance Theory-2 (ART2) networks, which are used as the environmental cognitive states in learning and decision making. Rational behavior-coordination mechanism is developed by the on-line learning of emotion and environmental cognitive. Simulations have demonstrated that the emotion and environmental cognition can obviously improve the efficiency of the learning and decision making process, and enhance the capability of autonomous navigation in behavior-based robot system under unknown environment. Key words: mobile robot; emotion; cognition; ART2 networks; behavior coordination; autonomous navigation 1 (Introduction) [1],. : +. : [2].,, + ( ) [3]. [4 6]., MIT Ahn Picard [7,8] :., ART2,.. 2 (Architecture of autonomous navigation control system) Ahn Picard [7,8] : 2007 08 25; : 2008 01 01. : (60675043); (2007C21051); (KYS09150543).
996 25.,., Ahn Picard, 1.,,. ART2,,.,.. 1, Schoner Dose [9]. :. 3 :. 2. [9, 10] 3. Fig. 2 2 Posture and sensors distribution map of mobile robot 1). φ = λ goal,φ sin(φ ψ goal ), (1) ν = λ goal,ν (ν ν goal ). (2) : λ goal,φ > 0, λ goal,ν > 0, ψ goal, ν goal : ν goal = min(k goal d goal, ν goal,max ). (3) Fig. 1 1 The structure of navigation control system,.,.,. a t = (a 1,, a A ), c t C = {1,, C }, d t D = {1,, D }. 3 (Behavior designing based on dynamic system approach),, : k goal > 0, d goal, ν goal,max. 2). φ=λ obs,φ (φ ψ obs ) exp( (φ ψ obs ) 2 /2σ 2 ), (4) { λobs,ν (ν ν min ), ν < ν min, ν = (5) λ obs,ν (ν k obs d obs ),. : ψ obs, λ obs,φ > 0, λ obs,ν > 0 k obs > 0, ν min, d obs > 0, σ > 0. 3). φ = λ wall,φ sin(φ ψ wall ), (6) ν = λ wall,ν (ν k wall d wall ). (7) : ψ wall, λ wall,φ > 0, λ wall,ν > 0, k wall > 0, d wall. ψ goal, ψ wall ψ obs.
6 : 997, φ ψ i (i = goal, obs, or wall). [11] : k i λ i,φ k i λ i,ν, i = goal, obs, or wall. (8). 4 ART2 (Perception of local environment based on ART2 networks),,,.., [12]. ( 2), M R (t) = 3 cos α i, i=1ds i + m 0 (9) M L (t) = 8 cos α i, i=6ds i + m 0 (10) M F (t) = 5 cos α i. ds i + m 0 (11) i=4 : R, L, F ; α i i ; ds i i ; m 0 > 0. [M R (t), M F (t), M L (t)].,., ART2 [13],. ART2,., (9) (11),. ART2, 1. 1. f(c t, d t, c t+1 ) d t c t c t+1 f(c t, d t, c t+1 ) f(c t, d t, c t+1 ) + 1. (12) P(c t+1 c t, d t ), : P(c c t, d t ) = f(c, d t, c t ), c C. (13) f(c t, d t, k) k=1 5 (Autonomous navigation control algorithm) 5.1 (Affective model and intrinsic reward).,,,. a e = (a e,1, a e,2 ), e A = {1,, A }. (14) a e,1, a e,2 (feeling bad) (feeling good), a e,1 + a e,2 = 1. : a = a e = (a 1,, a A ). (15), want, a = (a 1,1, a 1,2 ).,,. : δ E = R ext (c t+1 c t, d t ) max( R ext (c c t, d t )P(c c t, d t )). (16) c δ E > 0, ( ) feeling good,. : [8] : a t+1 = P(v t+1 1,ve t+1 e c t+1, c t, a t, d t ). (17) v t+1 e = { 1, feeling bad, 2, feeling good, P(v e = 1 c t+1, c t, a t, d t ) = P(v e = 1 c t+1, c t, a t, d t ) εδ E, (18) P(v e = 2 c t+1, c t, a t, d t ) = P(v e = 2 c t+1, c t, a t, d t ) + εδ E. (19) (16) (19),
998 25.,., { 1, v t+1 e = 1, feeling bad, R(v t+1 e ) = 5.2 (Extrinsic reward) 1, v t+1 e = 2, feeling good. (20),., : 1).,. M c (t) = 8 cos α i. (21) ds i + m 0 i=1 M c (t), r ext1 (t) = (M c (t + 1) M c (t)). (22) 2).,, r ext2 (t) = p 1 D g (t) p 2 ϕ g (t). (23) : D g, Deltaϕ g, p 1, p 2., r ext (t) = w r2 r ext2 (t), M c (t + 1) < M cmin, C ext, M c (t+1)>m cmax, w r1 r ext1 (t) + w r2 r ext2 (t),. (24) : w r1, w r2, M cmax, M cmin. C ext > 0. R ext (c t+1 c t, d t ) = (1 γ) R ext (c t+1 c t, d t ) + γ r ext (t). (25) γ = 1/f(c t, d t, c t+1 ). R ext (c c, d) r ext,, d c c. 5.3 (Learning and decisionmaking algorithm) : 1 c t a t, Boltzmann d t : P r(d t c t, a t ) = exp(q DM(c t, a t, d t )/T ). (26) D exp(q DM (c t, a t, d)/t ) d=1 : T, Q DM Q ext Q int : Q DM (c t, a t, d)=q ext (c t, d)+ηq int (c t, a t, d). (27) η. 2 d t, ART2 1 c t+1, (24) r ext. 3 < c t, d t, r ext, c t+1 > P(c t+1 c t, d t ) R ext (c t+1 c t, d t ). 4 P(ve t+1 c t+1, c t, a t, d t ). 5 c C, a A, d D Q ext Q int : Q ext (c, d) = R ext (c c, d)p(c c, d) + c =1 α c =1 max Q ext (c, d )P(c c, d), (28) d Q int (c, a, d) = 2 R e (v e )P(v e c, c, a, d)p(c c, d). (29) c =1v e=1 6 (17) (19) a t+1 = (a t+1 1,1, a t+1 1,2 ). 7 t t + 1, 1.,., η,. 6 (Simulation study),.
6 : 999,.,,.,,., η : Case 1 η = 0, ART2,,. Case 2 η = 1, ART2,,. Case 3 η = 5, ART2, Case 2,. Case 4 η = 5,, ( (9) (11)),. 3 6 4. A,B., 3, 3., η = 1, 4. 5, η = 5.. 5 6 : ART2,,. 4 (Case 2) Fig. 4 Learning process of autonomous navigation of Case 2 5 (Case 3) Fig. 5 Learning process of autonomous navigation of Case 3 6 (Case 4) Fig. 6 Learning process of autonomous navigation of Case 4 3 (Case 1) Fig. 3 Learning process of autonomous navigation of Case 1 3,4 6, ( C ),. 1 η..
1000 25 Table 1 1 4 Comparison of simulation results of 4 cases Case /m 1 350 43.21 4 2 294 33.04 3 3 157 20.43 1 4 260 25.71 2 7 (Conclusion),,,,,.,. (References): [1]. : [M]. :, 1995. (National Natural Science Foundation of China. Study on Development Strategy of Natural Science: Automation Science and Technology[M]. Beijing: Science Press, 1995.) [2] PICARD R W. Affective Computing[M]. Cambridge, Mass, London, England: MIT Press, 1997. [3]. [J]., 2000, 22(5): 479 481. (WANG Zhiliang. Artificial psychology-a most accessible science research to human brain[j]. Journal of University of Science and Technology Beijing, 2000, 22(5): 479 481. ) [4] MOREN J, BALKENIUS. Reflections on emotion[j]. Cybernetics and Systems, 2000, 30(6): 699 703. [5] VELASQUEZ J D. An Emotion-Based Approach to Robotics[C] //Proceedings of the 1999 International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 1999: 235 240. [6] GADANHO S C, HALLAM J. Emotion-triggered learning in autonomous robot control[j]. Cybernetics and Systems, 2001, 32(5): 531 559. [7] AHN H, PICARD R W. Affective-cognitive learning and decision making: A motivational reward framework for affective agents[c]//proceedings of the 1st International Conference on Affective Computing and Intelligent Interaction. Beijing, China: Springer, 2005, 866 873. [8] AHN H, PICARD R W. Affective-cognitive learning and decision making: The role of emotions[c]//proceedings of the 18th European Meeting on Cybernetics and Systems Research. Vienna, Austria: Austrian Society for Cybernetics Studies, 2006. [9] SCHONER G, DOSE M. A dynamical systems approach to task-level system integration used to plan and control autonomous vehicle motion[j]. Robotics and Autonomous Systems, 1992, 10(4): 253 267. [10] BICHO E, SCHONER G. The dynamic approach to autonomous robotics demonstrated on a low-level vehicle platform[j]. Robotics and Autonomous Systems, 1997, 21(1): 23 35. [11] ALTHAUS P. Indoor navigation for mobile robots: control and representations[d]. Stockholm: Royal Institute of Technology, 2003. [12],,. [M]. :, 2005. (TAN Min WANG Shuo CAO Zhiqiang. Multi-robot Systems[M]. Beijing: Tsinghua University Press, 2005. ) [13] CARPENTER G A, GROSSBERG S. ART-2: self-organization of stable category recognition codes for analog input pattern[j]. Applied Optics, 1987, 26(23): 4919 4930. : (1974 ),,,, E-mail: nbzhd@hotmail.com; (1952 ),,,,, E-mail: liushirong@hdu.edu.cn.