An Interactive-Learning System for Pet Robots

Preprints of the 18th IFAC World Congress Milano (Italy) August 28 - September 2, 2011 An Interactive-Learning System for Pet Robots Chi-Tai Cheng*, ...
Author: Austin Taylor
0 downloads 3 Views 958KB Size
Preprints of the 18th IFAC World Congress Milano (Italy) August 28 - September 2, 2011

An Interactive-Learning System for Pet Robots Chi-Tai Cheng*, Yu-Ting Yang**, Jacky Baltes* and Ching-Chang Wong** *Dept. of Computer Science, University of Manitoba, Canada, R3T 2N2 (Tel: +1-204-898-5885; email:[email protected]) **Dept. of Electrical Engineering, Tamkang University, Taiwan, 25173 (Tel: +886-2-262156565 ext. 2733; email: [email protected]) Abstract: This paper presents a four legs robotic dog with an interactive-learning system. There are four sub-systems; including physiology, feeling, learning, and behavior are built on the robot for the interactive-learning system. Three different feelings are designed in the robot. The batteries and the motors statuses are transferred to feeling of hunger and fatigue of the pet robot. Curiosity, which let the pet robot want to play with human, is also one feeling for the pet robot. The pet robot doesn’t know how to react when it feels something. A Neural-Network system is implemented to build the relationship between the behaviours and the feeling. As frequency of interaction increasing, the pet robot is able to choose the right behaviours more often. Experimental results show the efficiency of the proposed method. Keywords: Pet robot, Interactive-learning, Neural-Network differently, when robot is in the same feeling but different degree of the feeling.

1. INTRODUCTION For a long time, building an interactive robot that users form an emotional bond with is an active research area. In particular, research of pet robots is one of the important research topics for interactive robots, e.g. Hiros (1984). Many people consider their pet as a member of their family. Pet robots have the advantage that they are familiar to users and look cute, which allows them to enter daily life easily. Pet robots are autonomous robots because they need to sense the environment, make decision and execute actions to modify the environment, e.g. Blumberg et al. (1995); Blumberg et al. (1996); Kubota et al. (2001). The learning of new behaviours and the ability to meaningful interact with the user is always the most challenging problem, e.g. Terzopoulos et al. (1994); Buehler et al. (1998); Berns et al. (1999); Luo et al. (2002). The AIBO, which was designed and manufactured by Sony in 1999, came out and caused a great amount of interest in robotic pets, e.g. Cordes et al. (1997); Seiji et al. (1997). This paper presents BLUE, a dog-like pet robot with interactive and learning abilities. The pet robot architecture includes four sub-systems: physiology, feeling, learning, and behavior. Initially, the robot does not know what effects a given behaviour will have on the user. In contrast to real pets, the pet robot has to learn that it must try to beg for food if it is hungry and wants the user to feed it.

This paper is organized as follows. Section 2 describes the hardware and software system architecture of the pet robot. Section 3 focuses on the proposed reactive-learning method. Section 4 presents the experimental results and discussion. In Section 5 we conclude this paper. 2. SYSTEM ARCHITECTURE BLUE is an interactive learning pet robot. Table 1 shows the hardware details of BLUE. The main motivation of BLUE is learning how to make decision by itself by interacting with its environment. BLUE has 17 degrees of freedom (DOFs), as shown in Table 1.

BLUE is four legs robotic dog. It has three feelings and four behaviours. The feelings include HUNGER, CURIOSITY, and FATIGUE. Behaviours include: EAT, SLEEP, SKIPPING, and PLAYING. BLUE uses the proposed Neural Network (NN) learning system to build the connection of the feelings and the behaviours. The fuzzy efficiency system presents a flexible interactive system for the users. The executed behaviors override the feeling Copyright by the International Federation of Automatic Control (IFAC)

12686

Table 1. System specification of BLUE Height

35 cm

Weight

4.17 kg

Motors

Head: KONDO KRS 788 Legs: KONDO KRS 2350

DOFs

Ears: 2; Mouth: 1; Neck: 2; Leg: 3 x 4

Sensors

Voltage sensor x1; Temperature sensor x1; Touch sensors x3

Processor

Inter® Pentium® M 1.0 GHz, NIOS and 8051.

Preprints of the 18th IFAC World Congress Milano (Italy) August 28 - September 2, 2011

2.1 Mechanical Design The robot can execute various movements and motions to express its emotion, such as swinging the ear, rotating the head, and barking. The head includes a 16*8 LED dot-matrix display. The frame of BLUE is mainly fabricated from aluminium alloy 5052. 2.2 BLUE’s Hardware Design

The software system is developed in BLUE for image processing, interactive-learning, and system monitor. The interface of the software program is shown as Fig. 2. In order to accomplish interaction between BLUE and human, one face recognition system is implemented. The skin colour of user is easy to change when lighting changes. A chromaticity-based constraint to select training pixels in a scene for updating a dynamic skin colour model under changing illumination conditions is proposed, e.g. Soriano et al. (2000a, b). The processing steps are shown as Fig. 3.

Fig. 1 shows the hardware design of BLUE. There are three processors in BLUE’s: (a) a notebook, (b) a NIOS FPGA, and (c) an 8051 microcontroller. The notebook is responsible for higher level reasoning. It includes a USB camera and processes the image, decides on which action to perform next, and uses a neural network to learn about its actions and environment. The NIOS board, a 32-bit embedded soft core MCU (Micro-Control Unit) is responsible for additional sensors and motion control. The additional sensor data (e.g., voltage, temperature, and touch) is sampled and processed. The 8051 micro-controller controls the LED display. It receives an emotional context from the notebook, and adapts the LED display to express the emotion. NoteBook Image Processing Environment Analyze

Motion Design

Behavior Decision

NN learing

Fig. 2. Human-Machine interface of BLUE

USB Camera

Data Transmission

NIOS Touch Sensor

Protocol Module

Voltage Sensor Sensor Recive Motor Contor

Computer Recive

Data Analyze

Neck

Hear

Temp. Sensor Neck Servo

Leg

Ear Servo Leg Servo

8051 Data Analyze LED Face Execute Emotion

Fig. 1. BLUE’s Hardware Design Three touch sensors, one voltage sensor, and one temperature sensor are mounted on BLUE to capture touch events, battery charge, and motor temperature of the robot, respectively. One touch sensor is mounted on BLUE’s head to detect pads on its head. The other two sensors are added to the robot’s leg. The power detector measures the voltage of the batteries and can trigger an alarm if BLUE is running out of power. The temperature sensor is mounted on the ankle of the front leg to detect the temperature of the servomotor.

2.3 Software Design

Fig. 3. Face recognition procedure of BLUE.

3. LEARNING MECHANISM 12687

Preprints of the 18th IFAC World Congress Milano (Italy) August 28 - September 2, 2011

Four sub-systems are built in the robot to implement the learning system, as shown in Fig. 4. The “Physiology Sub-system” is the physical interface hardware of BLUE. This sub-system produces the changes, when people are playing with BLUE, battery voltage is shifting or temperature of servomotor is changing. Any changes causes different feeling (“Need”) for BLUE in “Feeling Sub-system”. There are three kinds of feelings: CURIOSITY, HUNGER, and FATIGUE in the feeling sub-system. In order to override these feelings, BLUE chooses different behavior to react its need. All of the behaviors override the feeling, but also take some price. The basic price value table for all of the behaviors is shown as Table 2. For example, when EAT behavior is executed, HUNGER will decrease by 20, but CURIOSITY and FATIGUE will increase by 10.

A Fuzzy Efficiency System (FES) is proposed to tune the efficiency of the behavior. In the real world, eating when we feel hungry is more efficient than randomly eating. The Fuzzy Efficiency System is designed for modified this kind behavior effect. The suitable value E j for every behavior is able to be calculated as Fig. 5. Where, t p is the average playing time for one day; S1 is CURIOISITY; S2 is HUNGER; S3 FATIGUE; d act (t p ) is Vitality of BLUE. VSB (Si , m i ), i ∈{1, 2, 3}, j∈{1, 2, 3,4} is the price of every behavior, f(S i ), i ∈{1, 2, 3} is output value of FES(Fuzzy Efficiency System)。The same behavior may cause different suitable value by fuzzy efficiency value f(S i ) .

The best behavior for BLUE is able to obtain from (1). E j * = min{ E 1 , E 2 , E 3 , E 4 }

(1)

Where j* ∈{1, 2, 3,4} , for four different behaviours. E j is shown as (2) E j = (f(S1 ) ⋅ VSB (S1 , b j ) ⋅ d act (t p )) + (f(S 2 ) ⋅ VSB (S 2 , b j ))

(2)

+ (f(S3 ) ⋅ VSB (S3 , b j )) Need

The meaning of the best behavior is the behavior with the smallest summation value of three different “Need”.

×

d act (t p )

tp

Fig.4. The learning system of BLUE

S1

S2

bj VSB (Si , m i ) CURIOSITY S1

Si

HUNGER S2 FATIGUE S3

(b1)

10

SLEEP

SKIPPING

PLAYING

(b2)

(b3)

(b4)

10

-15

×

VSB (S1 , b j )

Table 2. The price of four behaviors

EAT

f(S1 )

f(S 2 ) ×

+

Ej

VSB (S 2 , b j )

S3

f(S 3 )

×

VSB (S 3 , b j )

-20

Fig. 5. Diagram of calculating suitable value E j -20

10

5

10

10

-20

5

10

Vitality is an auto-change weight variable. It’s the other character of the pet dog. If the average frequency of human playing with BLUE is higher, this value will higher. It means that the robot with Vitality value is more sensitive in Curiosity then the robot with lower Vitality value.

When BLUE is young, BLUE doesn’t know which behavior is suitable for his request. It also doesn’t understand the smallest summation value of three different “Need”. For example, BLUE maybe shows that he wants to EAT (charge battery), when it is fatigued. But in fact, SLEEP is the right behaviour in this case. After BLUE executed EAT, the feeling of FATIGUE is getting stronger. Therefore, the output behavior EAT does not override the feeling FATIGUE. The Neural Network adjusts the weight value to improve the decision making system.

12688

Preprints of the 18th IFAC World Congress Milano (Italy) August 28 - September 2, 2011

Back-Propagation Network is implemented for learning how to choose the best behavior. One input layer, one hidden layer and one output layer NN system is implemented in this paper. LM (Levenberg-Marquardt) is used to minimize the system error. The hidden layer with 25 nodes is used in this system. Fig. 6 shows the diagram of the NN system.

1 1 + e −n

Fig. 6. Structure of the NN learning system. (a) a = f 1 ( IW × p + b1 )

(3)

y = f 2 ( LW × a + b2 )

(4)

where, p is the input matrix, including feeling of Curiosity, Hunger and FATIGUE. The range of each feeling is 0 to 100. IW is the weight between Input layer and Hidden layer. LW is the weight between Hidden layer and Output layer. b1 and b2 are the threshold value. f1 and f2 are the transform function of Hidden Layer and Output Layer. Y is the output value. Output result 1, 2, 3 and 4 mean EATING, SLEEPING, PETTING, and PLAYING respectively. 4. EXPERIMENTS The best behavior, which is the optimal solution of Learning Sub-system, is determined by Feeling Sub-system. The learning is stopped when the error of the output of the NN system converge on the goal value. Fig. 7 shows the training situation of the learning procedure. When BLUE just start training, it executes EAT behavior in situations with high FATIGUE and CURIOSITY values, as shown in Fig. 7(a). It also wants to play when it is very hungry. The ability of choosing the behavior is getting better after 5 times training, as shown in Fig. 7(b). It still chooses PLAYING, when it feels hungry and tired. Fig. 7(c) shows that BLUE is able to choose the right behavior after 100 times training. The learning system is settling down after 15 times training, as shown in Fig. 8.

(b)

(c) Fig. 7. Training results of NN system: (a) after 1st training, (b) after 5th training, and (c) after 100th training.

12689

Preprints of the 18th IFAC World Congress Milano (Italy) August 28 - September 2, 2011

(b) Fig. 10. BLUE shows the SLEEP behavior face when FATIGUE is high. Fig. 8. The error rates of the NN output

(a) (a)

(b) Fig. 12. BLUE shows the PLAYING behavior face when CURIOSITY is very high.

(b) Fig. 9. BLUE shows the EAT behavior face when HUNGER is high.

(a) (a)

12690

Preprints of the 18th IFAC World Congress Milano (Italy) August 28 - September 2, 2011

Terzopoulos, D., Tu, X., and Grzeszczuk, R. (1994). Artificial Fishes: Autonomous locomotion, perception, behavior, and learning in a simulated physical world, Journal of Artificial Life, vol. 1, no. 4, pp. 327-351. Buehler, M., Battaglia, R., Cocosco, A., Hawker, G., Sarkis, J., and Yamazaki, K. (1998). SCOUT: A simple quadruped that walks, climbs, and runs. IEEE Int. Conf. on Robotics and Automation, pp. 1707-1712. Berns, K., Ilg, W., Deck, M., Albiez, J., and Dillmann, R. (1999). Mechanical construction and computer architecture of the four-legged walking machine BISAM. IEEE/ASME Transaction on Mechatronics, pp. 32-38.

(b)

Luo, R. C., Yih, C. C., and Su, K. L. (2002). Multisensor fusion and integration: approaches, applications, and future research directions. IEEE Sensors Journal, vol. 2, pp.107-119. Cordes, S., Berns, K., Eberl, M., Ilg, W., and Buhrle, P. (1997). On the design of a four-legged walking machine. IEEE Int. Conf. on Advanced Robotics, pp. 65-70. (c) Fig. 12. The value of CURIOSITY increases when the human is playing with BLUE.

5. CONCLUSION This paper presents a four legs robotic dog with an interactive-learning system. BLUE is able to learn how to make decision by itself by interacting with its environment. The experiments results show that the proposed pet robot is able to build the connection between the feeling and behaviors. In the future, an embedded system can be used in the robot to replace the notebook. It is able to cost down the price of the system. The localization ability can also implement on the robot to let the robot able to walk in the house.

Seiji, Y. and Tomohiro, Y. (2004). Training AIBO like a dog. IEEE Int. Workshop on Robot and Human Interactive Communication, pp. 431-436. Soriano, M., Martinkauppi, B., Huovinen S., and Laaksonen M. (2000). Skin color modeling under varying illumination conditions using the skin locus for selecting training pixels. Proceedings Workshop on Real-Time Image Sequence Analysis, pp. 43-49. Soriano, M., Martinkauppi, B., Huovinen S., and Laaksonen M. (2000). Skin detection in video under changing illumination conditions. 15th Int. Conf. on Pattern Recognition, pp.839-842.

REFERENCES Hiros, S. (1984). A study of design and control of a quadruped walking vehicle. International Journal of Robotics Research, vol. 3, no. 2, pp. 113-133. Blumberg, B. M. and Galyean, T. A. (1995). Multi-Level Direction of Autonomous Creatures for Real-Time Virtual Environments. Proceedings of ACM SIGGRAPH, pp. 47-54. Blumberg, B. M., Todd, P. T., and Maes, P. (1996). No bad dogs: Ethological lessons for learning in hamsterdam. Proceedings of International Conference on Simulated Adaptive Behavior, pp. 295-304. Kubota, N., Kojima, F., and Fukuda, T. (2001). Self-consciousness and emotion for a pet robot with structured intelligence. Proceedings of IEEE IFSA World Congress, pp. 1786-2791. 12691

Suggest Documents