Center for Robotics and Intelligent Machines
EvBot Robot Software (Machine Learned Control for Autonomous Robot Colony Research) Presented by:
Dr. Edward Grant Associate Professor and Director, CRIM North Carolina State University Department of Electrical and Computer Engineering 1
Presentation Outline • Background • Design Specifications and
Implementation
• Experiments • Conclusion and Suggestions for
Future Work
2
1
Background •
Artificial evolution was applied to evolve neural networks to control autonomous mobile robots
•
The robot controllers were evolved to play a competitive team game: Capture the Flag
•
During artificial evolution, selection of the fittest was based on the results of robot-robot competition
•
Evolved controllers were tested in competition against a knowledge-based controller and found to be able to win a small majority of games in extensive tournaments 3
Remote Control
• Remote vision processing and
control on a PC running MATLAB Desktop Controlling Computer Camera
Video Transmitter
Treaded Base
BS2 MCU
Digital Receiver
H-Bridge Tread Drive
Video Receiver
Video Capture Card
Web Cam Image Acquisition Software
Digital Transmitter
RS232 Serial Port
MATLAB Based Controller
4
2
MATLAB Control Start
Start web cam as a separate process before proceeding
Web Cam Process
Grab image from JE video transmission
Save image
Wait one second
Initialize serial port for sending data wirelessly
Main Control Loop Read image saved by web cam
Process image
Send JE command(s) wirelessly
Wait for JE to respond
Wait for next image from web cam
5
EvBot Specification • Small size and low cost • Ease of software development • High-bandwidth communications • Powerful CPU • Advanced sensing capabilities 6
3
MATLAB for Software Development • Very high-level language • Extensible • Frequently used for controller design • Significant hardware and software requirements
7
Communications
• Radio frequency wireless communications
Shared bandwidth a potential problem Short-range local communication
• Information sharing protocol • Remote monitoring and control software
• Wireless Ethernet 8
4
Pentium-Class CPU
• Run complex
controllers • Run MATLAB
PC-compatible Fast Diverse sensors New sensors Device-driver interface 9
EvBot Hardware Optional USB Camera
PC/ 104 Stack
Utility Board
Treaded Base
10
5
EvBot Software • BasicX • •
controller code and interface protocol Custom miniature Linux distribution User-supplied EvBot controller 11
Evolutionary Robotics (ER) • Roots in Evolutionary Computation, Artificial Life and Behavioral Robotics.
• Population based artificial evolution • Automatically synthesize intelligent robot controllers
• Reinforcement Learning • Synthesis vs. Optimization • Behavioral vs. Dynamic systems
12
6
Population-Based Artificial Evolution in ER Population Initialization P(k=0)
Performance of controllers (p in P ) instantiated in robots in an Environment
p3 2
p4
p1
p
P(k+1)= (p 1, p 2, … , p N )
Fitness Evaluation of each p in P Based on Performance in Environment F(p1) = # . . F(pn) = # . . F(pN) = #
Re-order P based on fitness values {F(p2) > F(p6) > F(p3) > … }
Propagate P(k) to P(K+1) using a Genetic Algorithm (GA) (mutation/crossover)
P(k)
P(k)
P(k)
p1
p2
p2
P(k+1) p2
p2
p6
p6
p2'
p3
p3
p3
p6
p4
p1
p1
p6'
p5
pN
pN
p3
p6
p5
p5
p3'
pN
p4
p4
p2.6
13
Examples of ER Work: Simple •
1995: Jakobi, Harvey, Husbands – Phototaxis, obstacle avoidance (Khepera Robots) – Evolution in simulation with transference to reality
K
14
7
Examples of ER Work: Complex •
1997: Nolfi
– Garbage
collection
– Khepera Robot – Evolution in
K
simulation with transference to reality
15
Examples of ER Work: CoCompetitive •
1996: Cliff and Miller
– Co-evolution of predator and prey
Fox Rabbit
– Khepera Robot – Evolution in
simulation with transference to reality 16
8
CRIM ER Test-Bed: EvBots
EvBot with tactile sensors
EvBots with cameras and colored shells
EvBot II 17
CRIM ER Test-Bed: Environment
18
9
CRIM ER Test-Bed: Video Range Emulation Sensors R a w Im a g e
R a w Im a g e
R a w Im a g e
C o lo r Id e n tific a tio n
C o lo r Id e n tific a tio n
C o lo r Id e n tific a tio n
50
100
150 Gre e n Robot
0
50 0
50
100
150 Re d Goa l
0 50 0
0
50
100
150
50 0
0
50 100 H o riz o n ta l P o s itio n (p iz e ls )
150
Walls
0
50
100
150 Red Robot
0
150
50 0
0
50
100
150
50 0
0
50
100
150
50 0
0
50
100
150
50 0
Gre en Robot
100
0
50 100 H o riz o n ta l P o s itio n (p iz e ls )
150
50 0
Red Goal
Wa lls
50
Re d Robot
0
50 0
C a lc u la te d R a n g e D a ta
C a lc u la te d R a n g e D a ta 50
Green Goal
Gre en Goal
Re d Goa l
Gre e n Robot
Re d Robot
0
Gre e n Goa l
Wa lls
C a lc u la te d R a n g e D a ta 50
0
50
100
150
0
50
100
150
0
50
100
150
0
50
100
150
50 0 50 0 50 0 50 0
0
50 100 H o riz o n ta l P o s itio n (p iz e ls )
150
19
CRIM ER Test-Bed: Real vs. Simulation
Real sensors
Simulated sensors 20
10
Evolutionary ANN Controller Architecture • Weights,
Ne ura l n e twork top olo gy Le ge nd lin ldl s ig s dl s tp s pl rbf rdl fe e d forwa rd fe e d ba ck
Inpu t #35 Inpu t #34 Inpu t #33 Inpu t #32 Inpu t #31 Inpu t #30 Inpu t #29 Inpu t #28 Inpu t #27 Inpu t #26 Inpu t #25 Inpu t #24 Inpu t #23 Inpu t #22 Inpu t #21 Inpu t #20 Inpu t #19 Inpu t #18 Inpu t #17 Inpu t #16 Inpu t #15 Inpu t #14 Inpu t #13 Inpu t #12 Inpu t #11 Inpu t #10 Inp ut #9 Inp ut #8 Inp ut #7 Inp ut #6 Inp ut #5 Inp ut #4 Inp ut #3 Inp ut #2 Inp ut #1
connections, and topology can all be mutated during evolution
44 49 61 9 18 4 1415 5 559 113 17 19 2 0222 22628 7230 931 334 2 36 37 3 38404 9 43 1 4547 451 85 053 062 124 1 23 56 7810112 2325 525 45 65658 16 33 35 42 46 7
Output # 2 Output # 1
Ma trix of co nne ctio ns . Initia liza tion type = ffw I1 li n I2 li n I3 li n I4 li n I5 li n I6 li n I7 li n I8 li n I9 li n I1 0 li n I1 1 li n I1 2 li n I1 3 li n I1 4 li n I1 5 li n I1 6 li n I1 7 li n I1 8 li n I1 9 li n I2 0 li n I2 1 li n I2 2 li n I2 3 li n I2 4 li n I2 5 li n I2 6 li n I2 7 li n I2 8 li n I2 9 li n I3 0 li n I3 1 li n I3 2 li n I3 3 li n I3 4 li n I3 5 li n N1 rd l N2 rb f N3 rb f N4 s d l N5 s i g N6 rb f N7 s i g N8 rb f N9 s i g N1 0 s i g N1 1 s d l N1 2 rd l N1 3 rb f N1 4 rb f N1 5 s i g N1 6 s i g N1 7 rb f N1 8 s i g N1 9 rb f N2 0 rb f N2 1 s i g N2 2 rd l N2 3 rd l N2 4 rd l N2 5 s d l N2 6 rb f N2 7 s d l N2 8 rd l N2 9 s i g N3 0 rd l N3 1 s i g N3 2 rd l N3 3 rd l N3 4 s d l N3 5 rd l N3 6 rd l N3 7 rd l N3 8 s i g N3 9 s d l N4 0 s d l N4 1 s d l N4 2 rd l N4 3 rb f N4 4 s d l N4 5 rd l N4 6 s d l N4 7 rd l N4 8 rb f N4 9 s d l N5 0 rd l N5 1 s i g N5 2 s i g N5 3 rd l N5 4 s i g N5 5 rd l N5 6 rd l N5 7 s d l N5 8 rb f N5 9 rd l N6 0 rb f O1 li n O2 li n in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
rd l
rb f rb f s d l s ig
rb f s ig
rb f s ig s ig s d l
rd l
rb f
rb f s i g s ig
rb f s i g
rb f rb f s ig
rd l
rd l
rd l s d l
rb f s d l
rd l s ig
rd l
s ig
rd l
rd l s d l
rd l
rd l
rd l s ig s d l s d l s d l
rd l
rb f s d l
rd l s d l
rd l
rb f s d l
rd l s ig s i g
rd l s i g
rd l
rd l s d l
rb f
rd l
rb f li n
lin b i a s
21
Current Issues in ER • Control Architecture • Evolution in simulation vs. evolution in real robots
• Generalization of ER methods to evolve complex behaviors
• Fitness selection in artificial evolution 22
11
Fitness Selection Functions Population Initialization P(k=0)
Performance of controllers (p in P ) instantiated in robots in an Environment
p3
p4
p1
2
p
P(k+1)= (p 1, p 2, … , p N )
Propagate P(k) to Fitness Evaluation Re-order P based P(K+1) using a of each p in P on fitness values Genetic Algorithm (GA) Based on Performance in {F(p2) > F(p6) > F(p3) > … } (mutation/crossover) Environment P(k) P(k) P(k) P(k+1) F(p1) = # . . F(pn) = # . . F(pN) = #
p1
p2
p2
p2
p2
p6
p6
p2'
p3
p3
p3
p6
p4
p1
p1
p6'
p5
pN
pN
p3
p6
p5
p5
p3'
pN
p4
p4
p2.6
Fitness Selection Function: F(p)
23
Fitness Function Example • •
Hand-formulated Task-specific Fitness Functions. EX: Object avoidance
•
Co-Competitive Fitness Functions. Introduces changes in the fitness landscape (The Red Queen Effect), example Predator and Prey
F ( p ) = α ∫ (Vleft + Vright )dt − β ∫ (Sensor _ act )dt
Frabbit ( p ) = max ∫ Dist ( fox, rabbit )dt •
F fox ( p ) = min ∫ Dist ( fox, rabbit )dt
Aggregate Fitness Functions
1 if success F ( p) = 0 if failure 24
12
EvBot Capture the Flag •
Populations of robot controllers evolved to play Capture the Flag
•
Selection based on competitive tournaments
•
Games played in maze environments
25
•
•
•
Motivation Goal: Apply competitive fitness selection to ER – Controllers actively compete during evolution – Competition drives the evolution of complex behavior – Relative win/lose competition must be used to gain the benefits of competitive selection Problem: Initial populations are too unfit – Randomly initialized controllers can’t win any games at all (The Bootstrap Problem) – Need to use human bias to overcome the Bootstrap Problem – Bias will disrupt the relative win/lose competition Solution: The bimodal fitness function 26
13
The Bimodal Fitness Function •
Fitness F(p) of an individual p in an evolving population P ( p ∈ P ) takes the general form:
F ( p ) = Fmode_1 ( p ) ⊕ Fmode_2 ( p )
•
Mode 1: Accommodates sub-minimally competent initial populations (the Bootstrap Problem)
•
Mode 2: Allows for competitive win/lose selection in minimally competent populations
27
The Bimodal Fitness Function
• Mode 1: Fmode_1 = Fdist + s + m α * ( D − d ) if d < D Fdist = otherwize 0 28
14
The Bimodal Fitness Function
• Mode 2:
Fmode_2 (p):
Game Pair Outcomes
Fitness Points Awarded
win-win win-draw win-lose draw-draw draw-lose lose-lose
3 1 .5 0 (Fmode_1 dominates) 0 (Fmode_1 dominates) 0 (Fmode_1 dominates) 29
Results 1 Red Robots nnn 2 nnn 1 evbot3.crim evbot2.crim
nnn 1 evbot1.crim
Green Goal Green Robots
• Game in a •
Winning Robot Path (Red)
nnn 2 evbot4.crim
Red Goal
•
simple simulated world Robots use evolved ANN controllers Winner: Red 30
15
Results 2
• Game In a
Green Goal
nnn 1 evbot3.crim
nnn 1 evbot2.crim
nnn 1 evbot1.crim
•
nnn 1 evbot4.crim
Red Goal
•
Winning Robot (Green)
Large complicated simulated world Robots use evolved controllers Winner: Green 31
Transfer to the Real World
• Game In the Red Goal
Red Robots
Green Robots
• •
real world Robots use evolved ANN controllers Winner: Red
Green Goal 32
16
Acknowledgements • NC State University: Dr. Edward Grant (PI), John Galeotti, Lt. Stacey Rhody (USMC), Dr. Andrew Nelson, Leonardo Mattos, Greg Barlow, Kyle Luthy, Chris Braly
• San Diego State University: Dr. Gordon Lee (PI), Damion Gastelum, Cmdr. Tom Jones (USN)
• DARPA: MTO Division, Dr. Elana Ethridge, for the original inspiration 33
17