Introduction to Genetic Algorithm
By: J. Ivakpour
Introduction to GA § § § § § § § § §
Introduction Overview Terminology Designing of GA GA’s Operators Example Genetic Programming Conclusion References
introduction to GA to GA: GA: Hill Climbing
global
local
3
introduction to GA to GA: GA: Hill Climbing
§ Multi Multi--climbers
introduction to GA to GA: GA: GA
§ Genetic algorithm I am not at the top. My high is better!
I am at the top Height is ...
I will continue
introduction to GA: GA: GA
§ Genetic algorithm - few microseconds after
introduction to GA: GA: OVERVIEW
§ A class of probabilistic optimization algorithms § Inspired by the biological evolution process § Uses concepts of “Natural Selection” and “Genetic Inheritance” (Darwin 1859 1859)) § Originally developed by John Holland (1975 (1975))
introduction to GA: GA: Evolutionary Computation
v Evolutionary computation simulates the natural evolution on a computer
v Goal of natural evolution – to generate a population of individuals with increasing fitness (increasing ability to survive and reproduce in a specific environment)
v Goal of evolutionary computation - to generate a set of solutions (to a problem) of increasing quality
introduction to GA: GA: GA Terminology v Individual – candidate solution to a problem decoding
encoding
v Chromosome – representation of the candidate solution v Gene – constituent entity of the chromosome v Population – set of individuals/chromosomes
v Fitness function – representation of how good a candidate solution is
v Genetic operators – operators applied on chromosomes in order to create genetic variation (other chromosomes)
introduction to GA: GA: GA
introduction to GA: GA: Metaphor
EVOLUTION
Environment Individual Fitness
PROBLEM SOLVING
Problem Candidate Solution Quality
introduction to GA: GA: Flowchart
Initialize the population Select individuals for the mating pool Perform crossover Perform mutation Insert offspring into the population
no
Stop? yes
The End
introduction to GA: GA: Designing GA
l l l l l l
How to represent genomes? How to define the crossover operator? How to define the mutation operator? How to define fitness function? How to generate next generation? How to define stopping criteria?
introduction to GA: GA: Representing Genomes
Representation
Example
string
1
array of strings
0
1
http avala
1
1
yubc
0
0
net ~apopovic
or >
c
tree - genetic programming xor a
b b
1
introduction to GA: GA: Crossover
§ Crossover is concept from genetics. § Crossover combines genetic material from two parents, in order to produce superior offspring. § Few types of crossover: § One One--point § Multiple point.
introduction to GA: GA: OneOne-point Crossover
0
7
1
6
2
5
3
4
4
3
5
2
6
1
7
0
Parent #1 #1
Parent #2 #2
introduction to GA: GA: OneOne-point Crossover
0
7
1
6
5
2
4
3
3
4
2
5
1
6
0
7
Offspring #1 #1
Offspring #2 #2
introduction to GA: GA: Mutation
§ Mutation introduces randomness into the population. § The idea of mutation is to reintroduce divergence into a converging population. § Mutation is performed on small part of population, in order to avoid entering unstable state.
introduction to GA: GA: Mutation
Parent
1
1
0
1
0
0
0
1
Child
0
1
0
1
0
1
0
1
introduction to GA: GA: Mutation and Selection
D
Phenotype
D
D
Solution distribution
Phenotype
Phenotype
Selection
Mutation
introduction to GA: GA: About Probabilities
§ Average probability for individual to crossover is, in most cases, about 80%. 80%. § Average probability for individual to mutate is about 1-2%. § Probability of genetic operators follow the probability in natural systems. § The better solutions reproduce more often.
introduction to GA: GA: Typical Run
introduction to GA: GA: Fitness Function
§ Fitness function is evaluation function, that determines what solutions are better than others. § Fitness is computed for each individual. § Fitness function is application depended.
introduction to GA: GA: Selection
§ The selection operation copies a single individual, probabilistically selected based on fitness, into the next generation of the population. § There are few possible ways to implement selection: § “Only the strongest survive” § Choose the individuals with the highest fitness for next generation § “Some weak solutions survive” § Assign a probability that a particular individual will be selected for the next generation § More diversity § Some bad solutions might have good parts!
introduction to GA: GA: Selection - Survival of The Strongest
Previous generation
0.93
0.51
0.72
0.31
0.12
Next generation 0.93
0.72
0.64
0.64
introduction to GA: GA: Selection - Some Weak Solutions Survive
Previous generation
0.93
0.51
0.72
0.31
0.12
0.64
Next generation 0.93
0.72
0.64
0.12
introduction to GA: GA: Roulette Wheel Selection
©http://www.softchitech.com/ec_intro_html
introduction to GA: GA: Stopping Criteria
§ Final problem is to decide when to stop execution of algorithm. § There are two possible solutions to this problem: § First approach: § Stop after production of definite number of generations § Second approach: § Stop when the improvement in average fitness over two generations is below a threshold
introduction to GA: GA: Genetic Algorithm
v Problem definition
v Encoding of the candidate solution v Fitness definition v Run v Decoding the best fitted chromosome = solution
introduction to GA: GA: Working of GA: phases Phases in optimizing on a 1-dimensional fitness landscape Early phase: quasi-random population distribution
Mid-phase: population arranged around/on hills
Late phase: population concentrated on high hills
Best fitness in population
introduction to GA: GA: Working of GA: typical run
Time (number of generations)
Typical run of an EA shows so-called “anytime behavior”
Best fitness in population
introduction to GA: GA: Working of GA: long runs might not pay off
Progress in 2nd half
Progress in 1st half
Time (number of generations)
Best fitness in population
introduction to GA: GA: Working of GA: smart initialization
F
F: fitness after smart initialization T: time needed to reach level F after random initialization
T
Time (number of generations)
Performance of methods on problems
introduction to GA: GA: Performance of EAs
Special, problem tailored method
Evolutionary algorithm
Random search
Scale of “all” problems
Performance of methods on problems
introduction to GA: GA: Performance of EAs EA 4 EA 2
EA 3
EA 1
P Scale of “all” problems
introduction to GA: GA: An example -Goldberg ‘‘89 89
§ Simple problem: max x2 over {0 {0,1,…, ,…,31 31}} § GA approach: § Representation: binary code, e.g. 01101 « 13 § Population size: 4 § 1-point xover, bitwise mutation § Roulette wheel selection § Random initialisation § We show one generational cycle done by hand
introduction to GA: GA: An example -Goldberg ‘89
x2 example: selection
introduction to GA: GA: An example -Goldberg ‘‘89 89
X2 example: crossover
introduction to GA: GA: An example -Goldberg ‘‘89 89
X2 example: mutation
introduction to GA: GA: GENETIC PROGRAMMING
introduction to GA: GA: PREPARATORY STEPS Objective:
Find a computer program with one input (independent variable X) whose output equals the given data
1
Terminal set:
T = {X, RandomRandom-Constants}
2
Function set set::
F = {+,
3
Fitness:
The sum of the absolute value of the differences between the candidate program’s output and the given data (computed over numerous values of the independent variable x from –1.0 to +1.0)
4
Parameters:
Population size M = 4
5
Termination:
An individual emerges whose sum of absolute errors is less than 0.1
-,
*, %}
introduction to GA: GA: SYMBOLIC REGRESSION
POPULATION OF 4 RANDOMLY CREATED INDIVIDUALS FOR GENERATION 0
introduction to GA: GA: SYMBOLIC REGRESSION x2 + x + 1
FITNESS OF THE 4 INDIVIDUALS IN GEN 0
x+1
x2 + 1
2
x
0.67
1.00
1.70
2.67
introduction to GA: GA: SYMBOLIC REGRESSION x2 + x + 1
GENERATION 1
Copy of (a)
Mutant of (c) picking “2” as mutation point
First offspring of crossover of (a) and (b) picking “+” of parent (a) and left-most “x” of parent (b) as crossover points
Second offspring of crossover of (a) and (b) picking “+” of parent (a) and left-most “x” of parent (b) as crossover points
introduction to GA: GA: Conclusion: advantages and disadvantages Ø Advantages o o o o o o o o
No presumptions w.r.t. problem space Widely applicable Low development & application costs Easy to incorporate other methods Solutions are interpretable (unlike NN) Can be run interactively, accommodate user proposed solutions Provides many alternative solutions Intrinsic parallelism
Ø Disadvantages o o o o
No guarantee for optimal solution within finite time Weak theoretical basis May need parameter tuning Often computationally expensive, i.e. slow
introduction to GA: GA: References
www.cs.vu.nl/~jabekker/ec0607/ web.umr.edu/~ercal/387/ www.genetic-programming.com David E. Goldberg (1989), Genetic Algorithms in Search, Optimization, and Machine Learning.