System Identification using Genetic Programming and Gene Expression Programming

System Identification using Genetic Programming and Gene Expression Programming Juan J. Flores and Mario Graff Divisi´ on de Estudios de Posgrado Facu...
Author: Allen Baldwin
0 downloads 2 Views 152KB Size
System Identification using Genetic Programming and Gene Expression Programming Juan J. Flores and Mario Graff Divisi´ on de Estudios de Posgrado Facultad de Ingenier´ıa El´ectrica Universidad Michoacana de San Nicolas de Hidalgo

Abstract. This paper describes a computer program called ECSID that automates the process of system identification using Genetic Programming and Gene Expression Programming. ECSID uses a function set, and the observed data to determine an ODE whose behavior is similar to the observed data. ECSID is capable to evolve linear and non-linear models of higher order systems. ECSID can also code a higher order system as a set of higher order equations. ECSID has been tested with linear pendulum, non-linear pendulum, mass-spring system, linear circuit, etc.

1

Introduction

System identification (SID) is concerned with building a model from inputoutput observations. The model is represented as a mathematical formula. Linear system identification methods have been widely studied (e.g. [8]). However, these methods involve a complicated process that usually can only be followed by an expert. Nonlinear system identification remains a difficult task because, frequently there is not enough information about the system (i.e. the structure’s system is unknown). This article introduces a system called ECSID (Evolutionary Computation based System Identification). ECSID is a system that creates a model from observed data using evolutionary techniques; it uses GP (Genetic Programming [7]) and GEP (Gene Expression Programming [3]). In order to find a model ECSID only needs the observed data and a function set. ECSID represents the evolve system as an ordinary differential equation (ODE). it has the following features. – – – –

Evolve higher order ODEs. Evolve linear higher order ODEs. Only needs the maximum order of the system. Use GP or GEP to discover the model.

Section 2 presents related work. Section 3 briefly introduces Genetic Programming and Gene Expression Programming. Section 4 presents the methodology used in ECSID. Section 5 presents the results. Section 7 presents the conclusions and proposes some ideas for future work.

2

Related Work

Bradley et al [1] built a system called PRET for system identification. PRET automates the system identification process by building a layer of artificial intelligence techniques around a set of traditional formal engineering methods. PRET builds models using meta-domain information about the system and hypotheses given by the user. Gray et al [5] used GP to model a system of fluid flow through pipes. They used GP to find the system’s structure and Nelder-Simplex and Simulated Annealing to optimize the system’s parameters. Their program can evolve only first-order systems of ordinary differential equations. Weinbrenner [9] used genetic programming to model a helicopter engine. Genetic Programming was used to find the system structure and a search procedure (Nelder-Simplex and Simulated Annealing) to optimize the system parameters. He used automatically defined functions (ADFs) [7] to incorporate dynamic behavior to the system. His procedure was not able to produce a system of ordinary differential equations. Cao et al [2] modeled a system of ordinary differential equations, using genetic programming and genetic algorithms. Genetic Programming was used to discover the system’s structure while genetic algorithms were used to optimize its parameters. Cao evolved a higher order system, expressed as a set of first order equations (SODE) and higher order ordinary differential equations. Cao cannot evolve a system of higher order differential equations, but only a higher order differential equation (HODE). They always evolve equations of the same order, and the order has to be provided by the user. Hinchliffe [6] models a system from observed data, but his models were not ODEs instead were based on previous values of their inputs. The conclusion section enumerates some of the characteristics presented by ECSID, which are not contained in the works mentioned in this section.

3

Genetic Programming and Gene Expression Programing

Genetic Programming [7] and Gene Expression Programming [3] are evolutionary tools inspired in the Darwinian principle of natural selection and survival of the fittest individual. These methods use an initial random population and apply genetic operators to this population until the algorithm finds an individual that satisfies some termination criteria. GP and GEP are evolutionary tools which evolve computer programs. GP represents the computer program as a tree structure, while GEP uses a string. Each string represents a tree structure (Figure 1 shows this representation). The genetic operators generally used by GP are: crossover and mutation. Crossover choose two individuals in the population and merges them to build other two individuals. The procedure used by crossover is to select a sub-tree from each individual and swap these sub-trees. Mutation chooses an individual

Fig. 1. Convert a chromosome to a tree expression

from the population and randomly changes a node in the tree structure by another. These operators are better described in [7]. The genetic operators used by GEP are: Mutation, Root transposition, Gene transposition, one-point recombination, two-point recombination, and gene recombination. These operators are deeply described in [3].

4

ECSID

ECSID can use either GP or GEP to determine ODEs, it can evolve higherorder differential equation and higher-order differential equation system. ECSID can evolve linear equations with constant or variable coefficients. The system is represented in GEP by a multi-chromosome where each chromosome represents an equation. In GP each individual has a list of s-expressions, where each sexpression represents an equation. All the equations evolved by ECSID have the following form: y (n) = f (t, y, y 0 , y 00 , · · · , y (n−1) )

(1)

ECSID evolves only the right part of the Equation 1. The order of the system 2 is determined by the higher order element. Figure 2 shows the 0DE dd2xt = 7 dx dt + 10x + 12 represented in ECSID. In this figure it is observed that ECSID evolves only the right part of the Equation 1. In order to integrate Equation 1 ECSID needs to build a system with the form of Equation 2.  0  y 1   y20 ..   .    yn

= y2 = y3 = f (t, y1 , y2 , · · · , yn )

(2)

+ +

*

7

dx/dt

12

* 10

x

Fig. 2. ODE represented in ECSID.

Equation 2 is formed by replacing the following variables y1 = y, y2 = y 0 , y3 = y , · · · , yn = y (n−1) There are some experiments in which is necessary to evolve linear systems, in order to evolve a linear model we introduced an operator called “coefficient”. This operator receives any s-expression and a constant, then “coefficient” multiplies the s-expression with the constant. Alternatively we introduce a method that checks if the equation is linear. If it is not, the fitness function will be decreased. This is useful when we do not want to use the “coefficient” method but we want to evolve only linear equations. ECSID is capable to find the order of the system, the user sets the maximum order and ECSID evolves systems up to this. ECSID uses the standard evolutionary computation procedure, random population, fitness proportional selection and elitism. The fitness function is the absolute difference of the errors Σ|e|. The table 1 shows the parameters used in ECSID . 00

Genetic Operator Probability Mutation 0.2 Crossover 0.8 is-transposition 0.1 ris-transposition 0.1 gene-transposition 0.1 one-point recombination 0.3 two-point recombination 0.3 gene-recombination 0.1 Table 1. Genetic operators’ parameters

In order to compare experiments from different domains, we use the correlation coefficient (Equation 3). The correlation coefficient gives a number between −1 and 1 where 1 means that the curves are equal.

r=p

P P P xy − x y P P P x2 − ( x)2 ][n y 2 − ( y)2 ] n

[n

P

(3)

The evaluation method used by ECSID is shown below (Evaluation ). This method receives an individual to be evaluated and a list t , where each element of t represents time. CreateTrees builds trees from the individual. Order identifies the system’s order. Each equation of the system can be of different order therefore it returns a list. Lines 3 and 4 punish non-linear individuals in the case of a linear assumption. Eval evaluates the equations using the system order and an integration method (4th order Runge-Kutta). Evaluation returns the values of the individual in t .

Evaluation(individual, t) 1 trees ← CreateTrees(individual) 2 n ← order(trees) 3 if EvolveLinear() and IsLinear(trees) = nil 4 then return ∞ 5 return Eval(n, trees, t)

5

Modeling experiments

This section presents the results obtained using ECSID; these results were obtained running the experiments 20 times and the best individual of all is selected. Each experiment was run for 500 generations and the population size is 500. The termination criteria is when the correlation coefficient is r ≥ 0.99. 5.1

Example 1

This example identifies a model for a linear pendulum. Equation 4 shows the system to identify with the initial conditions f (0) = 1, f 0 (0) = 0. d2 θ = −19.6θ (4) d2 t Equations 5 and 6 show the result using GEP and GP respectively. Figure 3 shows Equation 4, 5 and 6, the equations obtained by ECSID are good. You can see that ECSID found the structure of the system but could not find the exact parameters. Table 2 shows the correlation coefficient. d2 θ = −19.33501θ d2 t

(5)

d2 θ = −20θ d2 t

(6)

5.2

Example 2

This example is a non-linear pendulum with friction. Equation 7 shows the model, the initial conditions are the same than those of the previous example. dθ d2 θ = −2 − 19.6sin(θ) (7) d2 t dt Equations 8 and 9 show the result using GEP and GP respectively. Both methods find a good model; the correlation coefficient is above r ≥ 0.99 (Table 2). These equations (8 and 9) do not have the same structure that the system but they are good models. Figure 4 shows the behavior exhibited by those models.

5.3

d2 θ dθ dθ = −2θ − 2 − 20θ 2 d t dt dt

(8)

d2 θ dθ = −2.3048492976 − 19.3437292976θ d2 t dt

(9)

Example 3

In this example we model a coupled mass-spring system Equation 10 shows the system with the initial conditions f (0) = 1, f 0 (0) = 0, g 0 = 2, g 0 (0) = 0. This system is different than the other examples because it has two second order equations. d2 x = −5x + 2y d2 t d2 y = 2x − 2y d2 t

(10)

2 Actual GEP GP

1.5 1 f(t)

0.5 0 -0.5 -1 0.5

1

1.5

2

2.5

3

3.5

time Fig. 3. Linear pendulum.

4

4.5

5

1 0.8 0.6 0.4 f(t) 0.2 0 -0.2 -0.4 -0.6

Actual GEP GP

1

2

3

4

5

6

7

time Fig. 4. Pendulum with friction.

GEP and GP gave the same Equation 11, the correlation coefficient is r = 1, you can see that neither the structure nor the parameters are the same that Equation 10. Figure 5 shows the behavior exhibited by those models (each model has two equations).

d2 x = −x d2 t 2 d y = −y d2 t

(11)

5 Actual Actual GEP-GP GEP-GP

4 3 f(t)

2

1 2 1 2

1 0 -1 -2 5

10

15 time

Fig. 5. Mass springs coupled.

20

25

6

Example 4

This example shows a linear circuit Equation 12 presents the system with the initial conditions f (0) = 0, g(0) = 0. This system is different than the other examples because it is linear, therefore ECSID needs to find a linear model. dx = −20x + 10y + 100 dt dy = 10x − 20y dt

(12)

Equations 13 and 14 show the results using GEP and GP respectively. Both methods find a good model and both models are linear. GP found the structure of the model but it did not found the exact same coefficients. Figure 6 shows the behavior exhibited by those models. 1424 dx = − 8x dt 25 dy 1399 = − 4x dt 50

(13)

dx = −19.09362x + 4y + 109.37688 dt dy = 6x − 11y dt

(14)

Table 2 shows the results of the experiments, you can see that the correlation coefficient is above r ≥ 0.99 in three experiments and it is r = 1 in one experiment. The last column I(M, i, z) is the minimum number of individuals that need to be proceed in order to obtain a satisfactory model [7]. The symbol N/A, in the last column, means that we do not have enough information to obtain I(M, i, z). This is because we only get one satisfactory model from the twenty independent runs, meanwhile in the others experiments we obtain at least five satisfactory models from the twenty independent runs.

7

Conclusions

We presented the results obtained with ECSID. ECSID found good models for linear pendulum, non-linear pendulum with friction, coupled mass-spring, and linear circuit. ECSID has the following advantages compared to related work. – ECSID can model higher order system expressed as as a set of higher order equations (Example 5.3).

7 6 Actual Actual GEP GEP GP GP

5 4 f(t)

1 2 1 2 1 2

3 2 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 time Fig. 6. Linear circuit.

Table 2. Results using GP and GEP Method GEP GP GEP GP GEP GP GEP

Problem Linear pendulum (Eq. 5) Linear pendulum (Eq. 6) Pendulum with friction (Eq. 8) Pendulum with friction (Eq. 9) Coupled mass-springs (Eq. 11) Coupled mass-springs (Eq. 11) Circuit (Eq. 13)

GP

Circuit (Eq. 14)

r I(M, i, z) r = 0.99635 N/A r = 0.99201 N/A r = 0.99822 154783 r = 0.99697 278609 r1 = 1, r2 = 1 41276 r1 = 1, r2 = 1 42761 r1 = 0.98623, r2 = 0.99753 N/A r1 = 0.99579, r2 = 0.99491 N/A

– ECSID is capable to determine the order of the system, the user sets the maximum order and it evolves systems up to that limit. Cao’s system can only evolve equations of the same order. – ECSID can evolve a linear model. In our work we have not found the necessity to use genetic algorithms or a search procedure to optimize the system’s parameters. GP by itself does a good work on finding a good model. ECSID has proved to be useful in system identification, but it needs to be improved. Below there is a list of the future work. – Test ECSID with noisy data and with real experiments. – Explore the differences between GP and GEP and to identify which one of them is better. – Improve the usability of ECSID ECSID can be downloaded from [4].

References 1. E. Bradley and R. Stolle. Automatic construction of accurate models of physical systems. Technical report, University of Colorado, Department of Computer Science. 2. H. Cao, L. Kang, Y. Chen, and J. Yu. Evolutionary modeling of systems of ordinary differential equations with genetic programming. Genetic Programming and Evolvable Machines, 1(4):309–337, Oct. 2000. 3. C. Ferreira. Gene expression programming: A new adaptive algorithm for solving problems. In Complex Systems, number 2, pages 87–129, 2001. 4. M. Graff and J. J. Flores, January 2005. http://sourceforge.net/projects/ecsid. 5. G. J. Gray, D. J. Murray-Smith, Y. Li, and K. C. Sharman. Nonlinear model structure identification using genetic programming. In J. R. Koza, editor, Late Breaking Papers at the Genetic Programming 1996 Conference Stanford University July 28-31, 1996, pages 32–37, Stanford University, CA, USA, 28–31 1996. Stanford Bookstore. 6. M. Hinchliffe. Dynamic Modelling Using Genetic Programming. PhD thesis, University of Newcastle upon Type, 2001. 7. J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems). The MIT Press, 1992. 8. L. S. Ljung. System Identification: Theory for the User. Prentice Hall, 1987. 9. T. Weinbrenner. Genetic programming techniques applied to measurement data. Diploma Thesis, 1997.

Suggest Documents