International Journal of Modern Physics C Vol. 26, No. 7 (2015) 1550076 (12 pages) # .c World Scienti¯c Publishing Company DOI: 10.1142/S012918311550076X
The study of genetic information °ux network properties in genetic algorithms
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
Zhengping Wu* and Qiong Xu College of Electrical Engineering and New Energy China Three Gorges University, Hubei Yichang, 443002, P. R. China *
[email protected] Gaosheng Ni College of Foreign Languages China Three Gorges University Hubei, Yichang 443002, P. R. China Gaoming Yu College of Petroleum Engineering Yangtze University, Hubei, Wuhan 430100, P. R. China Received 5 August 2014 Accepted 27 October 2014 Published 28 April 2015 In this paper, an empirical analysis is done on the information °ux network (IFN) statistical properties of genetic algorithms (GA) and the results suggest that the node degree distribution of IFN is scale-free when there is at least some selection pressure, and it has two branches as node degree is small. Increasing crossover, decreasing the mutation rate or decreasing the selective pressure will increase the average node degree, thus leading to the decrease of scaling exponent. These studies will be helpful in understanding the combination and distribution of excellent gene segments of the population in GA evolving, and will be useful in devising an e±cient GA. Keywords: Genetic algorithms; information °ux network; scale-free network; maximum likelihood estimation; scaling exponent. PACS Nos.: 11.25.Hf, 123.1K.
1. Introduction Recent developments on the topology of real networks have indicated that the apparent randomness of complex networks hides generic mechanisms and orders that are crucial to the understanding of the function of network system.1–4 In particular, discoveries of the small-world5 and scale-free6 complex networks have led to a 1550076-1
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
Z. Wu et al.
fascinating set of common problems concerning how the network structure facilitates and constrains the network system behaviors. Small-world networks are often generated by probabilistically rewiring the edges of a regular network. Due to small edge-rewiring probabilities, the small world network possesses a short average path length and a high clustering coe±cient, which are di®erent from regular network (long average path length and high clustering coe±cient) and random network (short average path length and low clustering coe±cient). Scale-free networks are often generated with the preferential attachment method, in which the network is built incrementally with new vertices attached to existing vertices in proportion to their degree. They exhibit a power-law distribution of vertex connectivity, in which the probability that a given vertex has k connections is governed by the relationship pðkÞ / k , where is referred to as the scaling exponent. The genetic algorithm (GA) is a heuristic search algorithm that mimics the process of natural evolution,7–9 and also a complex dynamical evolving network in itself. With the development of complex network theory, people start studying the GA based on it. As the population structure of GA directly in°uences the combination and distribution of excellent gene segments of the population, many studies have focused on the GA's population structure.10,11 The studied population structures mainly include random network,12 small-world network13,14 and scale-free network.15 And a series of relationship between the topological properties of a population structure and the performance of the algorithm (for example takeover time) it induces on a population has built. For GA, the population structure is the topology of potential interaction, and is di®erent from actual interaction topology. Distinguishing between the potential population structure and the actual interaction topology is important since it is ultimately the actual interaction topology that governs the dynamics of GA. So another method for analyzing the in°uence of population structure on GA performance is proposed to study the topology of actual interaction between individuals. In Ref. 16, Oner et al. introduced mating network to describe reproductive interactions that occur in a steady state genetic algorithm (SSGA) population. The mating network shows scale-free characteristics if (i) the ¯tness is actively being improved (evolutionary process is still active) and population has not yet converged (ii) there is at least some selection pressure. In Ref. 10, to evaluate an individual impact on population dynamics, Whitacre et al. developed a technique that encoded a subset of an individual's genealogical history as a directed network. Topological analysis of these networks revealed that power-law degree distributions were common outcome, indicating that very fewer individuals in a GA population had a signi¯cant e®ect on the evolutionary dynamics of a population. However, when the population structure was a ring topology, the power-law degree distribution deviations were observed for large impact sizes in the genealogical networks, providing evidence that such regular population structures would reduce selection pressure, and increase the takeover time.
1550076-2
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
The study of genetic information °ux network
Considering the fact that the mating network proposed in Ref. 16 could not express the mutation behaviors of individuals, Jieyu et al.17 introduced an information °ux network (IFN) to describe the information °ow among generations during evolution process. The IFN also showed scale-free characteristic when the selection operator used a preferential strategy rather than a random one. By changing other operations used in GAs, they found that the scaling exponent of the power-law node degree distribution of IFN decreased when crossover rate increased, but increased when mutation rate increased. The IFN that describes the actual interaction topology between individuals of GA o®ers us a new viewpoint to study GA. Through the study of IFN we will better understand the characteristic of GA at di®erent operation conditions. Now that inducing the change of scaling exponent of the power-law node degree distribution of IFN is too much complicated, this paper is aimed at the IFN characteristic of GA at di®erent operator and giving di®erent interpretations from previous researches done by others. 2. Methods 2.1. Framework of GA GA is a highly parallel, random and adaptive searching probabilistic method based on the mechanics of natural selection and genetic. It was initially proposed by John Holland in the 1960s and developed by Holland and his students and colleagues at the University of Michigan in the 1960s and the 1970s.8 In general, the basic mechanism of a standard GA consists of chromosome representation and coding, ¯tness function design, genetic operator design, parameters analysis and selection. Three main operators of GA used in this paper selection, crossover (single point) and mutation are described below. (i) Selection: This operator selects chromosomes in the population for reproduction. The ¯tter the chromosome is, the more likely it is to be selected for reproduction. We will use roulette selection method in the succeeding examples. The idea behind the roulette wheel selection method is that each individual is given a chance to become a parent in proportion to its ¯tness. It is called roulette wheel selection as the chances of selecting a parent can be seen as spinning a roulette wheel with the size of the slot for each parent being proportional to its ¯tness. Obviously those with the largest ¯tness have more chances of being chosen. Thus, it is possible for one member to dominate all the others and enjoy a higher proportion of being selected. (ii) Crossover: This operator randomly chooses a locus and exchanges the subsequences before and after that locus between two chromosomes to create two o®spring. For example, the strings 10000100 and 11111111 could be crossed over after the third locus in each to produce the two o®spring 10011111 and 11100100. The crossover operator roughly mimics biological recombination 1550076-3
Z. Wu et al.
between two single-chromosome (haploid) organisms. This single point crossover mode has the capacity to preserve the good characteristics of previous generation and reduce the disruptive e®ects of genetic operators. (iii) Mutation: This operator randomly °ips some of the bits in a chromosome. For example, the string 00000100 might be mutated in its second position to yield 01000100. Mutation can occur at each bit position in a string with some probability, usually very small. Mutation in itself is a kind of local random search which is able to avoid prematurity by keeping the diversity of population.
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
In this paper, the framework of GA is shown as below (see Fig. 1). 2.2. Information °ux network Similar to Ref. 17, we de¯ne all population individuals in the whole process of GA evolution as a network with N nodes. Individuals are nodes in the network and each node is assigned a number i; i ¼ 1; 2; . . . ; N . In this network, an edge from nodes i to j means that parent i contributes genetic material toward o®spring j through crossover or mutation. The edges are added between the individuals after crossover or mutation. The delivery of genetic information between parents and o®spring are
Fig. 1.
Framework of GA. 1550076-4
The study of genetic information °ux network
studied through standard test functions, and the network which is produced in this process is IFN. Detailed explanations on IFN can be found in Ref. 17. 2.3. The maximum likelihood estimation We study the statistic properties of IFN with maximum-likelihood ¯tting methods. So it is necessary to introduce the method of maximum likelihood estimator (MLE). If the node degree obeys the power-law distribution, the node degree distribution probability density pðxÞ can be expressed as18:
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
pðxÞ ¼ cx ;
ð1Þ
where x is the observed value, namely the node degree, c is normalization constant and is scaling exponent. In IFN, the degrees of some nodes, which are only possibly generated in the population initialization process, are likely to be 0, while the degree of other nodes are greater than 1, or at least equal to 1. Those nodes with 0 node degree have almost no e®ect on GA, so we set xmin ¼ 1. Then, it is straightforward to calculate the normalizing constant and we ¯nd that 1 x pðxÞ ¼ : ð2Þ xmin xmin To estimate the scaling exponent , the likelihood function for n sample datum can be expressed as n n Y Y 1 x pðx; Þ ¼ pðxi Þ ¼ ð3Þ xmin xmin i¼1 i¼1 and the log-likelihood function is as follows L ¼ ln
n Y i¼1
pðxi Þ ¼ n lnð 1Þ ln xmin
n X i¼1
ln
x xmin
Setting @L=@ ¼ 0 we then ¯nd " #1 " #1 n n X X x ^ ¼1þn ln ¼1þn ln x ; xmin i¼1 i¼1
:
ð4Þ
ð5Þ
^ is the MLE value of . From (5), we can ¯nd that as the average node degree where ^ is more small and the corresponding network is more heterogeneous. is more large, 2.4. Test function Here are six classic functions used to evaluate structure characteristics of IFN in GA (see Table 1).
1550076-5
Z. Wu et al. Table 1. Six classic functions. Function Sphere Rosenbrock Rastrigrin Griewank Scha®er F6
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
Ackley
Formula
Domain
Minimum
fðx; yÞ ¼ x2 þ y2 fðx; yÞ ¼ 100ðy x2 Þ2 þ ðx 1Þ2 fðx; yÞ ¼ 20 þ x2 þ y2 10ðcos 2x þ cos 2yÞ pffiffiffi fðx; yÞ ¼ ð1=4000Þðx2 þ y2 Þ cos x cosðy= 2Þ þ 1 pffiffiffiffiffiffiffiffiffiffi sin2 x2 þy2 0:5 fðx; yÞ ¼ ½1þ0:01ðx2 þy2 Þ2 0:5 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi fðx; yÞ ¼ 20expð0:2 0:5ðx2 þ y2 ÞÞ expð0:5ðcos 2x þ cos 2yÞÞ þ 20 þ ‘
100 x; y 100 2 x; y 2 5 x; y 5 40 x; y 40
fð0; 0Þ ¼ 0 fð1; 1Þ ¼ 0 fð0; 0Þ ¼ 0 fð0; 0Þ ¼ 0
6 x; y 6
fð0; 0Þ ¼ 0
6 x; y 6
fð0; 0Þ ¼ 0
3. Statistical Characteristics of the IFN under Various Operators 3.1. The branches of IFNs In this section, we design GA experiment based on Scha®er function, with ¯tness proportional selection, crossover rate 0.6 and one elite reservation. Simulating to Ref. 17, by the experiment we also ¯nd that if there are no selection pressure applied in GA, their correspondent IFNs are unconnected, while if there are at least some selection pressure, their according IFNs are scale free. But we still ¯nd that the node degree distributions of IFNs have two branches as node degrees are small, see Fig. 2(a). From the ¯gure we can see that the even values of nðkÞ are larger than the odd value of nðkÞ as k is small, which may be explained with the following reasons: (i) In the crossover process of GA, when a pair of parents produce two sons, only as one of these two sons is a new individual, the odd degree node may produce. While in general case, two sons produced by the couple through crossover both are new individuals, thus the node degrees in IFN are more likely to be even. (ii) Mutation process may produce odd degree nodes, but the mutation rate is largely lower than selective rate in general GA. From above analysis, we can conclude that, while the mutation rate or the number of iteration increases, the branch phenomenon will reduce or disappear (see Figs. 2(b), 2(c) and 2(d), respectively). These characteristics of IFN of GA show that high mutation and large iteration can produce more new individuals thus expand the search scope of GA, but they also increase the number of invalid runs. 3.2. IFNs under di®erent crossover rates and mutation rates Crossover rate of GA is one of the main operations which in°uence its search e±ciency. In Ref. 17, Jieyu et al. found that the relationship between scaling exponent and crossover rate shows negative correlation in IFNs of GA, and they indicated the reason is that high crossover rate leads to more edges that are shared between nodes 1550076-6
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
The study of genetic information °ux network
(a)
(b)
(c)
(d)
Fig. 2. (Color online) Distribution of node degree of IFN based on Scha®er function, ¯tness proportional selection and crossover rate: 0.6, but di®erent other running parameters (a) maximum number of iterations: 100, mutation probability: 0.1/Lind; (b) maximum number of iterations: 100, mutation probability: 1.0/Lind; (c) maximum number of iterations: 500, mutation probability: 0.1/Lind.; (d) maximum number of iterations: 500, mutation probability: 1.0/Lind.
and a corresponding decrease in single hubs, for which the network topology of the IFN becomes more heterogeneous when the crossover rate increases. We design the similar experiment as17: ¯tness proportional selection, 0.7/Lind mutation rate and six basic functions being regarded as the objective function. Each data point is obtained by averaging 20 times simulation runs. The information °ux networks (IFNs) are built under alterable crossover probability in the range of 0.1 to 0.8 in increments of 0.1. By the experiment, we get similar result as17, namely the relationship between scaling exponent and crossover rate is negative, see Fig. 3(a). Based on this experiment, we design other experiments to further test the relation between the crossover rate and number of edges, the crossover rate and number of individuals, the crossover rate and average ratio of node degree ki and minimum node degree km (in IFN, km ¼ 1), and the results are plotted in Figs. 3(b)–3(d), 1550076-7
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
Z. Wu et al.
(a)
(b)
(c)
(d)
Fig. 3. (Color online) The characteristics of IFN under di®erent crossover rate from 0.1 to 0.8. (a) The relationship between crossover rate and scaling exponent; (b) The relationship between crossover rate and number of edges; (c) The relationship between crossover rate and number of individuals; (d) The relationship between crossover rate and ki =km .
respectively. From Figs. 3(b)–3(d), we can see that with the increase of crossover rate, the number of edges and the number of individual are both increased, but the edge number increasing rate is larger than individual number increasing rate. It is obvious that average value of ki =km increases with the increase of crossover rate. From formula (5) we can conclude that the reason why relationship between scaling exponent and crossover rate is a negative correlation in IFN of GA is that with the increase of crossover rate, the average value of ki =km increases, thus the scaling exponent decreases. Then the in°uence of mutation rate on IFN structure is also studied. In this study, the parameters and protocols have been set as the last experiment, with the di®erence that the mutation is 0.6:0.2:2.0/Lind and crossover equals 0.6. The results are shown in Figs. 4(a)–4(d). From Figs. 4(b) and 4(c), it can be found that with the increase of mutation rate, the number of edges and the number of individual are both 1550076-8
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
The study of genetic information °ux network
(a)
(b)
(c)
(d)
Fig. 4. (Color online) The characteristics of IFN under di®erent mutation rate from 0.01 to 0.08. (a) The relationship between mutation rate and scaling exponent; (b) The relationship between mutation rate and number of edges; (c) The relationship between mutation rate and number of individuals; (d) The relationship between mutation rate and ki =km .
increased, but the individual number increasing rate is larger than the edge number increasing rate, thus average value of ki =km decreases with the increase of mutation rate, see Fig. 4(d). From formula (5) we can conclude that the relationship between scaling exponent and mutation rate is positively correlated, see Fig. 4(a). From these studies, the ¯nal results are presented as follows. (i) When the crossover rate increases, the edge number and individual number both increase, but the increasing rate of edge number is larger than that of individual number, and IFN becomes more inhomogeneous. It is predictable that increasing crossover rate cannot enlarge searching scope of GA e®ectively. (ii) If the mutation rate increases, the edge number and individual number both increase, but the increasing rate of individual number is larger than that of edge number, and IFN becomes more homogeneous. It is obvious that increasing mutation rate can enlarge searching scope of GA. 1550076-9
Z. Wu et al.
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
3.3. IFN and selection pressure In addition to the crossover rate and mutation rate, the selection pressure also has an impact on the scaling exponent. We also design the experiment to study the relation between the selection pressure and scaling exponent. The selection pressure of GA is de¯ned by the selection and replacement policies. In this section, we increase the selective pressure by increasing the number of elite reservation. The experiment is similarly designed as the previous ones. When we apply ¯tness proportional selection and regard six basic functions as the objective function, the IFNs are built under alterable elite reservation number in the range of 1–9 in increments of 1 while mutation rate is 0.7/Lind. Each data point is obtained by averaging 20 times simulation runs. The experiment results show that relationship between scaling exponent and selection pressure is positive, see Fig. 5(a), namely with the selection pressure increasing, the scaling exponent increases, and the IFN
(a)
(b)
(c)
(d)
Fig. 5. (Color online) The characteristics of IFN under di®erent elite reservation number in the range of 1–9. (a) The relationship between elite reservation number and scaling exponent; (b) The relationship between elite reservation number and number of edges; (c) The relationship between elite reservation number and number of nodes in IFNs; (d) The relationship between elite reservation number and ki =km . 1550076-10
The study of genetic information °ux network
becomes more homogeneous, di®ering from our common sense. Further experiments show that as the selection pressure increases, the edges, individuals and ki =km all decrease, thus increasing the scaling exponent, see Figs. 5(b), 5(c) and 5(d), respectively. These experiments show that increasing the selection pressure will largely decrease the individuals of IFN, thus decrease the search scope of GA.
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
4. Conclusions To sum up, we have further investigated the IFN in GA which is based on biology evolutionism by empirical analysis. First, we ¯nd similar result that the node degree distributions of IFNs are scale-free when there is at least some selection pressure, and we further ¯nd that the node degree distributions of IFNs have two branches as node degree is small. Second, we ¯nd that with the increasing of crossover rate, the number of edges and the number of individual are both increased, but the increasing rate of edge number is larger than that of individual number, thus average value of ki =km increases with the increase of crossover rate, and the scaling exponent is decreased accordingly. Third, we ¯nd that if the mutation rate increases, the edge number and individual number both increase, but the increasing rate of individual number is larger than that of edge number, and IFN becomes more homogeneous. Finally, if the selection pressure increases, the nodes of IFN will decrease largely, thus increase the average value of ki =km , and decrease the scaling exponent. These studies will be helpful in understanding the combination and distribution of excellent gene segments of the population in evolving, and will be useful in devising an e±cient GA. Acknowledgments The research reported in this article is supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61273183, Major national special project of China under Grant No. 2011ZX05024-002-004 and Subject of hall of Hubei province science and technology key project under Grant No. 2013CFA050. References 1. G. Chen, W. Xiao and Y. Liu, Physica A 291 (2014). 2. C. Cotta, A. C. Rosa and J. L. J. Laredo, in Evolutionary Computation (CEC), 2013 IEEE Congress, pp. 2450–2456. 3. C. M. Fernandes, J. L. J. Laredo et al., in Eur. Conf. Appl. Evolutionary Comput., 2014. 4. J. H. Moore, J. L. Payne and M. Giacobini, Soft Comput. 17, 1109 (2013). 5. S. H. Strogatz and D. J. Watts, Nature 393, 440 (1998). 6. R. Albert and A. L. Barabasi, Science 286, 509 (1999). 7. K. Deb, A. Pratap, S. Agarwal et al., Evol. Comput. IEEE Trans. 6, 182 (2002). 8. D. Lawrence (ed.), Handbook of Genetic Algorithms, Vol. 115 (Van Nostrand Reinhold: New York, 1991). 9. C. R. Houck, J. Joines and M. G. Kay, NCSU-IE TR 95 (1995). 10. J. M. Whitacre, R. A. Sarker and Q. T. Pham, Memetic Comput. 1, 125 (2009). 1550076-11
Int. J. Mod. Phys. C 2015.26. Downloaded from www.worldscientific.com by 37.44.207.140 on 01/27/17. For personal use only.
Z. Wu et al.
11. J. M. Whitacre, arXiv: preprint arXiv:0907.0516, 2009. 12. M. Giacobini, M. Tomassini and A. Tettamanzi, in Proc. 2005 Conf. Genetic and Evolutionary Computation. ACM, pp. 1333–1340, 2005. 13. M. Giacobini, M. Preuss and M. Tomassini, Evolutionary Computation in Combinatorial Optimization (Springer Berlin Heidelberg, 2006). 14. P. Bouvry and B. Dorronsoro, in Evolutionary Computation (CEC), 2012 IEEE Congress IEEE, pp. 1–8, 2012. 15. M. J. Eppstein, J. L. Payne, in Proc. 9th Annual Conf. Genetic and Evolutionary Computation. ACM, pp. 308–315, 2007. 16. M. K. Oner, I. I. Garibay and A. S. Wu, in Proc. 8th Annual Conf. Genetic and Evolutionary Computation. ACM, pp. 1423–1424, 2006. 17. J. Wu, X. Shao, J. Li et al., Physica A 391, 1692–1701 (1701). 18. M. E. J. Newman, A. Clauset and C. R. Shalizi, SIAM Rev. 51, 661 (2009).
1550076-12