Parallel Max-Min Ant System Using MapReduce

Parallel Max-Min Ant System Using MapReduce Qing Tan1,2, Qing He1, and Zhongzhi Shi1 1 The Key Laboratory of Intelligent Information Processing, Inst...
Author: Marvin Gibbs
0 downloads 0 Views 285KB Size
Parallel Max-Min Ant System Using MapReduce Qing Tan1,2, Qing He1, and Zhongzhi Shi1 1

The Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, 100190 Beijing, China 2 Graduate University of Chinese Academy of Sciences, 100049 Beijing, China {tanq,heq,shizz}@ics.ict.ac.cn

Abstract. Ant colony optimization algorithms have been successfully applied to solve many problems. However, in some large scale optimization problems involving large amounts of data, the optimization process may take hours or even days to get an excellent solution. Developing parallel optimization algorithms is a common way to tackle with this issue. In this paper, we present a MapReduce Max-Min Ant System (MRMMAS), a MMAS implementation based on the MapReduce parallel programming model. We describe MapReduce and show how MMAS can be naturally adapted and expressed in this model, without explicitly addressing any of the details of parallelization. We present benchmark travelling salesman problems for evaluating MRMMAS. The experimental results demonstrate that the proposed algorithm can scale well and outperform the traditional MMAS with similar running times. Keywords: Ant colony optimization, MMAS, Parallel MMAS, Travelling salesman problem, MapReduce, Hadoop.

1

Introduction

Max-Min Ant System [1] is an optimization algorithm that was inspired by the behavior of real ants. This evolutionary algorithm has become popular and has been found to be effective for solving NP-hard combinatorial problems like travelling salesman problem (TSP). However, as the city number grows, the algorithm often takes a very long time to obtain the optimal solution. Efficient parallel Ant Colony Optimization (ACO) [2] algorithms and implementation techniques are the key to meet the scalability and performance requirements entailed in such cases. So far, there are several parallel implementations of ACO algorithm [3,4]. In the PACS [3] method, the artificial ants are firstly generated and separated into several groups, and ACS is then applied to each group and the communication between groups is applied according to some fixed cycles. [4] proposed two parallel strategies for the ant system: the synchronous parallel algorithm and the partially asynchronous parallel algorithm. And all of the above methods need the programmers to design and implement the detailed parallelization on different processors. Y. Tan, Y. Shi, and Z. Ji (Eds.): ICSI 2012, Part I, LNCS 7331, pp. 182–189, 2012. © Springer-Verlag Berlin Heidelberg 2012

Parallel Max-Min Ant System Using MapReduce

183

MapReduce [5] is a programming model and an associated implementation for parallel processing large dataset. Users only specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across the cluster of machines. In this paper, we adapt MMAS algorithm in MapReduce framework and present a MRMMAS to make the method applicable to dealing with large scale problems. MRMMAS is simple, flexible, and scalable because it is designed in the MapReduce model. Considering TSP is the most typical application of MMAS, we present our MRMMAS method for solving TSP and conduct comprehensive experiments to evaluate its performance on some TSP benchmark problems. The rest of the paper is organized as follows. In Section 2, we present preliminary knowledge including MapReduce overview and introduction of standard MMAS. Section 3 describes how MMAS can be cast in the MapReduce model and shows the map function and reduce function of MRMMAS in detail. Experimental results in Section 4 demonstrate that the proposed algorithm can scale well through the computer cluster. Finally, we offer our conclusions in Section 5.

2

Preliminary Knowledge

2.1

MapReduce Overview

MapReduce, as the framework showed in figure 1, is a simplified programming model which is well suited to parallel computation [6]. Under this model, programs are automatically distributed to a cluster of machines. In MapReduce, all data are organized in the form of keys with associated values. For example, in a program that counts the frequency of occurrences for different words, the key could be set as a word and its value would be the frequency of that word. As its name shows, map and reduce are two basic stages in the model. In the first stage, the map function is called once for each input records. At each call, it may produce intermediate output records with the form of key-value pair. In the second stage, these intermediate outputs are grouped by key, and the reduce function is called once for each key. Finally, the reduce function will output some reduced results. More specifically, the map function is defined as a function that takes a single keyvalue pair and outputs a list of new key-value pairs. For each call, it may produce any number of intermediate key-value pairs. It could be formalized as: Map: (Key1, Value1)  list((Key2, Value2)) In the second stage, these intermediate pairs are sorted and grouped by key, and the reduce function is called once for each key. The reduce function reads a key and a corresponding list of values and outputs a new list of values for the key. Mathematically, this would be written: Reduce: (Key2, list(Value2))  Value3 The MapReduce model provides sufficient high-level parallelization. Since the map function only takes a single record, all map operations are independent of each other and fully parallelizable. And also the reduce function can be executed in parallel on each set of intermediate pairs with the same key.

184

Q. Tan, Q. He, and Z. Shi

Fig. 1. Overview of the MapReduce execution framework

2.2

Max-Min Ant System (MMAS)

Ant Colony Optimization (ACO) metaheuristic is a population based approach inspired by the behavior of ant colony in real world. In ACO, solutions of the problem are constructed within a stochastic iterative process, by adding solution components to partial solutions. This process, together with the pheromone updating rule in ACO, makes the algorithm efficient in solving combinatorial optimization problems. Initially, each ant was randomly positioned on a starting node. Then each ant applies a state transition rule to incrementally build a solution. Finally, all of the solutions are evaluated and the pheromone updating rule was applied until all the ants have built a complete solution. The framework of ACO algorithm could be represented as follows: Procedure: ACO algorithm for static combinatorial problems

. Initialize parameters and pheromone trails; Loop /* at this level each loop is called an iteration */ 2. Put each ant in a random starting node; Loop 3. /*Construct solutions*/

1

Each ant applies a state transition rule to choose a next city to visit; Until all ants have built a complete solution 4 Pheromone updating rule is applied; Until end condition is satisfied, usually reach a given iteration number



Max-Min Ant System [1] is one of the best implementation of ACO algorithm. It combines an improved exploitation of the best solutions with an effective mechanism for avoiding early search stagnation. It differs from Ant System (AS) mainly in the following three aspects: (1) Only one single ant adds pheromone after each iteration; (2) The range of possible pheromone trails on each solution component is limited to an interval [τ min ,τ max ] ; (3) The initial pheromone trails are set to τ max .

Parallel Max-Min Ant System Using MapReduce

3

185

MapReduce Based Parallel Max-Min Ant System

In this section, we present the main design for MapReduce Max-Min Ant System (MRMMAS). Firstly, we point out how MMAS be naturally adapted to MapReduce programming model and present the general idea of MRMMAS. Then we explain how the computations can be formalized as map and reduce operations in detail. 3.1

The Analysis of MMAS from Serial to Parallel

The whole procedure of MMAS is an iteration process. In every round of the iteration, the ant colonies construct feasible solutions through two rules: state transition rule and pheromone updating rule. And in MMAS, the pheromone updating rule is applied only when all ants have built a complete solution. In another word, the pheromone level keeps constant during the process of solution construction. In MMAS, the most intensive calculation to occur is the calculation of solution construction. In each iteration, every ant would require a lot of computations to decide which city to visit from its current city. Fortunately, the pheromone updating rule in MMAS does not require the communications among the ants in the same iteration but only deliver the information to the ants in the following iterations through the updating of the pheromone. It is obviously that the computation of constructing a solution for one ant is irrelevant with the construction of another ant in the same iteration. Therefore, the solution construction process could be parallel executed. After this phase, all the constructed solutions are summed up and pheromone updating rule is carried out. The updated pheromone level will be send to each ant and play a role in the following iteration. 3.2

MMAS Based on MapReduce

In an iteration of MMAS, each ant in the swarm locates at a starting node, chooses a next city to visit step by step, and evaluates its solution. All of these actions are completed independently of the rest of the swarm. As the analysis above, MRMMAS algorithm needs one kind of MapReduce job. The map function performs the procedure of constructing a solution for one ant and thus the map stage realizes the solution construction for all the ants in a parallel way. Then, the reduce function performs the procedure of updating the pheromone. For each round of the iteration, such a job is carried out to implement the whole process of MMAS. The procedure of MRMMAS is shown in the following. Procedure: MapReduce MMAS for static combinatorial problems

. Initialize parameters and pheromone trails; Loop /* for each iteration, a MapReduce job is carried out */ 2. /* Map stage */ 3. /* a map function realizes the behavior of an ant */ 4. The ant is randomly put in a starting node; 1

186

Q. Tan, Q. He, and Z. Shi

. 6. 7.

5

The ant applies a state transition rule to choose a next city to visit until a complete solution has been built. /* solution construction */ Calculate the fitness of the solution. /* solution evaluation */ /* Reduce stage */ Pheromone updating rule is realized by a reduce function; Until end condition is satisfied, usually reach a given iteration number

Map Function: Firstly, the pheromone values, heuristic information, and all of the parameters used in the state transition rule are transmitted into the map function from the main function of the MapReduce job. The MRMMAS map function, shown as function 1, is called once for each ant in the population. The input dataset is stored on HDFS as a sequence file of pairs, each of which represents a record in the dataset. The number of the record is set as the number of the ant population. So the map function would be carried out m times, where m is the population of the ant swarm. The dataset is split and globally broadcasted to all mappers. Consequently, the process of solution construction for the ants is parallel executed. For each map task, one ant constructs one solution according to the state transition rule. Then, the solution is evaluated and expressed as an output pair. Function 1: MRMMAS Map def mapper(key, value): /* get η[n][n] , τ [n][n] , α , β from MapReduce job */

/* initialize tabuList */ for i=1 to n do tabuList[i] = false; /* randomly put the ant in a starting node */ currentPosition = randomInit(n); solution[1] = currentPosition; tabuList[currentPosition] = true; /* construct the solution through state transition rule */ for i=2 to n do /* calculate the visited probability of each city */ for (int j=0; j iBest) iBest = fitness; iBestSolution = solution; /* update the global best solution */ if (iBest > gBest) { gBest = iBest;} /* pheromone updating */ τ = (1 − ρ ) *τ ; for all edges(i,j) in iBestSolution τ ij = τ ij + Δτ ijbest ; /* range pheromone into [τ min ,τ max ] */ for all edges(i,j) if ( τ ij > τ max ) { τ ij = τ max ;} if ( τ ij < τ min )

{ τ ij = τ min ;}

/* output the results in a pair */ Take gBest+gBestSolution as key’; Take τ [n][n] as vaule’;

,value’> pair;

output