Performance Based Symmetric Load Balancing Algorithm

International Journal of Research in Computer and Communication Technology, Vol 2, Issue 11, November- 2013 ISSN (Online) 2278- 5841 ISSN (Print) 232...
Author: Belinda Nash
5 downloads 0 Views 310KB Size
International Journal of Research in Computer and Communication Technology, Vol 2, Issue 11, November- 2013

ISSN (Online) 2278- 5841 ISSN (Print) 2320- 5156

Performance Based Symmetric Load Balancing Algorithm Remella Suvarna1, N.V.B Gangadhara Rao2, Dr. S.MaruthuPerumal3 1

II M.Tech Student – Department of CSE, Godavari Institute of Engineering & Technology, Rajahmundry, A.P. India. [email protected]

2

Associate Professor & Head, Department of IT, Godavari Institute of Engineering & Technology, Rajahmundry, A.P. India. [email protected]

3

Professor & Head, Department of CSE, Godavari Institute of Engineering & Technology, Rajahmundry, A.P. India. [email protected]

Abstract— Mainly Distributed Hash Table uses decentralized load balance algorithms which are based on virtual servers for participating in asymmetric peers. Require the participating peers to be asymmetric, thereby introducing another load imbalance problem which is symmetric and promise no precise performance metrics. In this paper, an original symmetric load balancing algorithm for DHTs is introduced where the peers approximate the system state with histograms. Unlike other algorithms, our proposal guarantees analytical performance in terms of the load balance factor and high convergence rate. Through rigorous simulations, we show that our proposal performs better in terms of load balance factor with a comparable cost. Keywords—Distributed hash tables, load balance, virtual server, symmetric.

I.

INTRODUCTION

Distributed hash tables (DHTs) are main building blocks in the design and implementation of successful distributedapplications. Examples of DHTs are Chord [1] and Pastry [2]. Applications/infrastructures built based o n DHTs include storage clouds [3], file-sharing network [4], and distributed file systems [5], among many others. In a typical DHT, the participating peers (or nodes) cooperatively manage a global hash table. Essentially, DHTs provide the GET operation to retrieve a published object whose key is x as well as the PUT operation to store the value v of an object with hash key y. As peers participating in a DHT are often heterogeneous, the work in [1] introduces the notion of virtual servers to cope with peer www.ijrcct.org

heterogeneity. Denote S as the hash space offered by the DHT. Let N be the set of participating peers in the DHT and V be the set of virtual servers hosted by the peers in N. (Typically, k N k < kV k; kk denotes the cardinality of a set.) Designing a load-balanced, heterogeneity-aware DHT with virtual servers is technically challenging. In particular, load balancing algorithms designed for DHTs based on virtual servers need to take the following into consideration: . Load balance and movement cost. By load balance, we mean that each peer manages the load proportional to its capacity. Previous studies (e.g., [6]) suggest migrating virtual servers among the participating peers in order to balance peer load. However, this is at the expense of introducing movement cost due to the migration of virtual servers. How to balance peer load while reducing movement cost as much as possible thus is a critical issue. System dynamics. Load balancing algorithms need to bear the system dynamics in mind because nodes may dynamically join and leave DHTs. In addition, the load of a virtual server may change from time to time, aggravating the load imbalance problem in the DHTs. Algorithmic robustness and workload. Load balancing algorithms need to be robust without introducing the performance bottleneck and the Page 1273

International Journal of Research in Computer and Communication Technology, Vol 2, Issue 11, November- 2013

single point of failure. In addition, as load balancing algorithms incur algorithmic workloads, such workloads shall not induce another load imbalance problem. On the other hand, a well-designed load balancing algorithm will not generate considerable overheads. Performance guarantee. Load balancing algorithms shall work well with performance guarantee, given any system instance. Specifically, DHT networks may operate in dynamic and large-scale environments, thus presenting a large number of problem instances for performance investigation. To tackle the load imbalance problem in DHTs, prior proposals have presented centralized algorithms that rely on a few rendezvous nodes to balance the loads of peers in a DHT, which can be found in [7], [8]. However, considering largescale and dynamic DHT networks, the centralized algorithms may introduce the performance bottleneck and the single point of failure. In contrast to the centralized approach, some studies, e.g., [9], [10], suggest organizing rendezvous nodes in a hierarchical manner. Virtual servers are first matched with peers through the rendezvous nodes in the lower layer of the hierarchy; for unpaired virtual servers, the rendezvous peers relay them to the rendezvous in the upper layer to seek reallocation. This process repeats iteratively until an unpaired virtual server reaches a rendezvous in the highest layer. For example, in [9], the rendezvous peers self-build and self-maintain an auxiliary tree-shaped network; in [10], the rendezvous nodes are formatted in a two-tier fashion. Consequently, the rendezvous nodes may experience skewed workloads, introducing another load imbalance problem [11]. They may also become the performance bottleneck and the single point of failure. Moreover, the hierarchical networks (e.g., the tree-shaped network in [9]) facilitating the load balancing algorithms are prone to no de/communication failure, thus demanding sophisticated maintenance for the networks.

ISSN (Online) 2278- 5841 ISSN (Print) 2320- 5156

nodes can be found in [11]. In [11], each peer i samples a few number of nodes to estimate the system state (e.g., the ideal load a peer shall manage) so that i can identify whether it is light or heavy. If i is light, then i seeks to discover heavy nodes and to reallocate loads from the heavy nodes. Although the solution in [11] assesses its performance through simulations and works well for some empirical system settings, it offers no performance guarantee given any system instance. In this paper, we present a fully decentralized load balancing algorithm for DHTs. Our proposal, which is inspired by the study [11], is essentially different from the previous rendezvous-based solutions [7], [8], [9], [10] in that each peer estimates and represents the system state with histograms. Based on the histograms, each peer reallocates its load independently without introducing an asymmetric rendezvo us in collecting the system state as well as matching virtual servers and participating peers. However, in [11], light nodes randomly sample heavy nodes and may thus contend for reallocating loads from the same heavy nodes. On the one hand, the nodes may not balance their loads well due to contention; on the other, the contention may introduce considerable, redundant algorithmic overheads. In contrast to [11], the participating nodes approximate and represent the system state with novel histograms suggested in this paper. The nodes publish their loads and inquiry the loads of interest through a global index, intending to prevent the contention of load reallocation. In particular, unlike the existing distribution algorithms in [7], [9], [10], [11], together with the histograms and the global index, our proposal can be shown through rigorous mathematical analysis, which exhibits unique performance guarantees for “any” system instance. Precisely, we analytically show in this paper the following: With the message overheads introduced by each participating peer i , the resultant load balance factor of i deviates from the optimum in a constant ratio with a probability no less than.

State-of-the-art fully distributed load balancing algorithms that do not depend on any rendezvous www.ijrcct.org

Page 1274

International Journal of Research in Computer and Communication Technology, Vol 2, Issue 11, November- 2013

Unlike existing solutions in [7], [8], [9], [10], peers participating in our system are symmetric; the protocol overhead of our proposal is uniformly distributed to the participating peers without introducing another load imbalance problem. . The expected load balance factor of each participating peer in our proposal rapidly converges toward the ideal in an exponential fashion, thus tackling the situation that the peers may dynamically join and leave, and the loads of virtual servers change from time to time. . II. PROPOSED LOAD BALANCE ALGORITHM A. Notation and Problem Definition Let N be the set of participating peers in a DHT and V be the set of virtual servers deployed over N. Each peer i Є N has a maximum capacity of Cimax and hosts a set of virtual servers Vi is sub set of V. Here, Vi intersection Vj = φ; for any i not equal to j Є N. Each virtual server v Є Vihas a load denoted by Lv (where Lv is greater than equal to 0). Our objective in this study is to develop a load balancing algorithm A to reallocate and balance the loads among the participating peers, such that an y peer i manages the total load of virtual servers proportional to its Ci max A I . That is, A computes a subset Vi subset of V for each peer i, such that the following equation is minimized:

ISSN (Online) 2278- 5841 ISSN (Print) 2320- 5156



Definition 2: The ideal load, denoted by , which peer i 2 N manages in a load-balanced DHT, is



Definition 3: The capacity of peer is

remaining

Our load balancing algorithm A intends to balance loads of participating peers by minimizing (1). It also aims to reduce the movement cost in (5) as much as possible. We note in this paper the following: . In a typical DHT, participating nodes can join and leave, arbitrarily. Thus, the reallocation of a virtual server from a source peer to a destination peer can be simply done by simulating the leave and join operations offered by a typical DHT. The load of any virtual server v at a particular time is the sum of loads of objects hosted by v at that time; the load of a peer i is the aggregate of loads of virtual servers maintained by i. The potential metric s for measuring the loads includes CPU utilization, storage space, etc... Similar to prior studies (e.g., [7], [9], [10], [11]), we assume that there is only one bottleneck resource in the system, leaving multiple-resource load balancing in the future. B. Overview

To ease our discussion, we define the following terminologies and notations:  Definition 1: The load per unit capacity, which is a peer that hosts in a load-balanced DHT, is defined as

www.ijrcct.org

In this paper, we present our proposed load balancing algorithm based on the Chord DHT [1]. However, our proposal can be also realized in most DHTs, such as Pastry [2] and O 1Þ-hop DHTs in [3], [14]. We assume in the following discussion that each virtual server in Chord has a unique ID selected uniformly at random from the key space [0, 1].

Page 1275

International Journal of Research in Computer and Communication Technology, Vol 2, Issue 11, November- 2013

ISSN (Online) 2278- 5841 ISSN (Print) 2320- 5156

In our proposal, as each heavy peer selects its virtualservers with small sizes to migrate, the resultant movementcost is small. Thus, analyzing the load balance factor foreach peer suffices. The load balance factor of peer I (denoted by LBFI) is defined as follows:

Fig 1: Concept of our problem.

C. Performance Analysis Compared with previous efforts [7], [8], [9], [10], [11], ourproposal is unique in that it offers rigorous performance guarantees. As our proposal selects virtual servers withsmall sizes for reducing the migration cost as much aspossible, we thus offer analytical results for the load balancefactor and the algorithmic convergence due to our algorithm.

Where A represents our load balancing algorithm. Consequently, due to (1), we have

III. THEORETICALPERFORMANCEANALYS IS Compared with previous efforts [7], [8], [9], [10], [11], our proposal is unique in that it offers rigorous performance guarantees. As our proposal selects virtual servers with small sizes for reducing the migration cost as much as possible, we thus offer analytical results for the load balance factor and the algorithmic convergence due to our algorithm.

Fig. 2. An example with 15 elements (denoted by “”) in the population and 10 ones (denoted by “ut”) are randomly sampled (here, R¼0:2).

A. Load Balance Factor

www.ijrcct.org

IV. SIMULATIONS

A. Experimental Setting We develop a simulator to assess the performance of ourproposal. Our simulator implements the Chord DHTprotocol [1]. In addition to our proposal presented in this paper, weinclude in our simulator the many-to-many approach with asingle directory [7], the tree-based approach [9], the two-tier solution [10] and Hsiaoetal’s algorithm [11] for comparison.While implementing the details of the Chord routinginfrastructure for the directory-based solution, the tree based approach, the algorithm by Hsiao et al., and ourproposal, we also simulate the Cycloid [32] substrate onwhich the two-tier solution is based. Similar to prior studies (e.g., [9], [10]), the capacity of asimulated peer in our simulations follows the power-lawdistribution, i.e., the Pareto distribution (with shapeparameter of 1.2 and scale parameter of 120), indicatingthat a small number Page 1276

International Journal of Research in Computer and Communication Technology, Vol 2, Issue 11, November- 2013

of capable peers are present in thesystem. The load value of a simulated virtual server also hasthe Pareto distribution (with shape parameter of 1.2 andscale parameter of 10), implying that a few virtual servers inthe system serve popular (or large) data objects. Moreover, each object stored in a virtual server has the size followingthe Pareto distribution [33].The major performance metrics measured include thefollowing:. Load balance factor. The load balance factor for nodei Є N is defined as

Given the Pareto distribution of capacities of thepeers and loads of the virtual servers, the ideal loadbalance factor is nearly equal to 0:8. Algorithmic message overhead. The algorithmic messageoverhead denotes the number of messagesrequired to manipulate a load balancing algorithm. B. Comparative Studies In the centralized directory approach (Dir), the tree-basedsolution (Tree), the two-tier-based solution (2-tier), Hsiao’s algorithm (Hsiao), and our proposal (Ours), B. Simulation Results 

Convergence:

We presented in Theorem 4 the upper bound of theconvergence rate for the expected load balance factor. Inthis section, we demonstrate the load balance factor againstalgorithmicroundsforthe1-,5-,10-,90-,95and99-percentile peers. Fig. 11 depicts the simulation results, where the algorithmic rounds investigated for our proposalare 1, 2, 3, and 10. Here, the 0 round denotes the initialsituation that no peer performs our proposal and that they-axis is in a logarithmic scale. The experimental resultsclearly demonstrate that the light (e.g., 1-, 5-, and10-percentile peers) and heavy (90-, 95-, and 99-percentilepeers) peers in our proposal improve their load balancefactors through the algorithm rounds toward the idealvalue. In www.ijrcct.org

ISSN (Online) 2278- 5841 ISSN (Print) 2320- 5156

particular, the load balance factor improves rapidly in an exponential manner.  Effect of Numbers of Samples: As discussed in Section 4.2, a few samples of the capacities ofpeers and the loads of virtual servers are sufficient to estimatethe system state effectively, i.e., the probability distributionfor the capacities of peers (i.e., PrðYÞ) and the probability distribution for the loads of virtual servers (i.e., PrðXÞ). Notethat in our implementation, each peer issues a randomwalker to sample the capacities of peers and the loads ofvirtual servers simultaneously. That is, when a randomwalker visits a peer, the walker not only collects the capacityvalue of the visited peer but also gathers the load values of thevirtual servers the visited peer selects to migrate.  Effect of System Dynamics We finally study the impact of system dynamics on ourproposal. In the experiment, we first stabilize the system for20 minutes, where no peer joins and leaves the system.Then, peers start to join and leave the system, and theexpected lifetime of each participating peer is 100 minutes.We investigate the system operating for 3,600 minutes.Given the probability distribution of the capacities of peers,the probability distribution of the loads of virtual servers is CONCLUSIONS

With the notion of virtual servers, peers participating in aDHT may host different numbers of virtual servers. Bymigrating virtual servers, peers can balance their loadsproportionally to their capacities. Unlike previous rendezvousbasedschemes, we have presented in this paper anovel load balancing algorithm for the reallocation of virtualservers in DHTs. Our load balancing algorithm operates in afully decentralized manner by having each participatingpeer estimate the probability distribution of loads of virtualservers selected for migration and the probability distributionof the remaining capacities of under loaded peers. Tothe best of our knowledge, our proposal is the first amongthe competitive algorithms [7], [9], [10], Page 1277

International Journal of Research in Computer and Communication Technology, Vol 2, Issue 11, November- 2013

[11] to exhibit theprovable performance metrics, including the load balancefactor and the algorithmic convergence rate. Compared withthe previous algorithms in simulations, our algorithmperforms very well in terms of the load balance factor. ACKNOWLEDGMENT

[10] H. Shen and C.-Z.Xu, “Locality-Aware and ChurnResilient Load Balancing Algorithms in Structured P2P Networks,” IEEE Trans.Parallel Distributed Systems, vol. 18, no. 6, pp. 849-862, June 2007. [11] “Load to-Peer 22,

H.-C. Hsiao, H. Liao, S.-S.Chen, and K.-C. Huang, Balance with Imperfect Information in Structured PeerSystems,”IEEE Trans. Parallel Distributed Systems, vol. no. 4, pp. 634-649,Apr. 2011.

[12]

H.-C. Lin and C.S. Raghavendra, “A Dynamic LoadBalancing Policy with a Central Job Dispatcher (LBC),” Trans. SoftwareEng., vol. 18, no. 2, pp. 148-158, Feb.

The authors are grateful to the anonymous reviewers who have provided us with valuable comments to improve our study. This work was partially supported by GodavariInstitute of Engineering and Technology – AP, India.

IEEE 1992.

REFERENCES

IEEE 1985.

[1] M.F. Scalable Feb.

I. Stoica, R. Morris, D. Liben-Nowell, D.R. Karger, Kaashoek,F. Dabek, and H. Balakrishnan, “Chord: A Peer-to-Peer Lookup Protocol for Internet Applications,” IEEE/ACM Trans.Networking, vol. 11, no. 1, pp. 17-21, 2003.

[2] A. Rowstron and P. Druschel, “Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems,” Proc.IFIP/ACM Int’l Conf. Distributed Systems Platforms, pp. 161-172,Nov. 2001. [3] A.

G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,

[4]

BitTorrent, http://www.bittorrent.org/index.html, 2012.

[5] “Don’t

J. Stribling, E. Sit, M.F. Kaashoek, J. Li, and R. Morris, GiveUp on Distributed File Systems,” Proc. Sixth Int’l Workshop Peer-toPeerSystems (IPTPS ’07), Feb. 2007.

[6]

A. Rao, K. Lakshminarayanan, S. Surana, R. Karp, and I. Stoica,“Load Balancing in Structured P2P Systems,” Second Int’lWorkshop Peer-to-Peer Systems (IPTPS 68-79, Feb. 2003.

Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W.Vogels, “Dynamo: Amazon’s Highly Available KeyValue Store,”Proc. 21st ACM Symp. Operating Systems Principles (SOSP ’07),pp. 205-220, Oct. 2007.

Proc. ’02), pp. [7] and 217-

S. Surana, B. Godfrey, K. Lakshminarayanan, R. Karp, I.Stoica, “Load Balancing in Dynamic Structured P2P Systems, "Performance Evaluation, vol. 63, no. 6, pp. 240, Mar. 2006.

[8] Problem Trans. Feb.

C. Chen and K.-C. Tsai, “The Server Reassignment for Load Balancing in Structured P2P Systems,” IEEE Parallel Distributed Systems, vol. 12, no. 2, pp. 234-246, 2008.

[9]

Y. Zhu and Y. Hu, “Efficient, Proximity-Aware Load Balancing forDHT-Based P2P Systems,” IEEE Trans. Distributed Systems,vol. 16, no. 4, pp. 349-361, Apr.

Parallel 2005.

www.ijrcct.org

ISSN (Online) 2278- 5841 ISSN (Print) 2320- 5156

[13] no. 1, [14]

[15] 1985.

F.C.H. Lin and R.M. Keller, “The Gradient Model Load Balancing Method,” IEEE Trans. Software Eng., vol. 13, pp. 32-38, Jan.1987. L.M. Ni and K. Hwang, “Optimal Load Balancing in a Multiple Processor System with Many Job Classes,” Trans. SoftwareEng., vol. 11, no. 5, pp. 491-496, May L.M. Ni, C.-W. Xu, and T.B. Gendreau, “A Distributed Drafting Algorithm for Load Balancing,” IEEE Trans. Software Eng., vol. 11,no. 10, pp. 1153-1161, Oct.

Authors Profile: Remella Suvarna received her Bachelor’s Degree in Computer Science from SNSNR Degree college, Ravulapalem, affiliated to Andhra University, A.P and her Master’s Degree in Computer Applications from SD College of IT, Tanuku, affiliated to Andhra University, A.P. At present she is pursuing M.Tech(Computer Science & Engineering) II Year from Godavari Institute of Engineering and Technology, Rajahmundry, affiliated to JNTUK. Mr. N V B Gangadhara Rao received his M.Tech, in ACV from IIT, Kharagpur, WB &B.Tech in the ECE from JNTU College of Engineering Kakinada, A.P. He has been working as Associate Professor in the Information Technology in Godavari Institute of Engineering Page 1278

International Journal of Research in Computer and Communication Technology, Vol 2, Issue 11, November- 2013

ISSN (Online) 2278- 5841 ISSN (Print) 2320- 5156

and Technology; Rajahmundry, AP and India since 2010. His research interests include Secured Software Development, Computer Vision and Data Mining & Object Recognition. Dr.

S.MaruthuPerumal received his B.E, in Computer Science and Engineering from BharathidasanUniversity and M.E. in Computer Science and Engineering from Sathyabama University Chennai. He received Ph.D from Dr.MGRUniversity, Chennai. He is having Fifteen years of teaching experience. At present he is working as Professor and Head of the Department of CSE Godavari Institute of Engineering and Technology, Rajahmundry. He published Twenty three research publications in various International, National Conferences and Journal. His research interest includes Image processing, Digital Watermarking, Steganography and Security. He is a life member of ISCA, IAENG and Institutional Member of CSI.

www.ijrcct.org

Page 1279

Suggest Documents