Load Balancing in Xen Virtual Machine Monitor

Load Balancing in Xen Virtual Machine Monitor Gaurav Somani1 and Sanjay Chaudhary2 1 2 Laxmi Niwas Mittal Institute of Information Technology, Jaipur...
Author: Oswald McDowell
2 downloads 0 Views 492KB Size
Load Balancing in Xen Virtual Machine Monitor Gaurav Somani1 and Sanjay Chaudhary2 1 2

Laxmi Niwas Mittal Institute of Information Technology, Jaipur, India Dhirubhai Ambani Institute of Information & Communication Technology, Gandhinagar, India

Abstract. Global load balancing across all the available physical processors is an important characteristic of a virtual machine scheduler. Xen’s Simple Earliest Deadline First Scheduler (SEDF) serves the purpose for interactive applications and low latency applications. SEDF scheduler can not be used in multiprocessor environments due to unavailability of load balancing. This paper investigates requirement of this feature and discusses algorithmic design and implementation of an user space load balancing program. Experiment results show a balance among number of physical processors with better utilization of resources in multiprocessor systems. Keywords: Virtualization, Xen, Virtual machine scheduling and Global load balancing.

1

Introduction

Modern data centers host different applications ranging from web servers, database servers and high performance computing nodes to simple user desktops. Virtual Machine based data center implementation provides numerous advantages like resource isolation, hardware utilization, security and easy management. The concept of virtualizing resources is old [16] but it is gaining popularity after the term on-demand computing or cloud computing arose. Virtual Machine Monitor (VMM) or hyper-visor is a piece of software which manages these virtual machines. Data centers which host these virtual machines on their physical machines follow Service Level Agreements (SLAs), which specifies the service requirements with different constraints and parameters to be fulfilled by service providers or cloud providers[4]. Running more virtual machines on a single physical machine results into better hardware utilization. Xen is a popular open source virtual machine monitor. Credit scheduler in Xen uses weight and cap values to allocate CPU time. Simple Earliest Deadline First Scheduler (SEDF) gives time period and a slice to every virtual machine. Credit scheduler uses a global load balancing feature[5] in which every runnable VCPU will get a physical CPU if there is one. SEDF scheduler gives comparatively fair scheduling for interactive applications than credit scheduler [17]. SEDF scheduler does not have feature to dynamically balance the load among the physical server. This paper investigates the need of this kind of feature and proposes an implementation of Global Load Balancing S. Ranka et al. (Eds.): IC3 2010, Part II, CCIS 95, pp. 62–70, 2009. c Springer-Verlag Berlin Heidelberg 2009 

Load Balancing in Xen Virtual Machine Monitor

63

(GLB) algorithm. Section 2 describes Xen virtual machine monitor architecture. Section 3 elaborates scheduling internals in Xen and its two schedulers. Requirements, design and implementation of the new algorithm are discussed in section 4. Experiments and results are discussed in section 5. Related work, conclusion and future work have been given in last consecutive sections.

2

Virtual Machine Scheduling

Virtualization is mostly targeted towards wide-spread x86 architectures. Xen uses paravirtualization strategy to create virtual machines and run operating systems on it [2]. Guest Operating systems running on top of Xen are known as domains in Xen’s terminology. Xen designates host operating system (domain 0) as isolated driver domain (IDD) to provide device driver support to guest operating systems. In Xen architecture the device drivers in host operating system will serve all co hosted guest operating systems. Guest Domain, also known as domainU can access drivers via back end drivers provided by domain 0. Scheduling in Xen has ported some concepts from operating systems. Virtual machine scheduling is compared with process scheduling in [6]. An Operating system kernel that provides an N:M threading library schedules N threads (typically one per physical context) which a user space library multiplexes into M user space threads [6]. In Xen, kernel threads are analogous to VCPUs (Virtual Central Processing Unit) and the user space threads represent processes within the domain. in Xen system, there can even be another tier, because the guest domains can also be running user space threads in it. So there comes a role of three tiers of schedulers [6]. 1. User space threads to kernel threads in guest. 2. Guest kernel mapping threads to VCPU. 3. VMM mapping VCPUs to Physical CPUs (PCPUs). The hypervisor scheduler, sitting at the bottom of this three tier architecture, needs to be predictable. The layers above it will make assumption on the behavior of the underlying scheduling, and will make the decisions. So the Xen’s scheduler is one of the the most important part for achieving overall performance(Figure 1). 2.1

Simple Earliest Deadline First Scheduler (SEDF)

SEDF scheduler is an extension to the classical Earliest Deadline First (EDF) scheduler. This scheduler provides weighted CPU sharing in an intuitive way and uses real-time algorithms to ensure time guarantees. It is a soft real time scheduler which operates on the deadlines of the domains. Applications with least deadline will be scheduled first to meet their goals on time. Xen operates on SEDF scheduler with two parameters decided by system administrator i.e. Time Period Pk and Time slice Sk , k designates a task in total number of tasks which is n. So each and every runnable domain will run for at least Sk time in a period of Pk time. So this kind of scheduler will give soft real time guarantees

64

G. Somani and S. Chaudhary

Fig. 1. Virtual Machine Scheduling in Xen [6]

to domains. Soft real time schedulers are those in which some of the tasks can tolerate the lateness or a deadline miss. SEDF maintains a per CPU queue to schedule domains according to their deadlines [5][15]. The deadline of each domain is calculated by the time at which the domain’s period is ending. SEDF is a preemptive scheduler whose fairness is decided by the parameters chosen by user. SEDF can be a good choice for the application with latency intensive tasks. Domains which host an I/O intensive application require very less CPU time but that time is critical in such applications [20]. It is a preemptive policy in which tasks are prioritized in reverse order of impending deadlines. The task with the highest priority is that one that is run. If we assume that the deadlines of our tasks occur at the ends of their periods, although it is not required by EDF [13]. Given a system of n independent periodic tasks, all the tasks will meet their deadlines if and only if n  Sk ≤1 (1) U (n) = Pk k=1

That is, EDF can guarantee that all deadlines are met if total CPU utilization is not more that 100%. If above condition does not meet then one or more tasks will miss their deadlines. To understand above equation we will take an example : Consider 3 periodic processes scheduled using EDF. Below are the required time slices in given period of time is given for all the three tasks. Scheduler should be able to provide at least Sk time to task k in each time period Pk . 1. P1 =8 and S1 =1. 2. P2 =5 and S2 =2. 3. P3 =10 and S3 =4. Here the value of U (utilization) will be 1 2 4 U (n) = + + = 0.925 ≤ 1 8 5 10

(2)

So in above set of task, a schedule can be made to meet all the deadlines. Xen’s SEDF scheduler can be used in both Work Conserving (WC) and Non-work Conserving(NWC) modes. In work conserving mode, shares are guaranteed. The tasks in work conserving mode can use idle CPU if they are runnable. CPUs will

Load Balancing in Xen Virtual Machine Monitor

65

not become idle if there is some runnable task in the system. So if there is only one task and all other tasks are idle or blocking than that single task can consume the whole CPU. On the other hand a task in a scheduler in NWC mode will not be given a share other than its share even if CPU is free. Xen provides these two modes in both of the current schedulers. Credit scheduler use “cap” values to choose between those two modes. SEDF scheduler has a extra time option to allow a VM to use extra time. By enabling this field a domain is able to get extra CPU time other than its slice if physical CPU is idle. SEDF scheduler has a serious drawback when it is used for multiprocessor systems. It can not do load balancing among processors for the available VCPUs. It just maintains per physical CPU queues and schedule the queues on individual processors. Xen’s credit scheduler provides a great advantage of global load balancing among the number of physical CPUs available. If there is any runnable VCPU in the system than it will get a CPU if one is available. In other way, there will be no idle CPU if any runnable VCPU is there. It provides a great advantage to choose Credit as a scheduler. On the other hand Xen’s SEDF scheduler provides per CPU queues, so it does not support Global Load Balancing. Manual pinning can be done to fix a VCPU on a physical CPU (PCPU). The multicore or multiprocessor hardware is important target for virtualization due to their more CPU capacity to share within a virtualization environment. A Global Load Balancing strategy like Credit scheduler, re-assigns VCPUs to dynamically balance the load among available CPUs. A good example is given in [5]. Let us consider a 2-CPU machine. By assigning equal weights and a single VCPU to each of the three VMs, one would expect to get around 66.6% of the total CPU per VM. However it can only be achieved when the scheduler has Global Load Balancing features. When we use same configuration on server running under SEDF scheduler, it will assign all the VMs to the first processor. SEDF maintains per CPU queues, so each VM will be getting only 33.3% share on average of the first CPU and it results into Maximum 50% utilization of the whole system. The other CPU will not be used at all. If we define affinity rules by pinning mechanism provided by Xen. Then we will have at most 100%, 50% and 50% shares respectively for these three VMs.

3

Requirement of Global Load Balancing

Some points in favor of requirements of load balancing. 1. SEDF’s present allocation If there is a dual CPU system (There are two CPUs, CPU0 and CPU1) with five VCPUs to run. Domain 0 has two VCPUs and domain 1 has three VCPUs. SEDF assigns VCPU0 and VCPU1 of domain 0 to CPU0 and in case of domain 1, VCPU0 and VCPU1 are assigned to CPU0 and VCPU2 will be assigned to CPU1. So if there are 2 VCPUs in a VM it will assign both the VCPUs to the 0th processor. Any VM having 4 VCPUs will only able to get a single VCPU on 1st processor. This allocation is quite static and imbalanced in nature.

66

G. Somani and S. Chaudhary

2. Why VCPU balancing and why not domain balancing? VCPUs are finer abstractions for the hypervisor scheduler to allocate CPU. A domain can only be assigned using defining affinity or mapping rules between its each VCPU and physical processors or PCPUs. We use the pinning mechanism provided by Xen hypervisor to achieve Global Load Balancing among available physical CPUs. Pinning a VCPU assigns it to a particular CPU. Xen gives a facility to assign CPU affinity by using virsh interface of libvirt virtualization library. Before the algorithm start working, the utilization of each VCPU is calculated using the command line interface given by xm vcpulist. xm gives the total elapsed time for each VCPU on a CPU at any instance of time. This elapsed time E0 is noted as initial elapsed time for each VCPU. Again, after time=T the reading ET is taken for each VCPU. The utilization of each VCPU will be calculated as. U (V CP U ) =

3.1

ET − E0 T

(3)

Description

Based upon equation 3, we propose an algorithm for load balancing strategy. Equation 3 gives utilization of each VCPU in time duration T. This utilization is the basis for sorting the VCPUs in decreasing order. Modified Worst Fit Bin packing algorithm is used to assure the balancing [11][19]. Bin packing algorithm is a classical problem to fill n number of objects of size ci (real value between 0 and 1) into minimum number of bins of unit size. Its goal is to pack a collection of objects into the minimum number of fixed sized “bins”. Our aim is slightly different from this algorithm, Global Load Balancing aims to balance the load among the number of available processors. Number of bins (PCPUs) is Npcpus and the sum of sizes of all the objects is also less than the sum of bin capacities. That can be achieved by Worst Fit Bin Packing algorithm by filling largest elements in emptiest bins or processors in the present case [11][19]. Formally, we can define the problem as follows. If the number of VCPUs is Nvcpus with Utilization Ui , where i=1 to Nvcpus . There are Npcpus available with capacity 100 each and their present utilization denoted by U Pj , where j=1 to Npcpus . The aim is to place total VCPUs among all the PCPUs available such that, Utilization U Pj , where j=1 to Npcpus , is almost similar for all the available Npcpus with all the Nvcpus placed. The overall flow can be seen in example given in Figure 2. Here, Npcpus =2 and Nvcpus =5. According to their mapping and utilization Ui , where i=1 to 5, the utilization of PCPUs is U P1 =93% and U P2 =25%. In the next step, the whole VCPU list has been sorted in decreasing order with respect to utilization. In the final stage when the new mapping has been applied using algorithm shown in 3.2, the balanced utilization will be U P1 =60% and U P2 =58%. The algorithm shown in section 3.2 uses current VCPU mapping as input in the form of a matrix. The total number of rows in this input mapping matrix A will

Load Balancing in Xen Virtual Machine Monitor

67

Fig. 2. Global Load Balancing Flow

be equal to the number of VCPUs available in the system. Each row of this matrix will corresponds to a mapping like “domain-VCPU-CPU”’. Forth coloumn of each matrix will be the Utilization Ui , where i=1 to Nvcpus . This matrix will be sorted in the order of decreased utilization. A new mapping will be recorded using worst case bin packing algorithm to fill all the physical processors. The output matrix B will contain the final mapping to be applied to balance the load. 3.2

Algorithm

OBJECTIVEThis algorithm assigns the number of VCPUs among number of physical processors to balance the total load. INPUTNo. of PCPUs : N_pcpus No. of VCPUs : N_vcpus Affinity Matrix : A(4 x N_vcpus) matrix. First three columns corresponds to the mapping of a domain’s VCPU to a PCPU. Fourth column corresponds to the utilization. OUTPUTBalanced Affinity matrix : B(3 x N_vcpus) matrix. Each row corresponds to the new mapping of a domain’s VCPU to a PCPU. ALGORITHM [GLOBAL-LOAD-BALANCE] 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) 12)

Sort the matrix A with Utilization in decreasing order for each row in A do Pick the VCPU in order of Utilization Create a new mapping by adding this VCPU to the emptiest PCPU Add this mapping in matrix B done for each row in B do Apply the new mapping done Return Balanced Affinity matrix B

END [GLOBAL-LOAD-BALANCE]

68

G. Somani and S. Chaudhary

4

Experiments and Results

The description given in previous section is elaborated version of the algorithm in section 3.2. The algorithm has been implemented as a user space program in C with use of shell and awk scripts to process utilization data. To show the efficiency of the Global Load Balancing algorithm, experiment shown in Figure 3 was conducted. There are four virtual machines with 1 VCPU each. All VMs started running a CPU intensive program at time t=0. This CPU intensive program comprises integer and floating point instruction in an infinite loop. The physical machine has two processors. SEDF will assign all these VCPUs on processor 0 by default. Each domain will be able to complete the whole execution of the CPU test program in around 237 seconds. The first plot in Figure 3 shows the behavior of SEDF. In the next plot, we started with the same conditions. At time t=10 seconds, we triggered our Global Load Balancing algorithm. The test duration parameter T was set to 5 seconds. After calculating the utilization’s and getting the final mapping and applying them took around 5 seconds. After that the overall utilization of the whole system raised up to 100%. The increase in each VMs utilization is also increased by double due to the Global Load Balancing on two processors. The total time taken by these VMs to complete the test was reduced to 135 seconds only.

Fig. 3. Experiment : Global Load Balancing Activity

4.1

Load Balancing Parameters

1. Time duration for the tests : It purely depends upon the utilization data of each VCPUs. Figure 3 shows that it reacts in few seconds after gathering data. The above graph shows the activity with T= 5 seconds. 2. When to run? : It can be run any time and as soon as possible after VMs has been started. Running it as a daemon and checking for current balancing after each fixed interval of some seconds and changing the essential mapping is useful in getting continuous balancing.

Load Balancing in Xen Virtual Machine Monitor

5

69

Related Work

Global Load balancing among number of nodes in a distributed computing environment has been seen in [3][8][10]. Distributed shared memry architecture and load balancing among peers have been analysed in [1][12]. Load balancing of tasks in a multiprocessor environment is done in [14][9]. Load balancing among number of physical servers is implemented in virtualization environment with migration support in [7][18]. Virtualization environment are different from native architecture due to their three tier scheduling features discussed in section(2) and [6]. Global Load balancing of virtual processors among number of physical processors on a physical machine is done in credit scheduler of Xen virtual machine monitor. Our work automates the affinity rules for the VCPUs among number of PCPUs using proposed Global Load Balancing (GLB) algorithm.

6

Conclusion and Future Work

A novel approach has been proposed and developed to facilitate balancing VCPUs according to their utilization among the available physical processors. The algorithm supports Global load balancing for Xen’s SEDF scheduler. The experiment results shows a performance and utilization increase of physical machine and running domains.The developed program for the global load balancing for the SEDF scheduler provides a crucial way to implement the load balancing among the physical processors and dynamic affinity gives higher utilization. Load balancing strategy can be extended by getting utilization at more fine time scales with the analysis of multiprocessor scheduling in guest operating system and use of predictive models.

References 1. Ahrens, J.P., Hansen, C.D.: Cost eflective data-parallel load balancing. Technical Report 95-04-02, University of Washington (1995) 2. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: SOSP 2003: Proceedings of the nineteenth ACM symposium on Operating systems principles, pp. 164–177. ACM, New York (2003) 3. Biagioni, E.S., Prins, J.F.: Scan directed load balancing for highly parallel meshconnected parallel computers. Unstructured Scientific Computation on Scalable Multiprocessors 10, 371–395 (1990) 4. Chen, Y., Iyer, S., Liu, X., Milojicic, D., Sahai, A.: Translating Service Level Objectives to lower level policies for multi-tier services. Cluster Computing 11(3), 299–311 (2008) 5. Cherkasova, L., Gupta, D., Vahdat, A.: Comparison of the three CPU schedulers in Xen. SIGMETRICS Perform. Eval. Rev. 35(2), 42–51 (2007) 6. Chisnall, D.: The Definitive Guide to the Xen Hypervisor. Prentice Hall Open Source Software Development Series. Prentice Hall PTR, Upper Saddle River, NJ, USA (2007)

70

G. Somani and S. Chaudhary

7. Padala, P., et al.: Automated Control of Multiple Virtualized Resources. Technical Report HPL-2008-123R1, HP Laboratories (2008) 8. Gates, K.E., Peterson, W.P.: A technical description of some parallel computers. International Journal High Speed Computing 6(3), 399–449 (1994) 9. Hajek, B.E.: Performance of global load balancing of local adjustment. IEEE Transactions on Information Theory 36(6), 1398–1414 (1990) 10. Hwang, K.: Advanced Computer Architecture: Parallelism, Scalability, Programmability. MIT Press and McGraw-Hill Inc. (1993) 11. Johnson, D.S.: Fast algorithms for Bin packing. Journal of Computer and System Sciences 8, 256–278 (1974) 12. Lenoski, D.E., Weber, W.D.: Scale Shared Memory Multiprocessing. Morgan Kaufmann Publishers Inc., San Francisco (1995) 13. Lin, B., Dinda, P.A.: VSched: Mixing Batch And Interactive Virtual Machines Using Periodic Real-time Scheduling. In: SC 2005: Proceedings of the 2005 ACM/IEEE conference on Supercomputing, Washington, DC, USA, p. 8. IEEE Computer Society, Los Alamitos (2005) 14. Nicol, D.M.: Communication efficient global load balancing. In: Proceedings of the Scalable High Performance Computing Conference, April 1992, pp. 292–299 (1992) 15. Ongaro, D., Cox, A.L., Rixner, S.: Scheduling I/O in virtual machine monitors. In: VEE 2008: Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, pp. 1–10. ACM, New York (2008) 16. Popek, G.J., Goldberg, R.P.: Formal requirements for virtualizable third generation architectures. Commun. ACM 17(7), 412–421 (1974) 17. Somani, G., Chaudhary, S.: Application performance isolation in virtualization. In: International Conference on Cloud Computing, pp. 41–48. IEEE, Los Alamitos (2009) 18. VMware. VMware Infrastructure: Resource Management with VMware DRS. Technical report (2008) 19. Weisstein, E.W.: Bin-Packing Problem, From MathWorld–A Wolfram Web Resource, http://mathworld.wolfram.com/Bin-PackingProblem.html 20. A Xen wiki page. Scheduling-PrgmrWiki, book.xen.prgmr.com/mediawiki/index.php/Scheduling