Empirical Virtual Machine Models for Performance Guarantees
Empirical Virtual Machine Models for Performance Guarantees
Andrew Turner, Akkarit Sangpetch, Hyong S. Kim Carnegie Mellon University
LISA 2010 11th...
Multiple application tiers on different hosts Resources needs F(resource allocation) = performance? Needs change throughout the day Over-provisioning wastes energy and resources Unhappy users 4
Our approach • Observe performance
• Create online model
• Calculate required resources 5
Our approach Measured error SLO target
Resource allocation Physical Machine
Our system
Measured performance
Application performance
Host and Application Monitoring
Control loop constantly checks performance and recalibrates resource allocation levels 6
Benefits of our system • Automatically identifies performance bottlenecks • Automatically sets resource allocation levels • Provides more performance per resource allocated • Reduces energy and hardware usage • Allows SLOs to be met
7
Assumptions • We can monitor application performance • We can control resource access or scheduling • Application performance is convex
8
Data used T – SLO target
E – Probability T achieved
R – Real performance
C – Contention level
W – Workload level
A – Resource allocation
M – Performance Model Find A and guarantee that:
P( R ≥ T ) ≥ E P( R ≥ T ) = P( M (W , C , A) ≥ T ) P( R ≥ T ) =
∫∫ P(W = w) P(C = c) P(M (w, c, A) ≥ T )dwdc 0..100%
9
Creating the model • Created online • Use previously observed data • Curve fit to fill unobserved areas CPU contention effect on response time
1200 1000 800 600 400 200 100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0 0%
Response time (ms)
1400
CPU contention 10
Deciding resource assignment • Hyperplane at target performance • Choose allocation that crosses plane TPC-W response time 400 40% contention 30% contention 20% contention 10% contention
350
Response Time (ms)
300
250
200
Target
150
100
50 10
20
30
40 50 60 70 Web Server CPU allocation (%)
80
90
100
11
Deciding resource assignment • Resource allocation is vector of allocations • E.g. (proxy = 65%, web = 55%) or (proxy = 80%, web = 35%) TPC-W response time while changing proxy and web CPU allocation
600
250
200
Response Time (ms)
500 400
150
Target
300 200 100 100 0 0
50 20
0 40
20 40
60
Proxy server share
60
80
80 100
Web server share
0
100
12
Deciding resource assignment • Resource allocation is vector of allocations • E.g. (proxy = 65%, web = 55%) or (proxy = 80%, web = 35%) TPC-W response time 40% contention 250
10 20
200
Proxy server share
30 40
150 50
Target
60 100 70 80
50
90 100 100
90
80
70
60 50 40 Web server share
30
20
10
0
13
Deciding resource assignment • Resource allocation is vector of allocations TPC-W response time 30% contention 250
10 20
200
Proxy sever allocation
30 40
150 50
Target
60 100 70 80
50
90 100 100
90
80
70 60 50 40 Web sever allocation
30
20
10
0
14
Deciding resource assignment A – the potential resource allocations X – chosen resource allocations
Hosts
App 1 solutions App 2 solutions
• •
30 50 60 70 20 40
0 0 0 20 20 80
Q – Priority level of application
Priority Start X
60 50 30 50 90 50
0.8 0.8 0.8 1 1 1
? ? ? ? ? ?
End X
0 0 1 0 0 1
Minimize: XTA 1TQ Subject to: XTA = 1, X >= 0
15
Reducing model dimensions • Which resources are important to model? • Use regression to find impact of each resource
Time
CPU Contention
Disk Contention
Performance
1
10%
10%
130ms
2
40%
12%
180ms
3
14%
90%
135ms
4
12%
50%
132ms
5
30%
75%
160ms
6
10%
40%
130ms
Disk has no affect
CPU does have affect 16
Experimental Evaluation • We test TPC-W and a dynamic web page • Measure response time
Apache Server
Apache Server
Apache Server
TPC-W Proxy
TPC-W Web
TPC-W DB
Host 1
Host 2
Host 3 17
Experimental Evaluation 150
SLO 100ms
100
SLO 150ms
50% resource allocation
Response time (ms)
50
0
100
200
300
400
500
0
100
200
300
400
500
0
100
200
300
400
500
0
100
200
300
400
500
200 150 100 50 300 200
100
300
10% resource allocation
100
CPU (%)
CPU contention levels
200
100
SQL Contention Proxy Contention
50
Web Contention 0
0
100
200
300
400
500
• System keeps response time close to target 18
Experimental Evaluation • Dynamic resource assignment helps meet SLOs • Use less resources that static allocation Test
RT average 89ms
Resource allocation average 48%
Apache VM average 125ms
SLO = 100ms SLO = 150ms
127ms
35%
107ms
50% resource allocation 10% resource allocation
150ms
50%
120ms
355ms
10%
83ms
19
SLO 100ms
SLO 150ms
Total TPC TPC-W resource allocation (%)
Experimental Evaluation 100 80 60 40 20 0
0
100
200
300
400
500
0
100
200
300
400
500
100 80 60 40 20 0
CPU contention levels
CPU (%)
80
60
40
SQL Contention Proxy Contention
20
Web Contention
0
0
100
200
300
400
500
• No increase in resource allocation as DB is not bottleneck 20
Experimental Evaluation 150 100
SLO 100ms
50% resource allocation
Response time (ms)
SLO 150ms
50
0
20
40
60
80
100
120
140
160
180
0
20
40
60
80
100
120
140
160
180
0
20
40
60
80
100
120
140
160
180
0
20
40
60
80
100
120
140
160
180
0
20
40
60
80
100
120
140
160
180
200 150 100 50 400 300 200 100 0 400 300
10% resource allocation
200 100
Number of users
Users
0 400 300 200 100
• Meets target time despite changes in workload 21
Conclusion • We automatically calculate required resources • Works on generic multi-tiered applications • Helps to meet SLOs • Better performance per resource assigned • Simplifies resource management