Progress and Challenges for Real-Time Virtualization*

Progress and Challenges for Real-Time Virtualization* Chris Gill Professor of Computer Science and Engineering Washington University, St. Louis, MO, ...
Author: Bruno Warner
0 downloads 4 Views 5MB Size
Progress and Challenges for Real-Time Virtualization* Chris Gill

Professor of Computer Science and Engineering Washington University, St. Louis, MO, USA [email protected]

VtRES Workshop Keynote at RTCSA 2013

National Taiwan University, Taipei, Taiwan, Wed Aug 21, 2013 *Our research described in this talk is supported in part by the US NSF and the US ONR, and has been driven by numerous contributions from Sisu Xi, Justin Wilson, Chong Li, and Chenyang Lu (Washington University in St. Louis) and from Jaewoo Lee, Sanjian Chen, Linh Phan, Insup Lee, and Oleg Sokolsky (University of Pennsylvania)

Two  Key  Uses  of  Virtualiza3on   •  Use  fewer  compu3ng  resources  and/or  pla;orms  to  (consolidate  or)   integrate  systems  via  virtualiza3on   •  Provide  elas3c  cloud  services  on-­‐demand  and  at  scale  to  mul3ple   tenants  via  virtualiza3on  

2

Challenges  for  Real-­‐Time  Virtualiza3on   •  Real-­‐Time  System  Integra3on   –  How  to  schedule  resources  feasibly  among  compe3ng  domains   –  How  maintain  3ming  guarantees  as  different  components  and   systems  are  composed   –  How  to  preserve  guarantees  across  mul3ple  shared  resources   •  Real-­‐Time  Cloud  Services   –  How  to  analyze  3ming  and  provide  guarantees  in  the  face  of   resource  elas4city  and  mul4-­‐tenancy  

3

RT  Virtualiza3on  for  System  Integra3on •  Some  key  challenges  for  real-­‐3me  (especially  safety-­‐cri4cal)  systems   –  Temporal  isola3on  as  dedicated  cores  become  shared  ones   –  Preserving  isola3on  as  components  and  systems  are  composed   –  Maintaining  end-­‐to-­‐end  3ming  guarantees  as  networked   communica2on  becomes  inter-­‐domain  communica2on  spanning   both  computa3on  and  communica3on  resources   Virtualization Platform Hypervisor

Legacy System

Legacy System Domains 4

A  Brief  Survey  of  Other  Related  Work     (please  see  our  publica3ons  for  references)   •  Improving  VMM  Scheduling  (Credit,  SEDF)  and  Domain  0  in  Xen     –  OUen  helps  with  isola3on,  predictability,  etc.  but  without  real-­‐3me  guarantees  

•  Improving  Inter-­‐domain  communica3on  in  Xen     –  E.g.,  XWAY,  XenLoop,  Xensocket:  involve  modifying  guest  OS  or  applica3ons  

•  Approaches  targe3ng  other  virtualiza3on  architectures   –  CucinoZa  et  al.  [COMPSAC  2009]  applied  hierarchical  real-­‐3me  scheduling  to   KVM,  e.g.,  towards  suppor3ng  Real-­‐Time  Service  Oriented  Architectures   –  Fiasco  and  L4  (TU  Dresden)  offer  precise  virtualiza3on  capabili3es  for  systems   ranging  from  small  embedded  systems  to  large  complex  systems       5

Tradi3onal  Virtualiza3on  in  Xen     •  Good  for  system  integra3on,  cost  reduc3on,  etc.   App     Real-time aware

X  Not real-time aware Domains are scheduled round-robin with NO prioritization of OS instances

App   OS

App  

App   OS

Xen Hypervisor HardwareHardware Hardware

Time

Problem: Some RT Applications CANNOT benefit from this kind of Virtualization 6

RT  Virtualiza3on  I:     Real-­‐Time  Scheduling  of  Domains  in  RT-­‐Xen   App  

App  

OS Sched

App  



App  

OS Sched

Xen Scheduler

Basic Solution: Incorporate Hierarchical Scheduling into Xen

App  

App  

Leaf Sched

App  



App  

Leaf Sched

Root Scheduler Leaves are implemented as Servers (Period, Budget, Priority)

7

Basic  Server  Design  (Deferrable  &  Periodic)   •  Servers  have  3  parameters  (Period,  Budget,  Priority)   S1 (5, 3, 1) with Two Tasks

T1 (10, 3) T2 (10, 3)

Time 0

Deferrable   Server  

Periodic   Server  

2

5

10

15

Actual Execution

back-to-back



Budget in S1



Actual Execution

2

5

10

15

Time

IDLE 3

Budget in S1

Time 0

2

5

10

15

8

Evalua3on  Setup     Scheduling  Algorithm   (Deferrable,  Polling,  Periodic,  Sporadic)   (Period,  Budget,  Priority)  for  Dom1   (Period,  Budget,  Priority)  for  Dom2   …  

IDLE  

Use  Rate  Monotonic  within  each  Domain   For  each  task:          shorter  period  -­‐>  higher  priority   App  

App  

Dom0

Dom1

VCPU

VCPU

App   …

App  

Dom5

VCPU

RT-Xen Schedulers (Deferrable, Polling, Periodic, Sporadic) Core 0

Core 1

9

Xen  Credit  vs.  Real-­‐Time  VM  Scheduling   0.8 Deferrable

Deadline Miss Ratio

0.7

Credit scheduler  poor real-time performance

Sporadic

0.6

Polling

0.5

Periodic Credit

0.4

SEDF 0.3 0.2 0.1 0 50

Real-time VM scheduling helps! 60

70

80

90

100

Total CPU Load “RT-­‐Xen:  Towards  Real-­‐Time  Hypervisor  Scheduling  in  Xen”,     ACM  Interna2onal  Conferences  on  Embedded  SoLware  (EMSOFT),  2011   10

10

RT  Virtualiza3on  II:    Incorpora3ng  Composi3onal  Scheduling   •  Composi3onal  Scheduling  Framework  (CSF)       –  Provides  temporal  isola4on  and  real-­‐4me  guarantees   –  Computes  components’  minimum-­‐bandwidth  resource  model   •  Mind  the  gap  between  CSF  theory  and  system  implementa4on   –  Realizing  CSF  though  virtualiza3on  can  bridge  that  gap   Parent component Resource Model

Periodic Resource Model (period, budget) Rate Monotonic

Scheduler Resource Model

Resource Model

Scheduler

Scheduler

Workload

Workload

Child components

Periodic Tasks

Component 11

Composi3onal  Scheduling  in  RT-­‐Xen •  Component    domain   •  Periodic  Resource  Model  (PRM)    Periodic  Server  (PS)   •  Task  model:  independent,  CPU-­‐intensive,  periodic  task   –  Scheduling  algorithm:  rate  monotonic   Compositional Scheduling (Theoretical Framework) Task

Task  

Component PRM

Task  

Task  

Component   PRM

Root Component  

CSF in RT-Xen (System Implementation)) App

App  

App  

App  

Domain  

Domain  

PS

PS

Hypervisor   Hardware  

12

First  Need  to  Extend  CSF  to  Deal  with     Quantum-­‐based  Scheduling  Pla;orms   Quantum-based resource model

of resource model

•  Find  the  minimum-­‐bandwidth  resource  model  for  workload  W  

Min-BW resource model

Non-decreasing B/P:

the upper bound of the period to find min-BW resource model?

Real-number-based resource model

infeasible bandwidth Necessary condition for schedulability

1 2

P:

13

of resource model

13

Then  Can  Improve  Periodic  Server  Design •  Purely  Time-­‐driven  Periodic  Server  (PTPS)   –  If  currently  scheduled  domain  is  idle,  its  budget  is  wasted   –  Not  work-­‐conserving   t Δ Current Domain

Task Release Task Complete

DH Budget

time

Execution of tasks in DH

DL

Budget Execution of tasks in DL 14

Periodic  Server  Re-­‐Design  I •  Work-­‐Conserving  Periodic  Server  (WCPS)   –  If  currently  scheduled  domain  is  idle,  the  hypervisor     picks  a  lower-­‐priority  domain  that  has  tasks  to  execute     –  Early  execu3on  of  the  lower-­‐priority  domain  during  idle   period  does  not  affect  schedulability     t Δ

Task Release Task Complete

Current Domain

DH

Budget

time

Execution of tasks in DH

DL

Budget Execution of tasks in DL 15

Periodic  Server  Re-­‐Design  II •  Capacity  Reclaiming  Periodic  Server  (CRPS)   –  If  currently  scheduled  domain  is  idle,  we  can  re-­‐assign  this  idled   budget  to  any  other  domain  that  has  tasks  to  execute   –  Early  execu3on  of  the  other  domain  during  idle  period  does  not   affect  schedulability     t Δ

Task Release Task Complete

Current Domain

DH

Budget

time

Execution of tasks in DH

DL

Budget Execution of tasks in DL 16

Interface  Overhead:  Synthe3c  Workload   UW: 90.4%, URM: 114.3%, Dom5: (22, 1)

CDF Plot, Probability

1

0%

deadline miss

0.8

CRPS ≥ WCPS ≥ PTPS

0.6 60%

0.4 0.2 100%

0 0

0.5

CRPS_dom5 (DMR: 0.0622) WCPS_dom5 (DMR: 60.5) PTPS_dom5 (DMR: 100)

1 1.5 2 Response Time / Deadline

2.5

3

“Realizing  Composi2onal  Scheduling  Through  Virtualiza2on”,     IEEE  Real-­‐Time  and  Embedded  Technology  and  Applica2ons  Symposium  (RTAS),  2012   17

RT  Virtualiza3on  III:    Inter-­‐Domain  Communica3on   Dom 2

Dom 3 Dom 4 Dom 0 Linux 3.4.2 100% CPU



Dom 9

0.8 CDF Plot

sent pkt every 10ms 5,000 data points

1

Dom 1

0.6 0.4 0.2

RT−Xen, Original Dom 0 Credit, Original Dom 0

Dom 10 0 50

100

150 200 250 Micro Seconds

300

VMM  Scheduler:  RT-­‐Xen  VS.  Credit  

When  Domain  0  is  not  busy,  the  VMM  scheduler   C  5  higher  priority   dominates  the  IDC  performance   for   C   4   C  0   C  1   C  2   C  3   domains  (i.e.,  adding  real-­‐4me  scheduling  already  helps)   18

But,  is  Real-­‐Time  Scheduling  Enough???   CDF Plot

1 RT−Xen, Original Dom 0 0.5

0 0

5000 10000 Micro Seconds

Dom 0

Dom 1

15000

Dom 2

Dom 5



Dom 4



Dom 3



100% CPU VMM  Scheduler  

C  0  

C  1  

C  2  

C  3   C  4  

C  5  

19

A  LiZle  Background  on  Xen’s  Domain  0   Domain  1   …

A  

netback[0] { rx_action(); tx_action(); }

ne4f  

TX

… Domain  m  

ne4f  



neWront  

RX

ne4f  



ne4f  

Domain  n   …

D  

soYnet_data   ne4f  



B  

netback  

neWront  

C  

Domain  2  

Domain  0  

ne4f  

neWront  

neWront  

Packets are fetched in a round-robin order Packets share one queue in softnet_data 20

RTCA:  Refining  Domain  0  for  Real-­‐Time  IDC   Domain  1   …

A  

ne4f  

TX

Domain  m   …

neWront  

RX

ne4f  



ne4f  

Domain  n   …

B  

soYnet_data   ne4f  



A  

netback  



neWront  

netback[0] { rx_action(); tx_action(); }

ne4f  

neWront  

B  

Domain  2  

Domain  0  

ne4f   neWront  

Packets are fetched by priority, up to a batch size Queues are separated by priority in softnet_data 21

Effects  on  IDC  Latency   IDC Latency between Domain 1 and Domain 2 in presence of low priority IDC (us)

By  reducing  priority  inversion  in  Domain  0,  RTCA  mi4gates   impacts  of  low  priority  IDC  on  latency  of  high  priority  IDC  

“Priori2zing  Local  Inter-­‐Domain  Communica2on  in  Xen”,     ACM/IEEE  Interna2onal  Symposium  on  Quality  of  Service    (IWQoS),  2013   22

Preserving  Domain  0  Throughput  in  RTCA   iPerf Throughput between Dom 1 and Dom 2

12

Gbits/s

10 8

RTCA, Size 1 RTCA, Size 64 RTCA, Size 238 Original

6 4 2 0

Base

Light

Medium Heavy

A  small  batch  size  leads  to  significant  reduc4on  in  high   priority  IDC  latency  and  improved  IDC  throughput   under  interfering  traffic   23

What  Next?  Time  to  ShiU  Gears  

Real-­‐Time  System  Integra4on  is  clearly  important,  but   Real-­‐Time  Cloud  Compu4ng  may  prove  even  more  so   24

Towards  Real-­‐Time  Cloud  Services   •  Key  challenge:  how  to  analyze  3ming  and  provide  guarantees  in  the   face  of  resource  elas4city  and  mul4-­‐tenancy  

“A  distributed  system  is  one  in  which  the  failure  of  a   computer  you  didn't  even  know  existed  can  render   your  own  computer  unusable.”  –  Leslie  Lamport   A  virtualized  system  is  one  in  which  the  failure  of  a   computer  that  doesn’t  actually  exist  can  render  your   en4re  applica4on  unusable.  

25

How  to  Address  this  Issue?   •  Need  to  shiU  our  assump3ons  about  system  design  to  give  precise   real-­‐3me  seman3cs    within  resource  elas4city  and  mul4-­‐tenancy   “…  the  Java  pla;orm's  promise  of  "Write  Once,  Run  Anywhere,”  …   offer[s]  far  greater  cost-­‐savings  poten3al  in  the  real-­‐3me  (and  more   broadly,  the  embedded)  domain  than  in  the  desktop  and  server   domains.”  …     “The  real-­‐3me  Java  pla;orm's  necessarily  qualified  promise  of  "Write   Once  Carefully,  Run  Anywhere  Condi3onally"  is  nevertheless  the  best   prospec3ve  opportunity  for  applica3on  re-­‐usability.”     –  Foreward  to  the  Real-­‐Time  Specifica4on  for  Java  

26

Clouds  are  not  Real-­‐Time  Today     •  Virtualiza3on  technology  underlying  clouds  is  not  real-­‐3me     Xen:  virtual  machine  monitor  for  Amazon  EC2   •  CPU:  propor3onal-­‐share  scheduling   App   Real-time

X  Not real-time

App VM

App

App VM

Virtual Machine Monitor Hardware HardwareHardware

•  If  anything,  I/O  is  worse   –  Vague  “performance  indicators”:  low/medium/large   –  Or  you  can  pay  a  lot  to  get  dedicated  physical  network  resources   27

Mo3va3on  to  Make  Clouds  Real-­‐Time   •  Hard  to  provide  3ming  guarantees   –  Simple  Interface  -­‐>  no  3ming  informa3on   –  Consolida3on  ra3o  keeps  increasing  -­‐>  more  compe33on     –  Live  migra3on  without  no3fica3on        -­‐>  unstable  performance   •  Why  are  3ming  guarantees  important?   –  If  the  steal  3me  exceeds  a  given  threshold,  Ne;lix  shuts  down  the   virtual  machine  and  restarts  it  elsewhere    [sciencelogic],  [scout]   –  “Xbox  One  may  offload  computa3ons  to  cloud…”    [MicrosoU  Blog]   –  “Energy  efficient  GPS  sensing  with  cloud  offloading”          [Sensys’12]   –  …  also,  smart  grids,  earthquake  early  warning,  etc.  in  CPS   28

28

Towards  Improving  the    Current  State  of  the  Art   •  Func3ons  of  the  cloud  management  system  are  an  essen3al  focus   –  Interface  to  the  end  users   –  VM  ini3al  placement   –  VM  live  migra3on  (load  balance,  host  maintenance,  etc)   •  Commercial  management  systems  are  mostly  close-­‐source:  Amazon   EC2  (Xen),  Google  Compute  Engine  (KVM),  MicrosoU  Azure               (Hyper-­‐V),  VMware  vCenter  (vSphere),  Xen  Center  (XenServer)   •  Open  source  alterna3ves   –  OpenStack  (HPCloud,  RackSpace,  etc),  CloudStack,  OpenNebula,  …   –  All  compa3ble  with  XenServer,  vSphere,  KVM,  etc.   29

29

Limita3ons  and  Opportuni3es  –  Interface     •  VMware  vCenter   –  Reserva3on:  minimum  guaranteed  resources,  in  MHz   –  Limita3on:  upper  bound  for  resources,  in  MHz   –  Share:  rela3ve  importance  of  the  VM   •  OpenStack   –  #  of  VCPUs  

30

30

Limita3ons  and  Opportuni3es  –     Ini3al  VM  Placement   •  Filtering   –  VM-­‐VM  affinity  /  an3-­‐affinity,  VM-­‐Host  affinity  /  an3-­‐affinity,  etc   –  When  is  a  host  ‘full’?   •  VMware  vCenter:  based  on  reserva3on  of  VMs   •  OpenStack:  pre-­‐configured  ra3o  (default  is  16)   •  Ranking   –  VMware  vCenter:  try  each  host;  turn  on  stand-­‐by  hosts   –  OpenStack:  spread  and  packed  

31

31

Limita3ons  and  Opportuni3es  –     VM  Load  Balancing   •  Open  source  alterna3ves:  no  load  balancing  by  default   •  VMware  vCenter   –  Distributed  Resource  Scheduler  (DRS)   •  Triggered  every  5  min,  calculate  normalized  host  u3liza3on   •  Minimize  cluster-­‐wide  imbalance      (standard  devia3on  over  all  hosts)  

32

32

Concluding Remarks •  Much has been accomplished already –  RT-Xen, CSF, RTCA support real-time in open-source Xen –  Other approaches have focused on other virtualization architectures and platforms (e.g., L4), mechanisms, etc. •  Much remains to be done –  Especially as we move towards larger and more complex real-time systems and systems-of-systems –  Gains made in real-time virtualization can be extended to offer (and define) new capabilities for real-time clouds 33

Thank  You!   All  source  code  is  available  at   hZp://sites.google.com/site/real3mexen/    

34

Suggest Documents