Delay Bound WFQ GPS
WFQ and WF2Q
When WFQ and GPS finish packets at the same order, WFQ will never lag behind GPS. (And may sometimes be ahead of GPS).
Patt-Shamir Lecture 8 1
Patt-Shamir Lecture 8 2
Delay Bound
WFQ and GPS: single link WFQ
z
GPS z
z z
When a packet arrives too late, it may have to wait for the current packet to finish transmission, hence Lmax/C delay. Error term does not accumulate over time.
GPS is never ahead of WFQ by more than one packet (in terms of transmitted bits up to time t). Packets in WFQ are not delayed more than one packet length relative to GPS: z
Bits sent: SGPS (f, t) - SWFQ(f, t) ≤ Lmax Lmax: length of longest packet in bits
z
Completion time: DWFQ (f, k) −DGPS (f, k) ≤ Lmax / C C: link speed in bps
May accumulate over multiple network hops. Patt-Shamir Lecture 8 3
Patt-Shamir Lecture 8 4
GPS and WFQ: Network
P-G theorem: Interpretation
Parekh-Gallager theorem
Suppose a given connection is (σ,ρ) constrained, has maximal packet size L, and passes through K WFQ schedulers, such that in the ith scheduler z z
there is total rate r(i) from which the connection gets g(i).
Let g be the minimum over all g(i), and suppose all packets are at most Lmax bits long. Then
Delay of last packet of a burst. Only in first node GPS term
store&forward penalty: only in nonfirst nodes
WFQ lag behind GPS: each node
GPS to WFQ correction Patt-Shamir Lecture 8 5
Significance z z
z
z
Patt-Shamir Lecture 8 6
Fine Points
WFQ can provide end-to-end delay bounds So WFQ provides both fairness and performance guarantees Bound holds regardless of cross traffic behavior Can be generalized for networks where schedulers are variants of WFQ, and the link service rate changes over time Patt-Shamir Lecture 8 7
z
To get a delay bound, need to pick g z z
z
Sources must be leaky-bucket regulated z
z
the lower the delay bound, the larger g needs to be large g means exclusion of more competitors from link but choosing leaky-bucket parameters is problematic
WFQ couples delay and bandwidth allocations z z
low delay requires allocating more bandwidth wastes bandwidth for low-bandwidth low-delay sources
Patt-Shamir Lecture 8 8
Worst Case WFQ (WF2Q) z
z
11th packet
We’ve just said that WFQ approximates GPS to within a difference of one packet. That was not quite true... z
z
WF2Q - Example
The bound is one way: WFQ is never lagging by too much, SWFQ (f, k) - SGPS (f, k) ≤ Lmax
But WFQ might be well ahead of GPS!
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 0 1
R1=0.5
R2=R3=…=R11=0.05 Packet lengths: 1 sec
10
t
Patt-Shamir Lecture 8 9
GPS Service Order
Patt-Shamir Lecture 8 10
WFQ Service Order 11th packet
11th packet S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 0
t 10
20 Patt-Shamir Lecture 8 11
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 0
t 10
20 Patt-Shamir Lecture 8 12
WF2Q Service Order
WF2Q Idea 11th packet
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 0
t 20
10
WFQ: scheduler selects next packet with minimal finish number from all available packets. z Worst-case fair weighted fair queuing (WF2Q): scheduler considers only packets that have started in the emulated GPS system. ⇒ Get closer emulation of GPS: add lower bound on the time difference with GPS z
Patt-Shamir Lecture 8 13
Implementation: Rate Controlled Scheduling z
z
Patt-Shamir Lecture 8 14
WF2Q Properties
Regulator holds packets until they are eligible for transmission. Scheduler decides which eligible packet should be transmitted next.
z z
Retain upper bound of WFQ Add lower bound (best possible in general)
Switch Output Module Queue 1
Regulator
Queue 1
Queue 2
Regulator
Queue 2
Queue 3
Regulator
Queue 3
Queue 4
Regulator
Queue 4
Scheduler
Patt-Shamir Lecture 8 15
Patt-Shamir Lecture 8 16
WF2Q Properties (cont.) z
Discipline is work-conserving: z z
z
Rate controller + GPS scheduler is the same as GPS alone. Replace GPS with WFQ (both work conserving) to get WF2Q.
Lower bound z z
Internetworking
WF2Q starts sending packet no earlier than GPS. GPS usually takes more time to transmit the packet (depending on allocated and available bandwidth).
r
Li ,max −
gi Li ,max r
gi
WF2Q GPS Patt-Shamir Lecture 8 17
Patt-Shamir Lecture 8 18
Internet Protocol (IP) z
Service Model provided to transport layer (TCP, UDP) z z z
z
IP Addresses
Not in IP service model z
z
Global name space Host-to-host connectivity (connectionless) Best-effort packet delivery
QoS guarantees on bandwidth, delay or loss
Delivery failure modes z z z z
Packet delayed for a very long time Packet loss Packet delivered more than once Packets delivered out of order Patt-Shamir Lecture 8 19
Patt-Shamir Lecture 8 20
IPv4 Address Model z z
32-bit address Maps to logically unique network adaptor z
z z
z
IP Addressing z
Exceptions: service request splitting for large web servers
3-level hierarchy: network, subnet, host Basic idea: if destination is on same network, use network. Otherwise, send to router (on same network!) IP’s definition of network: Set of devices that can communicate directly (in the datalink layer), without any router in the middle
IP address: z high order bits: network z low order bits: host
Example: System with 3 IP networks (for IP addresses starting with 223, first 24 bits are network address)
223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3
Class A: 0 Network (7 bits) Class B: 1 0
223.1.3.1
IP addresses z
Host (16 bits)
z Network (21 bits)
Host (8 bits)
z Network ID
Host ID
# of Addresses
# of Networks
A
0 + 7 bit
24 bit
224-2
126
B
10 + 14 bit
16 bit
65,536 - 2
214
C
110 + 21 bit
8 bit
256 - 2
221
E
1110 + Multicast Address
223.1.3.2
Patt-Shamir Lecture 8 22
z
Host (24 bits)
Class
D
223.1.2.2
IPv4 Address Model
Network (14 bits)
Class C: 1 1 0
223.1.3.27
LAN
Patt-Shamir Lecture 8 21
IPv4 Address Model
223.1.2.9
z
Decimal-dot notation Host in class A network: 1-127.*.*.* z 18.7.22.69 web.mit.edu Host in class B network: 128-191.0-255.*.* z 132.66.16.6 www.tau.ac.il Host in class C network: 192-223.0-255.0-255.* z 198.182.196.56 www.linux.org
IP Multicast
“Future Use”
Patt-Shamir Lecture 8 23
Patt-Shamir Lecture 8 24
Problem: Address structure
IP Addressing solutions
Address classes are too “rigid”. 1.
1.
Size. Often, Class C too small and Class B too big Small
organizations want Class B to support more than 255 hosts. But there are only 16K Class B network IDs. Ö Wastage and shortage of addresses! 2.
2.
Organizations using internal routers need to have a separate network ID for each link. Every
router must know about every network ID in every organization Î large address tables.
Subnetting: subdivide a network ID hierarchically to allow using routers within the same IP network. Classless Interdomain Routing (CIDR, “supernetting”): Forget classes. Network ID can be any prefix of the IP address.
Must know how to extract from IP address network ID
Patt-Shamir Lecture 8 25
Subnetting CLASS “B” e.g. Company
Patt-Shamir Lecture 8 26
CIDR Addressing
2
Classless InterDomain Routing IP address space is broken into blocks of consecutive addresses.
16
14
10
Host-ID
Net ID
e.g. Site
2
10
16
14
Net ID
0000
Subnet ID (20)
2
Host-ID
10
16
14
Net ID
1111
Subnet ID (20)
Subnet Host ID (12)
Host-ID Subnet Host ID (12)
Representation: the common prefix. Denoted x/y, meaning y first bits of x. (Is this the most flexible?) Merge consecutive blocks : 132.66/16 + 132.67 = 132.66/15 128.9.0.0
Subnet mask = 255.255.224.0 e.g. Dept
2
10
16
14
Net ID
Subnet ID (22)
000000
2
Host-ID Subnet Host ID (10)
Subnet mask = 255.255.252.0
10
16
14
Net ID
Subnet ID (26)
65/8
1111011011
Host-ID Subnet Host ID (6)
Subnet mask = 255.255.255.192 Patt-Shamir Lecture 8 27
0
128.9.16.14
142.12/19
128.9/16
216
232-1
Patt-Shamir Lecture 8 28
CIDR Addressing
Address Translation support
Overlaps are allowed! Resolve in favor of the most specific
z
IP addresses to LAN physical addresses Problem: IP is not a datalink! z NICs can only send to MAC addresses z Need a translation mechanism z Translate from IP address to physical address z Address Resolution Protocol (ARP) Internet domain name to IP address z Problem: hard for humans to remember IP addresses, even in dotted decimal notation z Need a Domain to IP translation z
128.9.19/24 128.9.25/24 128.9.16/20 128.9.176/20 128.9/16
z
0
232-1
128.9.16.14
z
Most specific route = “longest matching prefix” Patt-Shamir Lecture 8 29
Domain Name Service (DNS) Patt-Shamir Lecture 8 30
3 Addressing Schemes
z
MAC: Data link (LAN) level
E6-E9-00-17-BB-4B
www.tau.ac.il
Router: Forwarding
13 2. 66 .16 .6
IP addresses: network level 13 2. 66 .16 .6
z
13 2. 66 .16 .6
Domain names: application level
13 2. 66 .16 .6
z
132.66.16.6
Patt-Shamir Lecture 8 31
Patt-Shamir Lecture 8 32
Datagram forwarding with IP z
Hosts and routers maintain forwarding tables z z
z
z
Forwarding: the framework 1.
List of pairs Simple and static on hosts z Often contains a default route Complex and dynamic on routers
2.
Packet forwarding z z z z
Given a packet: determine the network prefix of the destination in the packet (CIDR!) Is the destination is on the same network?
3.
Compare network portion of address with pairs in table Use LAN to send directly to host on same network Send to indirectly (via router on same network) to host on different network Use ARP to get hardware address of host/router
4.
Use self address + subnet mask
If yes, immediate destination = final destination Else, find immediate destination in routing table Send packet over datalink to immediate destination
Use ARP to find datalink (MAC) address
Patt-Shamir Lecture 8 33
IP Message Forwarding
Routing Tables at a router 128.17.20.1
R2 1 R1 2 3
R3 R4
128.17.16.1
Patt-Shamir Lecture 8 34
z
e.g. 128.9.16.14 => Port 2 Prefix
Next-hop
Port
65/8 128.9/16 128.9.16/20 128.9.19/24 128.9.25/24 128.9.176/20 142.12/19
128.17.16.1 128.17.14.1 128.17.14.1 128.17.10.1 128.17.14.1 128.17.20.1 128.17.16.1
3 2 2 7 2 1 3
z
z
z z
Input: destination address Output: next hop IP address + interface Patt-Shamir Lecture 8 35
H1/TCP finds H2’s IP address and sends packet to H2 H1/IP looks up its table and finds that next hop is router (gateway) R H1/IP looks up R’s Ethernet address and sends packet R/IP looks up its table and finds that next hop is H2 R/IP looks up H2’s FDDI address and sends packet
H1
H2
TCP
TCP R
IP
ETH
IP
ETH
IP
FDDI
FDDI
Patt-Shamir Lecture 8 36
IP Packet Format
IP Packet Format z
0
4 Version
8 HLen
16
31
TOS
Length
Ident TTL
19
Flags Protocol
z
Offset Checksum
z
SourceAddr DestinationAddr Options (variable)
Pad (variable)
z
Data
4-bit version z IPv4 = 4, IPv6 = 6 4-bit header length z Counted in 32-bit words (minimum of 5) 8-bit type of service field (TOS) z An early attempt to support QoS--Mostly unused 16-bit data length z Counted in bytes
Patt-Shamir Lecture 8 37
IP Packet Format z
3-bit flags
z
13-bit fragment offset into packet z
z
All fragments from the same packet have the same ID
z
z
z
IP Packet Format
Fragmentation support z 16-bit packet ID z
Patt-Shamir Lecture 8 38
z z
1-bit to mark last fragment
z z
Counted in longwords (8 bytes)
8-bit time-to-live field (TTL) z Hop count decremented at each router z Packet is discard if TTL = 0
z
Patt-Shamir Lecture 8 39
8-bit protocol field z TCP = 6, UDP = 17 16-bit IP checksum on header 32-bit source IP address 32-bit destination IP address Options z Variable size z Source-based routing z Record route z ... Padding z Fill to 32-bit boundaries Patt-Shamir Lecture 8 40
IP Fragmentation and Reassembly z
Problem: z Different datalink layers have different frame lengths z
z
z z
Maximum transmission unit (MTU)
Source host does not know minimum value over whole path z
z
IP Fragmentation and Reassembly
Especially when paths change
z
Solution: z When necessary, split IP packet into small-enough packets and then forward over datalink z Questions z z
z
Where should reassembly occur? What happens when a fragment is damaged/lost?
Fragments are self-contained IP packets Reassemble at destination only to minimize router overhead Drop all fragments in packet if one or more fragments are lost Avoid fragmentation at source host z Transport layer should send packets small enough to fit into one MTU of local physical network z
Must consider IP header
Patt-Shamir Lecture 8 41
IP Fragmentation and Reassembly
Patt-Shamir Lecture 8 42
Internet Control Message Protocol (ICMP) z
H1
R1
ETH
R2
FDDI
R3
PPP
H2
ETH
ETH IP (1400) FDDI IP (1400) PPP IP (512) PPP IP (512) PPP IP (376) ETH IP (512) ETH IP (512) ETH IP (376)
Start of header Ident = x 0 Rest of header 1400 data bytes
Start of header Ident = x 1 Rest of header 512 data bytes Start of header Ident = x 1 Rest of header 512 data bytes
Offset 0
Formally, above IP, but in fact, IP “companion” protocol z
Handles error and control messages
Offset 0
FTP
HTTP
NV
TFTP
Offset 64
TCP
Start of header Ident = x 0 Offset 128 Rest of header 376 data bytes Patt-Shamir Lecture 8 43
UDP IP
Ethernet
FDDI
ICMP ATM
Modem Patt-Shamir Lecture 8 44
ICMP z
z
Virtual Private Networks
Error Messages z Host unreachable z Reassembly failed z IP checksum failed z TTL exceeded (packet dropped) z Invalid header Control Messages z Echo/ping request and reply z Echo/ping request and reply with timestamps z Route redirect
z
Goal z
z
Controlled connectivity
Virtual Private Network z z z
A group of connected subnets Connections may be over shared network Similar to LANE, but over IP allowing the use of heterogeneous internets
Patt-Shamir Lecture 8 45
Virtual Private Networks
Patt-Shamir Lecture 8 46
Tunneling z
z
C A
B
K
IP Tunnel
Virtual point-to-point link between an arbitrarily connected pair of nodes
L M Network Network 11
C K
R1
Internetwork Internetwork
Network Network 22
R2
IP Tunnel
L
10.0.0.1 A
IP Dest = 2.x IP Payload
B M Patt-Shamir Lecture 8 47
IP Dest = 10.0.0.1 IP Dest = 2.x IP Payload
IP Dest = 2.x IP Payload
Patt-Shamir Lecture 8 48
IPv6
Tunneling z
Advantages z
z
z
z
Transparent transmission of packets over a heterogeneous network Only need to change relevant routers
z
Disadvantages z z
z
Initial motivation: 32-bit address space predicted to be completely allocated by 2008. Additional motivation: z
Increases packet size Processing time needed to encapsulate and unencapsulate packets Management at tunnel-aware routers
z z
z
header format helps speed processing/forwarding header changes to facilitate QoS new “anycast” address: route to “best” of several replicated servers
IPv6 datagram format: z z
fixed-length 40 byte header no fragmentation allowed
Patt-Shamir Lecture 8 49
Patt-Shamir Lecture 8 50
IPv6 Header (Cont)
Other Changes from IPv4
Priority: identify priority among datagrams in flow Flow Label: identify datagrams in same “flow.”
z
(concept of“flow” not well defined). Next header: identify upper layer protocol for data
z
about 6.7×1023 addresses/square meter on earth
z
Checksum: removed entirely to reduce processing time at each hop Options: pushed to the next layer up, indicated by “Next Header” field ICMPv6: new version of ICMP z z
Patt-Shamir Lecture 8 51
additional message types, e.g. “Packet Too Big” multicast group management functions
Patt-Shamir Lecture 8 52
Simple case:
Class-based addressing
Forwarding: the framework 1. 2.
3. 4.
Use ARP to find datalink (MAC) address Patt-Shamir Lecture 8 53
> >
Idea: parallel lookup. Used in CPU caches Advantages:
Associative Memory or CAM
Search Data
log2N