CS 457 Networking and the Internet

11/1/16 CS 457 – Networking and the Internet Fall 2016 Topics • Principles of congestion control – How to detect congestion? – How to adapt and alle...
Author: Scarlett Howard
3 downloads 1 Views 771KB Size
11/1/16

CS 457 – Networking and the Internet Fall 2016

Topics • Principles of congestion control – How to detect congestion? – How to adapt and alleviate congestion?

• TCP congestion control – Additive-increase, multiplicative-decrease – Slow start and slow-start restart

• Related TCP mechanisms – Nagle’s algorithm and delayed acknowledgments

• TCP Throughput and Fairness • Active Queue Management (AQM) – Random Early Detection (RED) – Explicit Congestion Notification (ECN)

Congestion in the Internet is Unavoidable • Two packets arrive at the same time – The router can only transmit one – … and either buffer or drop the other

• If many packets arrive in a short period of time – The router cannot keep up with the arriving traffic – … and the buffer may eventually overflow

1

11/1/16

Principles of Congestion Control Congestion: • informally: “too many sources sending too much data too fast for network to handle” • different from flow control! • manifestations: – lost packets (buffer overflow at routers) – long delays (queueing in router buffers) • a top-10 problem!

Resource Allocation vs. Congestion Control • Resource allocation (connection-oriented networks) – How routers meet competing demands for resources – Reservations: allocate link bandwidth and buffer space to a flow – Admission control: when to say no, and to whom

• Congestion control (Internet) – – – –

How nodes prevent or respond to overload conditions E.g., persuade hosts to stop sending, or slow down Typically, much less exact Have some notion of fairness (i.e., sharing the pain)

Flow Control vs. Congestion Control • Flow control – Keeping one fast sender from overwhelming a slow receiver

• Congestion control – Keep a set of senders from overloading the network

• Different concepts but similar mechanisms – TCP flow control: receiver window – TCP congestion control: congestion window – TCP actual window: min{congestion window, receiver window}

2

11/1/16

Causes/costs of congestion: scenario 1 • two senders, two receivers • one router, infinite buffers • no retransmission

Host A

l out

l in : original data

unlimited shared output link buffers

Host B

• large delays when congested • maximum achievable throughput

Causes/costs of congestion: scenario 2 • one router, finite buffers • sender retransmission of lost packet Host A

Host B

lout

lin : original data l'in : original data, plus retransmitted data

finite shared output link buffers

Causes/costs of congestion: scenario 2 • • •

always: l = l (goodput) out in “perfect” retransmission only when loss:

l >l

out in retransmission of delayed (not lost) packet makes l larger (than perfect in case) for same l out

“costs” of congestion: ❒ more work (retrans) for given “goodput” ❒ unneeded retransmissions: link carries multiple copies of pkt

3

11/1/16

Causes/costs of congestion: scenario 3 • •

four senders multihop paths



timeout/retransmit

Q: what happens asl in and l increase ? in

Host A

lout

lin : original data l'in : original data, plus retransmitted data

finite shared output link buffers

Host B

Causes/costs of congestion: scenario 3 H o s t A

l o u t

H o s t B

Another “cost” of congestion: ❒ when packet dropped, any “upstream transmission capacity used for that packet was wasted!

Metrics: Throughput vs. Delay • High throughput – Throughput: measured performance of a system – E.g., number of bits/second of data that get through

• Low delay – Delay: time required to deliver a packet or message – E.g., number of ms to deliver a packet

• These two metrics are sometimes at odds – E.g., suppose you drive a link as hard as possible – … then, throughput will be high, but delay will be, too

4

11/1/16

Load, Delay, and Power Typical behavior of queuing systems with random arrivals:

A simple metric of how well the network is performing:

𝑃𝑜𝑤𝑒𝑟 =

𝑇ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡 𝐷𝑒𝑙𝑎𝑦

Average Packet delay

Load

Goal: maximize power

Fairness • Effective utilization is not the only goal – We also want to be fair to the various flows – … but what does that mean?

• Simple definition: equal shares of the bandwidth – N flows that each get 1/N of the bandwidth? – But, what if the flows traverse different paths? – Still a hard and open problem in the Internet

Simple Queuing Mechanism • Simplest approach: FIFO queue and drop-tail • Link bandwidth allocation: first-in first-out queue – Packets transmitted in the order they arrive

• Buffer space allocation: drop-tail queuing – If the queue is full, drop the incoming packet

5

11/1/16

Priority Queuing • A simple variation on basic FIFO queuing is priority queuing. The idea is to mark each packet with a priority; the mark could be carried, for example, in the IP header. • The routers then implement multiple FIFO queues, one for each priority class. The router always transmits packets out of the highest-priority queue if that queue is nonempty before moving on to the next priority queue. • Within each priority, packets are still managed in a FIFO manner.

Fair Queuing • The main problem with FIFO queuing is that it does not discriminate between different traffic sources, or it does not separate packets according to the flow to which they belong. • Fair queuing (FQ) is an algorithm that has been proposed to address this problem. The idea of FQ is to maintain a separate queue for each flow currently being handled by the router. The router then services these queues in a sort of round-robin,

Fair Queuing

Round-robin service of four flows at a router

6

11/1/16

Is Fair Queuing Fair? • Fair Queuing – The main complication with Fair Queuing is that the packets being processed at a router are not necessarily the same length. – To truly allocate the bandwidth of the outgoing link in a fair manner, it is necessary to take packet length into consideration. • For example, if a router is managing two flows, one with 1000byte packets and the other with 500-byte packets (perhaps because of fragmentation upstream from this router), then a simple round-robin servicing of packets from each flow’s queue will give the first flow two thirds of the link’s bandwidth and the second flow only one-third of its bandwidth.

Is Fair Queuing Fair? • Fair Queuing – What we really want is bit-by-bit round-robin; that is, the router transmits a bit from flow 1, then a bit from flow 2, and so on. – Clearly, it is not feasible to interleave the bits from different packets. – The FQ mechanism therefore simulates this behavior by first determining when a given packet would finish being transmitted if it were being sent using bit-by-bit round-robin, and then using this finishing time to sequence the packets for transmission.

Queuing • Fair Queuing – To understand the algorithm for approximating bit-bybit round robin, consider the behavior of a single flow – For this flow, let • P i : denote the length of packet i • S i: time when the router starts to transmit packet i • F i: time when router finishes transmitting packet i • Clearly, F i = S i + P i

7

11/1/16

Queuing • Fair Queuing – When do we start transmitting packet i? • Depends on whether packet i arrived before or after the router finishes transmitting packet i-1 for the flow

– Let Ai denote the time that packet i arrives at the router – Then Si = max(Fi-1, Ai) – Fi = max(Fi-1, Ai) + Pi

Queuing • Fair Queuing – Now for every flow, we calculate Fi for each packet that arrives using our formula – We then treat all the Fi as timestamps – Next packet to transmit is always the packet that has the lowest timestamp • The packet that should finish transmission before all others

Queuing • Fair Queuing

Example of fair queuing in action: (a) packets with earlier finishing times are sent first; (b) sending of a packet already in progress is completed

8

11/1/16

Simple Congestion Detection • Packet loss – Packet gets dropped along the way

• Packet delay – Packet experiences high delay

• How does TCP sender learn these? – Loss • Timeout • Triple-duplicate acknowledgment

– Delay • Round-trip time estimate

TCP Congestion Control Basics • Each source determines available capacity – … and how many packets is allowed to have in transit

• Congestion window – Maximum # of unack’ed bytes allowed to be in transit (the congestion-control equivalent of receiver window) – MaxWindow = min{congestion window, receiver window} - send at the rate of the slowest component

• How to adapt the congestion window? – Decrease upon losing a packet: back-off – Increase upon success: explore new capacity

Additive Increase, Multiplicative Decrease • How much to increase and decrease? – Increase linearly, decrease multiplicatively – A necessary condition for stability of TCP – Consequences of oversized window are much worse than having an under-sized window • Oversized window: packets dropped, retransmitted, pain for all • Undersized window: lower throughput for one flow

• Multiplicative decrease – On loss of packet, divide congestion window in half

• Additive increase – On success for last window of data, increase linearly, adding one MSS per RTT

9

11/1/16

TCP “Sawtooth” Behavior Window Loss

halved t

Practical Details • Congestion window (cwnd) – Represented in bytes, not in packets (Why?) – Packets typically one MSS (Maximum Segment Size)

• Increasing the congestion window – Increase by MSS on success for last window of data – In practice, increase a fraction of MSS per received ACK • # packets per window: CWND / MSS • Increment per ACK: MSS * (MSS / CWND)

• Decreasing the congestion window – Cut in half, but never below 1 MSS

Getting Started Need to start with a small CWND to avoid overloading the network.

Window

But, could take a long time to get started!

t

10

11/1/16

“Slow Start” Phase • Start with a small congestion window – Initially, CWND is 1 MSS – So, initial sending rate is MSS/RTT

• That could be pretty wasteful – Might be much less than the available bandwidth – Linear increase takes a long time to accelerate

• Slow-start phase (but in reality it’s “fast start”) – Sender starts at a slow rate (hence the name) – … but increases the rate exponentially – … until the first loss event

Slow Start in Action Double CWND per round-trip time

1

Src

D

2

A

D D

4

A A

D D

8

D A

D A

A

A

Dest

Slow Start and the TCP Sawtooth Window Loss

Exponential “slow start”

t

Why is it called slow-start? Because TCP originally had no congestion control mechanism. The source would just start by sending a whole window’s worth of data.

11

11/1/16

Two Kinds of Loss in TCP • Triple duplicate ACK – – – –

Packet n is lost, but packets n+1, n+2, etc. arrive Receiver sends duplicate acknowledgments … and the sender retransmits packet n quickly Do a multiplicative decrease and keep going (no slowstart)

• Timeout – – – – –

Packet n is lost and detected via a timeout Could be because all packets in flight were lost After the timeout, blasting away for the entire CWND … would trigger a very large burst in traffic So, better to start over with a very low CWND

Repeating Slow Start After Timeout Window

timeout threshold

Slow start in operation until it reaches half of t cwnd. previous

Slow-start restart: Go back to CWND of 1, but take advantage of knowing the previous value of CWND.

Repeating Slow Start After Idle Period • Suppose a TCP connection goes idle for a while – E.g., Telnet session where you don’t type for an hour

• Eventually, the network conditions change – Maybe many more flows are traversing the link – E.g., maybe everybody has come back from lunch!

• Dangerous to start transmitting at the old rate – Previously-idle TCP sender might blast the network – … causing excessive congestion and packet loss

• So, some TCP implementations repeat slow start – Slow-start restart after an idle period

12

11/1/16

Summary: TCP Congestion Control • When CongWin is below Threshold, sender in slow-start phase, window grows exponentially. • When CongWin is above Threshold, sender is in congestion-avoidance phase, window grows linearly. • When a triple duplicate ACK occurs, Threshold set to CongWin/2 and CongWin set to Threshold. • When timeout occurs, Threshold set to CongWin/2 and CongWin is set to 1 MSS.

Event

State

TCP Sender Action

Commentary

ACK receipt Slow Start for previously (SS) unACKed data

CongWin = CongWin + MSS, If (CongWin > Threshold) set state to “Congestion Avoidance”

Resulting in a doubling of CongWin every RTT

ACK receipt Congestion for previously Avoidance unACKed (CA) data

CongWin = CongWin+MSS * (MSS/CongWin)

Additive increase, resulting in increase of CongWin by 1 MSS every RTT

Loss event detected by triple duplicate ACK

SS or CA

Threshold = CongWin/2, CongWin = Threshold, Set state to “Congestion Avoidance”

Fast recovery, implementing multiplicative decrease. CongWin will not drop below 1 MSS.

Timeout

SS or CA

Threshold = CongWin/2, CongWin = 1 MSS, Set state to “Slow Start”

Enter slow start

Duplicate ACK

SS or CA

Increment duplicate ACK count for segment being ACKed

CongWin and Threshold not changed

Other TCP Mechanisms Nagle’s Algorithm and Delayed ACK

13

11/1/16

Motivation for Nagle’s Algorithm • Interactive applications – SSH/telnet/rlogin generate many small packets (e.g., keystrokes)

• Small packets are wasteful – Mostly header (e.g., 40 bytes of header, 1 of data)

• Appealing to reduce the number of packets – Could force every packet to have some minimum size – … but, what if the person doesn’t type more characters?

• Need to balance competing trade-offs – Send larger packets to increase efficiency – … but not at the expense of delay

Nagle’s Algorithm • Wait if the amount of data is small – Smaller than Maximum Segment Size (MSS)

• …and some other packet is already in flight – i.e., still awaiting the ACKs for previous packets

• That is, send at most one small packet per RTT – … by waiting until all outstanding ACKs have arrived ACK vs.

• Influence on performance – Interactive applications: enables batching of bytes – Bulk transfer: no change: transmits in MSS-sized packets anyway

Delayed ACK - Motivation • TCP traffic is often bidirectional – Data traveling in both directions – ACKs traveling in both directions

• ACK packets have high overhead – 40 bytes for the IP header and TCP header – … and zero data traffic

• Piggybacking is appealing – Host B can send an ACK to host A – … as part of a data packet from B to A

14

11/1/16

TCP Header Allows Piggybacking Source port

Destination port

Sequence number Flags: SYN FIN RST PSH URG ACK

Acknowledgment HdrLen 0

Flags

Advertised window

Checksum

Urgent pointer

Options (variable)

Data

Example of Piggybacking A

B B has data to send

B doesn’t have data to send

A has data to send

Increasing Likelihood of Piggybacking A

B

• Increase piggybacking – TCP allows the receiver to wait to send the ACK – … in the hope that the host will have data to send

waste

• Example: ssh/rlogin/telnet – Host A types characters at a UNIX prompt – Host B receives the character and executes a command – … and then data are generated – Would be nice if B could send the ACK with the new data

Works when packet from A causes data to be sent from B

15

11/1/16

Delayed ACK • Delay sending an ACK – Upon receiving a packet, the host B sets a timer – If B’s application generates data, go ahead and send • And piggyback the ACK bit

– If the timer expires, send a (non-piggybacked) ACK

• Limiting the wait – Timer of 200 msec or 500 msec – Results in an ACK for every other full-sized packet

TCP Throughput and Fairness

TCP Throughput • What’s the average throughout of TCP as a function of window size and RTT? – Assume long-lived TCP flow – Ignore slow start

• Let W be the window size when loss occurs. • When window is W, throughput is W/RTT • Just after loss, window drops to W/2, throughput to W/2RTT. • Average throughout: 0.75 W/RTT

16

11/1/16

Problems with Fast Links An example to illustrate problems • Consider the impact of high speed links: – 1500 byte segments, – 100ms RTT – 10 Gb/s throughput

• What is the required window size? – Throughput = .75 W/RTT • (probably a good formula to remember)

– Requires window size W = 83,333 in-flight segments

Example (Cont.) • 10 Gb/s throughput requires window size W = 83,333 inflight segments • TCP assumes every loss is due to congestion – Generally safe assumption for reasonable window size.

• (Magic) Formula to relate loss rate to throughput:

Throughput =

1.22 × MSS RTT L

Throughput of 10 Gb/s with MSS of 1500 bytes gives: – ➜ L = 2·10 -10 i.e. can only lose one in 5,000,000,000 segments!

• We need new versions of TCP for high-speed nets (topic for later discussion)

TCP Fairness Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP connection 1

TCP connection 2

bottleneck router capacity R

Simple scenario: assume same MSS and RTT

17

11/1/16

Is TCP Fair? Two competing sessions: • Additive increase gives slope of 1, as throughout increases • multiplicative decrease drops throughput proportionally equal bandwidth share

R

loss: decrease window by factor of 2 congestion avoidance: additive increase loss: decrease window by factor of 2 congestion avoidance: additive increase

Connection 1 throughput R

More on Fairness Fairness and parallel TCP Fairness and UDP connections • Multimedia apps often do • nothing prevents app from not use TCP opening parallel connections – do not want rate throttled by between 2 hosts. congestion control • Instead use UDP: • Web browsers do this – pump audio/video at • Example: link of rate R has 9 constant rate, tolerate connections running; packet loss

• Research area: TCP friendly unreliable transport (using UDP)

– new app asks for 1 TCP, gets rate R/10 – new app asks for 11 TCPs, gets 11R/20 (over half the bandwidth!)

Queuing Mechanisms Random Early Detection (RED) Explicit Congestion Notification (ECN)

18

11/1/16

Bursty Loss From Drop-Tail Queuing • TCP depends on packet loss to detect congestion – In fact, TCP drives the network into packet loss – … by continuing to increase the sending rate

• Drop-tail queuing leads to bursty loss – – – –

When a link becomes congested… … many arriving packets encounter a full queue And, as a result, many flows divide sending rate in half … and, many individual flows lose multiple packets full queue

Slow Feedback from Drop Tail • Feedback comes when buffer is completely full – … even though the buffer has been filling for a while

• Plus, the filling buffer is increasing RTT – … and the variance in the RTT

• Might be better to give early feedback – Get one or two flows to slow down, not all of them – Get these flows to slow down before it is too late queue filling up

Random Early Detection (RED) • Basic idea of RED – Router notices that the queue is getting backlogged – … and randomly drops queued (why?) packets to signal congestion

• Packet drop probability

Probability

– Drop probability increases as queue length increases – If buffer is below some level, don’t drop anything – … otherwise, set drop probability as function of queue

Average Queue Length

19

11/1/16

Properties of RED • Drops packets before queue is full – In the hope of reducing the rates of some flows

• Drops packet in proportion to each flow’s rate – High-rate flows have more packets – … and, hence, a higher chance of being selected

• Drops are spaced out in time – Which should help desynchronize the TCP senders

• Tolerant of burstiness in the traffic – By basing the decisions on average queue length

Problems With RED • Hard to get the tunable parameters just right – How early to start dropping packets? – What slope for the increase in drop probability? – What time scale for averaging the queue length?

• Sometimes RED helps but sometimes not – If the parameters aren’t set right, RED doesn’t help – And it is hard to know how to set the parameters

• RED is implemented in practice – But, often not used due to the challenges of tuning right

• Many variations – With cute names like “Blue” and “FRED”…

Explicit Congestion Notification • Early dropping of packets – Good: gives early feedback – Bad: has to drop the packet to give the feedback

• Explicit Congestion Notification – Router marks the packet with an ECN bit – … and sending host interprets as a sign of congestion – Can be used in conjunction with RED

• Surmounting the challenges – Must be supported by the end hosts and the routers – Requires two bits in the IP header (one for the ECN mark and one to indicate ECN capability) – Borrows two of the TOS bits in the IPv4 header

20

11/1/16

Conclusions • Congestion in the Internet is inevitable – Internet does not reserve resources in advance – TCP actively tries to push the envelope

• Congestion can be handled – Additive increase, multiplicative decrease – Slow start, and slow-start restart

• Active Queue Management can help – Random Early Detection (RED) – Explicit Congestion Notification (ECN)

21