TCP reliable data transfer. TCP Round Trip Time and Timeout. TCP Round Trip Time and Timeout. Q: how to estimate RTT?

TCP Round Trip Time and Timeout Q: how to set TCP timeout value? r longer than RTT m but RTT varies r too short: premature timeout m unnecessary retr...
0 downloads 0 Views 197KB Size
TCP Round Trip Time and Timeout Q: how to set TCP timeout value? r longer than RTT m but RTT varies r too short: premature

timeout m unnecessary retransmissions r too long: slow reaction to segment loss

Q: how to estimate RTT? r SampleRTT: measured time from

segment transmission until ACK receipt m ignore retransmissions r SampleRTT will vary, want estimated RTT “smoother” m average several recent measurements, not just current SampleRTT

3-1

TCP Round Trip Time and Timeout EstimatedRTT = (1- α)*EstimatedRTT + α*SampleRTT r Exponential weighted moving average r influence of past sample decreases exponentially fast

3-2

TCP reliable data transfer r TCP creates reliable

data transfer service on top of IP’s unreliable service r Pipelined segments r Cumulative acks r TCP uses single

retransmission timer

r Retransmissions are

triggered by: m m

timeout events duplicate acks

r Initially consider

simplified TCP sender: m m

ignore duplicate acks ignore flow control, congestion control

3-3

1

TCP sender events: data rcvd from app: r Create segment with seq #

timeout: r retransmit segment that caused timeout

r seq # is byte-stream

r restart timer

number of first data byte in segment

r start timer if not

already running (think of timer as for oldest unacked segment)

r expiration interval:

Ack rcvd: r If acknowledges previously

unacked segments m m

update what is known to be acked start timer if there are outstanding segments

TimeOutInterval

3-4

NextSeqNum = InitialSeqNum SendBase = InitialSeqNum

TCP sender

loop (forever) { switch(event) event: data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)

(simplified) Comment: • SendBase -1: last cumulatively ack’ed byte Example: • SendBase -1 = 71; y= 73, so the rcvr wants 73+ ; y > SendBase , so that new data is acked

event: timer timeout retransmit not -yet -acknowledged segment with smallest sequence number start timer event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not -yet -acknowledged segments) start timer } } /* end of loop forever */

3-5

TCP: retransmission scenarios Host A

s data

Seq=92 timeout

, 8 byte

Seq =9

=100 ACK

X

loss Seq=92

, 8 byte

s data

Sendbase = 100 SendBase = 120

=100 ACK

SendBase = 100

time

SendBase = 120

lost ACK scenario

Seq =

Host B 2, 8 by

100,

time

tes da

20 byt

ta

es da

ta

0 10 K= 120 AC ACK=

Seq=92

Seq=92 timeout

timeout

Host A

Host B

Seq=92

, 8 byte

s data

20 K=1 AC

premature timeout 3-6

2

TCP retransmission scenarios (more) Host A

Host B

timeout

Seq=92

SendBase = 120

Seq =1

, 8 byte

00, 20

X

s data

=100 ACK bytes data

loss =120 ACK

time Cumulative ACK scenario 3-7

TCP ACK generation

[RFC 1122, RFC 2581]

Event at Receiver

TCP Receiver action

Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed

Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK

Arrival of in-order segment with expected seq #. One other segment has ACK pending

Immediately send single cumulative ACK, ACKing both in-order segments

Arrival of out-of-order segment higher-than-expect seq. # . Gap detected

Immediately send duplicate ACK, indicating seq. # of next expected byte

Arrival of segment that partially or completely fills gap

Immediate send ACK, provided that segment startsat lower end of gap 3-8

Fast Retransmit r Time-out period often

relatively long: m

long delay before resending lost packet

r Detect lost segments

via duplicate ACKs. m

m

Sender often sends many segments back -toback If segment is lost, there will likely be many duplicate ACKs.

r If sender receives 3

ACKs for the same data, it supposes that segment after ACKed data was lost: m

fast retransmit: resend segment before timer expires

3-9

3

Fast retransmit algorithm: event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer } else { increment count of dup ACKs received for y if (count of dup ACKs received for y = 3) { resend segment with sequence number y } a duplicate ACK for already ACKed segment

fast retransmit

3-10

TCP Flow Control r receive side of TCP

connection has a receive buffer:

flow control

sender won’t overflow receiver’s buffer by transmitting too much, too fast

r speed- matching

r app process may be

service: matching the send rate to the receiving app’s drain rate

slow at reading from buffer 3-11

TCP Flow control: how it works r Rcvr advertises spare

(Suppose TCP receiver discards out-of-order segments)

room by including value of RcvWindow in segments r Sender limits unACKed data to RcvWindow m

guarantees receive buffer doesn’t overflow

r spare room in buffer = RcvWindow = RcvBuffer-[LastByteRcvd LastByteRead]

3-12

4

TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments r initialize TCP variables: m seq. #s m buffers, flow control info (e.g. RcvWindow) r client: connection initiator Socket clientSocket = new Socket("hostname","port number");

r server: contacted by client Socket connectionSocket = welcomeSocket.accept();

Three way handshake: Step 1: client host sends TCP SYN segment to server m specifies initial seq # m no data Step 2: server host receives SYN, replies with SYNACK segment server allocates buffers specifies server initial seq. # Step 3: client receives SYNACK, replies with ACK segment, which may contain data m m

3-13

TCP Connection Management (cont.) Closing a connection: client closes socket: clientSocket.close();

client

close

FIN

Step 1: client end system

ACK

sends TCP FIN control segment to server

FIN, replies with ACK. Closes connection, sends FIN.

close

FIN

timed wait

Step 2: server receives

server

ACK

closed

3-14

TCP Connection Management (cont.) Step 3: client receives FIN, replies with ACK. m

client

closing

Enters “timed wait” will respond with ACK to received FINs

server FIN

ACK

Step 4: server, receives

closing

FIN

modification, can handle simultaneous FINs.

timed wait

ACK. Connection closed.

Note: with small

ACK

closed

closed

3-15

5

TCP Connection Management (cont)

TCP server lifecycle TCP client lifecycle

3-16

Principles of Congestion Control Congestion: r informally: “too many sources sending too much

data too fast for network to handle”

r different from flow control! r manifestations: m

lost packets (buffer overflow at routers)

m

long delays (queueing in router buffers)

r a top-10 problem!

3-17

Causes/costs of congestion: scenario 1 Host A

r two senders, two

λ out

λ i n : original data

receivers

r one router,

Host B

unlimited shared output link buffers

infinite buffers

r no retransmission

r large delays

when congested

r maximum

achievable throughput 3-18

6

Causes/costs of congestion: scenario 2 r one router, finite buffers r sender retransmission of lost packet Host A

λout

λ in : original data λ'in : original data, plus retransmitted data

Host B

finite shared output link buffers

3-19

Causes/costs of congestion: scenario 2 r always:

λin= λout

(goodput)

λin> λout λin larger

r “perfect” retransmission only when loss:

r retransmission of delayed (not lost) packet makes

(than perfect case) for same

λout

“costs” of congestion: r more work (retransmission) for given “goodput” r unneeded retransmissions: link carries multiple copies of pkt 3-20

Causes/costs of congestion: scenario 3 r four senders

r multihop paths

r timeout/retransmit Host A

Q: what happens as λ in and λ increase ? in

λ in : original data

λ out

λ'in : original data, plus retransmitted data

finite shared output link buffers

Host B

3-21

7

Causes/costs of congestion: scenario 3 H o s t A

λ o u t

H o s t B

Another “cost” of congestion: r when packet dropped, any “upstream transmission capacity used for that packet was wasted! 3-22

Approaches towards congestion control Two broad approaches towards congestion control: End-end congestion control:

r no explicit feedback from

network r congestion inferred from end-system observed loss, delay r approach taken by TCP

Network-assisted congestion control:

r routers provide feedback

to end systems m single bit indicating congestion (SNA, DECbit, TCP/IP ECN, ATM) m explicit rate sender should send at

3-23

TCP Congestion Control r end-end control (no network

How does sender perceive congestion?

r sender limits transmission: LastByteSent-LastByteAcked ≤ CongWin

r loss event = timeout or

r Roughly,

rate (CongWin) after loss event three mechanisms:

assistance)

rate =

CongWin Bytes/sec RTT

r CongWin is dynamic, function

of perceived network congestion

3 duplicate acks

r TCP sender reduces

m m m

AIMD slow start conservative after timeout events 3-24

8

TCP AIMD multiplicative decrease: cut CongWin in half after loss event congestion window

additive increase: increase CongWin by 1 MSS every RTT in the absence of loss events: probing

24 Kbytes

16 Kbytes

8 Kbytes

time

Long-lived TCP connection 3-25

TCP Slow Start r When connection begins,

CongWin = 1 MSS m m

Example: MSS = 500 bytes & RTT = 200 msec initial rate = 20 kbps

r When connection begins,

increase rate exponentially fast until first loss event

r available bandwidth may

be >> MSS/RTT m

desirable to quickly ramp up to respectable rate

3-26

TCP Slow Start (more) r When connection

m m

double CongWin every RTT done by incrementing CongWin for every ACK received

Host A

Host B on e se gm

RTT

begins, increase rate exponentially until first loss event:

en t

two segme

nts

four segme

nts

r Summary: initial rate

is slow but ramps up exponentially fast

time

3-27

9

Refinement Philosophy: r After 3 dup ACKs: m

• 3 dup ACKs indicates network capable of delivering some segments • timeout before 3 dup ACKs is “more alarming”

CongWin is cut in half

window then grows linearly r But after timeout event: m

m

CongWin instead set to 1 MSS;

window then grows exponentially m to a threshold, then grows linearly m

3-28

Refinement (more)

Implementation:

14 congestion window size (segments)

Q: When should the exponential increase switch to linear? A: When CongWin gets to 1/2 of its value before timeout. r Variable Threshold

r At loss event, Threshold is

12 10 8 6

threshold TCP TCP Tahoe Reno

4 2 0 1

2 3

4 5

6 7

8 9 10 11 12 13 14 15

Transmission round

set to 1/2 of CongWin just before loss event

3-29

Summary: TCP Congestion Control r When CongWin is below Threshold , sender in

slow- start phase, window grows exponentially.

r When CongWin is above Threshold, sender is in

congestion-avoidance phase, window grows linearly.

r When a triple duplicate ACK occurs, Threshold

set to CongWin/2 and CongWin set to Threshold.

r When timeout occurs, Threshold set to

CongWin/2 and CongWin is set to 1 MSS. 3-30

10

TCP Fairness Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP connection 1

TCP connection 2

bottleneck router capacity R

3-31

Why is TCP fair? Two competing sessions:

r Additive increase gives slope of 1, as throughout increases

r multiplicative decrease decreases throughput proportionally equal bandwidth share

Connection 2 throughput

R

loss: decrease window by factor of 2 congestion avoidance: additive increase loss: decrease window by factor of 2 congestion avoidance: additive increase

Connection 1 throughput R 3-32

Fairness (more) Fairness and UDP r Multimedia apps often

do not use TCP m

do not want rate throttled by congestion control

r Instead use UDP: m pump audio/video at constant rate, tolerate packet loss

Fairness and parallel TCP connections r nothing prevents app from

opening parallel cnctions between 2 hosts.

r Web browsers do this r Example: link of rate R

supporting 9 cnctions; m m

new app asks for 1 TCP, gets rate R/10 new app asks for 11 TCPs, gets R/2 !

3-33

11

Delay modeling Q: How long does it take to receive an object from a Web server after sending a request? Ignoring congestion, delay is influenced by: r TCP connection establishment r data transmission delay r slow start

Notation, assumptions: r Assume one link between

client and server of rate R

r S: MSS (bits)

r O: object size (bits)

r no retransmissions (no loss,

no corruption)

Window size:

r First assume: fixed

congestion window, m

window = W segments

r Then dynamic window,

modeling slow start

3-34

Fixed congestion window (1) First case: WS/R > RTT + S/R: ACK for first segment in window returns before window’s worth of data sent delay = 2RTT + O/R

3-35

Fixed congestion window (2) Second case:

r WS/R < RTT + S/R:

wait for ACK after sending window’s worth of data sent

delay = 2RTT + O/R + (K-1)[S/R + RTT - WS/R]

3-36

12

TCP Delay Modeling: Slow Start (1) Now suppose window grows according to slow start Will show that the delay for one object is:

Latency = 2 RTT +

O S S + P  RTT +  − (2 P − 1) R R R 

where P is the number of times TCP idles at server:

P = min{Q, K −1} - where Q is the number of times the server idles if the object were of infinite size. - and K is the number of windows that cover the object.

3-37

TCP Delay Modeling: Slow Start (2) Delay components:

• 2 RTT for connection estab and request • O/R to transmit object • time server idles due to slow start

initiate TCP connection

request object first window = S/R RTT

second window = 2S/R

Server idles: P = min{K-1,Q} times

Example: • O/S = 15 segments • K = 4 windows •Q=2 • P = min{K-1,Q} = 2

third window = 4S/R

fourth window = 8S/R

complete transmission

object delivered

Server idles P=2 times

time at server

time at client

3-38

TCP Delay Modeling (3) S + RTT = time from when server starts to send segment R until server receives acknowledg ement initiate TCP connection

2 k−1

S = time to transmit thekth window R +

request object

 S + RTT − 2k −1 S  = idle time after thekth window  R R 

first window = S/R RTT

second window = 2S/R

third window = 4S/R

delay =

P

O + 2 RTT + ∑ idleTimep R p=1

O S S + 2 RTT + ∑ [ + RTT − 2 k −1 ] R R k =1 R O S S = + 2 RTT + P[ RTT + ] − (2 P −1) R R R =

fourth window = 8S/R

P

complete transmission

object delivered time at client

timeat server

3-39

13

TCP Delay Modeling (4) Recall K = number of windows that cover object How do we calculate K ?

K = min{k : 2 0 S + 21 S + L + 2k −1 S ≥ O} = min{k : 2 0 + 21 + L + 2 k− 1 ≥ O / S} O = min{k : 2 k − 1 ≥ } S O = min{k : k ≥ log 2 ( + 1)} S O = log 2 ( + 1)  S   Calculation of Q, number of idles for infinite-size object, is similar 3-40

14