7 minutes
ICOM6012 Transport Layer
Transport-layer Services
Transport Services and Protocols
- Provide logical communication (actually accomplished by network layer) between app processes on different hosts
- logical communication means a pair of hosts is not connected physically, but it seems that they are connected by channel
- Transport protocols run in end systems
- send side: breaks app messages into segments (if needed), passes to network layer
- using segment header, including port number, to communicate with the socket of receiver
- receive side: reassembles segemnts into messages, uses segement header (or port number) to pass to application layer
- send side: breaks app messages into segments (if needed), passes to network layer
- More than one transport protocol available to apps
- Majority: TCP and UDP
- TCP: reliable, in-order delivery
- congestion control
- flow control
- connection setup
- UDP: unreliable, unordered delivery
- no-frills extension of "best-effort" IP
- TCP: reliable, in-order delivery
- Others: SCTP and DCCP
- These protocols are very limited, so we just ignore them. Unless in specific environment, for example, SCTP would be used in wireless network (e.g. celluar network)
- Majority: TCP and UDP
- Services not available
- delay guarantee
- bandwidth guarantee
- throughput guarantee
- Q: Why the Internet can not guarantee throughput and so on?
- A: Because the data link layer can not guarantee, the network layer can not guarantee. For example, we can use Zoom to meet, but the Internet can not guarantee the quality.
Transport Layer Actions
- Sender:
- is passed an application-layer message
- determines segment header fields values, including port number
- not change the pillow, just add a header with some important information
- creates segment
- passed segement to IP
- Receiver:
- receives segment from IP, ensuring the packet has arrived at the correct destination
- checks header values
- extracts application-layer message
- demultiplexes message up to application via socket
Multiplexing and Demultiplexing
Overview
- Multiplexing and demultiplexing happen at all layers. For example,
- application layer & transport Layer -> port number in the TCP and UDP header
- transport layer & network layer -> protocol ID in the IP header
- network layer & data link layer -> frame type in the Ethernet header, for example.
Demultiplexing
- Host receives IP datagrams (network layer)
- each datagram has source & destination IP addresses
- each datagram carries one transport-layer segment
- each segment has source & destination port number (for resending and replying purpose)
Host uses IP addresses & port numbers to direct segment to appropriate socket
Connectionless Demux (UDP Using destination port number only)
- Create socket with unique local port number, which is assigned by OS (sender)
- When creating datagram to send into UDP socket, must specify (sender)
- destination IP address
- destination port number
- When host receives UDP segment (receiver)
- checks destination port number in segment
- directs UDP segment to socket with that port number
- Others
- IP datagrams with same destination & port number, but different source IP addresses and/or source port numbers will be directed to same socket at destination
- Q: We call the UDP segment "datagram", why?
- A: Becasue UDP does not improve or enhance the service of network layer.
- share the same socket, but UDP does not care
- UDP socket is identified by two-tuple: (dest IP address, dest port number)
- Connection-oriented Demux (Using 4-tuple)
- TCP socket identified by 4-tuple:
- source IP address
- source port number
- dest IP address
- dest port number
- Server host may support many simultaneous TCP sockets
- each socket is identified by its own 4-tuple
- Receiver uses all four values to direct segment to appropriate socket
- Web servers have different sockets for each connecting client
- Non-persistent HTTP will have different sockets for each request
- Others
- UDP shares one socket, but TCP has additional sockets
- TCP consumes extra resources, such as memory
Connectionless Transport: UDP
- Features
- "no frills", "bare bones" Internet transport protocol (means get the packets from Internet but do nothing)
- "best effort" service, UDP segment may be lost or delivered out-of-order to app
- connectionless
- no handshaking between UDP senders and receivers
- each UDP segment handled independently of others
- The Reason of the Existence of UDP (Advantages)
- no connection establishment
- DNS choose this, due to no additional RTT delay
- simple (no connection)
- small header size
- no congestion control
- UDP can blast away as fast as desired
- can function in the face of congestion
- Usage
- streaming multimedia apps (loss tolerant, rate sensitive)
- DNS
- SNMP
- HTTP/3
- if used UDP
- add needed reliability at application layer
- add congestion control at application layer
- Segment
- length: in bytes of UDP segment, inlcuding header
- UDP segment would keep the length of packet below 1500 bytes, in order to avoid being fragmented by router
- checksum: for error detection (cannnot dectect all, but majority)
- procedure
- sender
- treat segment contents, including header (8 bytes), as sequence of 16-bit integers
- checksum: addition (one's complement sum) of segment contents
- sender puts checksum value into UDP checksum field
- receiver
- compute checksum of received segment
- check if computed checksum equals checksum field value:
- no - error detected (two options)
- transport layer drop it
- transmit to application layer, it would decide the next action
- yes - no error detected (but may still have errors)
- others
- some implementations allow UDP checksum calculation to be disabled in order to speed up the processing of incoming UDP datagrams
Connection-oriented Transport: TCP
- Services
- point-to-point
- reliable, in-order byte stream (no message boundaries)
- using ACK and sequence number to be reliable
- ACK and sequence number are byte-stream numbers, not packet number
- the sequence number of a segment is the byte-stream number of the first byte in the segment
- full duplex data
- bi-directional data flow in same connection
- MSS: maximum segment size (excluding header)
- connection specific
- two direcrions may have different MSSs
- in contrast, UDP has packet size
- depends on two issues
- overhead management
- IP fragmentation
- absolute limit: 65495 =\(\ (2^{16}-1-40)\)
- \(\ 2^{16}-1\) is IP datagram maximum size
- IP header costs 20 bytes, and TCP header costs 20 bytes -> 40 bytes
- in practice, it is hard to reach this number
- typically, its value is 1460 bytes (just pillow)
- TCP header is 20 bytes (min), IP datagram header is 20 bytes (min), Ethernet maximum frame size is 1500 bytes (but up to 8000 bytes in data centre)
- cumulative ACKs (in the header)
- ACK: sequence number of next byte expected from other side
- duplicated ACKs means packet loss (then restransmit fastly) or packets out of order
- another method to detect packet loss (then restransmit fastly) is timer (using time-out)
- TCP and UDP both use this
- pipelining (means keep talking)
- unpipelining -> transmit packets one by one, but it is reliable
- connection-oriented
- flow control
- sender will not overwhelm receive buffer
- congestion control
- focus on router buffer
- through window size (key factor)
- others
- send buffer
- decide when to send packets
- resending purpose
- receive buffer
- solve the problem of out-of-order packets
- send buffer
- Segment Structure
- source & dest ports -> mutiplexing & demultiplexing
- header length (4 bits)
- Q: What is the max header size?
- A: The TCP header would combine 4 bytes as a union, although header length only has 4 bits (15), the max header size is 60 bytes. Because TCP also has "option" in the header, there are 20 bytes except for "options".
- C, E (for newer protocols, explicit congestion control, we can just ignore them)
- U (seldomly used)
- if U equals 1, the pillow contains two types of data
- urgent data
- start from the first byte to the urgent data pointer
- original data
- urgent data
- the boundary of above two types of data is indicated by urgent data pointer
- if U equals 1, the pillow contains two types of data
- P, means push quickly
- the client would not transmit a packet to application layer when the transport layer receives the packet immediately
- packets would be save in the buffer, until the value of P equals 1, packets would be transmitted to the application layer together (buffer would be empty at the same time)
- R, S, F
- RST: if the client or server crashes, make R as 1 to reset everything
- SYN: connection request
- FIN: finish my connection (finsh connections on both sides)
- Reliable Data Transfer
- window size
- best N: determined by congestion control
- Q: How receiver handles out-of-order segments?
- A: TCP spec does not say, up to implementor.
- best N: determined by congestion control
- TCP round trip time, timeout
- timer would be put in the oldest packet
- timeout value
- longer than RTT, but RTT varies
- too short: premature timeout, unnecessary retransmissions
- too long: slow reaction to segment loss
- estimate RTT
- SampleRTT: measured time from segment transmission until ACK receipt
- ignore retransmissions
- SampleRTT will vary, want estimated RTT "smoother"
- average several recent measurements, not just current SampleRTT
- EstimatedRTT =\(\ (1-\alpha)\times\)EstimatedRTT + \(\alpha\times\)SampleRTT
- exponential weighted moving average (EWMA)
- influence of past sample decreases exponentially fast
- typcial value: \(\alpha\) = 0.125
- EstimatedRTT just like average RTT, SampleRTT just like current RTT, we can combine them to update EstimatedRTT
- SampleRTT: measured time from segment transmission until ACK receipt
- timeout interval: EstimatedRTT plus "safety margin"
- large variation in EstimatedRTT: want a larger safety margin
- TimeoutInterval = EstimatedRTT + 4\(\times\)DevRTT
- DevRTT (safety margin): EWMA of SampleRTT deviation from EstimatedRTT
- DevRTT = (1 - \(\beta\))\(\times\)DevRTT + \(\beta\times\)|SampleRTT - EstimatedRTT|
- \(\beta\) = 0.25
- TCP fast retransmit
- due to lots of experiments, 3 duplicated ACKs are the best option
- advantages of delayed ACK
- wait up to 500ms for next segment and send a ACK for several segments to save
- save dedicated ACK (no pillow, just header)
- window size