Fundamentals 13 min read

Understanding Linux Network I/O: OSI Layers, MTU, Fragmentation, and TCP Flow Control

This article explains the structure of Linux network I/O by detailing the OSI seven‑layer model, the role of each layer, MTU/PMTU concepts, IP fragmentation and reassembly, and key TCP mechanisms such as MSS, flow control, and congestion control, providing a comprehensive foundation for studying zero‑copy networking.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Understanding Linux Network I/O: OSI Layers, MTU, Fragmentation, and TCP Flow Control

In the previous article we discussed the structure of Linux network I/O; this piece clarifies why the network stack is so layered and explains terminology such as MSS and IFG.

Linux network I/O is built on the OSI seven‑layer protocol suite, with the kernel implementing everything from the physical layer up to the transport layer. Understanding the OSI model is essential to grasp Linux networking.

1. Physical Layer

Data at the physical layer is represented as signals on various media (copper wire, fiber, air, vacuum). These signals are abstracted to binary 0/1, forming the basis for the link layer.

2. Data Link Layer

2.1 Overview

Because signals are prone to interference, frames are used to encapsulate bits, and frame checksums ensure integrity.

Using Ethernet as an example, a frame consists of a preamble (7 bytes) for clock synchronization, a start‑of‑frame delimiter (1 byte), the data payload, and an inter‑frame gap (IFG, 12 bytes) between frames.

Data frames contain three parts: header, payload, and trailer.

Header (18 bytes): destination MAC (6 B), source MAC (6 B), 802.1Q tag (4 B), EtherType (2 B).

Payload (46–1500 bytes).

Trailer (4 bytes) for CRC.

2.2 Abstraction

After the link layer abstracts the signal, the data becomes a frame . The portion of the frame that the network layer operates on is the payload , referred to as a datagram .

2.3 MTU/PMTU

MTU (Maximum Transmission Unit) is the largest datagram size a link can carry; exceeding it causes the packet to be dropped. Common Ethernet MTU is 1500 bytes.

PMTU (Path MTU) is the smallest MTU along a communication path and may differ in each direction.

2.4 Testing Your PMTU

When using ping , packets larger than 1472 bytes (ICMP payload) fail because the total size (1472 + 8 ICMP + 20 IP) exceeds the Ethernet MTU of 1500 bytes.

3. Network Layer (IPv4 Example)

3.1 Overview

The Internet Protocol (IP) routes packets based on source and destination addresses, providing an unreliable, best‑effort delivery service.

3.2 IPv4 Header

An IPv4 datagram has a variable‑length header, typically 20 bytes.

3.3 Fragmentation

IP fragments packets that exceed the MTU of the underlying link. Each fragment is ≤ MTU − IP‑header size. Fragments may be further fragmented on subsequent hops.

3.4 Reassembly

The receiver collects fragments (when the DF flag is 0) and reassembles them in order before passing the complete datagram to the upper layer.

3.5 Problems Caused by IP Fragmentation

CPU and memory overhead on both ends.

Loss of a single fragment forces retransmission of the entire original packet.

Maliciously crafted fragments can exhaust receiver memory.

Firewalls cannot easily filter non‑first fragments because they lack transport‑layer headers.

3.6 Abstraction

At the IP layer, data is abstracted as a datagram , while each fragment is called a fragment . The transport layer receives a logical, complete packet (or TCP segment) after reassembly.

4. Transport Layer (TCP Example)

4.1 Overview

TCP is a connection‑oriented, reliable, byte‑stream protocol.

4.2 Transmission Process

Application sends a data stream to TCP.

TCP segments the stream into packets; IP forwards them.

Each packet gets a sequence number; the receiver acknowledges with ACKs.

If an ACK is not received within the RTT, the sender retransmits.

Checksums detect corrupted packets, which are also retransmitted.

Because IP does not guarantee order, TCP reorders packets using sequence numbers.

4.3 MSS

Although IP imposes no size limit, TCP negotiates a Maximum Segment Size (MSS) based on the path MTU. MSS is the largest amount of application data that can be carried in a TCP segment (excluding TCP header and options).

During the three‑way handshake, each side advertises its MSS; the smaller value is used. Typical Ethernet result: MSS = 1500 − 20 (IP) − 20 (TCP) = 1460 bytes.

4.4 Flow Control

TCP uses a sliding‑window mechanism. The receiver advertises a receive window size, limiting how many bytes the sender may transmit without further ACKs.

4.5 Congestion Control

Early TCP lacked congestion windows. Modern TCP adds a congestion window and employs algorithms such as slow start, increasing the sending rate until loss or ACK feedback indicates the network limit.

4.6 Abstraction

TCP abstracts data as packets (or TCP segments). To the application, TCP provides a stream abstraction, and the OS exposes the connection as a socket API.

End

TCPLinuxMSSMTUnetwork I/OOSI modelfragmentation
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.