Understanding the Linux listen() System Call: Socket Creation, Queue Initialization, and Backlog Calculation
This article explores why a server must call listen() before accept() by dissecting the Linux kernel's listen system call, its interaction with socket objects, and how the kernel calculates and initializes the full and half‑connection queues based on backlog and system parameters.
The author introduces the topic by noting the necessity of calling listen() on a server socket before accepting client connections and sets out to uncover what the kernel does internally during this step.
1. Creating the socket
The server first creates a socket via the socket() function, which returns a file descriptor in user space but corresponds to a complex kernel object structure.
2. Kernel execution of listen
2.1 listen system call
The source of the listen system call is found in net/socket.c :
//file: net/socket.c
SYSCALL_DEFINE2(listen, int, fd, int, backlog)
{
// locate the socket kernel object from fd
sock = sockfd_lookup_light(fd, &err, &fput_needed);
if (sock) {
// get net.core.somaxconn
somaxconn = sock_net(sock->sk)->core.sysctl_somaxconn;
if ((unsigned int)backlog > somaxconn)
backlog = somaxconn;
// invoke the protocol‑specific listen function
err = sock->ops->listen(sock, backlog);
...
}
}The kernel translates the user‑space file descriptor to the corresponding socket object, clamps the requested backlog to net.core.somaxconn , and then calls the protocol‑specific listen implementation.
2.2 Protocol‑stack listen
The protocol‑specific function for IPv4 is inet_listen in net/ipv4/af_inet.c :
//file: net/ipv4/af_inet.c
int inet_listen(struct socket *sock, int backlog)
{
if (old_state != TCP_LISTEN) {
// start listening
err = inet_csk_listen_start(sk, backlog);
}
// set the length of the full‑connection queue
sk->sk_max_ack_backlog = backlog;
}Here the kernel sets the maximum length of the full‑connection (accept) queue to the smaller of the user‑provided backlog and net.core.somaxconn .
2.3 Definition of the receive queue
The receive queue is represented by struct request_sock_queue inside struct inet_connection_sock :
//file: include/net/inet_connection_sock.h
struct inet_connection_sock {
struct inet_sock icsk_inet;
struct request_sock_queue icsk_accept_queue;
...
}Its definition contains pointers for the full‑connection queue ( rskq_accept_head , rskq_accept_tail ) and a pointer to the half‑connection hash table ( listen_opt ).
2.4 Allocation and initialization of the receive queue
The function inet_csk_listen_start calls reqsk_queue_alloc to allocate and initialise icsk_accept_queue :
//file: net/ipv4/inet_connection_sock.c
int inet_csk_listen_start(struct sock *sk, const int nr_table_entries)
{
...
int rc = reqsk_queue_alloc(&icsk->icsk_accept_queue, nr_table_entries);
...
}Inside reqsk_queue_alloc (found in net/core/request_sock.c ) the kernel allocates memory for the half‑connection hash table, computes its size, and initialises the full‑connection queue head to NULL :
//file: net/core/request_sock.c
int reqsk_queue_alloc(struct request_sock_queue *queue,
unsigned int nr_table_entries)
{
size_t lopt_size = sizeof(struct listen_sock);
struct listen_sock *lopt;
// limit half‑connection length
nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog);
nr_table_entries = max_t(u32, nr_table_entries, 8);
nr_table_entries = roundup_pow_of_two(nr_table_entries + 1);
// allocate listen_sock (which holds the half‑connection hash table)
if (lopt_size > PAGE_SIZE)
lopt = vzalloc(lopt_size);
else
lopt = kzalloc(lopt_size, GFP_KERNEL);
// initialise full‑connection queue head
queue->rskq_accept_head = NULL;
// attach half‑connection queue
lopt->nr_table_entries = nr_table_entries;
queue->listen_opt = lopt;
...
}The half‑connection queue length is first clamped by sysctl_max_syn_backlog , then forced to be at least 8, and finally rounded up to the next power of two. The kernel stores only the exponent (e.g., max_qlen_log ) for performance.
2.5 Calculation of half‑connection queue length
The effective half‑connection length becomes:
nr = min(backlog, somaxconn, tcp_max_syn_backlog);
nr = max(nr, 8);
nr = roundup_pow_of_two(nr + 1);
// minimum result is 16Examples: With somaxconn=128 , tcp_max_syn_backlog=8192 , and backlog=5 , the final length is 16. With the same system parameters but backlog=512 , the final length is 256.
Conclusion
Server‑side socket programming follows the sequence: bind() → listen() → accept() . The listen() call primarily allocates and initialises the receive queues: a linked‑list for full connections and a hash table for half connections. The full‑connection queue size is the smaller of the user‑provided backlog and net.core.somaxconn , while the half‑connection queue size is derived from the minimum of backlog , somaxconn , and tcp_max_syn_backlog , rounded up to the nearest power of two (minimum 16).
Refining Core Development Skills
Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.