Fundamentals 11 min read

Understanding Real-Time Data Flow and SIGPIPE in Linux Pipes

This article explains how Linux pipes transmit data in real time, explores buffering modes, demonstrates practical experiments with Python scripts, and details read/write rules, including SIGPIPE handling, providing developers with essential insights for effective command-line pipeline usage.

Efficient Ops
Efficient Ops
Efficient Ops
Understanding Real-Time Data Flow and SIGPIPE in Linux Pipes

Many Linux users are familiar with the pipe symbol "|" for chaining commands, but the real-time nature of data flow through pipes and practical tips are often overlooked.

<code>echo 123 | awk '{print $0+123}'       # outputs 246</code>

The left-hand command writes to the pipe as soon as it produces output, and the right-hand command reads and processes that data immediately.

Pipes operate simultaneously on both ends; data written to the pipe is processed by the reader without delay.

Pipe Definition

A pipe is a kernel-managed buffer that connects the output of one process to the input of another. It behaves like a circular buffer in memory; when empty, the reader blocks, and when full, the writer blocks. The pipe disappears when both ends close.

Consider the command

COMMAND1 | COMMAND2

. COMMAND1's stdout is bound to the pipe's write end, and COMMAND2's stdin reads from the pipe's read end, so data is transferred instantly.

<code># 1.py
import time
import sys
while True:
    print '1111'
    time.sleep(3)
    print '2222'
    time.sleep(3)
</code>
<code># Run the script through a pipe
python 1.py | cat</code>

The output is not delayed until the script finishes; instead, data appears as soon as it is written, unless buffering interferes.

File I/O buffering modes affect this behavior:

Full buffering: data is written only when the buffer is full (typically for files).

Line buffering: data is flushed on newline characters (standard output).

No buffering: data is written immediately (standard error).

Python’s default stdout is line‑buffered when attached to a terminal but becomes fully buffered when redirected to a pipe, so the output may not appear until the buffer fills or

flush()

is called.

<code># Method 1: Fill the buffer (e.g., 4096 bytes)
import time
while True:
    print '1111' * 4096
    time.sleep(3)
    print '2222' * 4096
    time.sleep(3)

# Method 2: Manually flush after each print
import time, sys
while True:
    print '1111'
    sys.stdout.flush()
    time.sleep(3)
    print '2222'
    sys.stdout.flush()
    time.sleep(3)
</code>

Running the two methods yields:

<code># Method 1 (full buffer)
1111... (many lines)
...sleep 3 seconds...
2222... (many lines)

# Method 2 (manual flush)
1111
...sleep 3 seconds...
2222
...sleep 3 seconds...
1111
...</code>

Thus, as soon as data is written to the pipe (whether the buffer is full or flushed), the downstream command receives it immediately.

Pipe Read/Write Rules and SIGPIPE

Key rules for pipe operations:

If no data is available to read:

Without

O_NONBLOCK

:

read()

blocks until data arrives.

With

O_NONBLOCK

:

read()

returns -1 and sets

errno

to

EAGAIN

.

If the pipe is full:

Without

O_NONBLOCK

:

write()

blocks until space is freed.

With

O_NONBLOCK

:

write()

returns -1 and sets

errno

to

EAGAIN

.

If all write‑end file descriptors are closed,

read()

returns 0 (EOF).

If all read‑end file descriptors are closed, a subsequent

write()

generates the

SIGPIPE

signal.

Writes of size ≤

PIPE_BUF

are atomic; larger writes are not guaranteed to be atomic.

When a writer receives

SIGPIPE

, it terminates by default.

<code># Demonstrate SIGPIPE
#!/usr/bin/python
import time, sys
while True:
    time.sleep(10)
    print '1111'
    sys.stdout.flush()
</code>

Run the script and kill the reading process:

<code>python 1.py | cat
# In another terminal:
ps -fe | grep -E 'cat|python'
# Kill the cat process
kill <cat_pid>
</code>

The writer receives

SIGPIPE

and exits with a broken‑pipe error:

<code>Traceback (most recent call last):
  File "1.py", line 6, in <module>
    sys.stdout.flush()
IOError: [Errno 32] Broken pipe
Terminated
</code>

Additional experiment shows that if the writer closes first, the reader receives EOF (read returns 0).

Conclusion

By understanding how data flows through Linux pipes in real time, the impact of buffering, and the behavior of SIGPIPE and other read/write rules, developers can use pipelines more confidently and avoid common pitfalls.

LinuxShellBufferinginterprocess communicationSIGPIPEPipes
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.