Fundamentals 11 min read

Understanding Real-Time Data Flow and SIGPIPE in Linux Pipes

This article explains how Linux pipes transmit data in real time, explores buffering modes, demonstrates practical experiments with Python scripts, and details read/write rules, including SIGPIPE handling, providing developers with essential insights for effective command-line pipeline usage.

Efficient Ops

May 23, 2019

Understanding Real-Time Data Flow and SIGPIPE in Linux Pipes

Many Linux users are familiar with the pipe symbol "|" for chaining commands, but the real-time nature of data flow through pipes and practical tips are often overlooked.

echo 123 | awk '{print $0+123}'       # outputs 246

The left-hand command writes to the pipe as soon as it produces output, and the right-hand command reads and processes that data immediately.

Pipes operate simultaneously on both ends; data written to the pipe is processed by the reader without delay.

Pipe Definition

A pipe is a kernel-managed buffer that connects the output of one process to the input of another. It behaves like a circular buffer in memory; when empty, the reader blocks, and when full, the writer blocks. The pipe disappears when both ends close.

Consider the command COMMAND1 | COMMAND2. COMMAND1's stdout is bound to the pipe's write end, and COMMAND2's stdin reads from the pipe's read end, so data is transferred instantly.

# 1.py
import time
import sys
while True:
    print '1111'
    time.sleep(3)
    print '2222'
    time.sleep(3)

# Run the script through a pipe
python 1.py | cat

The output is not delayed until the script finishes; instead, data appears as soon as it is written, unless buffering interferes.

File I/O buffering modes affect this behavior:

Full buffering: data is written only when the buffer is full (typically for files).

Line buffering: data is flushed on newline characters (standard output).

No buffering: data is written immediately (standard error).

Python’s default stdout is line‑buffered when attached to a terminal but becomes fully buffered when redirected to a pipe, so the output may not appear until the buffer fills or flush() is called.

# Method 1: Fill the buffer (e.g., 4096 bytes)
import time
while True:
    print '1111' * 4096
    time.sleep(3)
    print '2222' * 4096
    time.sleep(3)

# Method 2: Manually flush after each print
import time, sys
while True:
    print '1111'
    sys.stdout.flush()
    time.sleep(3)
    print '2222'
    sys.stdout.flush()
    time.sleep(3)

Running the two methods yields:

# Method 1 (full buffer)
1111... (many lines)
...sleep 3 seconds...
2222... (many lines)

# Method 2 (manual flush)
1111
...sleep 3 seconds...
2222
...sleep 3 seconds...
1111
...

Thus, as soon as data is written to the pipe (whether the buffer is full or flushed), the downstream command receives it immediately.

Pipe Read/Write Rules and SIGPIPE

Key rules for pipe operations:

If no data is available to read:

Without O_NONBLOCK: read() blocks until data arrives.

With O_NONBLOCK: read() returns -1 and sets errno to EAGAIN.

If the pipe is full:

Without O_NONBLOCK: write() blocks until space is freed.

With O_NONBLOCK: write() returns -1 and sets errno to EAGAIN.

If all write‑end file descriptors are closed, read() returns 0 (EOF).

If all read‑end file descriptors are closed, a subsequent write() generates the SIGPIPE signal.

Writes of size ≤ PIPE_BUF are atomic; larger writes are not guaranteed to be atomic.

When a writer receives SIGPIPE, it terminates by default.

# Demonstrate SIGPIPE
#!/usr/bin/python
import time, sys
while True:
    time.sleep(10)
    print '1111'
    sys.stdout.flush()

Run the script and kill the reading process:

python 1.py | cat
# In another terminal:
ps -fe | grep -E 'cat|python'
# Kill the cat process
kill <cat_pid>

The writer receives SIGPIPE and exits with a broken‑pipe error:

Traceback (most recent call last):
  File "1.py", line 6, in <module>
    sys.stdout.flush()
IOError: [Errno 32] Broken pipe
Terminated

Additional experiment shows that if the writer closes first, the reader receives EOF (read returns 0).

Conclusion

By understanding how data flows through Linux pipes in real time, the impact of buffering, and the behavior of SIGPIPE and other read/write rules, developers can use pipelines more confidently and avoid common pitfalls.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux Shell Buffering Interprocess Communication SIGPIPE pipes

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.