Understanding How strace Works: A Step‑by‑Step Implementation Using ptrace
This article explains the inner workings of the classic strace command by walking through a handcrafted implementation that uses ptrace to attach to a target process, capture its system calls, read the ORIG_RAX register, and print the syscall name, while also detailing the relevant kernel source code.
In the field of performance observation, the long‑standing # strace -p {pid} command is widely used to monitor the system calls of a running process; this article reveals how strace can observe another process despite Linux's process isolation.
1. Hand‑crafted strace – To grasp strace’s principles, a minimal C program is presented that attaches to a target PID, registers for syscall notifications, reads the syscall number from the ORIG_RAX register, translates it to a name, and prints it. The core logic is shown below:
int main(int argc, char *argv[])
{
// 1. attach to the target process
ptrace(PTRACE_ATTACH, pid, NULL, NULL);
while (1) {
// 2. request syscall notifications
ptrace(PTRACE_SYSCALL, pid, NULL, NULL);
waitpid(pid, &status, 0);
// 3. read and decode the syscall
long syscall_number = ptrace(PTRACE_PEEKUSER, pid, 8*ORIG_RAX, NULL);
const char *syscall_name;
switch (syscall_number) {
case 5: syscall_name = "read"; break;
case 6: syscall_name = "write"; break;
case 10: syscall_name = "open"; break;
case 11: syscall_name = "close"; break;
default: syscall_name = "unknown"; break;
}
printf("Syscall: %s (number: %ld)\n", syscall_name, syscall_number);
}
}The program relies on three key steps:
Attach to the target process – ptrace(PTRACE_ATTACH, pid, NULL, NULL) creates a tracing relationship, requiring root privileges.
Register as the target’s syscall debugger – ptrace(PTRACE_SYSCALL, pid, NULL, NULL) tells the kernel to notify the tracer on each syscall.
Read the syscall number – The kernel stores the number in the ORIG_RAX register; ptrace(PTRACE_PEEKUSER, pid, 8*ORIG_RAX, NULL) retrieves it.
2. Attaching to the target process – The kernel’s ptrace implementation first locates the target’s task_struct , then calls ptrace_attach , which performs permission checks and links the tracer to the target via ptrace_link . This inserts the child into the tracer’s ptraced list and sets the child’s parent to the tracer, enabling waitpid to receive SIGTRAP signals.
3. Capturing the target’s SYSCALL
After attachment, the tracer repeatedly issues PTRACE_SYSCALL and sleeps with waitpid . When the target executes a syscall, the kernel’s syscall_trace_enter detects the SYSCALL_TRACE flag, calls ptrace_report_syscall_entry , and ultimately invokes ptrace_stop , which sets the target’s state to TASK_TRACED , records the exit code, notifies the tracer, and schedules the target out.
The tracer wakes up, reads the ORIG_RAX register via PTRACE_PEEKUSER , translates the number using a lookup table (e.g., /usr/include/x86_64-linux-gnu/asm/unistd_64.h ), and prints the syscall name.
4. Summary – The overall flow consists of three phases: (1) ptrace(ATTACH) to link tracer and target, (2) ptrace(SYSCALL) plus waitpid to wait for syscall events, and (3) reading ORIG_RAX and converting the number to a human‑readable name. Because strace pauses the target on each syscall, it introduces noticeable context‑switch overhead and should be used cautiously in production environments.
Finally, the article promotes a series of video courses covering hardware principles, memory management, process management, network management, and upcoming container fundamentals, offering a discounted subscription via QR code.
Refining Core Development Skills
Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.