Using backtrace to Diagnose Linux Program Crashes
This article explains common causes of unexpected Linux program termination and demonstrates how to employ the backtrace utility, along with signal handling and related functions, to capture and analyze stack traces, enabling precise identification and resolution of issues such as memory overflows, null pointer dereferences, and other runtime errors.
During a journey of Linux development, the author frequently encountered abrupt program terminations that left developers frustrated. To help others avoid similar pain, the article shares a systematic method using the backtrace tool to locate and resolve such crashes.
1. Common Causes of Linux Program Abnormal Exit
Memory overflow: allocating more memory than the system can provide, often due to loops that allocate without freeing.
Null‑pointer dereference: accessing a pointer that has not been assigned a valid address.
File I/O errors: trying to read/write a non‑existent file, lacking permissions, or encountering file locks.
Insufficient system resources: exhausting file descriptors, process slots, network connections, etc.
Logical errors: incorrect conditions that lead to unexpected execution paths and crashes.
System exceptions: hardware faults or kernel crashes that terminate the process.
2. What is backtrace?
2.1 Definition and Purpose Backtrace ("回溯") generates a function call stack when a program crashes or aborts, providing a "footprint" of the execution path that helps pinpoint where the failure occurred.
2.2 Working Principle The call stack records return addresses and other context for each function call. When a crash happens, backtrace walks the stack frames from the current function back to the entry point, collecting function names, offsets, and addresses, and then formats this information for the developer.
2.3 Example Call Stack
Assume functions A → B → C, where C crashes. Backtrace starts at C, retrieves its return address (pointing to B), then moves to B, and finally to A, producing a readable stack trace.
3. backtrace Functions and Usage
Linux provides several APIs to obtain and format stack information.
int backtrace(void *buffer, int size)This function fills buffer with up to size return addresses and returns the actual number captured.
#define BT_BUF_SIZE 100
void *buffer[BT_BUF_SIZE];
int nptrs = backtrace(buffer, BT_BUF_SIZE); char **backtrace_symbols(void const *buffer, int size)Converts the addresses returned by backtrace into an array of printable strings (function name, offset, address).
char **strings = backtrace_symbols(buffer, nptrs);
for (int i = 0; i < nptrs; i++) {
printf("%s\n", strings[i]);
}
free(strings); void backtrace_symbols_fd(void const *buffer, int size, int fd)Writes the formatted stack trace directly to the file descriptor fd , avoiding extra memory allocation and making the function re‑entrant.
3.2 Usage Notes
(1) Compilation options Compile with -g and -rdynamic and avoid optimization flags that remove the frame pointer (e.g., -O2 -fomit-frame-pointer ), otherwise backtrace cannot retrieve stack frames.
(2) Special cases
Static functions are not exported, so their names may not appear in the trace.
Inline functions have no separate stack frame, making them invisible to backtrace.
Tail‑call optimization can collapse frames, causing missing information.
4. Capturing System Signals to Obtain a Stack Trace
Linux signals such as SIGSEGV (invalid memory access) or SIGFPE (illegal arithmetic) can be intercepted with signal() . The handler can call backtrace to dump the stack.
#include
#include
#include
#include
void signal_handle(int signal) {
void *buffer[100];
char **strings;
int nptrs;
printf("\n==========> catch signal %d <==========\n", signal);
nptrs = backtrace(buffer, 100);
strings = backtrace_symbols(buffer, nptrs);
if (strings == NULL) {
perror("backtrace_symbols");
exit(EXIT_FAILURE);
}
for (int i = 0; i < nptrs; i++) {
printf("%s\n", strings[i]);
}
free(strings);
exit(EXIT_FAILURE);
}
int main() {
signal(SIGSEGV, signal_handle);
int *ptr = NULL;
*ptr = 10; // triggers SIGSEGV
return 0;
}5. backtrace Analysis Case Study
5.1 Test Code Demonstrates both a null‑pointer dereference and a division‑by‑zero, each triggering a different signal.
#include
#include
#include
#include
void signal_handle(int signal) { /* same as above */ }
void null_pointer_dereference() {
int *ptr = NULL;
*ptr = 10; // SIGSEGV
}
void division_by_zero() {
int a = 10;
int b = 0;
int result = a / b; // SIGFPE
}
int main() {
signal(SIGSEGV, signal_handle);
signal(SIGFPE, signal_handle);
null_pointer_dereference();
// division_by_zero();
return 0;
}Running the program prints a stack trace similar to:
==========> catch signal 11 <==========
Dump stack start...
./test(signal_handle+0x55) [0x400975]
/lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7f0ff12af4b0]
./test(null_pointer_dereference+0x10) [0x400a40]
./test(main+0x2f) [0x400a9f]
...
Dump stack end...Using addr2line -e test 0x400a40 -f maps the address to the source line ( test.c:19 ), confirming the fault location.
5.2 Dynamic Library Example Shows how to compile a shared library with -g -rdynamic , link the executable with -Wl,-rpath=. , and adjust addresses using the process /proc/ /maps file to obtain correct offsets for addr2line . The example also demonstrates extracting the function entry address via objdump -d and calculating the final offset.
6. Full Summary
The debugging workflow consists of (1) recognizing typical crash causes such as memory overflow, null pointers, or resource exhaustion; (2) compiling with debugging symbols and without frame‑pointer‑eliminating optimizations; (3) registering signal handlers for SIGSEGV, SIGFPE, etc., and invoking backtrace inside the handler; and (4) translating raw addresses to source lines with addr2line (or similar tools) to pinpoint the exact location of the fault.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.