Master Core Dumps: From Generation to Debugging with GDB on Linux
This article explains what a Core Dump is, how Linux generates the ELF‑based snapshot when a program crashes, common causes such as memory errors or signal mishandling, essential system configurations, and step‑by‑step GDB techniques for analyzing and fixing the underlying bugs.
Ever encountered a program that runs fine locally but crashes online with a vague "Segmentation fault"? The key to solving such crashes is the Core Dump – a snapshot of a program’s memory, registers, and stack at the moment of failure, allowing precise bug location without guesswork.
1. What is a Core Dump?
1.1 Core Dump file introduction
On Linux, a Core Dump (core file) is created when a process receives a fatal signal (e.g., SIGSEGV, SIGABRT) and the kernel writes the process’s address space and state to disk. The file acts like a "snapshot" of the crash, essential for post‑mortem debugging.
The Core Dump file consists of four parts: ELF header, program header table, NOTE segment, and LOAD segment.
1.2 Kernel view of Core Dump generation
The generation process involves four steps:
Step 1: A fatal signal (e.g., segmentation fault) triggers a hardware exception.
Step 2: The kernel freezes the process, then terminates it and starts writing the Core Dump.
Step 3: The Core Dump file is written, containing the virtual address space, CPU registers, thread info, and signal details.
Step 4: The kernel releases all resources and fully removes the process.
(1) Signal handling phase: do_signal
Before returning to user space, the kernel checks pending signals and calls do_signal to process them.
static void fastcall do_signal(struct pt_regs *regs) {
siginfo_t info;
int signr;
struct k_sigaction ka;
sigset_t *oldset;
// Get the signal to deliver
signr = get_signal_to_deliver(&info, &ka, regs, NULL);
if (signr > 0) {
// Signal‑specific handling, possibly generating a Core Dump
}
}If the signal requires a Core Dump, the kernel prepares the dump.
(2) Signal acquisition phase: get_signal_to_deliver
This function extracts a pending signal from the process’s queues and decides whether a Core Dump should be generated.
int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
struct pt_regs *regs, void *cookie) {
sigset_t *mask = ¤t->blocked;
int signr = 0;
while ((signr = dequeue_signal(mask, ¤t->pending)) ||
(signr = dequeue_signal(mask, ¤t->shared_pending))) {
struct sigpending *pending;
struct sigqueue *q;
q = find_signal_queue(signr, ¤t->pending);
if (!q)
q = find_signal_queue(signr, ¤t->shared_pending);
*info = q->info;
*return_ka = current->sigaction[signr - 1];
if (should_generate_coredump(signr)) {
prepare_coredump();
}
return signr;
}
return 0;
}(3) Memory information recording phase
After deciding to generate a dump, the kernel records the process’s memory layout. The Core Dump is an ELF file containing PT_NOTE (registers, task_struct, VMCOREINFO) and PT_LOAD segments (heap, stack, data, etc.).
2. Core Dump generation mechanism
2.1 Trigger conditions
Core Dumps are typically triggered by the following signals:
SIGSEGV (signal 11) : illegal memory access such as null‑pointer dereference.
#include <stdio.h>
#include <stdlib.h>
int main() {
int *ptr = NULL;
*ptr = 10; // triggers SIGSEGV
return 0;
}SIGABRT (signal 6) : abort() or failed assert.
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
int main() {
int num = 0;
assert(num > 0); // triggers SIGABRT
return 0;
}SIGFPE (signal 8) : fatal arithmetic error such as division by zero.
#include <stdio.h>
int main() {
int a = 10;
int b = 0;
int c = a / b; // triggers SIGFPE
return 0;
}2.2 Configuration points
Use ulimit -c unlimited to allow unlimited Core Dump size.
Ensure the program’s working directory is writable.
If the program changes its effective UID/GID, set /proc/sys/fs/suid_dumpable to 1.
Adjust /proc/sys/kernel/core_pattern to control dump location and naming.
3. Common causes of Core Dumps
3.1 Memory access errors
Null or wild pointer dereference – accessing a null pointer triggers SIGSEGV.
#include <iostream>
int main() {
int *ptr = nullptr;
*ptr = 10; // SIGSEGV
return 0;
}Buffer overflow – writing past an array’s bounds can corrupt memory and cause a crash.
#include <stdio.h>
int main() {
char buffer[10];
strcpy(buffer, "123456789012345"); // overflow
return 0;
}3.2 Improper signal handling
SIGSEGV, SIGABRT, SIGFPE – if not caught, the default action is to terminate and dump core.
3.3 Resource limits and configuration
If ulimit -c is 0 or disk space is insufficient, no Core Dump will be written.
3.4 Multithreading issues
Race conditions – unsynchronized access to shared data can lead to crashes.
#include <iostream>
#include <thread>
int sharedVariable = 0;
void increment() {
for (int i = 0; i < 1000; ++i) {
sharedVariable++; // no lock → race
}
}
int main() {
std::thread t1(increment), t2(increment);
t1.join(); t2.join();
std::cout << "Final: " << sharedVariable << std::endl;
return 0;
}Deadlock – two threads waiting on each other’s mutex can stall and eventually crash.
#include <iostream>
#include <thread>
#include <mutex>
std::mutex m1, m2;
void f1() { m1.lock(); std::this_thread::sleep_for(std::chrono::milliseconds(100)); m2.lock(); m2.unlock(); m1.unlock(); }
void f2() { m2.lock(); std::this_thread::sleep_for(std::chrono::milliseconds(100)); m1.lock(); m1.unlock(); m2.unlock(); }
int main() { std::thread t1(f1), t2(f2); t1.join(); t2.join(); return 0; }3.5 Dynamic memory management errors
Double free – freeing the same pointer twice leads to undefined behavior.
#include <stdio.h>
#include <stdlib.h>
int main() {
int *ptr = (int *)malloc(sizeof(int));
free(ptr);
free(ptr); // double free → crash
return 0;
}Memory leak – continuous allocation without release can exhaust memory and cause a crash.
#include <iostream>
void leak() { while (true) { int *p = new int; } }
int main() { leak(); return 0; }3.6 Program logic errors
Infinite recursion – eventually overflows the stack.
#include <stdio.h>
void recur() { recur(); }
int main() { recur(); return 0; }Uncaught exception – propagates to termination.
#include <iostream>
void thrower() { throw 1; }
int main() {
try { thrower(); } catch(...) {}
return 0;
}3.7 Hardware problems
Faulty RAM can cause SIGSEGV‑like crashes.
CPU overheating or defects may also trigger crashes.
4. Core Dump analysis and debugging
4.1 Using GDB to analyze a Core Dump
Load the executable and core file: gdb ./my_program core.1234 Use bt to view the backtrace, p to inspect variables, and info registers to see CPU state.
(gdb) bt
#0 func3 (arg1=0x7fffffffde10, arg2=42) at my_file.c:123
#1 0x00005555555552b5 in func2 (arg=0x7fffffffde10) at main.c:234
#2 0x0000555555555350 in main () at main.c:3454.2 Check system configuration
Ensure ulimit -c is not zero: ulimit -c Set it to unlimited if needed: ulimit -c unlimited Verify write permission on the dump directory, e.g., /var/core:
ls -l /var/core
sudo chmod 777 /var/core4.3 GDB loading Core Dump
gdb my_program core.1234After loading, use where / bt, p, and info registers as needed.
4.4 Useful GDB commands
where/ bt – display call stack. p VAR – print variable value. info registers – show CPU registers at crash.
5. Core Dump practical cases
5.1 Simple case analysis
#include <stdio.h>
#include <stdlib.h>
void func() { int *ptr = NULL; *ptr = 10; }
int main() { func(); return 0; }Compile with -g, run, then analyze with GDB:
gdb test core.12345
(gdb) bt
#0 func () at test.c:5
#1 main () at test.c:9
(gdb) p ptr
$1 = (int *) 0x0The backtrace shows the crash at line 5, and ptr is null.
5.2 Complex multithreaded scenario
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define ARRAY_SIZE 100
int shared_array[ARRAY_SIZE];
pthread_mutex_t mutex;
void *write_thread(void *arg) {
for (int i = 0; i < ARRAY_SIZE; i++) {
pthread_mutex_lock(&mutex);
shared_array[i] = i;
pthread_mutex_unlock(&mutex);
}
return NULL;
}
void *read_thread(void *arg) {
for (int i = 0; i < ARRAY_SIZE; i++) {
pthread_mutex_lock(&mutex);
int value = shared_array[i];
printf("Read value: %d at index %d
", value, i);
pthread_mutex_unlock(&mutex);
}
return NULL;
}
int main() {
pthread_t w, r;
pthread_mutex_init(&mutex, NULL);
pthread_create(&w, NULL, write_thread, NULL);
pthread_create(&r, NULL, read_thread, NULL);
pthread_join(w, NULL);
pthread_join(r, NULL);
pthread_mutex_destroy(&mutex);
return 0;
}After a crash, load the core file:
gdb multi_thread_test core.67890
(gdb) info threads
(gdb) thread 2 # switch to the read thread
(gdb) bt
#0 read_thread (arg=0x0) at multi_thread_test.c:20
#1 pthread_mutex_lock ()
#2 main () at multi_thread_test.c:28
(gdb) p i
$1 = 120The index i exceeds ARRAY_SIZE, revealing an out‑of‑bounds read that caused the Core Dump.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
