Analyzing Core Dumps in Linux PHP Production Environments: Tools, Causes, and Solutions
This article explains core dumps, their history, generation, impact on services, and provides a detailed Linux PHP case study with step‑by‑step debugging using strace, gdb, and ulimit, culminating in root‑cause analysis and remediation recommendations.
From Practical Problems to Theory
A PHP script in a Linux production environment produced a core dump. A team of engineers used tools such as strace (system‑call tracing) and gdb (debugger) to locate and analyze the issue, reproducing the problem in a stable environment and ultimately fixing it.
Background and History
The term core originates from magnetic core memory, the dominant RAM technology from the 1950s to the 1970s. Early core dumps were printed on paper as octal or hexadecimal listings; later they were written to magnetic media and finally to formatted files.
Terminology
core : name derived from magnetic core memory.
dump : snapshot of memory contents, originally printed, later stored on tape or disk.
Key Tools
strace : traces system calls and signals of a process.
gdb : GNU debugger for inspecting core files and live processes.
Linux : open‑source Unix‑like operating system.
backtrace : provides self‑debugging information.
Impact of Core Dumps
When a process generates a core dump, it terminates immediately, causing service interruption. The time to write a core file is proportional to the process’s memory size; for processes using >60 GB, a full core can take >15 minutes, risking disk exhaustion.
Generating a Core File
Core files can be triggered by signals such as SIGBUS , SIGSEGV , SIGILL , SIGABRT , SIGQUIT . Signals SIGKILL and SIGSTOP cannot be caught, so they never produce a core.
work@VM_28_112_centos:~/duan/coredump$ sleep 10
^\Quit (core dumped)The gcore utility can also create a core from a running process:
work@VM_28_112_centos:~/duan/coredump$ sleep 10 &
[3] 5299
work@VM_28_112_centos:~/duan/coredump$ gcore 5299
Saved corefile core.5299Core File Format
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 ...
Class: ELF64
Data: 2's complement, little endian
Type: CORE (Core file)
Machine: Advanced Micro Devices X86-64
...Case Study: PHP Service Crash
In a scheduled task, a PHP 7.0.13 script repeatedly produced core files under /tmp . The ulimit -c limit was set to 29 blocks (≈30 KB), truncating the core files and making debugging difficult.
work@VM_28_112_centos:~/duan/coredump$ cat /proc/sys/kernel/core_pattern
/tmp/core-%e-%p
work@VM_28_112_centos:~/duan/coredump$ ulimit -a
core file size (blocks, -c) 29
...Running environment:
CentOS 7
PHP 7.0.13 with Phalcon 3.2.4 C‑extension
Debugging steps:
Inspect the core with gdb : gdb /usr/local/matrix/bin/php /tmp/core-php-31859 Program terminated with signal 11, Segmentation fault. #0 0x00007f2f1e2e430b in ?? () // truncated core, no stack info
Increase core size limit: sudo su ulimit -c unlimited
Reproduce the crash: /usr/local/matrix/bin/php /data0/www/htdocs/cli/cli.php history_deal_export_v2 main Segmentation fault (core dumped)
Collect a full backtrace with gdb (after fixing the limit): #0 zephir_create_instance.constprop.180() #1 Phalcon\Di\FactoryDefault::get() #2 ...
Use strace to see the last system call before the crash: 20:38:09.833890 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x18} ---
The backtrace showed the crash occurring in zephir_create_instance while accessing a constant that does not exist in the Phalcon 3.2.4 extension. The faulty code path was:
use Phalcon\Di\FactoryDefault\Cli as CliDI, Phalcon\Cli\Console as ConsoleApp;
// ...
$console->handle($arguments); // triggers Phalcon DI resolutionFurther analysis of the Phalcon source revealed that zephir_create_instance throws a fatal error when a class name is not a string, but the bug manifested as a segmentation fault only when a class member referenced an undefined constant.
Conclusion and Recommendations
The root cause is a bug in Phalcon 3.2.4 where accessing an undefined constant in a class member triggers a core dump.
Upgrade to Phalcon 3.4.1 (or later), where the issue is fixed.
Ensure ulimit -c unlimited and appropriate core_pattern settings in production to capture full cores.
Use automated tests to catch such constant‑access bugs before deployment.
References
Wikipedia – Core dump
Baidi R&D blog – Core dump analysis
man7.org – signal(7), core(5), backtrace(3)
IBM – strace tutorial
GDB documentation
Acknowledgements
Special thanks to the internal reviewers and contributors who helped with the investigation.
Advertisement
New‑renovation division is hiring Java, PHP, Front‑end, iOS, Android engineers (P6‑P8). Send resumes to [email protected].
Beike Product & Technology
As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.