Why Does Your C++ Program Crash? Unveiling Global Object Initialization Order
This article explores how C++ programs start, detailing the role of the _start entry point, the initialization of global objects via .init_array, __libc_csu_init, and constructors, and provides practical techniques—including linking with crt files and using init_priority—to control and debug initialization order and avoid crashes.
Background
In C++ development, global object initialization order can cause crashes when objects in different shared libraries reference each other before they are initialized. Understanding when global variables are initialized, the order, and how to control it is essential.
Test Environment
CPU architecture: x86_64
Operating system and libraries: Debian GLIBC 2.28-10
Compiled by GNU CC version 8.3.0
Is the program entry point main ?
Simple example:
<code>struct C { C() {} // <== break };
C c;
int main() { return 0; }</code>When compiled and run under gdb , the call stack shows that execution starts at _start , not main . The entry point is defined by the ELF header:
e_entry: The virtual address to which the system first transfers control, thus starting the process.
Inspecting sections with info files reveals the address of _start , which points to the program’s entry code.
_start
The canonical entry point, usually the first thing in the text segment. It extracts arguments from the stack and calls __libc_start_main :
<code>_start:</code><code> xorl %ebp, %ebp</code><code> mov %RDX_LP, %R9_LP /* address of shared library termination function */</code><code> popq %rsi /* argc */</code><code> mov %RSP_LP, %RDX_LP /* argv */</code><code> pushq %rsp</code><code> mov $__libc_csu_fini, %R8_LP</code><code> mov $__libc_csu_init, %RCX_LP</code><code> mov $main, %RDI_LP</code><code> call *__libc_start_main@GOTPCREL(%rip)</code><code> hlt /* crash if exit returns */</code>__libc_start_main
This function registers destructors, calls global constructors, invokes main , and finally calls exit :
<code>STATIC int __libc_start_main(int (*main)(int, char**, char**), int argc, char **argv, __typeof(main) init, void (*fini)(void), void (*rtld_fini)(void), void *stack_end) {</code><code> char **ev = &argv[argc + 1];</code><code> __environ = ev;</code><code> if (fini) __cxa_atexit((void (*) (void *)) fini, NULL, NULL);</code><code> if (init) (*init)(argc, argv, __environ);</code><code> int result = main(argc, argv, __environ);</code><code> exit(result);</code><code>}</code>Before main is called, the init function is __libc_csu_init .
__libc_csu_init
<code>void __libc_csu_init(int argc, char **argv, char **envp) {</code><code>#ifndef NO_INITFINI</code><code> _init();</code><code>#endif</code><code> const size_t size = __init_array_end - __init_array_start;</code><code> for (size_t i = 0; i < size; i++)</code><code> (*__init_array_start[i])(argc, argv, envp);</code><code>}</code>It first calls _init (defined in .init section) and then iterates over the function pointers stored in .init_array , which contain the constructors for global objects.
.init and .init_array
_init is defined in crti.o and typically just sets up the stack before calling any pre‑init functions. The real work for global objects is performed by the functions listed in .init_array . Each entry is a wrapper generated by the compiler (e.g., _GLOBAL__sub_I_<em>object</em> ) that eventually calls the object's constructor via __static_initialization_and_destruction_0 .
.preinit_array
Functions placed in .preinit_array run even earlier, after the process is mapped but before any other initialization functions. They are invoked by _dl_init before _start .
init_priority attribute
GCC allows the init_priority attribute (values 101‑65535) to control the relative order of constructors within a single translation unit. Lower numbers have higher priority and appear earlier in .init_array . This does not affect the order across different object files.
Cross‑translation‑unit initialization
When linking multiple object files or libraries, the linker concatenates their .init_array sections. The order is determined by the suffixes added by the compiler (e.g., .init_array.200 before .init_array.201 ), not by the original source order.
Static vs. dynamic linking
With static linking, the combined .init_array order may cause one global object to be constructed before another it depends on, leading to crashes. Dynamic linking loads each shared library first, runs its .preinit_array and .init_array , and only then initializes the main executable’s globals, which avoids many ordering problems.
Avoiding the static initialization order fiasco
The safest approach is to avoid inter‑module global variables. Use the Construct‑on‑First‑Use idiom (lazy initialization) or wrap globals in functions that return a reference to a static local object. This ensures objects are initialized when first needed.
Reference links
https://www.gnu.org/software/hurd/glibc/startup.html
https://refspecs.linuxbase.org/elf/gabi4+/ch5.dynamic.html
https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.sheader.html#special_sections
https://stackoverflow.com/questions/42912038/what-is-the-difference-between-cxa-atexit-and-atexit
https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/C_002b_002b-Attributes.html#C_002b_002b-Attributes
ByteDance SYS Tech
Focused on system technology, sharing cutting‑edge developments, innovation and practice, and analysis of industry tech hotspots.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.