Fundamentals 28 min read

How Linux Executes a Hello World Program: Understanding ELF Format and the execve Loading Process

This article explains step by step how a simple Hello World program is compiled, linked, and executed on Linux, covering the ELF executable format, the role of the file command, the fork‑execve process creation sequence, and the kernel’s binary loader implementation.

Refining Core Development Skills
Refining Core Development Skills
Refining Core Development Skills
How Linux Executes a Hello World Program: Understanding ELF Format and the execve Loading Process

Hello everyone, I'm Feige! Today we explore how a program runs on Linux using the simplest Hello World example.

#include
int main(){
    printf("Hello, World!\n");
    return 0;
}

After compiling with gcc main.c -o helloworld and running ./helloworld , the program prints "Hello, World!". What actually happens during compilation, loading, and execution?

1. Understanding the Executable File Format

The compiled binary is an ELF (Executable and Linkable Format) file. Using file helloworld we see:

helloworld: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), ...

ELF is the standard binary format on Linux, consisting of an ELF header, a Program Header Table, Sections, and a Section Header Table.

1.1 ELF Header

The ELF header contains overall file attributes. Using readelf --file-header helloworld we can view fields such as Magic, Class, Type, Entry point address, and sizes. Important fields include:

Magic – identifies the file as ELF.

Class – indicates ELF64.

Type – EXEC for executable files.

Entry point address – where execution starts (e.g., 0x401040).

Size of this header – 64 bytes.

Additional fields describe program and section headers.

Start of program headers – offset of the Program Header Table.

Number of program headers – total count (e.g., 11).

Start of section headers – offset of the Section Header Table.

Number of section headers – total count (e.g., 30).

1.2 Program Header Table

Program headers describe loadable segments. Sections such as .text, .data, and .bss are grouped into Segments based on required memory permissions (R, W, X). The kernel loads each PT_LOAD segment into memory.

1.3 Section Header Table

Section headers describe each section individually (e.g., .text, .data, .bss). Using readelf --section-headers helloworld we see 30 sections, each with address, offset, size, and flags (W, A, X, etc.). The .text section contains the code, and its address matches the entry point.

// Example of uninitialized memory in .bss
int data1; // .bss
// Initialized data in .data
int data2 = 100; // .data
// Code in .text
int main(void) {
    ...
}

1.4 Inspecting the Entry Point

Using nm -n helloworld we see the entry point 0x401040 points to _start , which performs initialization before calling main at 0x401126.

2. Overview of User Process Creation

Shell loads a program via fork followed by execve . A simplified shell snippet:

int main(int argc, char *argv[]) {
    ...
    pid = fork();
    if (pid == 0) { // child process
        execve("helloworld", argv, envp);
    } else {
        ...
    }
    ...
}

The fork system call creates a new task_struct; execve replaces the child’s memory image with the new binary.

In the kernel, fork is defined in kernel/fork.c and ultimately calls do_fork , which allocates a new task_struct and copies resources.

SYSCALL_DEFINE0(fork) {
    return do_fork(SIGCHLD, 0, 0, NULL, NULL);
}

During do_fork , copy_process creates the new task_struct, copies files, namespaces, memory descriptors, and allocates a PID.

static struct task_struct *copy_process(...){
    struct task_struct *p = dup_task_struct(current);
    // copy files, fs, mm, namespaces
    pid = alloc_pid(p->nsproxy->pid_ns);
    p->pid = pid_nr(pid);
    p->tgid = p->pid;
    ...
}

3. Linux Executable Loader

Linux registers binary format handlers in a global formats list. The ELF loader is represented by a linux_binfmt structure:

struct linux_binfmt {
    int (*load_binary)(struct linux_binprm *);
    int (*load_shlib)(struct file *);
    int (*core_dump)(struct coredump_params *);
};

The ELF handler elf_format sets load_binary to load_elf_binary and is registered at boot:

static struct linux_binfmt elf_format = {
    .module = THIS_MODULE,
    .load_binary = load_elf_binary,
    .load_shlib = load_elf_library,
    .core_dump = elf_core_dump,
    .min_coredump = ELF_EXEC_PAGESIZE,
};

static int __init init_elf_binfmt(void) {
    register_binfmt(&elf_format);
    return 0;
}

4. execve Loads a User Program

execve reads the filename, arguments, and environment, then creates a linux_binprm object:

SYSCALL_DEFINE3(execve, const char __user *, filename, ...){
    struct filename *path = getname(filename);
    do_execve(path->name, argv, envp);
    ...
}

do_execve_common allocates and initializes linux_binprm , reads the first 128 bytes of the file, and calls search_binary_handler to find a suitable loader.

static int do_execve_common(const char *filename, ...){
    struct linux_binprm *bprm = kzalloc(sizeof(*bprm), GFP_KERNEL);
    // initialize bprm, count args, env, allocate stack page
    prepare_binprm(bprm); // read 128‑byte header
    search_binary_handler(bprm);
    ...
}

The stack page (4 KB) is allocated via a VMA whose vm_end points near STACK_TOP_MAX , and the stack pointer is stored in bprm->p .

static int __bprm_mm_init(struct linux_binprm *bprm){
    struct vm_area_struct *vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
    vma->vm_end = STACK_TOP_MAX;
    vma->vm_start = vma->vm_end - PAGE_SIZE;
    bprm->p = vma->vm_end - sizeof(void *);
}

After preparation, search_binary_handler iterates over the formats list and invokes each loader’s load_binary until one succeeds.

int search_binary_handler(struct linux_binprm *bprm){
    for (try = 0; try < 2; try++) {
        list_for_each_entry(fmt, &formats, lh) {
            int (*fn)(struct linux_binprm *) = fmt->load_binary;
            retval = fn(bprm);
            if (retval >= 0)
                return retval;
        }
    }
    return -ENOEXEC;
}

4.1 ELF Header Parsing

load_elf_binary reads the ELF header from bprm->buf , validates it, and aborts on mismatch.

4.2 Program Header Reading

The number of program headers ( e_phnum ) determines how many elf_phdr structures are read into memory.

4.3 Clearing Inherited Resources

flush_old_exec(bprm) discards the parent’s memory mappings, signal handlers, etc., and installs the new mm_struct and stack pointer.

4.4 Loading Segments

For each PT_LOAD segment, elf_map creates an mmap region, copying file contents into the appropriate virtual addresses and updating mm_struct fields such as start_code , end_code , start_data , and end_data .

4.5 Data Segment & Heap Initialization

The data segment is expanded with set_brk , which aligns the start and end addresses, allocates virtual memory via vm_brk , and sets mm->start_brk and mm->brk to the heap top.

4.6 Jump to Entry Point

If the ELF contains an INTERP segment, the dynamic linker (e.g., /lib64/ld-linux-x86-64.so.2 ) is loaded first and its entry point combined with the interpreter’s e_entry . Otherwise the ELF’s own e_entry is used. Finally start_thread(regs, entry, bprm->p) transfers control to user space.

5. Summary

Although a Hello World program appears trivial, its execution involves a sophisticated chain: ELF file structure, kernel binary format registration, fork‑execve process creation, allocation of a new memory descriptor, stack setup, segment mapping, heap preparation, and finally jumping to the program’s entry point. Understanding these steps provides a solid foundation for debugging and optimizing Linux applications.

In practice, shells often use vfork instead of fork to avoid unnecessary copying of the parent’s address space before execve , improving performance.

Kernellinuxelfbinary loadingexecveprocess creation
Refining Core Development Skills
Written by

Refining Core Development Skills

Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.