Backend Development 29 min read

Deep Dive into Node.js Architecture and Core Module Implementation

This article provides a comprehensive overview of Node.js’s architecture, including its composition, code structure, startup process, event loop phases, process and thread management, core modules such as Cluster, libuv thread pool, signal handling, file operations, and networking protocols, illustrating how the runtime integrates V8 and operating‑system features.

ByteDance Web Infra
ByteDance Web Infra
ByteDance Web Infra
Deep Dive into Node.js Architecture and Core Module Implementation

1. Node.js Composition

Node.js mainly consists of V8, libuv and third‑party libraries:

Libuv: a cross‑platform asynchronous I/O library that also provides process, thread, signal, timer, inter‑process communication, thread‑pool and other functionalities.

Third‑party libraries: asynchronous DNS resolver (c‑ares), HTTP parser (old versions use http_parser, newer versions use llhttp), HTTP/2 parser (nghttp2), compression library (zlib), encryption library (openssl), etc.

V8: implements JavaScript parsing, execution and supports custom extensions; Node.js exists because V8 allows custom extensions.

2. Node.js Code Architecture

The codebase is divided into three layers: JavaScript, C++ and C.

JavaScript: the modules that developers use directly (e.g., http , fs ).

C++: three parts – a wrapper around libuv, a set of APIs that do not depend on libuv (e.g., the crypto‑related APIs that use libuv’s thread‑pool, such as the Buffer module), and the V8 integration layer.

C: low‑level OS bindings, such as TCP and UDP handling.

After understanding the composition and code architecture, we can look at the Node.js startup process.

3. Node.js Startup Process

3.1 Registering Built‑in C++ Modules

Node.js calls registerBuiltinModules , which invokes a series of registerXXX functions generated by macros in each C++ module. Each registration inserts a node into a linked list that represents the built‑in modules.

These modules are accessed via internalBinding (internal only) or process.binding (exposed to user code).

3.2 Environment Object and Context Binding

After registration, Node.js creates an Environment object that holds runtime‑wide data and binds it to V8’s Context , allowing native code to retrieve the environment through the context.

3.3 Initialising the Module Loader

Node.js loads the C++ module loader and executes loader.js , which wraps both the C++ and native JS loaders and stores them in the environment.

It then loads the native JS loader and runs run_main_module.js .

run_main_module.js finally loads the user’s JavaScript entry point.

Assume the user code contains:

require('net')

require('./myModule')

When require is executed:

Node.js checks whether the module is a native JS module; if not, it loads the user module directly.

If the module is native and uses a C++ binding, internalBinding is used to load it.

3.4 Executing User Code and the libuv Event Loop

After loading, Node.js runs the user’s JavaScript. The code typically schedules tasks for the event loop, such as creating a TCP handle when a server is listening.

net.createServer(() => {}).listen(80)

4. Event Loop

The event loop consists of seven phases:

timer – handles timers using a binary heap where the earliest expiration is at the root.

pending – processes callbacks generated by the Poll I/O phase.

check, prepare, idle – custom phases executed on every loop iteration.

Poll I/O – processes network I/O, signals, thread‑pool tasks, etc.

closing – runs callbacks for handles that are being closed.

4.1 Timer Phase

Timers are stored in a binary heap. During the timer phase, the heap is traversed; expired nodes have their callbacks executed and are removed. If a node has the repeat flag, it is re‑inserted for the next interval.

Node.js maintains a JavaScript‑level heap, a linked list per node, a map from relative timeout to heap nodes, and a corresponding low‑level timeout node.

4.2 check / idle / prepare Phases

Each of these phases maintains a queue; callbacks in the queue are executed but the nodes remain in the queue until explicitly removed.

4.3 pending / closing Phases

Both phases also use queues; callbacks are executed and the corresponding nodes are then removed.

4.4 Poll I/O Phase

The core data structure is an I/O observer that wraps a file descriptor, interested events and a callback, similar to epoll’s epoll_event . When a descriptor is added, an observer is created and inserted into libuv’s internal list via uv__io_start . During the Poll phase, libuv iterates the observer list, performs epoll operations, and dispatches callbacks for ready descriptors. If epoll blocks, the timeout is set to the nearest timer expiration to keep timers punctual.

5. Processes and Inter‑Process Communication

5.1 Creating Processes

Node.js creates processes using a fork‑plus‑exec model. Fork copies the parent’s memory; exec loads a new program. Both asynchronous and synchronous creation modes are provided.

Asynchronous creation spawns a child that runs independently while the parent records the child’s metadata.

Synchronous creation blocks the parent: a new event‑loop structure is allocated for the child, the child runs on that loop, and the parent’s original loop is blocked until the child exits.

5.2 Inter‑Process Communication

Node.js uses Unix domain sockets for IPC because they support file‑descriptor passing. The implementation creates a socketpair , gives one descriptor to the parent (which wraps send and on('message') ), passes the other descriptor to the child via an environment variable, and the child wraps matching send/receive functions.

6. Threads and Thread‑Based Communication

6.1 Thread Architecture

Although Node.js is single‑threaded at the JavaScript level, it supports worker threads. Each worker has its own event loop but shares libuv’s thread pool.

6.2 Creating a Worker Thread

When new Worker(...) is called, the main thread creates two communication structures, sends a message to load the JS file, invokes the OS to spawn a thread, the new thread initialises its environment, reads the message, loads the script and enters its own event loop.

6.3 Thread‑Based Communication

Communication uses MessageChannel , which consists of two linked MessagePort objects. A Message is serialized, inserted into the opposite port’s queue under a lock, and the receiving thread is notified. The receiving thread processes the message during its Poll I/O phase.

7. Cluster Module (Multi‑Process Scaling)

Node.js is single‑process by default; the cluster module enables a multi‑process server model. Two modes exist: master‑accept (master accepts connections and distributes them) and worker‑accept (workers share the listening socket).

Example usage:

Master calls fork to create workers.

Workers start a server; the master either creates a listening socket and passes file descriptors to workers (master‑accept) or lets workers share the socket (worker‑accept).

7.1 Master‑Accept

The master creates a listening socket, accepts connections, and distributes the file descriptors to workers.

7.2 Worker‑Accept

The master creates a socket, binds it, but does not listen; the socket is passed to a worker, which then accepts connections.

8. libuv Thread Pool

Tasks that are I/O‑bound, DNS‑bound or CPU‑intensive are off‑loaded to libuv’s thread pool. The pool maintains a task queue; worker threads pull tasks, execute them, then push the completed task into the main loop’s completed‑task queue and notify the main thread via libuv’s async mechanism.

9. Signal Handling

Signals are represented by a long integer and an array of handlers. libuv maintains a red‑black tree of signal handles. When a signal is registered, a node is inserted; the first node causes libuv to create an I/O observer registered with epoll. On signal delivery, libuv looks up the handle in the tree and schedules the callback in the main thread’s Poll phase.

Node.js also uses the newListener hook to register signal listeners; when process.on('SIGINT') is called, startListeningIfSignal creates the red‑black‑tree node and stores the subscription in the events module.

10. File Operations and Watching

10.1 File Operations

Node.js offers synchronous (blocking) and asynchronous (non‑blocking) file APIs. Asynchronous calls are delegated to the libuv thread pool so the main thread remains responsive.

10.2 File Watching

Two modes exist: polling (implemented with a timer that periodically checks file metadata) and inotify‑based subscription (efficient, event‑driven). The inotify implementation creates an instance, registers paths, and stores callbacks in a red‑black tree; when events arrive, libuv’s Poll phase dispatches the appropriate callbacks.

11. TCP Server Implementation

Calling http.createServer(cb).listen(port) performs the following steps:

Obtain a socket.

Bind the address to the socket.

Mark the socket as listening.

Register the socket with epoll to await incoming connections.

When a connection arrives, Node.js:

Calls accept to retrieve the TCP connection.

Creates a C++ object representing the connection.

Creates a corresponding JavaScript object for user‑level interaction.

Registers a readable event to receive data from the client.

If the single_accept flag is set, Node.js sleeps briefly after handling a connection to allow other processes to accept new connections, improving load distribution.

12. UDP Sending Process

When sending a UDP packet, libuv enqueues the data, registers a writable event with epoll, and upon the event, iterates the queue to send each packet. After successful transmission, the packet is moved to a completed‑queue and a pending‑phase callback notifies the user.

13. DNS Resolution

Blocking DNS APIs are off‑loaded to the libuv thread pool. Node.js submits a task, the worker thread calls the c‑ares library, which creates a socket, performs the query, and notifies Node.js via a callback. The result is parsed and delivered to the JavaScript callback.

c‑ares implements the DNS protocol itself; Node.js initialises c‑ares, registers a socket‑change callback, creates a socket for each query, registers it with epoll, and processes the response in the Poll phase.

14. Conclusion

This article presented a top‑down view of Node.js’s implementation, covering its composition, code architecture, startup sequence, event‑loop mechanics, process and thread models, core modules such as Cluster and libuv, as well as signal, file, TCP, UDP and DNS handling. Understanding these internals helps developers use Node.js more effectively.

For more details, see: https://github.com/theanarkh/understand-nodejs

backendArchitectureNode.jsthreadProcessevent-looplibuv
ByteDance Web Infra
Written by

ByteDance Web Infra

ByteDance Web Infra team, focused on delivering excellent technical solutions, building an open tech ecosystem, and advancing front-end technology within the company and the industry | The best way to predict the future is to create it

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.