Blockchain 14 min read

An Overview of IPFS: Architecture, IPLD, and libp2p

The article explains IPFS as a peer‑to‑peer, content‑addressed storage system built on a five‑layer architecture, detailing how the Merkle‑DAG, Kademlia routing, IPLD’s multiformats, and libp2p’s modular network stack work together to enable decentralized, tamper‑resistant data storage and retrieval, complementing blockchain.

Meitu Technology
Meitu Technology
Meitu Technology
An Overview of IPFS: Architecture, IPLD, and libp2p

IPFS and blockchain have a very close relationship. As blockchain evolves, the demand for data storage increases. This article examines IPFS from its low‑level design, referencing source code to analyze technical details.

Overview

IPFS stores large data off‑chain, using encryption and hash operations to prevent tampering. Only the hash (content identifier) is recorded on the blockchain, satisfying storage requirements while keeping costs low. Readers should be familiar with network programming, distributed storage, and basic blockchain concepts.

What is IPFS?

IPFS is a protocol designed to create a persistent, distributed storage and file‑sharing network. It can be viewed as a file system that supports peer‑to‑peer transfer.

Storage method: Distributed storage; files are split into blocks, each block gets a unique hash‑derived ID, and multiple copies may exist on different nodes for efficiency.

Content‑addressing: Each block’s unique ID allows retrieval of the block directly from the network.

To organize these blocks into logical files, IPFS uses a Merkle‑DAG (a combination of Merkle trees and directed acyclic graphs). This structure provides content addressing, tamper resistance, and deduplication.

Kademlia (KAD) Algorithm

KAD uses XOR to compute the distance between IDs, unifying node ID and object ID addressing. Its properties include commutativity, reflexivity, a non‑zero distance for different keys, and the triangle inequality. Through KAD, IPFS distributes blocks to nodes that are “close” in XOR space, achieving distributed storage.

IPFS System Architecture

The architecture consists of five layers:

Naming – a PKI‑based namespace.

MerkleDAG – the internal logical data structure.

Exchange – the protocol for block data exchange between nodes.

Routing – implements node and object addressing.

Network – encapsulates P2P communication and transport.

From a data perspective, IPFS is divided into two major modules:

IPLD (InterPlanetary Linked Data) – defines data models and enables cross‑domain data interoperability.

libp2p – a modular network stack that handles data transport.

IPLD Details

IPLD provides a unified data model based on JSON, along with serialization formats, selectors (similar to CSS), and transformation tools. It relies on a set of multiformats:

multihash – self‑describing hash (function code, length, digest).

multiaddr – self‑describing network address format.

multibase – encoding formats (binary, octal, decimal, hexadecimal, base58btc, base64, etc.).

multicodec – self‑describing codec identifiers.

multistream – streams prefixed with a multicodec; a JavaScript example creates a buffer, adds a protobuf prefix, and transmits it.

The Content Identifier (CID) combines these multiformats. CIDv0 is backward compatible (base58btc, protobuf‑mdag). CIDv1 adds explicit multibase, version, multicodec, and multihash fields, offering greater flexibility.

libp2p Details

libp2p defines a modular network stack consisting of:

Peer Routing – currently KAD routing and MDNS routing.

Swarm – handles transport, connection, and stream multiplexing.

Distributed Record Store – stores key‑value records (e.g., IPNS name publishing).

Discovery – supports bootstrap, random‑walk, and MDNS methods.

Routing uses K‑buckets to organize peers based on common prefix length; each bucket holds up to 20 peers before splitting.

Swarm negotiates stream protocols via multistream‑select. The initiator proposes a protocol; if the receiver rejects it, another protocol is tried until a match is found or negotiation fails. An ls message can query all supported protocols.

Discovery mechanisms enable nodes to find each other via configured bootstrap nodes, random walks, or multicast DNS (MDNS) within LANs.

Conclusion

The article presented two core IPFS modules:

IPLD – defines and models data.

libp2p – solves data transport.

Both modules complement each other and can be used independently in other projects. IPFS aims to replace HTTP for web content delivery and integrates Filecoin’s incentive mechanism to encourage node participation and improve data persistence.

Distributed StorageblockchainContent AddressingIPFSIPLDKademlialibp2p
Meitu Technology
Written by

Meitu Technology

Curating Meitu's technical expertise, valuable case studies, and innovation insights. We deliver quality technical content to foster knowledge sharing between Meitu's tech team and outstanding developers worldwide.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.