Fundamentals 13 min read

Overview of Data Processing Units (DPUs) and Their Evolution in Data Centers

Data Processing Units (DPUs) have evolved from early I/O processors to modern programmable ASICs and FPGA-based accelerators, integrating networking, storage, and compute functions to offload workloads from CPUs, with contributions from companies like Fungible, Nvidia, Intel, and emerging Chinese firms, shaping data‑center and edge architectures.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Overview of Data Processing Units (DPUs) and Their Evolution in Data Centers

Introduction The concept of a Data Processing Unit (DPU) was first introduced by Fungible in 2016, and the term gained traction after Nvidia launched its BlueField DPU in 2020. Since then, many companies such as Marvell, Pensando, Broadcom, Intel, as well as Chinese firms like Zhongke Yushu and Xingyun Zhili, have entered the DPU market.

Multiple Definitions of DPU Wikipedia defines a DPU as a programmable specialized circuit with hardware acceleration for data‑centric computing, a broad definition that can include FPGAs and certain switch chips. Fungible’s DPU enables hyper‑disaggregation of compute and storage across data‑center scales, focusing on CPU‑storage isolation. Marvell’s DPU targets networking equipment (switches, routers, firewalls, smart NICs). Nvidia’s DPU combines a multi‑core CPU, high‑performance network interfaces, and programmable acceleration engines, resembling a CPU with five functional blocks (compute, storage, control, I/O, and networking).

Historical Perspective: I/O Processors The idea of an I/O processor dates back to IBM 709 (1958) and IBM 360 (1964), which used channel controllers to manage peripheral devices and offload the CPU. As device speeds diverged, the Northbridge/Southbridge architecture emerged, isolating high‑speed (memory, GPU) and low‑speed (keyboard, mouse) peripherals, further relieving the CPU for compute‑intensive tasks.

Network Processors Network processors (NP) are ASICs specialized for packet handling. Early designs (e.g., Freescale 2000) offered programmable interfaces and flexibility, but were limited to networking functions. Modern NPs, such as PowerEN, integrate additional engines (crypto, regex, compression, XML) to accelerate common workloads, illustrating the trend toward multifunctional accelerators that DPUs aim to embody.

From NICs to DPUs The evolution of network interface cards (NICs) shows increasing functionality and programmability, culminating in smart NICs built on FPGAs (e.g., Ethernity). DPUs inherit these capabilities, requiring both programmable interfaces and the ability to handle compute and storage tasks alongside networking.

XPUs and the Shift to Data‑Centric Computing As data‑center workloads become data‑centric, CPUs struggle with high‑speed data movement and protocol processing. Companies have introduced XPUs (e.g., Amazon Nitro, Facebook’s data‑center acceleration research) to offload networking, storage, and security functions. China’s KPU and IBM’s Z15 mainframe also illustrate attempts to free CPUs from I/O‑heavy duties.

Current DPU Offerings Intel – IPU (Infrastructure Processing Unit) Intel’s IPU, announced in 2021, mirrors DPU functionality with three main traits: security, storage acceleration, and network acceleration. It combines programmable logic (ASIC or FPGA) with ARM cores to run ISP/CSP services, offering high‑bandwidth packet processing, RDMA, NVMe, and IPSec. ASIC IPU – Mount Evans: features a network unit (up to 4×200 Gb/s, SR‑IOV, RDMA, NVMe, IPSec) and a compute unit (16 Arm N1 cores, cache, acceleration engines). FPGA IPU – Oak Springs Canyon: built from Agilex FPGA and Xeon‑D SoC, running embedded Linux for flexible acceleration. Fungible – F1 and S1 Fungible provides two DPU chips: S1 for smart‑NIC roles and F1 for server‑side workloads. F1 comprises a data cluster (192 threads for data movement, protection, analysis), a control cluster (Linux, security, cryptography), and a network unit (low‑latency Ethernet, P4 support, packet encryption).

Future Outlook DPUs aim to address the limitations of the von Neumann architecture by offloading I/O‑intensive tasks, providing programmable acceleration for networking, storage, and security, and enabling more efficient resource isolation in data‑center and edge environments.

infrastructurehardware accelerationdata centerFPGADPUNetwork Processor
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.