Fundamentals 10 min read

High‑Performance Computing (HPC) Network Requirements and RDMA Technologies

The article explains how modern data‑center compute demands drive the need for high‑throughput, low‑latency networking, compares TCP/IP with RDMA‑based solutions such as InfiniBand, iWARP and RoCE, and recommends loss‑less Ethernet for large‑scale HPC deployments.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
High‑Performance Computing (HPC) Network Requirements and RDMA Technologies

With 5G, big data, IoT and AI reshaping society, data‑center resources are shifting from pure storage to compute power, making network performance a critical factor for the emerging "compute‑center" paradigm.

Single‑core scaling has stalled at 3 nm and adding cores quickly raises power consumption; therefore high‑performance computing (HPC) becomes the norm, and as clusters grow from P‑ to E‑scale the network must provide ever higher bandwidth and lower latency.

HPC workloads fall into three typical network scenarios: loosely‑coupled (e.g., financial risk, remote sensing), tightly‑coupled (e.g., electromagnetic simulation, fluid dynamics) which demand ultra‑low latency, and data‑intensive (e.g., weather forecasting, genome sequencing) which require high throughput and moderate latency.

To meet these demands, the industry is replacing TCP/IP with RDMA (Remote Direct Memory Access), because RDMA offers kernel‑bypass and zero‑copy data movement, dramatically reducing both latency (to ~1 µs) and CPU utilization.

TCP/IP suffers from fixed stack latency of tens of microseconds and high CPU load: each packet incurs multiple context switches, data copies and CPU‑bound protocol processing, which becomes a bottleneck for AI, SSD‑based storage and other microsecond‑scale systems.

RDMA eliminates most of these overheads: the NIC reads/writes directly to application memory, cutting CPU usage from near‑100 % to a few percent and shrinking end‑to‑end latency from milliseconds to sub‑10 µs.

Three RDMA transport options exist:

InfiniBand : native RDMA protocol with the highest throughput and lowest latency, but requires proprietary switches and lacks IP‑based interoperability.

iWARP : runs RDMA over TCP, allowing use of standard Ethernet hardware, yet inherits most TCP performance penalties.

RoCE (v1/v2) : maps RDMA onto Ethernet; RoCEv1 works within a single broadcast domain, while RoCEv2 adds routability via UDP encapsulation. Both need lossless Ethernet to preserve RDMA performance because RDMA is extremely sensitive to packet loss.

Given the market dominance of Ethernet and the high OPEX of InfiniBand, RoCEv2 over lossless Ethernet is the preferred solution for large‑scale HPC deployments, provided that packet loss stays below 0.001 % (ideally <1e‑5) to avoid throughput collapse.

network performanceRDMAdata centerHPCInfiniBandRoCEiWARP
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.