Tagged articles

91 articles

Page 1 of 1

Architects' Tech Alliance

May 14, 2026 · Artificial Intelligence

Jensen Huang’s China Visit: Could It Revive GPU Prospects? Inside Nvidia’s DGX H200 Cluster Design

The article reviews the US‑approved export of Nvidia's DGX H200, the lack of deliveries, Jensen Huang’s surprise China trip that may speed approvals, and then provides a detailed technical breakdown of the DGX H200 cluster’s compute and storage networking, topology, optical link choices, and cable count estimates.

AI infrastructureDGX H200Data Center Networking

0 likes · 8 min read

Jensen Huang’s China Visit: Could It Revive GPU Prospects? Inside Nvidia’s DGX H200 Cluster Design

Architects' Tech Alliance

Oct 12, 2025 · Artificial Intelligence

How InfiniBand Powers AI Training: Deep Dive into RDMA, RoCEv2, and High‑Speed Interconnects

This article explains how InfiniBand’s architecture, native RDMA, GPUDirect, and evolving bandwidth enable ultra‑low‑latency, high‑throughput communication for AI model training, compares it with Ethernet, and details the role of RoCEv2 and other high‑performance interconnect technologies.

AI trainingGPU interconnectHigh‑Performance Networking

0 likes · 9 min read

How InfiniBand Powers AI Training: Deep Dive into RDMA, RoCEv2, and High‑Speed Interconnects

Architects' Tech Alliance

Jul 29, 2025 · Artificial Intelligence

Why NVIDIA Spectrum‑X and Quantum InfiniBand Are Redefining AI Data Center Networks

The article explains how AI‑driven data center networks must handle massive distributed workloads, why traditional Ethernet falls short, and how NVIDIA’s Spectrum‑X Ethernet and Quantum InfiniBand use loss‑less RDMA, dynamic routing, advanced congestion control, and hardware‑accelerated collective communication to deliver the bandwidth, latency, and scalability required for generative AI and large‑scale model training.

AIInfiniBandNVIDIA

0 likes · 8 min read

Why NVIDIA Spectrum‑X and Quantum InfiniBand Are Redefining AI Data Center Networks

Architects' Tech Alliance

Jul 19, 2025 · Artificial Intelligence

Best GPU Cluster Network for Large‑Scale AI: NVLink, InfiniBand, RoCE & DDC

This article compares the main networking technologies used in large‑scale AI GPU clusters—NVLink, InfiniBand, RoCE Ethernet, and the emerging DDC full‑schedule fabric—examining latency, lossless transmission, congestion control, cost, power and scalability to help engineers choose the optimal solution for training massive language models.

AI trainingDDCData Center

0 likes · 15 min read

Best GPU Cluster Network for Large‑Scale AI: NVLink, InfiniBand, RoCE & DDC

Architects' Tech Alliance

Jul 7, 2025 · Operations

Choosing the Right AI Data Center Network: InfiniBand vs RoCE

This article outlines the high‑performance networking requirements for AI data center training, compares InfiniBand and RoCE solutions, discusses their advantages in bandwidth, latency, scalability and cost, and provides design guidelines for building scalable, low‑latency, non‑blocking AI‑centric network architectures.

AIData CenterHigh Performance Computing

0 likes · 10 min read

Choosing the Right AI Data Center Network: InfiniBand vs RoCE

Architects' Tech Alliance

May 31, 2025 · Artificial Intelligence

GPU Cluster Scaling: Understanding Scale‑Up and Scale‑Out for AI Pods

This article explains the concepts of AI Pods and GPU clusters, compares vertical (scale‑up) and horizontal (scale‑out) expansion, describes XPU types, discusses internal and inter‑pod communication, and evaluates the benefits and drawbacks of each scaling approach along with relevant networking technologies.

AI PodsGPUInfiniBand

0 likes · 10 min read

GPU Cluster Scaling: Understanding Scale‑Up and Scale‑Out for AI Pods

Architects' Tech Alliance

May 26, 2025 · Fundamentals

Understanding RDMA, InfiniBand, and RoCEv2 for High‑Performance Distributed Training

The article explains how distributed AI training performance depends on reducing inter‑card communication latency, introduces RDMA technology and its implementations (InfiniBand, RoCEv2, iWARP), compares their latency and scalability against traditional TCP/IP, and outlines the hardware components and trade‑offs of InfiniBand and RoCEv2 networks.

InfiniBandRDMARoCEv2

0 likes · 12 min read

Understanding RDMA, InfiniBand, and RoCEv2 for High‑Performance Distributed Training

Architects' Tech Alliance

May 23, 2025 · Artificial Intelligence

Why High‑Performance Networks Are Critical for Large‑Scale AI Model Training

The whitepaper explains that AI model training and inference rely on massive data computation, with model sizes reaching billions of parameters, demanding low‑latency, high‑bandwidth, stable, scalable, and manageable networks; it compares RDMA‑based InfiniBand and RoCE solutions and offers design recommendations for future AI compute clusters.

AIHigh‑Performance NetworkingInfiniBand

0 likes · 10 min read

Why High‑Performance Networks Are Critical for Large‑Scale AI Model Training

Architects' Tech Alliance

May 15, 2025 · Industry Insights

Why InfiniBand Still Beats Ethernet: Deep Dive into RDMA, Omni‑Path, and Protocol Layers

This article provides a comprehensive technical analysis of InfiniBand architecture, its protocol stack, comparison with Ethernet‑based RDMA solutions like RoCE and iWARP, and an overview of Omni‑Path, highlighting performance advantages, design trade‑offs, and practical limitations.

High Performance ComputingInfiniBandOmni‑Path

0 likes · 19 min read

Why InfiniBand Still Beats Ethernet: Deep Dive into RDMA, Omni‑Path, and Protocol Layers

Architects' Tech Alliance

May 13, 2025 · Industry Insights

How NVIDIA Builds AI Supercomputers: From H100 to GH200 and GB200 SuperPods

This article analyzes NVIDIA's evolving AI supercomputer architectures—detailing the H100‑based 256‑GPU SuperPod, the GH200‑based 256‑GPU SuperPod with integrated Grace CPU, and the GB200‑based 576‑GPU SuperPod—examining their NVLink and InfiniBand topologies, bandwidth limits, and scalability challenges.

AIGPUHPC

0 likes · 11 min read

How NVIDIA Builds AI Supercomputers: From H100 to GH200 and GB200 SuperPods

Linux Kernel Journey

May 8, 2025 · Artificial Intelligence

How Tencent’s TRMT Tech Delivered a Huge Speedup to DeepSeek’s Large‑Model Network

DeepSeek engineers highlighted Tencent’s open‑source TRMT and DeepEP contributions that boost GPU‑to‑GPU communication by up to 300%, double RoCE performance and add a further 30% gain on InfiniBand, while addressing lane‑utilization and CPU‑control bottlenecks through three targeted optimizations.

DeepEPDeepSeekGPU communication

0 likes · 6 min read

How Tencent’s TRMT Tech Delivered a Huge Speedup to DeepSeek’s Large‑Model Network

May 7, 2025 · Artificial Intelligence

How Tencent’s DeepEP Doubles GPU Communication Speed on RoCE Networks

Tencent engineers highlighted a massive speedup in DeepSeek’s open‑source DeepEP communication framework, revealing how their TRMT‑based optimizations—dynamic multi‑QP topology awareness, IBGDA‑driven CPU‑bypass, and atomic signaling—boost RoCE network throughput up to 300% and add another 30% gain when applied to InfiniBand, effectively doubling GPU communication performance for large AI models.

AI model trainingDeepEPGPU communication

0 likes · 8 min read

How Tencent’s DeepEP Doubles GPU Communication Speed on RoCE Networks

Architects' Tech Alliance

Apr 21, 2025 · Industry Insights

Why RDMA Is Overtaking TCP/IP in HPC: OSI, Leaf‑Spine, and NVIDIA SuperPOD Explained

This article analyzes how the traditional OSI/TCP‑IP model is giving way to RDMA in high‑performance computing, compares Ethernet, InfiniBand and RoCE, evaluates leaf‑spine versus three‑tier data‑center designs, and examines NVIDIA SuperPOD architectures with detailed technical metrics.

Data CenterHPCInfiniBand

0 likes · 14 min read

Why RDMA Is Overtaking TCP/IP in HPC: OSI, Leaf‑Spine, and NVIDIA SuperPOD Explained

Architects' Tech Alliance

Mar 29, 2025 · Industry Insights

Why Network Becomes the New Bottleneck for AI Training and How InfiniBand vs RoCE Compare

AI large‑model training relies on GPU clusters, generating massive inter‑node traffic that turns network performance into the primary bottleneck, prompting a detailed comparison of InfiniBand and RoCE protocols, their histories, strengths, limitations, and the need for next‑generation network chip architectures.

AIData CenterHPC

0 likes · 5 min read

Why Network Becomes the New Bottleneck for AI Training and How InfiniBand vs RoCE Compare

Feb 13, 2025 · Fundamentals

Understanding InfiniBand RDMA: Architecture, Advantages, and NVIDIA Quantum-2

InfiniBand RDMA, designed to network server buses, offers high bandwidth and ultra‑low latency through zero‑copy, kernel‑bypass communication, with a layered architecture (L1‑L5) and hardware components like Quantum‑2 Switch, ConnectX‑7 RNIC, and SHARP acceleration, supported by the Verbs API and OFED stack.

InfiniBandQuantum-2RDMA

0 likes · 25 min read

Understanding InfiniBand RDMA: Architecture, Advantages, and NVIDIA Quantum-2

Architects' Tech Alliance

Dec 8, 2024 · Industry Insights

Why InfiniBand Still Beats Ethernet: Deep Dive into RDMA, Omni‑Path, and iWARP

This article provides a comprehensive technical analysis of InfiniBand’s protocol layers, topology, and performance advantages, compares Omni‑Path’s architecture, explains RDMA fundamentals, and details Ethernet‑based RDMA protocols such as RoCE and iWARP, highlighting their trade‑offs and use cases.

High-Performance ComputingInfiniBandOmni‑Path

0 likes · 18 min read

Why InfiniBand Still Beats Ethernet: Deep Dive into RDMA, Omni‑Path, and iWARP

BirdNest Tech Talk

Dec 1, 2024 · Fundamentals

How to Exchange RDMA Connection Parameters: Methods, Pros, and Pitfalls

Establishing an RDMA connection requires exchanging key parameters such as LID, QP number, and memory keys, and this article systematically outlines the essential information, compares six exchange methods—from static configuration to distributed services—and evaluates their advantages, drawbacks, and suitable scenarios.

Distributed SystemsInfiniBandParameter Exchange

0 likes · 7 min read

How to Exchange RDMA Connection Parameters: Methods, Pros, and Pitfalls

Architects' Tech Alliance

Nov 7, 2024 · Industry Insights

Why RDMA, InfiniBand, and RoCE Are Redefining High‑Performance Data Center Networks

This article examines the evolution from the OSI and TCP/IP models to RDMA‑based technologies, compares traditional three‑tier and leaf‑spine architectures, analyzes NVIDIA SuperPOD designs, and evaluates Ethernet, InfiniBand, and RoCE switches to guide high‑throughput, low‑latency data‑center networking decisions.

Data Center NetworkingHigh Performance ComputingInfiniBand

0 likes · 13 min read

Why RDMA, InfiniBand, and RoCE Are Redefining High‑Performance Data Center Networks

Architects' Tech Alliance

Oct 11, 2024 · Industry Insights

Why Common Network Misconceptions Hurt AI Performance and How to Fix Them

The article explains how prevalent misunderstandings in data‑center network design—such as altering end‑to‑end link speeds, overlooking switch radix, and choosing inappropriate buffering architectures—can increase latency and reduce AI workload efficiency, and it outlines the benefits of InfiniBand, cut‑through switching, scalable radix, and resilient AI‑cloud management solutions.

AIBuffer ArchitectureCut-through Switching

0 likes · 9 min read

Why Common Network Misconceptions Hurt AI Performance and How to Fix Them

Architects' Tech Alliance

Sep 25, 2024 · Fundamentals

NVIDIA Quantum‑2 InfiniBand Platform: Technical Overview, Q&A, and Deployment Guidance

This article explains the growing demand for high‑performance computing, introduces NVIDIA's Quantum‑2 InfiniBand platform with its high‑speed, low‑latency capabilities, provides a curated list of related technical articles, and offers an extensive Q&A covering compatibility, cabling, UFM, PCIe limits, and best‑practice deployment for AI and HPC workloads.

AIGPUInfiniBand

0 likes · 11 min read

NVIDIA Quantum‑2 InfiniBand Platform: Technical Overview, Q&A, and Deployment Guidance

Architects' Tech Alliance

Sep 12, 2024 · Artificial Intelligence

Comparison of InfiniBand and RoCEv2 Architectures for AI Compute Networks

This article examines the two dominant AI compute network architectures, InfiniBand and RoCEv2, detailing their designs, flow‑control mechanisms, performance, cost and scalability characteristics, and evaluates their respective advantages and limitations to guide network selection for AI data centers.

AI computeInfiniBandRDMA

0 likes · 9 min read

Comparison of InfiniBand and RoCEv2 Architectures for AI Compute Networks

Architects' Tech Alliance

Sep 8, 2024 · Artificial Intelligence

Design and Architecture of Multi‑Million GPU Clusters for Large‑Scale AI Model Training

The article surveys the network architectures and congestion‑control techniques used in massive GPU clusters—such as Byte’s megascale, Baidu HPN, Alibaba HPN7, and Tencent Xingmai 2.0—highlighting how high‑bandwidth, low‑latency designs and advanced RDMA technologies enable training of trillion‑parameter multimodal AI models.

Data CenterGPU clustersHPN

0 likes · 11 min read

Design and Architecture of Multi‑Million GPU Clusters for Large‑Scale AI Model Training

Architects' Tech Alliance

Aug 18, 2024 · Artificial Intelligence

RDMA, InfiniBand, RoCE, and iWARP: High‑Performance Networking for Large‑Scale Generative AI Model Training

The article explains how RDMA technologies—including InfiniBand, RoCE, and iWARP—provide high‑throughput, low‑latency, CPU‑free data transfer for massive generative AI model training, compares their architectures, and discusses modern network designs and load‑balancing strategies to optimize AI‑focused data‑center networks.

AI trainingHigh‑Performance ComputingInfiniBand

0 likes · 11 min read

RDMA, InfiniBand, RoCE, and iWARP: High‑Performance Networking for Large‑Scale Generative AI Model Training

Architects' Tech Alliance

Aug 12, 2024 · Industry Insights

How Shanghai Jiao Tong University Built China’s First Campus‑Scale ARM HPC Cluster with Huawei Kunpeng

This article details Shanghai Jiao Tong University's design and deployment of the nation’s first campus‑level high‑performance computing cluster based on Huawei Kunpeng 920 ARM processors, covering background, user challenges, unified storage, network topology, containerized software delivery, and performance validation with LAMMPS and GATK.

ARMHPCInfiniBand

0 likes · 12 min read

How Shanghai Jiao Tong University Built China’s First Campus‑Scale ARM HPC Cluster with Huawei Kunpeng

Architects' Tech Alliance

Jul 19, 2024 · Industry Insights

What Nvidia’s Blackwell GPUs and PCIe 6/7 Mean for AI and Data Centers

The article analyzes Nvidia's Blackwell‑based GB200, HGX B200 and HGX B100 servers, their integration of Blackwell GPUs and Grace CPUs, the shift to PCIe 6.0/7.0, the accompanying Quantum‑X800 InfiniBand platform and 1.6 T optical modules, and projects rapid market growth driven by AI workloads.

AIBlackwellData Center

0 likes · 9 min read

What Nvidia’s Blackwell GPUs and PCIe 6/7 Mean for AI and Data Centers

Architects' Tech Alliance

Jul 15, 2024 · Industry Insights

Why Ethernet Is Overtaking InfiniBand in AI and Data Center Networks

The article analyzes the 2022 global and Chinese switch markets, explains how distributed computing and generative AI workloads rely on high‑performance switches, compares Ethernet and InfiniBand technologies—including bandwidth, latency, and cost factors—and outlines major vendor strategies and future trends in the networking industry.

AIData CenterInfiniBand

0 likes · 14 min read

Why Ethernet Is Overtaking InfiniBand in AI and Data Center Networks

Architects' Tech Alliance

Jul 7, 2024 · Operations

Designing High‑Performance Cluster Networks for AI Large Models: InfiniBand vs RoCE

The article analyzes the networking challenges of AI super‑large models, comparing InfiniBand and RoCE technologies, and presents design guidelines for ultra‑scale, high‑bandwidth, low‑latency, and highly stable cluster interconnects to maximize GPU utilization and overall training efficiency.

AIGPU interconnectHigh‑Performance Computing

0 likes · 14 min read

Designing High‑Performance Cluster Networks for AI Large Models: InfiniBand vs RoCE

Architects' Tech Alliance

Jul 7, 2024 · Operations

Overview of Popular GPU/TPU Cluster Networking Technologies: NVLink, InfiniBand, RoCE, and DDC

This article reviews the main GPU/TPU cluster networking solutions—including NVLink, InfiniBand, RoCE Ethernet, and DDC full‑schedule fabrics—examining their latency, loss‑free transmission, congestion control, cost, scalability, and suitability for large‑scale LLM training workloads.

AI trainingDDCGPU networking

0 likes · 16 min read

Overview of Popular GPU/TPU Cluster Networking Technologies: NVLink, InfiniBand, RoCE, and DDC

Architects' Tech Alliance

Jun 26, 2024 · Industry Insights

Why RoCE Is Overtaking InfiniBand in AI Compute: Insights from the UEC Alliance

The article examines the rise of the Ultra Ethernet Consortium (UEC), its new specifications, and how industry leaders like Broadcom, Nvidia, and Meta are shifting from InfiniBand to RoCE to meet the high‑throughput, low‑latency demands of AI and HPC workloads, highlighting technical advantages and future trends.

HPCInfiniBandRoCE

0 likes · 8 min read

Why RoCE Is Overtaking InfiniBand in AI Compute: Insights from the UEC Alliance

Architects' Tech Alliance

Jun 20, 2024 · Artificial Intelligence

Comparative Analysis of InfiniBand and RoCEv2 Architectures for AI Compute Networks

This article provides a detailed comparison of InfiniBand and RoCEv2 network architectures, examining their technical features, flow‑control mechanisms, performance, cost, and suitability for AI compute environments to guide designers in selecting the optimal solution.

AI computeInfiniBandPerformance

0 likes · 9 min read

Comparative Analysis of InfiniBand and RoCEv2 Architectures for AI Compute Networks

Architects' Tech Alliance

May 23, 2024 · Cloud Computing

Design and Comparison of High‑Performance Cloud Data Center Networks for AI Computing

This article analyzes traditional cloud data center network limitations for AI workloads and compares various high‑bandwidth, low‑latency architectures—including two‑layer and three‑layer fat‑tree designs, InfiniBand, and RoCE—providing best‑practice recommendations for building scalable, non‑blocking AI‑Pool networks.

AI computingFat-TreeGPU clusters

0 likes · 12 min read

Design and Comparison of High‑Performance Cloud Data Center Networks for AI Computing

Architects' Tech Alliance

May 19, 2024 · Industry Insights

InfiniBand vs RoCEv2: Which High‑Performance Network Wins AI Compute?

With AI models growing to billions of parameters, the choice of high‑performance interconnect—InfiniBand or RoCEv2—directly impacts training speed, scalability, latency, and operational complexity, and this article analyzes their architectures, performance metrics, vendor ecosystems, and suitability for large‑scale AI clusters.

AIHigh Performance ComputingInfiniBand

0 likes · 13 min read

InfiniBand vs RoCEv2: Which High‑Performance Network Wins AI Compute?

Architects' Tech Alliance

May 11, 2024 · Industry Insights

Why Network Interconnects Are the New Bottleneck for Large‑Model AI Training

The rapid growth of AI large‑model training and inference is driving unprecedented demand for compute and high‑speed networking, prompting a shift from traditional GPU clusters to super‑pooled intelligent computing centers that must balance multiple intra‑ and inter‑node interconnect solutions such as NVLink, OAM/UBB, InfiniBand and RoCEv2.

AIData CenterInfiniBand

0 likes · 6 min read

Why Network Interconnects Are the New Bottleneck for Large‑Model AI Training

Architects' Tech Alliance

May 9, 2024 · Industry Insights

Why RoCE Is Reshaping High‑Performance Computing Networks

The article provides a detailed technical analysis of RoCE (RDMA over Converged Ethernet), its two protocol versions, packet overhead, congestion‑control mechanisms, Soft‑RoCE implementation, and the challenges and performance implications of deploying RoCE in modern HPC environments compared to InfiniBand and traditional Ethernet solutions.

Congestion ControlHPCInfiniBand

0 likes · 17 min read

Why RoCE Is Reshaping High‑Performance Computing Networks

Architects' Tech Alliance

May 5, 2024 · Operations

Essential Q&A on NVIDIA Quantum‑2 InfiniBand: Compatibility, Cabling, and Performance

This article compiles detailed technical Q&A about NVIDIA's Quantum‑2 InfiniBand platform, covering compatibility of CX7 NDR ports, cabling options, switch connections, UFM deployment, PCIe bandwidth limits, and performance considerations for high‑performance computing clusters.

CablingHPCInfiniBand

0 likes · 14 min read

Essential Q&A on NVIDIA Quantum‑2 InfiniBand: Compatibility, Cabling, and Performance

Architects' Tech Alliance

May 5, 2024 · Artificial Intelligence

Why InfiniBand Is the Secret Weapon for AIGC Training Performance

The article examines how InfiniBand’s specialized features—collective communication, in‑network computing, adaptive routing, congestion control, cut‑through forwarding, shallow buffering, and self‑healing—are optimized for large‑scale AI‑generated content (AIGC) training, delivering higher bandwidth, lower latency, and greater fault tolerance than Ethernet alternatives.

AI trainingAIGCAdaptive routing

0 likes · 10 min read

Why InfiniBand Is the Secret Weapon for AIGC Training Performance

Architects' Tech Alliance

May 3, 2024 · Fundamentals

From OSI Model to RDMA: High‑Performance Networking, Leaf‑Spine Architecture, and Switch Selection

This article examines the evolution of network protocols from the OSI seven‑layer model and TCP/IP to RDMA technologies such as InfiniBand and RoCE, compares traditional three‑tier and leaf‑spine data‑center designs, and evaluates Ethernet, InfiniBand, and RoCE switches for high‑throughput, low‑latency HPC environments.

Data center architectureInfiniBandLeaf-Spine

0 likes · 13 min read

From OSI Model to RDMA: High‑Performance Networking, Leaf‑Spine Architecture, and Switch Selection

Architects' Tech Alliance

May 1, 2024 · Industry Insights

How NVIDIA’s Blackwell Platform Redefines AI Supercomputing Networks

The article examines NVIDIA’s Blackwell platform network architecture, detailing the fifth‑generation NVLink, sixth‑generation PCIe, 800 Gb/s InfiniBand and Ethernet adapters, the DGX B200 and GB200 configurations, new IB and Ethernet switches, and the implications of increased optical module demands for large‑scale AI clusters.

AI supercomputingBlackwellDGX

0 likes · 10 min read

How NVIDIA’s Blackwell Platform Redefines AI Supercomputing Networks

Architects' Tech Alliance

Apr 28, 2024 · Industry Insights

Why RoCE v2 Is Outpacing InfiniBand for Modern Data Centers

This article provides an in‑depth technical analysis of RoCE v2, covering its architecture, NIC requirements, and detailed comparisons with InfiniBand across physical layers, protocol stacks, switching, congestion handling, routing, and topology, while also highlighting the UEC alliance’s new transport protocol initiative.

High Performance ComputingInfiniBandRDMA

0 likes · 12 min read

Why RoCE v2 Is Outpacing InfiniBand for Modern Data Centers

360 Smart Cloud

Apr 25, 2024 · Cloud Native

Building High‑Performance RoCE v2 and InfiniBand Networks in a Cloud‑Native Environment for Large‑Model Training

This article explains how to construct high‑performance RoCE v2 and InfiniBand networks within a cloud‑native Kubernetes environment, detailing the underlying technologies, required components, configuration steps, and performance test results that demonstrate significant communication speed improvements for large‑scale AI model training.

AI trainingCloud NativeHigh‑Performance Networking

0 likes · 12 min read

Building High‑Performance RoCE v2 and InfiniBand Networks in a Cloud‑Native Environment for Large‑Model Training

Architects' Tech Alliance

Apr 23, 2024 · Industry Insights

Which GPU Cluster Network Wins for LLM Training? NVLink, InfiniBand, RoCE & DDC Compared

This article analyzes the main GPU/TPU cluster networking options—NVLink, InfiniBand, RoCE Ethernet, and DDC full‑schedule fabrics—examining latency, lossless transmission, congestion control, cost, power, and scalability to determine their suitability for large‑scale LLM training.

DDCData center fabricsGPU networking

0 likes · 18 min read

Which GPU Cluster Network Wins for LLM Training? NVLink, InfiniBand, RoCE & DDC Compared

Architects' Tech Alliance

Apr 21, 2024 · Fundamentals

Understanding RDMA: InfiniBand, RoCE, and Their Role in High‑Performance AI Model Training

This article explains how Remote Direct Memory Access (RDMA) technologies such as InfiniBand and RoCE bypass OS kernels to achieve ultra‑low latency and high bandwidth, discusses their hardware implementations, cost considerations, and their critical impact on large‑scale AI model training and HPC network design.

AIGPUHigh‑Performance Computing

0 likes · 11 min read

Understanding RDMA: InfiniBand, RoCE, and Their Role in High‑Performance AI Model Training

Architects' Tech Alliance

Apr 18, 2024 · Industry Insights

Why InfiniBand Dominates Modern HPC: Speed, Latency, and Scalability Explained

This article provides a comprehensive technical overview of InfiniBand, covering its rapid adoption in top supercomputers, detailed performance advantages such as ultra‑high bandwidth, CPU offload, sub‑microsecond latency, flexible scalability, QoS, SHARP acceleration, and a comparison with Ethernet, Fibre Channel, and Omni‑Path, while also outlining HDR switch and NIC product families.

Data CenterHDRHPC

0 likes · 20 min read

Why InfiniBand Dominates Modern HPC: Speed, Latency, and Scalability Explained

Linux Code Review Hub

Apr 8, 2024 · Fundamentals

Understanding Memory Semantics: Definitions, IB vs CXL, and Common Confusions

The article explains what memory semantics means in modern data‑center contexts, compares Infiniband and CXL definitions, clarifies load/store and DMA operations, and highlights the distinction between memory space and memory access with concrete examples and references.

CXLDMAInfiniBand

0 likes · 11 min read

Understanding Memory Semantics: Definitions, IB vs CXL, and Common Confusions

Architects' Tech Alliance

Apr 3, 2024 · Industry Insights

InfiniBand vs. RoCE v2: Choosing the Best Network for AI Data Centers

This article provides a detailed technical comparison between InfiniBand and RoCE v2, covering architecture, lossless transmission, adaptive routing, major vendors, performance, scalability, operational complexity, and cost considerations to help AI data center architects select the most suitable high‑performance network solution.

AI data centerHigh‑Performance NetworkingInfiniBand

0 likes · 13 min read

InfiniBand vs. RoCE v2: Choosing the Best Network for AI Data Centers

Architects' Tech Alliance

Feb 29, 2024 · Industry Insights

Choosing the Right GPU Cluster Network: NVLink, InfiniBand, RoCE & DDC Explained

This article examines the key GPU/TPU cluster networking options—NVLink, InfiniBand, RoCE Ethernet, and emerging DDC full‑scheduling fabrics—detailing their latency, loss‑less transmission, congestion control, cost, power, and scalability considerations for large‑scale AI training deployments.

AI trainingDDC fabricGPU networking

0 likes · 18 min read

Choosing the Right GPU Cluster Network: NVLink, InfiniBand, RoCE & DDC Explained

Architects' Tech Alliance

Feb 14, 2024 · Industry Insights

Why InfiniBand Is Outpacing Ethernet in High‑Performance Computing

This article provides a comprehensive overview of InfiniBand technology, covering its history, architecture, packet format, layer functions, switching mechanisms, and performance advantages over Ethernet, while highlighting its rapid growth and future prospects in HPC environments.

High Performance ComputingInfiniBandRDMA

0 likes · 15 min read

Why InfiniBand Is Outpacing Ethernet in High‑Performance Computing

Architects' Tech Alliance

Jan 29, 2024 · Industry Insights

Can Ethernet Keep Up with InfiniBand? 2023 Market Trends and Forecast

An IDC‑based analysis shows that while Ethernet remains dominant in data‑center networking, both Ethernet and InfiniBand are growing, with detailed vendor shares, port shipment estimates, and speed‑grade trends revealing a nuanced competitive landscape for 2023‑2024.

AIData CenterIDC

0 likes · 11 min read

Can Ethernet Keep Up with InfiniBand? 2023 Market Trends and Forecast

Architects' Tech Alliance

Jan 8, 2024 · Industry Insights

Why Ethernet Still Beats InfiniBand in Data Centers: 2023 Market Insights

Despite the surge in AI investments, IDC data shows that high‑speed Ethernet continues to outpace InfiniBand in data‑center switch sales, with strong growth in both revenue and port shipments across DC, campus, and edge markets in Q3 2023.

Data CenterIDCIndustry Insights

0 likes · 11 min read

Why Ethernet Still Beats InfiniBand in Data Centers: 2023 Market Insights

Architects' Tech Alliance

Dec 24, 2023 · Artificial Intelligence

Overview of Popular GPU/TPU Cluster Networking Technologies for LLM Training

This article examines the main GPU/TPU cluster networking options—including NVLink, InfiniBand, RoCE Ethernet Fabric, and DDC full‑schedule networks—explaining their latency, loss‑less transmission, congestion control, cost, scalability, and suitability for large‑scale LLM training workloads.

GPU networkingInfiniBandLLM training

0 likes · 18 min read

Overview of Popular GPU/TPU Cluster Networking Technologies for LLM Training

Architects' Tech Alliance

Dec 6, 2023 · Artificial Intelligence

The Relationship Between Switches, Network Protocols, and AI in Modern Data Centers

This article explains how network protocols and switch architectures—including OSI layers, TCP/IP, RDMA, InfiniBand, RoCE, and leaf‑spine designs—support high‑throughput, low‑latency AI and HPC workloads, compares Ethernet and InfiniBand markets, and examines NVIDIA’s Spectrum/X and SuperPOD solutions.

AIData Center NetworkingInfiniBand

0 likes · 11 min read

The Relationship Between Switches, Network Protocols, and AI in Modern Data Centers

Architects' Tech Alliance

Nov 29, 2023 · Fundamentals

RDMA vs TCP/IP: Protocol Comparison and Overview of RDMA Types for Distributed Storage Networks

This article explains the differences between RDMA (including RoCE, InfiniBand, and iWARP) and traditional TCP/IP, describes the three main RDMA network types, compares their characteristics, and outlines their typical usage in distributed storage environments.

InfiniBandRDMARoCE

0 likes · 7 min read

RDMA vs TCP/IP: Protocol Comparison and Overview of RDMA Types for Distributed Storage Networks

Architects' Tech Alliance

Aug 10, 2023 · Industry Insights

InfiniBand vs RoCEv2: Which Network Powers AI Model Training?

This article examines the architecture of AI compute clusters, explaining offline training and inference pipelines, the role of RDMA, and the technical differences between InfiniBand and RoCEv2—including latency, bandwidth, scalability, cost, and vendor considerations—to help engineers choose the optimal high‑performance network for large‑model training.

AI computeHigh‑Performance NetworkingInfiniBand

0 likes · 13 min read

InfiniBand vs RoCEv2: Which Network Powers AI Model Training?

Architects' Tech Alliance

Jul 24, 2023 · Operations

NVIDIA Quantum‑2 InfiniBand Platform Overview and Technical Q&A

This article introduces NVIDIA's Quantum‑2 InfiniBand solution for high‑performance computing, explains its HDR 200 Gb/s architecture, and provides a comprehensive Q&A covering cable compatibility, SuperPod networking, UFM management, PCIe bandwidth, and RDMA support for both IB and Ethernet environments.

InfiniBandPCIeRDMA

0 likes · 9 min read

NVIDIA Quantum‑2 InfiniBand Platform Overview and Technical Q&A

Alibaba Cloud Infrastructure

Jun 16, 2023 · Cloud Computing

Predictable Network and High‑Performance Network Architecture for Large‑Scale AI Training

The article examines how Alibaba Cloud’s Predictable Network, InfiniBand versus Ethernet trade‑offs, and the HPN high‑performance network design together address the extreme bandwidth, latency, scalability and reliability requirements of modern large‑model AI training workloads in cloud data centers.

AI trainingCloud ComputingHigh Performance Computing

0 likes · 24 min read

Predictable Network and High‑Performance Network Architecture for Large‑Scale AI Training

Architects' Tech Alliance

Jun 10, 2023 · Fundamentals

Understanding RDMA: How Direct Memory Access Boosts Data Center Performance

This article explains the principles of DMA and RDMA, compares RDMA protocols such as InfiniBand, RoCE, and iWARP, outlines their performance advantages, and reviews the key standards bodies, open‑source communities, hardware vendors, and real‑world adoption in high‑performance data centers.

DMAData CenterHigh Performance Computing

0 likes · 15 min read

Understanding RDMA: How Direct Memory Access Boosts Data Center Performance

Baidu Geek Talk

May 10, 2023 · Artificial Intelligence

Baidu's AI Infrastructure for Large-Scale LLM Training: Architecture, Challenges, and Optimization

Baidu’s AI infrastructure combines a massive InfiniBand‑linked GPU cluster, Kunlun chips, the PaddlePaddle framework, and the Wenxin model suite with 4D hybrid parallelism, elastic fault tolerance, and a two‑stage training pipeline to overcome computation, memory, and communication walls, delivering world‑leading MLPerf performance for large‑scale LLMs.

GPU clusterInfiniBandLarge Language Model

0 likes · 15 min read

Baidu's AI Infrastructure for Large-Scale LLM Training: Architecture, Challenges, and Optimization

Architects' Tech Alliance

Apr 19, 2023 · Fundamentals

Implementation and Performance Evaluation of a Domestic ARM‑Based High‑Performance Computing Cluster at Shanghai Jiao Tong University

The article describes how Shanghai Jiao Tong University built a campus‑level HPC platform using Huawei Kunpeng 920 ARM processors, detailing system architecture, unified storage and scheduling, containerized software deployment, network topology, Lustre file system integration, and performance results of LAMMPS and GATK compared with traditional X86 clusters.

ARMHPCInfiniBand

0 likes · 11 min read

Implementation and Performance Evaluation of a Domestic ARM‑Based High‑Performance Computing Cluster at Shanghai Jiao Tong University

Open Source Linux

Apr 14, 2023 · Fundamentals

Why InfiniBand Is the Fastest Growing High‑Speed Interconnect for HPC

This article provides a comprehensive overview of InfiniBand technology, covering its history, architecture, packet structure, layer hierarchy, switching mechanisms, and performance advantages over Ethernet, highlighting its role as a low‑latency, high‑bandwidth solution for high‑performance computing.

High Performance ComputingInfiniBandRDMA

0 likes · 14 min read

Why InfiniBand Is the Fastest Growing High‑Speed Interconnect for HPC

Architects' Tech Alliance

Mar 26, 2023 · Fundamentals

Comprehensive Overview of InfiniBand Technology and Architecture

This article provides an in‑depth examination of InfiniBand, covering its rapid development as a high‑bandwidth, low‑latency interconnect technology, the InfiniBand Trade Association, detailed packet structures, layered architecture, switching mechanisms, and a comparative analysis with Ethernet, highlighting its advantages for high‑performance computing.

Data TransferHPCHigh Performance Computing

0 likes · 14 min read

Comprehensive Overview of InfiniBand Technology and Architecture

Refining Core Development Skills

Oct 24, 2022 · Fundamentals

Low‑Latency Network Architecture for High‑Frequency Trading

This article explains how high‑frequency trading firms achieve ultra‑low network latency by combining proximity deployment, dedicated links, microwave transmission, InfiniBand, low‑latency switches, kernel bypass, RDMA, TCP offload engines and FPGA acceleration, and summarizes the impact of each technique on overall request latency.

FPGAInfiniBandKernel Bypass

0 likes · 16 min read

Low‑Latency Network Architecture for High‑Frequency Trading

Architects' Tech Alliance

Oct 9, 2022 · Fundamentals

Understanding InfiniBand and RDMA: Concepts and Configuration Guide

This article provides an overview of InfiniBand and Remote Direct Memory Access (RDMA), explains their underlying protocols and hardware, and offers detailed step‑by‑step guidance for configuring InfiniBand, RDMA, RoCE, and related services on Red Hat Enterprise Linux systems.

InfiniBandLinuxRDMA

0 likes · 9 min read

Understanding InfiniBand and RDMA: Concepts and Configuration Guide

Architects' Tech Alliance

Jul 18, 2022 · Industry Insights

Why RDMA Is Essential for Future HPC Data Centers: From TCP Limits to RoCEv2

The article analyzes how the shift of data centers toward compute-centric architectures drives the need for high‑performance networking, explains the shortcomings of TCP/IP, compares InfiniBand, iWarp and RoCE protocols, and shows why loss‑less Ethernet with RDMA is critical for modern HPC workloads.

Data CenterHPCInfiniBand

0 likes · 12 min read

Why RDMA Is Essential for Future HPC Data Centers: From TCP Limits to RoCEv2

Architects' Tech Alliance

May 18, 2022 · Fundamentals

Intel Omni-Path Architecture and InfiniBand: Protocol Layers, Topology, and RDMA Overview

This article explains Intel Omni-Path Architecture and InfiniBand, describing their protocol stack, network topology, RDMA technology, and related protocols such as RoCE and iWARP, while also comparing their design goals and performance characteristics.

High Performance ComputingInfiniBandOmni‑Path

0 likes · 19 min read

Intel Omni-Path Architecture and InfiniBand: Protocol Layers, Topology, and RDMA Overview

Architects' Tech Alliance

May 14, 2022 · Fundamentals

High‑Performance Computing Network Solutions: RoCE v2, RDMA, and InfiniBand Overview

The article explains how high‑performance computing (HPC) networks overcome TCP/IP limitations by using RDMA‑based technologies such as RoCE v1/v2 and InfiniBand, detailing their architectures, advantages, vendor implementations, and cost‑effective migration to Ethernet‑based solutions for GPU‑driven workloads.

HPCHighPerformanceComputingInfiniBand

0 likes · 7 min read

High‑Performance Computing Network Solutions: RoCE v2, RDMA, and InfiniBand Overview

Architects' Tech Alliance

Mar 4, 2022 · Operations

What Is InfiniBand RDMA and How to Configure It on RHEL 8?

This guide explains the fundamentals of InfiniBand and RDMA, details the InfiniBand Verbs API, outlines the steps required for kernel data handling, and provides practical configuration instructions for RoCE, IPoIB, and the subnet manager on Red Hat Enterprise Linux 8.

IPoIBInfiniBandRDMA

0 likes · 11 min read

What Is InfiniBand RDMA and How to Configure It on RHEL 8?

Architects' Tech Alliance

Sep 27, 2021 · Fundamentals

High‑Performance Computing (HPC) Network Requirements and RDMA Technologies

The article explains how modern data‑center compute demands drive the need for high‑throughput, low‑latency networking, compares TCP/IP with RDMA‑based solutions such as InfiniBand, iWARP and RoCE, and recommends loss‑less Ethernet for large‑scale HPC deployments.

Data CenterHPCInfiniBand

0 likes · 10 min read

High‑Performance Computing (HPC) Network Requirements and RDMA Technologies

Architects' Tech Alliance

Apr 28, 2021 · Industry Insights

Why InfiniBand Is Outpacing Ethernet in High‑Performance Computing

The article provides a comprehensive technical overview of InfiniBand, covering its history, standards, architecture layers, packet format, performance advantages, and a detailed comparison with Ethernet, highlighting why it has become the preferred high‑speed interconnect for HPC workloads.

Data TransferHigh Performance ComputingInfiniBand

0 likes · 15 min read

Why InfiniBand Is Outpacing Ethernet in High‑Performance Computing

Architects' Tech Alliance

Mar 7, 2021 · Fundamentals

Understanding RDMA: InfiniBand, iWARP, and RoCE Technologies and Their Differences

This article explains Remote Direct Memory Access (RDMA), its origins in InfiniBand, the Ethernet‑based variants iWARP and RoCE (including RoCEv1 and RoCEv2), compares their architectures, performance characteristics, and deployment requirements for high‑performance computing and data‑center networks.

High‑Performance NetworkingInfiniBandRDMA

0 likes · 11 min read

Understanding RDMA: InfiniBand, iWARP, and RoCE Technologies and Their Differences

Architects' Tech Alliance

Jun 21, 2020 · Fundamentals

An Overview of InfiniBand Technology: Architecture, Packet Structure, and Comparison with Ethernet

This article provides a comprehensive overview of InfiniBand, covering its history, the InfiniBand Trade Association, architecture layers, packet format, switching mechanisms, performance advantages, and a detailed comparison with Ethernet for high‑performance computing environments.

Ethernet ComparisonInfiniBandRDMA

0 likes · 14 min read

An Overview of InfiniBand Technology: Architecture, Packet Structure, and Comparison with Ethernet

Architects' Tech Alliance

Apr 3, 2020 · Industry Insights

Why InfiniBand Beats TCP/IP: Deep Dive into Architecture and Socket Direct

This article explains how InfiniBand’s RDMA‑based architecture, layered protocol stack, and Mellanox Socket Direct technology deliver far higher bandwidth, lower latency, and better CPU efficiency than traditional TCP/IP networks, and it presents performance test results that show up to an 80% latency reduction.

FabricHigh Performance ComputingInfiniBand

0 likes · 11 min read

Why InfiniBand Beats TCP/IP: Deep Dive into Architecture and Socket Direct

Architects' Tech Alliance

Mar 26, 2020 · Fundamentals

Understanding InfiniBand: Technology, Market Landscape, and Its Role in Supercomputing

This article provides a comprehensive overview of InfiniBand technology, its high‑bandwidth low‑latency architecture, dominant market players such as Mellanox, and its critical impact on the performance and evolution of global supercomputing systems as reflected in TOP500 rankings.

InfiniBandMellanoxSupercomputers

0 likes · 11 min read

Understanding InfiniBand: Technology, Market Landscape, and Its Role in Supercomputing

Architects' Tech Alliance

Jul 18, 2019 · Fundamentals

Overview of OpenFabrics Enterprise Distribution (OFED) and InfiniBand Software Architecture

This article provides a comprehensive overview of the OpenFabrics Enterprise Distribution (OFED) and the InfiniBand software architecture, covering its history, components, middleware, protocol stack, and how it enables high‑performance, low‑latency networking for IP, storage, and compute applications.

High-Performance ComputingInfiniBandLinux

0 likes · 11 min read

Overview of OpenFabrics Enterprise Distribution (OFED) and InfiniBand Software Architecture

Architects' Tech Alliance

Jun 13, 2019 · Fundamentals

Understanding OpenFabrics Enterprise Distribution (OFED) and the InfiniBand Software Architecture

This article explains the OpenFabrics Enterprise Distribution (OFED) ecosystem, its history, the InfiniBand hardware and software stack, key protocols such as IPoIB, SDP and iSER, and how these technologies enable high‑performance, low‑latency networking across Linux, Windows and virtualized environments.

High-Performance ComputingInfiniBandLinux

0 likes · 12 min read

Understanding OpenFabrics Enterprise Distribution (OFED) and the InfiniBand Software Architecture

Architects' Tech Alliance

Jun 9, 2019 · Fundamentals

Detailed Overview of NVMe Architecture and NVMe over Fabrics

This article provides a comprehensive technical overview of NVMe architecture, the NVMe‑over‑Fabric extensions—including InfiniBand, RoCE, iWARP, Fibre Channel, and TCP—explaining their RDMA‑based advantages, protocol differences, and practical considerations for data‑center storage deployments.

Fibre ChannelInfiniBandNVMe

0 likes · 12 min read

Detailed Overview of NVMe Architecture and NVMe over Fabrics

Architects' Tech Alliance

Apr 24, 2019 · Industry Insights

Why RDMA Is the Game‑Changer for High‑Performance Networking in AI Workloads

This article examines the rise of RDMA high‑performance networking, explains its technical advantages over traditional TCP/IP, showcases real‑world deployments in machine learning at Baidu, and explores future use cases in storage, GPU communication, and core services.

GPU DirectHigh‑Performance NetworkingInfiniBand

0 likes · 9 min read

Why RDMA Is the Game‑Changer for High‑Performance Networking in AI Workloads

Architects' Tech Alliance

Mar 11, 2019 · Fundamentals

Understanding Mellanox InfiniBand Technology and Its Role in High‑Performance Computing

The article explains Nvidia's $6.9 billion acquisition of Mellanox, outlines Mellanox's history and product portfolio, and provides a detailed overview of InfiniBand architecture, network topologies, protocols, and related software stacks such as OFED, highlighting their importance for data‑center, HPC, and cloud environments.

Data CenterHigh‑Performance ComputingInfiniBand

0 likes · 14 min read

Understanding Mellanox InfiniBand Technology and Its Role in High‑Performance Computing

Architects' Tech Alliance

Feb 3, 2019 · Fundamentals

Understanding GPUDirect RDMA: Principles, Implementation, and Performance

This article explains the background of GPU communication, introduces DMA and RDMA fundamentals, describes how GPUDirect RDMA enables direct GPU-to-GPU memory access across machines, and presents performance results showing reduced latency and increased bandwidth for distributed deep‑learning training.

GPU communicationGPUDirectInfiniBand

0 likes · 7 min read

Understanding GPUDirect RDMA: Principles, Implementation, and Performance

Architects' Tech Alliance

Jan 13, 2019 · Fundamentals

Overview of InfiniBand Technology and Its Protocol Stack

This article provides a comprehensive overview of InfiniBand technology, covering its open‑standard architecture, history, OFED software stack, protocol layers, performance advantages over traditional storage networks, and its primary use cases in high‑performance computing and data‑center environments.

High-Performance ComputingInfiniBandOFED

0 likes · 11 min read

Overview of InfiniBand Technology and Its Protocol Stack

Architects' Tech Alliance

Jan 10, 2019 · Fundamentals

Understanding RDMA: Principles, Advantages, and Implementation Details

This article explains the challenges of high‑performance computing and big‑data workloads on traditional TCP/IP stacks, introduces RDMA technology, its variants (InfiniBand, RoCE, iWARP), key protocols, hardware components, and how it achieves ultra‑low latency and high throughput with minimal CPU involvement.

InfiniBandNetwork ProtocolsRDMA

0 likes · 13 min read

Understanding RDMA: Principles, Advantages, and Implementation Details

Architects' Tech Alliance

Nov 25, 2018 · Industry Insights

Why RDMA Makes NVMe‑over‑Fabric Faster: A Deep Dive into Fabrics, FC, InfiniBand, RoCE and TCP

The article examines how NVMe‑over‑Fabric extends NVMe beyond PCIe using various fabrics—FC, InfiniBand, RoCE v2, iWARP and TCP—highlighting RDMA’s zero‑copy, kernel‑bypass and CPU‑free advantages, and comparing protocol differences, performance trade‑offs, and the evolution toward NVMe/TCP.

Fibre ChannelInfiniBandNVMe

0 likes · 13 min read

Why RDMA Makes NVMe‑over‑Fabric Faster: A Deep Dive into Fabrics, FC, InfiniBand, RoCE and TCP

Architects' Tech Alliance

Nov 7, 2018 · Fundamentals

Survey of Network Types and Vendors in High‑Performance Computing (HPC) Environments

The Intersect360 2016 survey of 474 HPC sites covering 723 compute systems, 633 storage systems and 638 LANs reveals that Ethernet and InfiniBand dominate system interconnect, storage and LAN networks, with Mellanox and Cisco accounting for over half of installations, while newer technologies such as 10 GE, 40 G, 56 G InfiniBand and Omni‑Path show evolving market shares driven by bandwidth and latency demands.

CiscoHPCInfiniBand

0 likes · 10 min read

Survey of Network Types and Vendors in High‑Performance Computing (HPC) Environments

Architects' Tech Alliance

Oct 31, 2018 · Fundamentals

Understanding InfiniBand: Architecture, Protocols, and Performance

InfiniBand is a high‑performance network protocol that uses credit‑based flow control and switched fabric architecture to provide low latency, high bandwidth, and reliable data transfer, offering advantages over TCP/IP such as reduced packet loss, efficient RDMA, and support for various upper‑layer protocols.

High Performance ComputingInfiniBandRDMA

0 likes · 10 min read

Understanding InfiniBand: Architecture, Protocols, and Performance

Architects' Tech Alliance

Oct 28, 2018 · Fundamentals

Understanding OpenFabrics Enterprise Distribution (OFED) and InfiniBand Software Architecture

This article provides a comprehensive overview of OpenFabrics Enterprise Distribution (OFED), its history, component stack, and the layered InfiniBand software architecture, explaining how various protocols such as IPoIB, SDP, and iSER enable high‑performance, low‑latency networking for Linux and Windows applications.

High-Performance ComputingInfiniBandLinux

0 likes · 8 min read

Understanding OpenFabrics Enterprise Distribution (OFED) and InfiniBand Software Architecture

Architects' Tech Alliance

Sep 13, 2018 · Fundamentals

Understanding NVMe over Fabrics (NVMe‑oF) with InfiniBand and NetApp EF570/E5700 Architecture

This article explains the fundamentals of NVMe‑oF, the role of InfiniBand components such as HCAs, switches, subnet managers and gateways, and why NetApp chose InfiniBand for its EF570/E5700 storage systems, highlighting performance benefits and protocol coexistence.

Data CenterInfiniBandNVMe-oF

0 likes · 9 min read

Understanding NVMe over Fabrics (NVMe‑oF) with InfiniBand and NetApp EF570/E5700 Architecture

Architects' Tech Alliance

Apr 22, 2018 · Fundamentals

An Overview of Remote Direct Memory Access (RDMA): Principles, Comparisons, and Implementations

This article provides a comprehensive overview of Remote Direct Memory Access (RDMA), detailing its underlying principles, performance advantages over traditional TCP/IP, various protocol families such as InfiniBand, RoCE, and iWARP, and their respective hardware and software requirements.

High Performance ComputingInfiniBandLow latency

0 likes · 9 min read

An Overview of Remote Direct Memory Access (RDMA): Principles, Comparisons, and Implementations

Architects' Tech Alliance

Apr 8, 2018 · Fundamentals

Understanding High‑Performance Computing (HPC): Market Size, Technologies, Metrics, and Core Components

This article provides a comprehensive overview of high‑performance computing, covering its rapid market growth, definition, classification into high‑throughput and distributed computing, key hardware components such as CPUs, GPUs, memory types, networking technologies like InfiniBand, performance metrics, benchmarking tools, and parallel file systems.

GPUHPCHigh Performance Computing

0 likes · 11 min read

Understanding High‑Performance Computing (HPC): Market Size, Technologies, Metrics, and Core Components

Architects' Tech Alliance

Jun 23, 2017 · Fundamentals

Analysis of Intel Omni-Path vs. InfiniBand: Architecture, Products, and Performance

The article provides a detailed analysis of Intel’s Omni-Path and InfiniBand technologies, covering their histories, architectural differences, product lineups, performance benchmarks, and market positioning within high‑performance computing; it also examines the role of the InfiniBand Trade Association, the impact of acquisitions by Intel and Mellanox, and the future prospects of both interconnect solutions.

High-Performance ComputingInfiniBandIntel

0 likes · 9 min read

Analysis of Intel Omni-Path vs. InfiniBand: Architecture, Products, and Performance

Architects' Tech Alliance

Jun 20, 2017 · Fundamentals

An Introduction to Remote Direct Memory Access (RDMA) and Its Supporting Protocols

Remote Direct Memory Access (RDMA) is a high‑performance networking technology that moves data directly between the memories of two computers without OS involvement, offering low latency, zero‑copy transfers and reduced CPU load through protocols such as InfiniBand, RoCE and iWARP.

InfiniBandRDMARoCE

0 likes · 9 min read

An Introduction to Remote Direct Memory Access (RDMA) and Its Supporting Protocols

Architects' Tech Alliance

Jun 8, 2017 · Cloud Computing

Mellanox InfiniBand Technology Overview: Architecture, Protocol Stack, and Product Portfolio

This article provides a comprehensive overview of Mellanox's InfiniBand solutions, covering the company's background, network architecture, routing algorithms, Fat‑Tree topology, the OFED software stack, management tools, MPI support, adapters, switches, routers, cables, and related products for high‑performance computing and cloud data centers.

Cloud ComputingData Center NetworkingFat-Tree

0 likes · 21 min read

Mellanox InfiniBand Technology Overview: Architecture, Protocol Stack, and Product Portfolio

Architects' Tech Alliance

Jun 5, 2017 · Fundamentals

Overview of InfiniBand Technology: Development, Advantages, Architecture, Protocol Layers, and Applications

This article provides a comprehensive overview of InfiniBand technology, covering its history, performance advantages over traditional interconnects, architectural concepts, layered protocol specifications, and typical use cases in high‑performance computing and data‑center environments.

Data CenterHigh-Performance ComputingInfiniBand

0 likes · 14 min read

Overview of InfiniBand Technology: Development, Advantages, Architecture, Protocol Layers, and Applications