Tagged articles

193 articles

Page 1 of 2

May 30, 2026 · Industry Insights

Where Is the Real Moat in the AI Era as Large Models Become Commoditized?

The article analyzes how the rapid commoditization of large‑model capabilities, illustrated by Palantir’s 85% Q1 2026 revenue growth, reshapes AI competition into three layers—model, wrapper, and infrastructure—highlighting ontology as the hard‑to‑copy moat for enterprise AI in high‑risk scenarios.

AI commoditizationAI infrastructurePalantir

0 likes · 11 min read

Where Is the Real Moat in the AI Era as Large Models Become Commoditized?

DataFunSummit

May 29, 2026 · Artificial Intelligence

Why the Overlooked Agent Harness Is the Real Reason AI Projects Fail

The article explains that the hidden infrastructure layer called Agent Harness—its OS‑like architecture, three‑layer abstraction, context‑rot problem, compounding error, and verification loops—determines whether impressive agent demos can survive in production, with concrete benchmarks showing harness improvements far outweigh model upgrades.

AI infrastructureAgent HarnessCompounding Error

0 likes · 14 min read

Why the Overlooked Agent Harness Is the Real Reason AI Projects Fail

Xiaomi Tech

May 26, 2026 · Artificial Intelligence

MiMo V2.5 API Gets Permanent Price Cut and Token Plan Overhaul – Incentive Program Ends

MiMo announces a permanent up to 99% price reduction for its V2.5 API, a 5‑8× usage boost in its Token Plan billing, a full reset of all Token Plan quotas, and the conclusion of its Hundred‑Trillion Token Creator Incentive Program, effective May 27, 2026.

AI infrastructureAPI pricingInference Optimization

0 likes · 5 min read

MiMo V2.5 API Gets Permanent Price Cut and Token Plan Overhaul – Incentive Program Ends

DataFunTalk

May 26, 2026 · Industry Insights

Why DeepSeek’s Permanent Price Cut Aims at a $10 Trillion AI Market

DeepSeek’s 75% permanent API price reduction is analyzed as a strategic move to shrink KV‑cache memory, lower hardware dependence, trigger a demand surge, reshape the AI hardware ecosystem, and capture an estimated $10 trillion market opportunity.

AI hardwareAI infrastructureAI pricing

0 likes · 13 min read

Why DeepSeek’s Permanent Price Cut Aims at a $10 Trillion AI Market

Baidu Intelligent Cloud Tech Hub

May 26, 2026 · Operations

When CPUs Hide GPU Bottlenecks: How Btune 2.0 Automates Latency Analysis to Uncover Performance Issues

The article presents a real‑world migration case where a CPU‑XPU bottleneck limited inference QPS, explains how Btune 2.0’s new latency‑focused diagnostics pinpointed a kernel lock contention in the halolet component, and shows the AI Agent’s automated, cross‑process analysis that restored performance and reduced cost.

AI infrastructureCPU-GPU bottleneckCross-process analysis

0 likes · 11 min read

When CPUs Hide GPU Bottlenecks: How Btune 2.0 Automates Latency Analysis to Uncover Performance Issues

Architect

May 25, 2026 · Artificial Intelligence

From KV Cache to Harness: How DeepSeek Is Shifting Costs to the System Layer

DeepSeek’s recent V4 release shows that as model inference becomes cheaper, the dominant expenses are moving to system‑level components such as KV cache, memory, storage, compilers, scheduling, hardware adapters, and the emerging Agent Harness layer, reshaping AI infrastructure economics.

AI infrastructureAgent HarnessDeepSeek

0 likes · 23 min read

From KV Cache to Harness: How DeepSeek Is Shifting Costs to the System Layer

DataFunSummit

May 18, 2026 · Artificial Intelligence

How Palantir’s Ontology‑Based Semantic Network Drove 85% Growth and Zero Churn

Palantir’s Q1 2026 revenue jumped 85% while many AI firms saw valuations collapse, and the company attributes its success to replacing cheap‑token LLM wrappers with a deep ontology‑driven semantic network that secures high‑risk AI deployments, creates a durable moat, and delivers unprecedented net‑retention.

AI infrastructurePalantirRAG

0 likes · 10 min read

How Palantir’s Ontology‑Based Semantic Network Drove 85% Growth and Zero Churn

Architects' Tech Alliance

May 14, 2026 · Artificial Intelligence

Jensen Huang’s China Visit: Could It Revive GPU Prospects? Inside Nvidia’s DGX H200 Cluster Design

The article reviews the US‑approved export of Nvidia's DGX H200, the lack of deliveries, Jensen Huang’s surprise China trip that may speed approvals, and then provides a detailed technical breakdown of the DGX H200 cluster’s compute and storage networking, topology, optical link choices, and cable count estimates.

AI infrastructureDGX H200Data Center Networking

0 likes · 8 min read

Jensen Huang’s China Visit: Could It Revive GPU Prospects? Inside Nvidia’s DGX H200 Cluster Design

21CTO

May 13, 2026 · Artificial Intelligence

Is AI Entering a Self‑Evolving Era? Baidu’s Robin Li Introduces the Daily Active Agents (DAA) Metric

Robin Li, CEO of Baidu, proposes Daily Active Agents (DAA) as the new AI‑era metric, arguing it better reflects platform value than Token or DAU by counting how many agents deliver results, and outlines a three‑layer evolution of agents, individuals, and organizations supported by a full‑stack AI infrastructure.

AI ecosystemAI evolutionAI infrastructure

0 likes · 10 min read

Is AI Entering a Self‑Evolving Era? Baidu’s Robin Li Introduces the Daily Active Agents (DAA) Metric

DataFunSummit

May 13, 2026 · Artificial Intelligence

From RAG to Ontology: Palantir’s Semantic Network Drives 85% Growth and Zero Churn

Amid rapidly commoditized large‑model capabilities, Palantir achieved an 85% YoY revenue surge and zero churn by replacing generic RAG approaches with a deep enterprise ontology that unifies business semantics, creating a durable infrastructure moat while other AI firms see valuation collapse.

AI infrastructurePalantirRAG

0 likes · 11 min read

From RAG to Ontology: Palantir’s Semantic Network Drives 85% Growth and Zero Churn

Baidu Geek Talk

May 13, 2026 · Artificial Intelligence

LoongForge Boosts Multimodal Training Speed by 45% on GPU and Kunlun XPU

LoongForge, Baidu Baige’s open‑source full‑modal training framework, unifies LLM, VLM and VLA workloads, runs unchanged on NVIDIA GPUs and Kunlun XPU, and delivers 15‑45% end‑to‑end speedups with up to 90% linear scaling on 5,000‑plus card clusters, while simplifying model integration via YAML.

AI infrastructureGPUKunlun XPU

0 likes · 23 min read

LoongForge Boosts Multimodal Training Speed by 45% on GPU and Kunlun XPU

Machine Heart

May 8, 2026 · Industry Insights

How SGLang’s $100M Seed Funding Powers the Next‑Gen Open AI Infrastructure

RadixArk raised a $100 million seed round backed by top hardware and AI investors to turn the open‑source SGLang inference engine and the Miles RL framework into day‑0 standards, aiming to democratize AI infrastructure and eliminate bottlenecks from training to inference.

AI infrastructureDeepSeek V4Hardware‑agnostic AI

0 likes · 10 min read

How SGLang’s $100M Seed Funding Powers the Next‑Gen Open AI Infrastructure

Machine Heart

May 7, 2026 · Industry Insights

Elon Musk Disbands xAI and Allocates 220,000 GPUs to Anthropic

Elon Musk announced the dissolution of xAI, merging its Grok model and X‑related assets into a new SpaceXAI division, while simultaneously granting Anthropic access to over 220,000 Nvidia GPUs and more than 300 MW of compute to boost Claude’s performance and limits.

AI infrastructureAnthropicClaude

0 likes · 6 min read

Elon Musk Disbands xAI and Allocates 220,000 GPUs to Anthropic

ZhiKe AI

May 6, 2026 · Industry Insights

How WorldClaw Enables AI Agents to Pay On-Chain with Stablecoins

WorldClaw's new WorldRouter lets AI agents settle model‑calling fees on Solana or BNB Chain using the USD1 stablecoin, offering a unified gateway to 300+ models at 30% lower cost while introducing programmable wallets and on‑chain auditability to solve the agent‑authorization bottleneck.

AI infrastructureWLFIWorldClaw

0 likes · 11 min read

How WorldClaw Enables AI Agents to Pay On-Chain with Stablecoins

Machine Heart

May 5, 2026 · Artificial Intelligence

Musk’s 550K Nvidia GPUs Achieve Only 11% Utilization – Like Running 60K GPUs

xAI’s massive fleet of roughly 550,000 Nvidia H100 and H200 GPUs in its Memphis and Colossus data centers is operating at a mere 11% model FLOPs utilization, highlighting how scaling to hundreds of thousands of GPUs creates coordination, network, and scheduling bottlenecks that waste most of the hardware’s compute power.

AI infrastructureGPU utilizationNvidia H100

0 likes · 5 min read

Musk’s 550K Nvidia GPUs Achieve Only 11% Utilization – Like Running 60K GPUs

AI Engineering

May 4, 2026 · Artificial Intelligence

Why the Big‑Model Race Is Over: Where Real Value Lies in AI Infrastructure

The article argues that the competition over which large language model will dominate is outdated, explaining that true value now comes from building multi‑model routing, context engineering, standardized tool protocols, intelligent orchestration, and robust evaluation layers that turn models into reliable AI infrastructure.

AI infrastructureMCPModel routing

0 likes · 6 min read

Why the Big‑Model Race Is Over: Where Real Value Lies in AI Infrastructure

AI Explorer

May 2, 2026 · Backend Development

Building a High‑Concurrency DeepSeek Middleware with Go

The ds2api project, written in Go, offers a high‑concurrency, plugin‑based middleware that standardizes and converts various AI model APIs into DeepSeek‑compatible requests, delivering tens of thousands of conversions per second with millisecond latency and a simple three‑step setup.

AI infrastructureDeepSeekGo

0 likes · 6 min read

Building a High‑Concurrency DeepSeek Middleware with Go

High Availability Architecture

Apr 30, 2026 · Artificial Intelligence

Redefining the Backend: How Workers, Triggers, and Functions Turn Agents into First-Class Workers

The article argues that the traditional separation between AI agent harnesses and back‑ends creates debugging complexity, and proposes redefining the backend with three primitives—worker, trigger, and function—so that agents become equivalent to services or queues, enabling real‑time discovery, scalable extensibility, and unified observability across heterogeneous components.

AI infrastructureAgent Architecturebackend primitives

0 likes · 18 min read

Redefining the Backend: How Workers, Triggers, and Functions Turn Agents into First-Class Workers

AI Explorer

Apr 29, 2026 · Industry Insights

SenseTime’s ‘Big Device’ Powers the Leap of Chinese AI from Usable to Practical

The article explains how DeepSeek V4’s delayed launch was a strategic move to fully adapt to Huawei’s Ascend chips, with SenseTime’s ‘Big Device’ acting as middleware that fine‑tunes hardware‑level scheduling, enabling million‑token contexts and bringing Chinese AI performance closer to Nvidia‑based systems, while noting remaining throughput challenges.

AI infrastructureChinese AIDeepSeek V4

0 likes · 7 min read

SenseTime’s ‘Big Device’ Powers the Leap of Chinese AI from Usable to Practical

Java Tech Enthusiast

Apr 27, 2026 · Operations

Earn 30K CNY/month Guarding DeepSeek’s Data Center on the Mongolian Grasslands

DeepSeek is hiring senior data‑center operations and delivery managers to run its new facility in Ulanqab, Inner Mongolia, offering a 30 K CNY monthly salary and emphasizing a strategy that shifts from algorithmic innovation to low‑cost, high‑efficiency physical infrastructure to support its upcoming V4 trillion‑parameter model.

AI infrastructureData CenterDeepSeek

0 likes · 5 min read

Earn 30K CNY/month Guarding DeepSeek’s Data Center on the Mongolian Grasslands

DataFunSummit

Apr 25, 2026 · Big Data

AI‑Era Multimodal Data Lake Infrastructure: TBDS Design, Storage, Compute, and Governance

The article analyzes how Tencent Cloud's TBDS platform tackles the AI era's multimodal data lake challenges through a native storage format (Lance), elastic Ray‑based compute, standardized metadata with Gravitino, and automated governance via Lakekeeper, citing architecture details, performance numbers, and real‑world deployments.

AI infrastructureBig DataGravitino

0 likes · 13 min read

AI‑Era Multimodal Data Lake Infrastructure: TBDS Design, Storage, Compute, and Governance

DevOps in Software Development

Apr 21, 2026 · Industry Insights

Can Chinese Tokens Power a Self‑Sufficient AI Ecosystem?

The article argues that China’s AI future depends on a three‑part formula—Chinese models, Chinese GPUs, and Chinese green power—to build an open, distributed infrastructure that reduces reliance on Western super‑brain clouds and creates a sustainable, cost‑effective AI supply chain.

AI ecosystemAI infrastructureChinese Tokens

0 likes · 9 min read

Can Chinese Tokens Power a Self‑Sufficient AI Ecosystem?

IT Services Circle

Apr 19, 2026 · Industry Insights

Why DeepSeek Is Moving Its AI Heart to the Mongolian Grasslands

DeepSeek’s latest hiring push reveals a strategic shift from algorithmic research to building and operating a high‑efficiency data center in Inner Mongolia’s Ulanqab, leveraging low‑temperature climate and existing cloud infrastructure to cut TCO, while gearing up for the upcoming V4 trillion‑parameter model.

AI infrastructureCloud ComputingData Center

0 likes · 5 min read

Why DeepSeek Is Moving Its AI Heart to the Mongolian Grasslands

Machine Heart

Apr 18, 2026 · Industry Insights

DeepSeek’s First Fundraise: $100B Valuation and $300M Target Amid Talent Exodus

DeepSeek, the Chinese AI startup behind the high‑efficiency DeepSeek‑R1 model, is reportedly seeking at least $300 million at a $100 billion valuation, while shifting to building its own data‑center infrastructure and seeing key researchers depart for rivals, signaling a new financing and operational phase for the company.

AI financingAI infrastructureDeepSeek

0 likes · 6 min read

DeepSeek’s First Fundraise: $100B Valuation and $300M Target Amid Talent Exodus

DataFunSummit

Apr 15, 2026 · Artificial Intelligence

How Relax Powers Scalable Multi‑Modal RL Training with Full Asynchrony

Relax, an open‑source RL training engine built on Megatron‑LM and SGLang, tackles data heterogeneity, system fragility, and role coupling by using a service‑oriented fault‑tolerant architecture, asynchronous pipelines, and multimodal‑native support, achieving up to 76% end‑to‑end speedup over veRL.

AI infrastructureDistributed SystemsMultimodal

0 likes · 11 min read

How Relax Powers Scalable Multi‑Modal RL Training with Full Asynchrony

Alibaba Cloud Infrastructure

Apr 13, 2026 · Industry Insights

How UALink 2.0 and CXL Are Redefining AI Scale‑Up Interconnects

At the 2026 Open AI Infra Summit, Alibaba Cloud showcased the evolution of the UALink 2.0 protocol and its integration with CXL, detailing new specifications, in‑network compute capabilities, and ecosystem developments that aim to overcome scale‑up bottlenecks in AI training and inference.

AI infrastructureCXLCloud Computing

0 likes · 8 min read

How UALink 2.0 and CXL Are Redefining AI Scale‑Up Interconnects

Machine Heart

Apr 11, 2026 · Industry Insights

OpenAI’s Stargate Project Faces Leadership Exodus and Security Incident

After a Molotov cocktail was thrown at Sam Altman's home, OpenAI’s Stargate initiative suffered a shockwave of senior executive departures, a strategic pivot from self‑built data centers to partner‑driven cloud resources, massive funding commitments, and the suspension of its UK expansion, highlighting deep turmoil in the AI infrastructure race.

AI infrastructureCloud ComputingData Centers

0 likes · 10 min read

OpenAI’s Stargate Project Faces Leadership Exodus and Security Incident

SuanNi

Apr 10, 2026 · Artificial Intelligence

How Claude Managed Agents Remove the Infrastructure Burden for Enterprise AI

Claude Managed Agents provide a pre‑built sandbox, orchestration, and session layers that let developers launch production‑grade AI agents in days instead of months, cutting costs, boosting success rates, and delivering real‑world enterprise case studies.

AI infrastructureClaudeManaged Agents

0 likes · 8 min read

How Claude Managed Agents Remove the Infrastructure Burden for Enterprise AI

Top Architecture Tech Stack

Apr 10, 2026 · Artificial Intelligence

How Claude Managed Agents Slash Agent Development Costs by 500×

Claude Managed Agents, Anthropic's new hosted execution layer, eliminates the infrastructure headaches of building AI agents by providing sandboxing, state persistence, error recovery, and orchestration, enabling developers to create complex, long‑running agents with dramatically lower cost and effort.

AI infrastructureAnthropicClaude

0 likes · 12 min read

How Claude Managed Agents Slash Agent Development Costs by 500×

AI Architecture Hub

Apr 10, 2026 · Artificial Intelligence

How Claude Managed Agents Turn AI Assistants into Production-Ready Cloud Workers

Claude Managed Agents, Anthropic's cloud‑hosted AI agent service, lets enterprises embed autonomous bug‑fixing, code‑writing, and reporting bots without building heavy infrastructure, offering managed runtimes, scalable sessions, and API integration while highlighting use‑case categories, architectural design, limitations, and industry impact.

AI AgentsAI infrastructureAnthropic

0 likes · 11 min read

How Claude Managed Agents Turn AI Assistants into Production-Ready Cloud Workers

Big Data Tech Team

Apr 9, 2026 · Industry Insights

Why Data Engineers Are the New AI Powerhouses: 4 Core Reasons & Actionable Tips

The article analyzes why data development engineers are becoming more valuable in the AI era, outlining four core reasons—including data‑driven AI limits, the rise of RAG architectures, heightened data compliance, and a talent shortage—while offering concrete advice on mastering real‑time pipelines, unstructured data, and AI infrastructure.

AI infrastructureBig DataRAG

0 likes · 8 min read

Why Data Engineers Are the New AI Powerhouses: 4 Core Reasons & Actionable Tips

Design Hub

Mar 28, 2026 · Artificial Intelligence

Why Harness Engineering Is Emerging as a New Kind of Company

The AI community is shifting its focus from model performance to building runnable, observable, and scalable agent systems, a trend illustrated by the rise of Harness Engineering, Open Agents Company, and Agent Matrix across X discussions, GitHub projects, and developer meetups.

AI AgentsAI infrastructureAgent Matrix

0 likes · 14 min read

Why Harness Engineering Is Emerging as a New Kind of Company

Machine Learning Algorithms & Natural Language Processing

Mar 28, 2026 · Artificial Intelligence

Junyang Lin’s 10k‑Word Review: From Reasoning to Agentic Thinking in Large Models

In a detailed post‑departure analysis, Junyang Lin reviews two years of large‑model evolution, explains how o1 and DeepSeek‑R1 highlighted the limits of pure reasoning, and argues that the next breakthrough lies in agentic thinking that integrates environment interaction, tool use, and robust reinforcement‑learning infrastructure.

AI infrastructureagentic thinkinglarge language models

0 likes · 18 min read

Junyang Lin’s 10k‑Word Review: From Reasoning to Agentic Thinking in Large Models

Alibaba Cloud Big Data AI Platform

Mar 24, 2026 · Artificial Intelligence

How Hologres + Mem0 Deliver Low‑Cost, High‑Performance Long‑Memory for LLMs

This article explains how the combination of Hologres, a unified real‑time data warehouse, and Mem0, an open‑source LLM memory framework, overcomes the limited context window of large language models by providing scalable, low‑latency, and cost‑effective long‑term memory for AI applications.

AI infrastructureHologresLLM

0 likes · 11 min read

How Hologres + Mem0 Deliver Low‑Cost, High‑Performance Long‑Memory for LLMs

AI Explorer

Mar 19, 2026 · Industry Insights

Nvidia Unveils Physical AI Infrastructure: Turning Virtual Thinkers into Real-World Actors

At GTC 2026, Nvidia introduced a comprehensive physical AI platform built on the upgraded Omniverse, aiming to bridge virtual simulations with real-world robotics, industrial automation, and autonomous vehicles, positioning the company as a systemic infrastructure provider for the emerging AI‑driven manufacturing era.

AI infrastructureDigital TwinNVIDIA

0 likes · 5 min read

Nvidia Unveils Physical AI Infrastructure: Turning Virtual Thinkers into Real-World Actors

AI Explorer

Mar 16, 2026 · Artificial Intelligence

HyperOffload: A New Storage Paradigm Aiming to Break the AI Memory Wall

HyperOffload, a joint effort by Shanghai Jiao Tong University and Huawei’s MindSpore team, proposes a dynamic tensor offloading system that moves data between GPU memory, CPU RAM, and SSDs, aiming to overcome the “memory wall” that limits trillion‑parameter AI model training and deployment.

AI infrastructureAI memory wallGPU Memory Management

0 likes · 6 min read

HyperOffload: A New Storage Paradigm Aiming to Break the AI Memory Wall

JD Cloud Developers

Mar 16, 2026 · Operations

Why Traditional Monitoring Fails for AI Supercomputing and How to Build Next‑Gen Intelligent Monitoring

In the era of hundred‑thousand‑GPU clusters and trillion‑parameter models, conventional monitoring can no longer rely on simple alerts; it must become an observability system that quantifies training and inference performance, breaks data silos across data centers, servers, and networks, and provides business‑aware insights for AI infrastructure.

AI infrastructureLarge Models

0 likes · 10 min read

Why Traditional Monitoring Fails for AI Supercomputing and How to Build Next‑Gen Intelligent Monitoring

Black & White Path

Mar 13, 2026 · Information Security

Beware: Generative AI as a New Cybercrime Ally—13 Enterprise Attack Vectors

The article analyzes how generative AI is transforming cybercrime by enabling 13 distinct attack methods—from highly personalized phishing emails and AI‑assisted malware creation to automated vulnerability hunting, deep‑fake social engineering, malicious LLMs, and attacks on AI infrastructure—highlighting recent research data and real‑world examples that illustrate the heightened speed, stealth, and accessibility of modern threats.

AI infrastructureLLM Securitycybercrime

0 likes · 13 min read

Beware: Generative AI as a New Cybercrime Ally—13 Enterprise Attack Vectors

AI Explorer

Mar 12, 2026 · Industry Insights

Nvidia’s $26 B Bet on Open‑Source AI Models: Redefining the Industry’s Foundations

Nvidia is committing $26 billion to open‑source AI models, shifting from a pure hardware supplier to shaping the entire AI stack—from chips and system software to frameworks and applications—while raising questions about ecosystem lock‑in, competition with newcomers like DeepSeek, and the future of AI infrastructure.

AI ecosystemAI infrastructureAI strategy

0 likes · 7 min read

Nvidia’s $26 B Bet on Open‑Source AI Models: Redefining the Industry’s Foundations

AI Explorer

Mar 11, 2026 · Industry Insights

Jensen Huang and Former OpenAI Executives Target a Gigawatt‑Scale AI Supercomputer

Jensen Huang teams up with former OpenAI leaders to launch a 1‑gigawatt AI supercomputing platform next year, a move that could reshape AI infrastructure, accelerate breakthrough applications, and raise sustainability and centralization challenges for the industry.

AI computeAI infrastructureGigawatt supercomputer

0 likes · 6 min read

Jensen Huang and Former OpenAI Executives Target a Gigawatt‑Scale AI Supercomputer

AI Explorer

Mar 11, 2026 · Industry Insights

Why AI Is Humanity’s Largest Infrastructure Project, Not Just an App

Jensen Huang argues that AI is a five‑layer infrastructure—from energy and chips to data centers, models and applications—forming the biggest construction effort in human history, reshaping jobs, demanding new technical talent, and accelerating growth through open‑source models.

AI ecosystemAI infrastructureData Centers

0 likes · 10 min read

Why AI Is Humanity’s Largest Infrastructure Project, Not Just an App

Didi Tech

Mar 11, 2026 · Cloud Native

How Huatuo Now Monitors MetaX GPUs for Cloud‑Native AI Workloads

Huatuo, the open‑source deep‑observability platform backed by Didi, now supports real‑time monitoring of MetaX GPUs, offering detailed hardware metrics via Docker or Kubernetes deployments and exposing them through a /metrics endpoint for cloud‑native AI and operations use cases.

AI infrastructureCloud NativeGPU monitoring

0 likes · 4 min read

How Huatuo Now Monitors MetaX GPUs for Cloud‑Native AI Workloads

AI Explorer

Mar 6, 2026 · Artificial Intelligence

AReaL: Lightning‑Fast Asynchronous RL Engine for Building High‑Performance LLM Agents

AReaL, an open‑source, fully asynchronous reinforcement‑learning platform co‑developed by Tsinghua University and Ant Group, dramatically speeds up training of complex LLM agents, offering a simple, stable, and hardware‑flexible solution for developers seeking industrial‑grade AI agents.

AI infrastructureAReaLAsynchronous Training

0 likes · 7 min read

AReaL: Lightning‑Fast Asynchronous RL Engine for Building High‑Performance LLM Agents

AI Info Trend

Mar 6, 2026 · Industry Insights

Why AI Is Becoming the New Utility: Key Insights from Deloitte’s 2026 Tech Trends

Deloitte’s 2026 Technology Trends report reveals AI’s shift from experimental labs to essential infrastructure, outlines five major trends—including physical AI, AI agents, hybrid AI infrastructure, AI‑native organizations, and AI‑driven security—and offers actionable steps for enterprises to seize the emerging growth window.

AIAI infrastructureIndustry Insights

0 likes · 8 min read

Why AI Is Becoming the New Utility: Key Insights from Deloitte’s 2026 Tech Trends

PaperAgent

Mar 5, 2026 · Artificial Intelligence

Bridging Agent Runtime and RL: Inside the Claw‑R1 Training Framework

Claw‑R1, a new reinforcement‑learning framework from the USTC Cognitive Intelligence Lab, integrates the OpenClaw Agent Runtime with RL training to enable agents to learn directly in real environments, addressing the gap between simulated tasks and true tool‑calling, multi‑step reasoning, and stable long‑task execution.

AI infrastructureClaw-R1OpenClaw

0 likes · 10 min read

Bridging Agent Runtime and RL: Inside the Claw‑R1 Training Framework

SuanNi

Feb 27, 2026 · Artificial Intelligence

How Dual‑Channel Loading Doubles LLM Inference Throughput

The article analyzes the storage‑bandwidth bottleneck of agent‑style large language models, explains why traditional pre‑fill and decode architectures underutilize network resources, and details a dual‑channel loading and smart scheduling design that unlocks idle bandwidth, achieving up to 1.9× higher throughput in both offline and online inference workloads.

AI infrastructureDual-Channel LoadingInference Optimization

0 likes · 14 min read

How Dual‑Channel Loading Doubles LLM Inference Throughput

Machine Learning Algorithms & Natural Language Processing

Feb 27, 2026 · Artificial Intelligence

Can DeepSeek’s DualPath Break GPU Bottlenecks and Ignite an Agentic AI Surge?

DeepSeek’s new DualPath inference framework, co‑developed with leading Chinese universities, decouples compute from KV‑Cache memory access to eliminate I/O stalls in multi‑round agentic workloads, delivering up to nearly 2× higher throughput and dramatically reducing job‑completion time across several large‑scale LLMs.

AI infrastructureAgentic InferenceDeepSeek

0 likes · 13 min read

Can DeepSeek’s DualPath Break GPU Bottlenecks and Ignite an Agentic AI Surge?

Tencent Technical Engineering

Feb 27, 2026 · Artificial Intelligence

What Will AI Look Like in 2026? Insights from 8 Tech Giants

This article compiles and analyzes 2026 AI trend reports from eight leading technology companies, highlighting key themes such as AI agents, infrastructure, application scenarios, safety regulations, quantitative metrics, and shared consensus points to forecast the next phase of AI development.

2026 predictionsAI AgentsAI governance

0 likes · 14 min read

What Will AI Look Like in 2026? Insights from 8 Tech Giants

Black & White Path

Feb 26, 2026 · Information Security

13 Ways Attackers Leverage Generative AI to Exploit Systems

The article outlines thirteen distinct techniques by which cybercriminals exploit generative AI—from hyper‑personalized phishing and AI‑driven malware creation to AI‑coordinated espionage, deep‑fake social engineering, and attacks on AI infrastructure—backed by expert quotes, research findings, and concrete case studies.

AI AgentsAI infrastructureattack vectors

0 likes · 14 min read

13 Ways Attackers Leverage Generative AI to Exploit Systems

Design Hub

Feb 16, 2026 · Industry Insights

Three AI Industry Shifts in Feb 2026: Open‑Source, Talent, and Infrastructure

In February 2026 three pivotal AI developments—OpenAI hiring OpenClaw founder Peter Steinberger, Alibaba unveiling the trillion‑parameter Qwen3‑Max‑Thinking model, and Cloudflare launching Markdown for Agents—illustrate how open‑source collaboration, talent mobility, and AI‑native infrastructure are reshaping the sector.

AI AgentsAI infrastructureCloudflare

0 likes · 14 min read

Three AI Industry Shifts in Feb 2026: Open‑Source, Talent, and Infrastructure

JD Tech Talk

Jan 30, 2026 · Artificial Intelligence

How JD’s 9N‑LLM Engine Powers Scalable Generative Recommendation at Billion‑Scale

This article details JD Retail’s 9N‑LLM unified training engine, explaining the background of generative recommendation, the challenges of massive sparse and dense parameters, and the multi‑framework, multi‑hardware solutions—including efficient sample processing, large‑scale sparse embedding, dense scaling, UniAttention acceleration, and reinforcement‑learning integration—that enable industrial‑scale deployment.

AI infrastructureGenerative RecommendationSparse Embedding

0 likes · 26 min read

How JD’s 9N‑LLM Engine Powers Scalable Generative Recommendation at Billion‑Scale

Tencent Technical Engineering

Jan 23, 2026 · Artificial Intelligence

Unlocking AI Infra: Distributed Inference, PD Separation, TileLang, and Next‑Gen Agent Infrastructure

This article surveys the 2025 AI infrastructure landscape, covering distributed inference with PD‑separation, dynamic DOPD scheduling, AFD attention‑FFN disaggregation, high‑bandwidth cross‑machine communication libraries, the TileLang programming model, RL train‑inference decoupling via SeamlessFlow, and secure, low‑latency agent infra designs for future large‑scale models.

AI infrastructureAgent SystemsGPU communication

0 likes · 27 min read

Unlocking AI Infra: Distributed Inference, PD Separation, TileLang, and Next‑Gen Agent Infrastructure

AI Engineering

Jan 23, 2026 · Industry Insights

vLLM Core Team Launches Inferact, Secures $150M Seed Funding

The vLLM core maintainers have founded Inferact, raised a $150 million seed round led by Andreessen Horowitz and Lightspeed, and highlighted escalating inference challenges, the project's ecosystem dominance, and a continued commitment to open‑source development.

AI infrastructureInferactLLM inference

0 likes · 3 min read

vLLM Core Team Launches Inferact, Secures $150M Seed Funding

AI Engineering

Jan 22, 2026 · Industry Insights

SGLang Spins Out as RadixArk with $400M Valuation Amid Inference Infrastructure Boom

SGLang, the open‑source inference accelerator, has been spun out into RadixArk—a $400 million‑valued startup aiming to democratize AI infrastructure, while the broader market sees a surge of funding for inference‑focused companies.

AI inferenceAI infrastructureRadixArk

0 likes · 5 min read

SGLang Spins Out as RadixArk with $400M Valuation Amid Inference Infrastructure Boom

Alibaba Cloud Developer

Jan 6, 2026 · Artificial Intelligence

How Tair‑KVCache‑HiSim Simulates LLM Inference 390 000× Faster with <5% Error

This article explains the design, challenges, and high‑fidelity architecture of Tair‑KVCache‑HiSim, a simulation tool that models multi‑level KV‑Cache behavior for large‑language‑model inference, predicts latency, throughput and cost under SLO constraints, and validates its predictions against real GPU deployments with sub‑5% error.

AI infrastructureKVCacheLLM inference

0 likes · 32 min read

How Tair‑KVCache‑HiSim Simulates LLM Inference 390 000× Faster with <5% Error

Baidu Intelligent Cloud Tech Hub

Jan 5, 2026 · Artificial Intelligence

How Baidu Tianchi Supernodes Supercharge Large‑Model Inference: Architecture, Deployment, and Optimization

This article details Baidu's Tianchi supernode design and software tuning—covering hardware scale‑up, deployment planning, Prefill and Decode stage optimizations, quantization strategies, and communication schemes—to dramatically boost large‑model inference throughput and latency while lowering token‑cost.

AI infrastructureParallelismPerformance Optimization

0 likes · 20 min read

How Baidu Tianchi Supernodes Supercharge Large‑Model Inference: Architecture, Deployment, and Optimization

Advanced AI Application Practice

Jan 3, 2026 · Industry Insights

Where AI Is Heading in 2025: Key Trends and Predictions for Next Year

The author reviews optimistic and conservative AI forecasts, argues that enterprise AI adoption will surge, outlines infrastructure bottlenecks, predicts a shift from pure model performance to ecosystem competition, and highlights the rise of world‑model approaches and edge‑side applications for 2025.

AI competitionAI infrastructureAI trends

0 likes · 8 min read

Where AI Is Heading in 2025: Key Trends and Predictions for Next Year

Fighter's World

Jan 2, 2026 · Artificial Intelligence

How AI Agents Are Redefining Systems of Record into Decision‑Making Engines

The article argues that AI agents will transform traditional Systems of Record, which only store outcomes, into next‑generation decision‑capturing Systems of Action by introducing event‑driven Context Graphs, addressing blind spots, technical challenges, and outlining strategic business paths for this paradigm shift.

AI AgentsAI infrastructureContext Graph

0 likes · 30 min read

How AI Agents Are Redefining Systems of Record into Decision‑Making Engines

Fighter's World

Dec 26, 2025 · Industry Insights

Where Is AI Heading in 2026 After the 2025 Sprint?

The article analyzes the rapid weekly turnover of leading LLM benchmarks in 2025, declining compute costs, the shift from chatbots to multi‑step agents, the widening pilot‑to‑production gap, and predicts that 2026 will be defined by infrastructure constraints, AI‑first product design, and accelerated enterprise adoption.

AI infrastructureAI product strategyAI trends

0 likes · 25 min read

Where Is AI Heading in 2026 After the 2025 Sprint?

Alibaba Cloud Developer

Dec 24, 2025 · Artificial Intelligence

Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service

Large language model inference faces memory pressure, but by externalizing KVCache with Mooncake and orchestrating roles via the Kubernetes‑native RoleBasedGroup (RBG), developers can achieve stable, high‑throughput, cost‑effective serving with seamless in‑place upgrades and topology‑aware performance.

AI infrastructureKVCacheKubernetes

0 likes · 21 min read

Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service

Baidu Intelligent Cloud Tech Hub

Dec 24, 2025 · Artificial Intelligence

How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens

The article explains how the newly merged Context Parallelism (CP) technique in SGLang, combined with DeepSeek V3.2's Sparse Attention architecture, reduces first‑token latency by up to 80% and alleviates memory pressure for ultra‑long 128K‑token sequences, detailing both algorithmic innovations and engineering solutions.

AI infrastructureContext ParallelismLLM

0 likes · 10 min read

How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens

Fighter's World

Nov 28, 2025 · Artificial Intelligence

Is Gemini 3 Pro Google’s New Starting Point? An In‑Depth Technical and Market Analysis

The article examines Google’s Gemini 3 Pro launch, highlighting its full‑stack vertical integration, advanced System 2 reasoning, dynamic compute budgeting, native multimodal architecture, TPU cost advantages, the Antigravity IDE platform, generative UI capabilities, and the strategic implications for Google’s AI ecosystem and competitive positioning.

AI infrastructureAntigravityGemini 3 Pro

0 likes · 32 min read

Is Gemini 3 Pro Google’s New Starting Point? An In‑Depth Technical and Market Analysis

Data Party THU

Nov 25, 2025 · Artificial Intelligence

What $47,000 Taught Us About Deploying Multi‑Agent AI Systems

After spending $47,000 running four LangChain agents in production, we reveal the hidden costs of A2A communication and Anthropic’s MCP, expose seven common deployment pitfalls, and argue that dedicated AI infrastructure is essential for scalable multi‑agent systems.

A2A communicationAI infrastructureLangChain

0 likes · 13 min read

What $47,000 Taught Us About Deploying Multi‑Agent AI Systems

Baidu Intelligent Cloud Tech Hub

Nov 25, 2025 · Artificial Intelligence

Why DeepSeek‑V3.2‑Exp Lost Performance and How a Simple RoPE Fix Restored It

The Baidu Baige team discovered that DeepSeek‑V3.2‑Exp’s long‑context performance lagged behind the official report, traced the issue to a subtle RoPE layout mismatch in the open‑source inference demo, collaborated with DeepSeek to fix it, and verified that the model’s speed and accuracy fully recovered across multiple benchmarks.

AI infrastructureDeepSeekLLM inference

0 likes · 9 min read

Why DeepSeek‑V3.2‑Exp Lost Performance and How a Simple RoPE Fix Restored It

Baidu Intelligent Cloud Tech Hub

Nov 20, 2025 · Artificial Intelligence

Boost Multimodal Model Training Efficiency with Offline Sequence Packing and Mixed‑Modality Data

Baidu's Baige team introduces an extended multimodal data loader, automated ShareGPT format conversion, and offline sequence packing techniques that together double token throughput, cut SFT training time by up to six times, and improve GPU utilization and stability for large vision‑language models.

AI infrastructureAIAKGPU efficiency

0 likes · 7 min read

Boost Multimodal Model Training Efficiency with Offline Sequence Packing and Mixed‑Modality Data

Kuaishou Tech

Nov 12, 2025 · Artificial Intelligence

How KaiFG Lets Python Feature Engineering Run at C++ Speed

KaiFG, Kuaishou's self‑built AI Feature Generator, unifies fragmented feature extraction frameworks, replaces slow C++ compilation cycles with Python‑level development, and achieves near‑C++ performance through Codon‑based compilation, reference‑counted memory management, and aggressive LLVM optimizations, dramatically shortening iteration time.

AI infrastructureHigh Performance Computingfeature engineering

0 likes · 14 min read

How KaiFG Lets Python Feature Engineering Run at C++ Speed

Baidu Intelligent Cloud Tech Hub

Nov 7, 2025 · Artificial Intelligence

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

In a deep interview, Baidu AI Computing chief scientist Wang Yanpeng and host Koji trace China's internet infrastructure from the early big‑data era through cloud computing to today's AI boom, highlighting the pivotal role of compute power, GPU acceleration, data scaling, and Baidu's Baige platform in shaping the AI arms race.

AI infrastructureBaidu BaigeCloud Computing

0 likes · 26 min read

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

21CTO

Nov 4, 2025 · Cloud Computing

How OpenAI’s New Alliance with AWS Will Transform AI Computing

On November 3, OpenAI announced a strategic partnership with Amazon Web Services, committing $38 billion to run its AI workloads on AWS’s optimized infrastructure, including EC2 UltraServer GPU clusters, with plans to reach full capacity by the end of 2026, marking a shift from its previous Microsoft‑centric collaborations.

AI infrastructureNVIDIA GPUsOpenAI

0 likes · 3 min read

How OpenAI’s New Alliance with AWS Will Transform AI Computing

DataFunTalk

Nov 4, 2025 · Cloud Computing

How OpenAI’s $38B Deal with AWS Will Transform AI Cloud Computing

OpenAI announced a multi‑year strategic partnership with Amazon Web Services, worth $38 billion, granting OpenAI access to AWS’s massive GPU‑powered EC2 UltraServers and scalable CPU resources to accelerate its generative AI workloads, while leveraging AWS’s security, performance, and cost advantages.

AI infrastructureCloud ComputingOpenAI

0 likes · 5 min read

How OpenAI’s $38B Deal with AWS Will Transform AI Cloud Computing

Alibaba Cloud Infrastructure

Oct 29, 2025 · Cloud Native

How Alibaba Cloud’s Container Stack Evolves for the AI Era

Alibaba Cloud’s container experts unveiled a comprehensive, AI‑focused upgrade across its cloud‑native stack—introducing AMD compute, dynamic scaling, AI‑native scheduling, secure execution environments, and advanced GPU profiling—to make containers the native foundation for AI workloads and accelerate enterprise AI adoption.

AI infrastructureGPU schedulingcontainer computing

0 likes · 9 min read

How Alibaba Cloud’s Container Stack Evolves for the AI Era

Alibaba Cloud Infrastructure

Oct 29, 2025 · Artificial Intelligence

How Alibaba Cloud’s Container Service Accelerates Enterprise LLM Inference

The article outlines how Alibaba Cloud’s container service has evolved to support large‑scale GPU clusters, AI data pipelines, and the new AI Serving Stack, enabling enterprises to deploy, scale, and manage LLM inference services efficiently while addressing Day0‑Day2 challenges.

AI infrastructureAlibaba CloudGPU scaling

0 likes · 13 min read

How Alibaba Cloud’s Container Service Accelerates Enterprise LLM Inference

Architects' Tech Alliance

Oct 27, 2025 · Artificial Intelligence

How AI Super Nodes Are Redefining Scalable AI Infrastructure

The article examines the emerging AI Super Node ecosystem, detailing its core concepts, four‑layer architecture, key enabling technologies, current challenges such as compatibility and energy consumption, and future directions like quantum‑classic hybrids and green low‑carbon designs, illustrating how it overcomes scaling bottlenecks in modern AI deployments.

AI infrastructureDistributed computingSecure AI

0 likes · 13 min read

How AI Super Nodes Are Redefining Scalable AI Infrastructure

Fighter's World

Oct 26, 2025 · Industry Insights

How Bitcoin Miners Are Turning Into AI Infrastructure Providers: An IREN Case Study

The article offers a comprehensive analysis of IREN's shift from Bitcoin mining to AI cloud services, detailing its dual‑engine business model, vertical integration advantages, ambitious 2025‑2028 roadmap, and the key supply‑chain, regulatory, execution, financial, and competitive risks it faces.

AI infrastructureBitcoin miningData center engineering

0 likes · 23 min read

How Bitcoin Miners Are Turning Into AI Infrastructure Providers: An IREN Case Study

BirdNest Tech Talk

Oct 24, 2025 · Backend Development

Bridging Go and Python with pyproc: Ultra‑Low‑Latency Interprocess Calls

This article introduces pyproc, a library that lets Go applications invoke Python functions via Unix Domain Sockets with sub‑45 µs latency, explaining the problem of mixing Go and Python ecosystems, the architecture, performance benefits, suitable use cases, and a step‑by‑step quick‑start guide with full code examples.

AI infrastructureGoInterprocess Communication

0 likes · 7 min read

Bridging Go and Python with pyproc: Ultra‑Low‑Latency Interprocess Calls

DataFunTalk

Oct 15, 2025 · Artificial Intelligence

Why OpenAI’s Massive AI Infrastructure Bet Could Redefine Computing

The article analyzes OpenAI’s recent strategic partnerships and massive AI infrastructure investments, detailing multi‑gigawatt data‑center plans, chip collaborations, soaring energy demands, and the broader implications for AI as the next global infrastructure platform.

AI chipsAI infrastructureCloud Computing

0 likes · 9 min read

Why OpenAI’s Massive AI Infrastructure Bet Could Redefine Computing

Architects' Tech Alliance

Oct 11, 2025 · Artificial Intelligence

What Is a SuperNode? Inside AI‑Optimized High‑Performance Compute Pods

The article explains the concept of SuperNode (SuperPod) as a new AI‑focused compute infrastructure, outlines its high‑density integration, ultra‑fast interconnects, and unified resource management, and compares three leading implementations from NVIDIA, Huawei, and the ETH‑X project.

AI SupernodeAI infrastructureDGX SuperPOD

0 likes · 11 min read

What Is a SuperNode? Inside AI‑Optimized High‑Performance Compute Pods

Alibaba Cloud Native

Oct 11, 2025 · Artificial Intelligence

How AI Gateway Redefines AI Application Infrastructure with Serverless Flexibility

The article provides a comprehensive overview of the AI Gateway product, detailing its evolution, core capabilities across model, tool, and agent access, security features, the open‑source HiMarket platform, and the new Serverless edition that dramatically lowers entry costs for AI workloads.

AI infrastructureOpen PlatformServerless

0 likes · 16 min read

How AI Gateway Redefines AI Application Infrastructure with Serverless Flexibility

DataFunSummit

Oct 8, 2025 · Artificial Intelligence

How EasyRec Boosts Recommendation Training and Inference Performance

This article explains the EasyRec recommendation system’s training and inference architecture, detailing optimization techniques such as embedding parallelism, CPU/GPU placement, XLA and TRT fusion, online learning pipelines, network compression, and real‑world deployment results that dramatically improve throughput and latency.

AI infrastructureEasyRecInference Optimization

0 likes · 15 min read

How EasyRec Boosts Recommendation Training and Inference Performance

Fighter's World

Oct 7, 2025 · Industry Insights

How Many Digital Workers Could Future AI Deploy?

The article analyzes Epoch AI's token‑based framework for estimating AI‑generated digital workers, critiques its static assumptions, and proposes a dynamic, multi‑factor model that incorporates compute supply, hardware constraints, inference efficiency, task reliability, and economic value to forecast a wide range of possible future digital‑worker counts.

AIAI infrastructureAI scaling

0 likes · 27 min read

How Many Digital Workers Could Future AI Deploy?

Alibaba Cloud Infrastructure

Sep 26, 2025 · Artificial Intelligence

How Alibaba’s UPN512 Redefines AI Scale‑Up Networking with Optical Interconnects

The UPN512 whitepaper details Alibaba Cloud's next‑generation AI infrastructure network, explaining the shift from dense to MoE models, the rise of train‑and‑inference integration, xPU scale‑up challenges, and how high‑radix Ethernet with LPO/NPO optical interconnects delivers ultra‑high bandwidth, low latency, cost‑effective, and reliable large‑scale AI compute clusters.

AI infrastructureHigh Performance ComputingUPN512

0 likes · 34 min read

How Alibaba’s UPN512 Redefines AI Scale‑Up Networking with Optical Interconnects

DevOps Cloud Academy

Sep 25, 2025 · Artificial Intelligence

How to Build Scalable MLOps Infrastructure for Enterprise AI Success

This article explains what MLOps is, why a robust MLOps framework is essential for businesses, outlines its core components, compares MLOps with AIOps, details the benefits of investing in MLOps, and provides a step‑by‑step guide to designing enterprise‑grade AI MLOps infrastructure.

AI infrastructureGovernanceMLOps

0 likes · 17 min read

How to Build Scalable MLOps Infrastructure for Enterprise AI Success

DataFunTalk

Sep 24, 2025 · Artificial Intelligence

How OpenAI’s Quest for a Compute Empire Is Reshaping the AI Landscape

In a week OpenAI secured a $300 billion Oracle cloud deal, loosened its exclusive tie‑up with Microsoft, announced massive AI infrastructure projects, and revealed its own chip development, highlighting a strategic shift toward building an independent compute empire amid mounting financial and competitive pressures.

AI computeAI infrastructureOpenAI

0 likes · 22 min read

How OpenAI’s Quest for a Compute Empire Is Reshaping the AI Landscape

DataFunSummit

Sep 18, 2025 · Artificial Intelligence

How We Scaled WeChat AI Services with Ray: Lessons from Million‑Node Deployments

This article examines how Tencent's WeChat team leveraged the Ray distributed computing framework within the Astra platform to tackle massive AI workloads, addressing challenges of scale, GPU diversity, operational complexity, and cost while outlining their architecture and practical insights.

AI infrastructureAstra PlatformDistributed computing

0 likes · 6 min read

How We Scaled WeChat AI Services with Ray: Lessons from Million‑Node Deployments

Architects' Tech Alliance

Sep 18, 2025 · Artificial Intelligence

How AI Model Training Is Redefining Data Center Scaling Strategies

Large‑scale AI model training now demands unprecedented bandwidth and latency performance, forcing data centers to adopt three scaling approaches—Scale‑up, Scale‑out, and Scale‑Across—while leveraging optical I/O, CPO, and optical circuit switching to overcome power, distance, and bandwidth limits.

AI infrastructureScale‑Updata center scaling

0 likes · 11 min read

How AI Model Training Is Redefining Data Center Scaling Strategies

Baidu Intelligent Cloud Tech Hub

Sep 9, 2025 · Artificial Intelligence

How Baidu Built a 32,000‑Card AI Super‑Compute Cluster and Boosted Efficiency by 50%

This article details Baidu Intelligent Cloud's journey in designing, constructing, and operating a 32,000‑card hybrid AI compute cluster, covering challenges in power, cooling, networking, multi‑cluster scheduling, and security, and explains how innovative hardware, software, and operational strategies achieved over 50% MFU improvement and industry‑first performance records.

AI infrastructureGPU clustershybrid cloud

0 likes · 15 min read

How Baidu Built a 32,000‑Card AI Super‑Compute Cluster and Boosted Efficiency by 50%

Alibaba Cloud Infrastructure

Aug 22, 2025 · Artificial Intelligence

Building Scalable AI Infrastructure: Insights from Alibaba Cloud’s AI Tech Day

The AI Infra Solutions and Best Practices salon held by Alibaba Cloud in Beijing gathered technical leaders from leading AI companies to share comprehensive strategies on network, compute, and storage architectures that enable high‑efficiency, low‑latency, and elastic AI infrastructure for modern enterprise workloads.

AI OpsAI infrastructureCloud Computing

0 likes · 7 min read

Building Scalable AI Infrastructure: Insights from Alibaba Cloud’s AI Tech Day

Architects' Tech Alliance

Aug 18, 2025 · Artificial Intelligence

How Large Model Training Dominates Compute and What New Techniques Can Change It

This article explains why pre‑training large AI models consumes 90‑99% of total compute, describes the full training and inference pipelines, introduces resource‑saving strategies such as PD‑separation, and reviews market trends and infrastructure challenges shaping the next generation of AI systems.

AI infrastructureAI trainingGPU architecture

0 likes · 13 min read

How Large Model Training Dominates Compute and What New Techniques Can Change It

Baobao Algorithm Notes

Aug 11, 2025 · Industry Insights

Why AI Infrastructure Must Be Close to Models and Hardware – Insights from Zhu Yibo

In a WAIC 2025 interview, Zhu Yibo, co‑founder of Jiejie Xingchen, shares deep insights on AI infrastructure, covering its evolution, the need for tight model‑hardware co‑design, cost‑efficiency metrics, industry challenges, and future directions for large‑scale AI systems.

AI infrastructureIndustry InsightsMachine Learning

0 likes · 36 min read

Why AI Infrastructure Must Be Close to Models and Hardware – Insights from Zhu Yibo

Architects' Tech Alliance

Aug 2, 2025 · Artificial Intelligence

How China’s Computing‑Power Strategy Is Powering the AI Future

China’s computing‑power industry is rapidly maturing as national policies, massive infrastructure investments, and domestic chip development converge to create a strategic high‑ground that fuels AI, data centers, and digital‑economy transformation, with clear upstream, mid‑stream, and downstream value chains.

AI infrastructureChina policyData Centers

0 likes · 9 min read

How China’s Computing‑Power Strategy Is Powering the AI Future

DataFunTalk

Jul 25, 2025 · Artificial Intelligence

How the U.S. AI Action Plan Aims to Lead the Global AI Race

The U.S. AI Action Plan outlines a three‑pillar strategy—accelerating AI innovation, building robust AI infrastructure, and asserting leadership in international AI diplomacy and security—to secure America’s technological dominance, protect national interests, and ensure AI benefits American workers and society.

AI competitionAI governanceAI infrastructure

0 likes · 44 min read

How the U.S. AI Action Plan Aims to Lead the Global AI Race

AI Info Trend

Jul 24, 2025 · Industry Insights

What’s Driving AI Adoption in 2025? Six Key Trends Uncovered

The AI Adoption Survey H1 2025 reveals that nearly half of organizations have deployed AI in production, engineering and R&D lead usage, Chinese LLMs gain overseas interest, and cost, reliability and intelligence remain the top challenges, while tool preferences and multimodal trends reshape the market.

AI adoptionAI infrastructureAI trends

0 likes · 7 min read

What’s Driving AI Adoption in 2025? Six Key Trends Uncovered

Architects' Tech Alliance

Jul 22, 2025 · Artificial Intelligence

Will AI Backend Networks Exceed $100 B in Spending by 2029? The Ethernet Surge Explained

Driven by exploding AI workloads, the data‑center networking landscape is shifting toward four distinct networks—Compute Fabric, Backend, Front‑end, and DCI—with forecasts showing AI backend network spend surpassing $100 billion by 2029, Ethernet outpacing InfiniBand, and massive port‑speed upgrades reshaping the market.

AIAI infrastructureMarket Forecast

0 likes · 9 min read

Will AI Backend Networks Exceed $100 B in Spending by 2029? The Ethernet Surge Explained

Tencent Technical Engineering

Jul 18, 2025 · Artificial Intelligence

From CPUs to GPUs: How Traditional Backend Skills Power Modern AI Infrastructure

This article explores the evolution of AI infrastructure, comparing it with traditional backend systems, and details how hardware shifts to GPU-centric designs, software adaptations like deep learning frameworks, and engineering challenges in model training and inference can be addressed using established backend methodologies.

AI infrastructureGPU computingInference Optimization

0 likes · 19 min read

From CPUs to GPUs: How Traditional Backend Skills Power Modern AI Infrastructure

Volcano Engine Developer Services

Jul 17, 2025 · Artificial Intelligence

How Distributed KVCache (EIC) Revolutionizes Large‑Model Inference Performance

This article examines how Volcano Engine's Elastic Instant Cache (EIC) tackles the memory bottleneck, high‑concurrency latency, and cross‑node coordination challenges of large language model inference by decoupling storage and computation, pooling resources, and applying layered optimizations, ultimately boosting AI inference efficiency, scalability, and cost‑effectiveness across various deployment scenarios.

AI infrastructureKVCacheLLM inference

0 likes · 30 min read

How Distributed KVCache (EIC) Revolutionizes Large‑Model Inference Performance

Tencent Cloud Developer

Jul 17, 2025 · Artificial Intelligence

Why GPUs Are the New CPUs: Unpacking AI Infrastructure Challenges

This article explores how AI infrastructure has shifted from CPU‑centric designs to GPU‑driven architectures, detailing hardware evolution, software changes, and the engineering challenges of large‑model training and inference, while offering practical insights for traditional backend engineers transitioning to AI systems.

AI infrastructureGPU computingdeep learning

0 likes · 16 min read

Why GPUs Are the New CPUs: Unpacking AI Infrastructure Challenges

DataFunTalk

Jul 15, 2025 · Artificial Intelligence

Inside Scale AI: How a Data‑Labeling Startup Became a $29 B AI Powerhouse

This investigative article traces Scale AI’s evolution from a MIT‑dropout’s data‑annotation startup to a $29 billion AI infrastructure leader, detailing its founder Alexandr Wang, core products, government contracts, competitive advantages, and the strategic shift toward defense‑focused AI solutions.

AI infrastructureArtificial IntelligenceScale AI

0 likes · 15 min read

Inside Scale AI: How a Data‑Labeling Startup Became a $29 B AI Powerhouse

Alibaba Cloud Infrastructure

Jul 11, 2025 · Cloud Native

How Alibaba Cloud’s AI Infra Innovations Are Transforming Kubernetes Workloads

This article summarizes Alibaba Cloud’s key technical contributions at KubeCon China 2025, covering AI‑focused Kubernetes optimizations, Argo Workflows enhancements, storage strategies for large models, Fluid’s data orchestration, multi‑tenant security, and the RoleBasedGroup framework for PD‑separated AI inference.

AI infrastructureArgo WorkflowsFluid

0 likes · 20 min read

How Alibaba Cloud’s AI Infra Innovations Are Transforming Kubernetes Workloads

Architects' Tech Alliance

Jun 29, 2025 · Artificial Intelligence

Scale-Up vs Scale-Out: Balancing Performance and Flexibility in AI Infrastructure

This article explains the technical definitions, core differences, and practical use cases of Scale‑Up and Scale‑Out networking in AI systems, highlighting how they impact latency, bandwidth, and cost, and illustrates their combined application through NVIDIA's NVL72 supernode case study.

AI infrastructureGPU networkingHigh Performance Computing

0 likes · 14 min read

Scale-Up vs Scale-Out: Balancing Performance and Flexibility in AI Infrastructure

IT Services Circle

Jun 23, 2025 · Artificial Intelligence

How the Emerging Computing Power Internet Will Transform AI and Data Services

The article explains the concept, background, definition, challenges, roadmap, and key application scenarios of China's Computing Power Internet, highlighting its role in unifying fragmented compute resources, enabling on‑demand AI services, and driving nationwide digital transformation.

AI infrastructureCloud Computingcomputing power internet

0 likes · 11 min read

How the Emerging Computing Power Internet Will Transform AI and Data Services

DataFunSummit

Jun 20, 2025 · Artificial Intelligence

EasyRec Deep Dive: Training & Inference Architecture, Optimizations, and Online Learning

This article explains EasyRec's end‑to‑end recommendation system, covering its training‑inference architecture, a series of CPU/GPU and distributed optimizations, and a real‑time online‑learning pipeline that together improve throughput, latency, and model freshness.

AI infrastructureDistributed computingInference Optimization

0 likes · 15 min read

EasyRec Deep Dive: Training & Inference Architecture, Optimizations, and Online Learning