Tagged articles

51 articles

Page 1 of 1

May 24, 2026 · Artificial Intelligence

How Tsinghua & Tencent Mixed‑X Won the MLSys 2026 MoE Inference Challenge with a 4.1× Speedup

The Tsinghua‑Tencent Mixed‑X team captured the MLSys 2026 MoE inference optimization championship by analyzing NPU bottlenecks, redesigning data movement, applying expert‑level sharding, continuous DMA, PSUM batching, and an Agent‑based optimizer, achieving a 4.1× end‑to‑end speedup while preserving bit‑level output fidelity.

Agent optimizerInference OptimizationMLSys 2026

0 likes · 14 min read

How Tsinghua & Tencent Mixed‑X Won the MLSys 2026 MoE Inference Challenge with a 4.1× Speedup

Ray's Galactic Tech

Apr 19, 2026 · Operations

How to Make Real‑Time Speech Translation Reliable: Observability & Load‑Testing Secrets

This article dissects the challenges of building a production‑grade real‑time speech translation pipeline, explains why low latency, high accuracy, and resource contention are opposing forces, and then walks through a four‑layer architecture, metric design, tracing, structured logging, capacity planning, and a multi‑stage load‑testing methodology with concrete code examples and real‑world failure patterns.

MicroservicesObservabilityload-testing

0 likes · 39 min read

How to Make Real‑Time Speech Translation Reliable: Observability & Load‑Testing Secrets

SuanNi

Mar 29, 2026 · Artificial Intelligence

How an AI Agent Outperformed NVIDIA Engineers in 7‑Day GPU Kernel Optimization

This article analyzes the AVO system, an autonomous AI agent that replaces traditional evolutionary search pipelines to iteratively improve CUDA attention kernels on NVIDIA's Blackwell B200 GPU, achieving up to 10.5% higher throughput than hand‑tuned implementations after a week of nonstop optimization.

AICUDAGPU optimization

0 likes · 13 min read

How an AI Agent Outperformed NVIDIA Engineers in 7‑Day GPU Kernel Optimization

dbaplus Community

Feb 24, 2026 · Cloud Native

How CPU Architecture Bottlenecks Cripple Netflix’s Container Scaling

Netflix discovered that scaling hundreds of containers on modern CPUs hit severe lock‑contention due to mount‑related kernel locks, with performance varying across AWS instance types, NUMA designs, and hyper‑threading, leading them to redesign containerd mounting and choose hardware‑aware scheduling to restore efficient scaling.

CPU architectureHyper-threadingNUMA

0 likes · 16 min read

How CPU Architecture Bottlenecks Cripple Netflix’s Container Scaling

Ray's Galactic Tech

Jan 28, 2026 · Operations

Building a Full Performance Engineering Loop with Spring Boot, SkyWalking, and Prometheus

This guide walks through constructing a sustainable performance‑engineering pipeline—from monitoring and metrics collection with SkyWalking, Prometheus, and Grafana, through targeted load testing and bottleneck analysis, to capacity modeling and alert solidification—for Spring Boot services.

GrafanaPrometheusSpring Boot

0 likes · 8 min read

Building a Full Performance Engineering Loop with Spring Boot, SkyWalking, and Prometheus

DeWu Technology

Jan 7, 2026 · Operations

From Chaos to Clarity: Building Full‑Stack Observability for Poizon’s Algorithm Ecosystem

This article details how Poizon’s algorithm platform evolved from fragmented tracing to a unified, scenario‑driven observability system that standardizes traces, metrics, logs, and events, introduces a knowledge‑graph of algorithm scenes, and applies compression, async reporting, and advanced anomaly detection to improve stability and debugging efficiency.

Algorithm PlatformAnomaly DetectionLog Standardization

0 likes · 26 min read

From Chaos to Clarity: Building Full‑Stack Observability for Poizon’s Algorithm Ecosystem

Refining Core Development Skills

Sep 3, 2025 · Operations

When Should You Hire a Dedicated Performance Engineering Team?

This article explains why modern enterprises increasingly need specialized performance engineering teams, outlines their ROI through cost savings, latency reduction, scalability, and engineering efficiency, details the engineers' responsibilities, and provides practical hiring guidelines and real‑world case studies.

Latency ReductionScalabilitycost optimization

0 likes · 29 min read

When Should You Hire a Dedicated Performance Engineering Team?

Airbnb Technology Team

Aug 18, 2025 · Operations

How Airbnb’s Impulse Framework Enables Scalable, Decentralized Load Testing

This article explains how Airbnb’s internal Impulse load‑testing‑as‑a‑service framework combines a load generator, dependency simulator, traffic collector, and test‑API generator to provide decentralized, containerized performance testing that integrates seamlessly with CI/CD pipelines and mimics production traffic.

AirbnbContainerizationDecentralized

0 likes · 11 min read

How Airbnb’s Impulse Framework Enables Scalable, Decentralized Load Testing

Baobao Algorithm Notes

Aug 1, 2025 · Artificial Intelligence

Why Training Large Language Models Feels Like Alchemy—and How to Master It

This article breaks down the hardware bottlenecks of large‑scale LLM training, explains the Roofline performance model, arithmetic intensity, and how computation and communication costs interact on GPUs and TPUs, offering concrete formulas and examples for efficient scaling.

Arithmetic intensityDistributed computingGPU

0 likes · 12 min read

Why Training Large Language Models Feels Like Alchemy—and How to Master It

Bitu Technology

Mar 14, 2025 · Backend Development

Designing Argos: A Next‑Generation Load‑Testing Tool for Super‑Bowl Scale and the Path to a 10x Engineer in the AI Era

The article recounts a Tubi meetup where senior engineers presented Argos, a cloud‑native load‑testing platform built with curl, Lambda, and ClickHouse to handle Super Bowl‑level traffic, while also discussing engineering mindset, cross‑team processes, and how AI tools empower developers to become 10x engineers.

10x engineerAI toolsCloud Functions

0 likes · 13 min read

Designing Argos: A Next‑Generation Load‑Testing Tool for Super‑Bowl Scale and the Path to a 10x Engineer in the AI Era

FunTester

Feb 26, 2025 · Industry Insights

8 Software Testing Trends Shaping 2025: AI, Low‑Code, Shift‑Left/Right & More

The article outlines eight major software testing trends for 2025—including AI‑driven test automation, low‑code tools, shift‑left/right practices, chaos engineering, DevSecOps security testing, performance engineering, and autonomous testing—while advising engineers on skill upgrades and cross‑functional collaboration.

AI testingDevSecOpsShift-Left

0 likes · 16 min read

8 Software Testing Trends Shaping 2025: AI, Low‑Code, Shift‑Left/Right & More

Python Programming Learning Circle

Jan 6, 2025 · Fundamentals

Beyond Moore's Law: Software, Algorithms, and Architecture as New Performance Drivers

The article examines how, as Moore's Law ends, performance gains will increasingly rely on software optimization, algorithmic advances, and hardware architecture innovations, illustrated by matrix multiplication benchmarks and discussions of Dennard scaling, parallelism, and emerging technologies.

Moore's Lawhardware architectureperformance engineering

0 likes · 10 min read

Beyond Moore's Law: Software, Algorithms, and Architecture as New Performance Drivers

Airbnb Technology Team

Jul 2, 2024 · Frontend Development

Airbnb Page Performance Score (PPS): Multi‑Platform Metrics, Weighting, and Evolution

Airbnb created the Page Performance Score (PPS), a unified 0‑100 metric that aggregates platform‑specific initial‑load and post‑load user‑centric measurements for Web, iOS, and Android, using weighted curves to enable cross‑page, cross‑team comparisons, track organizational weighted averages, and evolve with new metrics while preserving a stable scale.

AirbnbMetricsPage Performance Score

0 likes · 10 min read

Airbnb Page Performance Score (PPS): Multi‑Platform Metrics, Weighting, and Evolution

DevOps

Jun 16, 2024 · Operations

Performance Engineering Challenges and Practices for Software‑Defined Vehicles

The article examines how the shift to Software‑Defined Vehicles introduces complex performance engineering challenges across software, hardware, and organizational domains, and proposes an engineering‑driven, continuous‑observability approach—including modeling, monitoring, iterative optimization, and specialized team structures—to sustainably improve automotive software performance.

ObservabilityPerformance OptimizationSDV

0 likes · 17 min read

Performance Engineering Challenges and Practices for Software‑Defined Vehicles

Python Programming Learning Circle

May 28, 2024 · Fundamentals

Beyond Moore's Law: Leveraging Software, Algorithms, and Architecture for Future Performance Gains

With Moore's Law reaching its limits, a recent Science paper by MIT, Nvidia, and Microsoft researchers argues that future computing performance will rely on improvements in the software stack, algorithmic innovations, and hardware architecture, as demonstrated by performance engineering benchmarks and evolving hardware trends.

AlgorithmsMoore's LawPost-Moore Era

0 likes · 9 min read

Beyond Moore's Law: Leveraging Software, Algorithms, and Architecture for Future Performance Gains

iKang Technology Team

May 11, 2024 · Operations

How to Conduct Full‑Stack Load Testing for Reliable Production Systems

Full‑link load testing evaluates the performance of an entire application stack—from user interface to databases—by simulating real‑world traffic, isolating test data, verifying security and SLA thresholds, measuring key metrics such as throughput and response time, and comparing tools like tcpcopy and goreplay to ensure system stability and scalability.

Metricsfull-stackload-testing

0 likes · 7 min read

How to Conduct Full‑Stack Load Testing for Reliable Production Systems

iQIYI Technical Product Team

May 10, 2024 · Operations

Full‑Link Load Testing of iQIYI Playback Service: Process, Tools, and Outcomes

iQIYI implemented full‑link load testing of its playback service using LoadMaker for traffic generation and Rover for link control, mapping the topology, creating weighted user scenarios, and safely pressurizing production‑like environments, which validated multi‑times historical peak capacity, uncovered bottlenecks, and enabled several performance and disaster‑recovery improvements without impacting real users.

capacity planningiQIYIload-testing

0 likes · 10 min read

Full‑Link Load Testing of iQIYI Playback Service: Process, Tools, and Outcomes

JD Retail Technology

Feb 5, 2024 · Mobile Development

Efficient Optimization of JD App Review API: Reducing Peak Requests by 85% without Impacting User Experience

By analyzing user behavior and decoupling the review layer from the product detail flow, the JD app team introduced a scroll‑triggered, configurable request mechanism and fallback handling, achieving up to an 85% reduction in peak review API QPS during major sales events without degrading user experience.

API throttlingAsynchronous LoadingMobile Optimization

0 likes · 7 min read

Efficient Optimization of JD App Review API: Reducing Peak Requests by 85% without Impacting User Experience

Tencent Cloud Developer

Sep 21, 2023 · Frontend Development

Memory Optimization of Desktop QQ: A Stage Summary

To curb Desktop QQ’s excessive Electron memory consumption, the team applied comprehensive profiling, code and resource slimming, thumbnail image generation, visible‑only DOM rendering, layer merging, Lottie animation tuning, API caching, and leak removal, achieving average usage around 228 MB and keeping all six processes below 300 MB with ongoing monitoring.

Desktop applicationElectronMemory Optimization

0 likes · 24 min read

Memory Optimization of Desktop QQ: A Stage Summary

Didi Tech

Aug 17, 2023 · Operations

Construction of a Full-Link Load Testing Simulation Measurement System for Didi Ride-Hailing

The article details how Didi’s ride‑hailing team built a full‑link load‑testing simulation‑degree measurement system that quantifies test coverage across five dimensions—interface, scenario, category, link, and module—using normalized metrics, traffic prediction, and scoring formulas to identify gaps, improve stability, and guide future capacity‑planning enhancements.

DidiRide Hailingload-testing

0 likes · 16 min read

Construction of a Full-Link Load Testing Simulation Measurement System for Didi Ride-Hailing

DaTaobao Tech

Jun 14, 2023 · Artificial Intelligence

Optimizing NeRF for Real-Time Mobile 3D Rendering in Alibaba's Object Drawer

Alibaba’s Taobao engineers detail how they transformed slow, high‑quality NeRF reconstruction into a real‑time mobile solution by combining an Octree‑Tiny‑MLP architecture, SNeRG optimizations, and a high‑frequency voxel reduction that shrank models to ~5 MB and achieved ~6 FPS on low‑end Android phones, targeting sub‑1 MB models and 50 FPS.

3D ReconstructionMobile OptimizationNeRF

0 likes · 10 min read

Optimizing NeRF for Real-Time Mobile 3D Rendering in Alibaba's Object Drawer

DaTaobao Tech

May 31, 2023 · Mobile Development

From Intern to Senior Engineer: Lessons on Writing Quality Android Code

This article shares a senior engineer’s journey from internships to three years at Taobao, offering practical advice on writing readable, high‑performance Android code, mastering design principles, handling performance metrics, and maintaining a growth mindset while contributing to a mobile‑focused tech team.

APMAndroidDesign Patterns

0 likes · 13 min read

From Intern to Senior Engineer: Lessons on Writing Quality Android Code

dbaplus Community

Mar 22, 2023 · Databases

Scaling an Airline Ticket Order Database: From Monolith to 64‑Shard Sharding

The article details how a rapidly growing airline ticket order system was re‑architected by identifying performance bottlenecks, applying vertical and horizontal sharding, optimizing cache layers, implementing dual‑write mechanisms, and planning a phased migration to achieve ten‑fold QPS growth while reducing resource usage and operational risk.

Cache OptimizationDistributed SystemsDual Write

0 likes · 38 min read

Scaling an Airline Ticket Order Database: From Monolith to 64‑Shard Sharding

NetEase Yanxuan Technology Product Team

Feb 20, 2023 · Big Data

Data Task Optimization Techniques and Practices

The article surveys unconventional offline data‑task optimizations—such as distribution‑by, seeded random shuffling, explode‑based skew mitigation, hash bucketing, task‑parallelism tuning, and multi‑insert materialization—organized by point, line, and surface perspectives, and stresses that effective performance gains require both technical tricks and business‑driven pipeline adjustments.

Distributed computingHiveSQL Tuning

0 likes · 16 min read

Data Task Optimization Techniques and Practices

Weimob Technology Center

Feb 3, 2023 · Operations

How Full‑Link Load Testing Became the Secret Weapon for E‑Commerce Mega‑Sales

This article explains how micro‑enterprise SaaS leader Weimeng built a full‑link load‑testing platform to simulate real‑world traffic for major shopping festivals, detailing the challenges, architecture, capabilities, results, and future plans for ensuring system stability and performance at scale.

JMetercapacity planninge-commerce

0 likes · 16 min read

How Full‑Link Load Testing Became the Secret Weapon for E‑Commerce Mega‑Sales

ByteDance SYS Tech

Jan 6, 2023 · Fundamentals

How ByteDance Scaled Profile‑Guided Optimization to Boost CPU Efficiency

This article explains ByteDance's large‑scale adoption of profile‑guided optimization (PGO), covering its principles, instrumentation and sampling methods, the automated platform built for data collection and compilation, and the resulting performance gains across dozens of critical services.

ByteDanceCompiler OptimizationInstrumentation

0 likes · 12 min read

Bilibili Tech

Dec 27, 2022 · Operations

Optimizing QUIC Gateway Performance with AF_XDP

Bilibili’s video CDN replaced its traditional TCP‑based gateway with a QUIC/HTTP‑3 gateway and, to curb the extra CPU load caused by complex UDP handling, adopted AF_XDP kernel‑bypass sockets that redirect packets via XDP, cutting CPU usage by about half, raising peak bandwidth to roughly 9 Gbps and improving per‑bandwidth efficiency by up to 30 %.

AF_XDPLinux kernelQUIC

0 likes · 14 min read

Optimizing QUIC Gateway Performance with AF_XDP

iQIYI Technical Product Team

Jul 8, 2022 · Mobile Development

Performance Optimization Practices for iQIYI International Mobile App

To prevent massive user loss from slow loads, iQIYI International overhauled its mobile app’s network stack—switching to HTTPDNS, enabling TCP Fast Open, upgrading to HTTP/2 and TLS 1.3, compressing payloads with Brotli and WebP, using protobuf, caching, and fallback protocols—cut latency, reduced failures, and boosted video playback across Southeast Asian markets.

HTTP/2TLSedge computing

0 likes · 10 min read

Performance Optimization Practices for iQIYI International Mobile App

Bilibili Tech

Jun 24, 2022 · Cloud Native

Evolution and Design of Bilibili's Load‑Testing Platform (Platform 2.0)

Bilibili’s load‑testing platform evolved from ad‑hoc JMeter scripts to a fully automated, self‑service system (Platform 2.0) that uses a custom load client, adaptive scheduling, and flexible scenario modes—including traffic replay and data‑isolated testing—to efficiently stress‑test over a hundred microservices for large‑scale events, with further integration and circuit‑breaker enhancements planned.

Distributed SystemsMicroservicescloud-native

0 likes · 27 min read

Evolution and Design of Bilibili's Load‑Testing Platform (Platform 2.0)

Baidu Geek Talk

May 25, 2022 · Backend Development

Large-Scale C/C++ Service Compilation Performance Optimization and Platformization (OMAX)

The article details OMAX’s end‑to‑end platform for large‑scale C/C++ service compilation, covering optimization flags, profile‑guided and link‑time techniques, Facebook BOLT post‑link tuning, and real‑world results that cut CPU use, latency and deployment time while shrinking binary size.

BoltCCloud Services

0 likes · 24 min read

Large-Scale C/C++ Service Compilation Performance Optimization and Platformization (OMAX)

SQB Blog

May 9, 2022 · Operations

How Havok Enables Realistic Full‑Link Load Testing for Scalable Services

This article explains how the Havok full‑link load testing platform was designed and built to replay real traffic safely, provide capacity‑assessment data, support multiple test types, and offer real‑time monitoring and circuit‑breaker protection for large‑scale online services.

Monitoringcapacity planningfull‑link testing

0 likes · 16 min read

How Havok Enables Realistic Full‑Link Load Testing for Scalable Services

Tencent Cloud Developer

Nov 30, 2021 · Cloud Native

Tencent sTGW TQUIC: Reducing Login Latency by 30% and Boosting 500 ms Success Rate to 90%

Tencent’s sTGW team introduced the TQUIC protocol stack, a lightweight QUIC/HTTP‑3 implementation with 0‑RTT handshakes, connection migration, and real‑time frames, cutting user login latency by 30 % and raising 500 ms download success from 60 % to 90 % in weak‑network conditions while shrinking the Android library to roughly 3 MB.

Cloud NativeQUICReal-time Transmission

0 likes · 14 min read

Tencent sTGW TQUIC: Reducing Login Latency by 30% and Boosting 500 ms Success Rate to 90%

DevOpsClub

Oct 11, 2021 · R&D Management

How Ant Group Scaled R&D Efficiency with a Data‑Driven Insight Platform

Ant Group built a comprehensive R&D Insight system that combines measurement infrastructure, a unified metric framework, and a comprehensive evaluation model to turn massive development data into actionable diagnostics, enabling company‑wide, team‑level, and outsourcing efficiency improvements across thousands of engineers.

DevOpsR&D metricsSoftware measurement

0 likes · 29 min read

How Ant Group Scaled R&D Efficiency with a Data‑Driven Insight Platform

High Availability Architecture

Jun 2, 2021 · Operations

Design and Implementation of Full‑Link Load Testing at Dada Group

This article describes Dada Group’s evolution from a simple 1:1 test environment to a sophisticated machine‑labeling load‑testing solution, detailing core design, isolation techniques, custom testing platform, model construction, pre‑heat strategies, and post‑test analysis that ensure system stability during high‑traffic events.

Distributed SystemsMicroservicesload-testing

0 likes · 16 min read

Design and Implementation of Full‑Link Load Testing at Dada Group

Xueersi Online School Tech Team

Mar 12, 2021 · Operations

Evolution of Live Streaming Load Testing and Stability Assurance for an Online Education Platform

The article details how an online education provider progressively enhanced its live‑streaming performance testing framework—from rudimentary "stone age" checks to automated, data‑driven "information age" practices—by restructuring services, refining test scenarios, introducing traffic replay, and automating script generation to achieve more reliable and efficient stability assurance.

automationload-testingonline education

0 likes · 12 min read

Evolution of Live Streaming Load Testing and Stability Assurance for an Online Education Platform

21CTO

Jan 15, 2021 · Operations

How iQIYI Scaled Its Payment System with Full‑Link Load Testing

This article details iQIYI's end‑to‑end load‑testing methodology for its payment platform, covering problem identification, core‑link mapping, environment setup, realistic traffic modeling, execution safeguards, results from capacity verification and stress testing, and future plans for a unified testing solution.

Operationscapacity planningload-testing

0 likes · 12 min read

How iQIYI Scaled Its Payment System with Full‑Link Load Testing

Amap Tech

Nov 19, 2020 · Operations

TestPG Load‑Testing Platform: Precise Pressure Control Architecture and Practice

The TestPG load‑testing platform, built on a master‑slave architecture with Redis‑driven dynamic configuration, delivers fine‑grained, cluster‑ and interface‑level pressure control that automates load‑generator allocation, shortens holiday testing cycles to three days, and produces realistic traffic models for Gaode’s nationwide services.

Distributed Systemsautomationload-testing

0 likes · 14 min read

TestPG Load‑Testing Platform: Precise Pressure Control Architecture and Practice

iQIYI Technical Product Team

Sep 18, 2020 · Operations

Full-Chain Load Testing Practices for iQIYI Payment System

iQIYI’s payment team built a full‑chain load‑testing framework that isolates data, mocks dependencies, constructs realistic multi‑service traffic, and executes protected tests to expose bottlenecks, guide scaling and optimizations, and ultimately ensure reliable payment services during traffic spikes, while planning a unified automation platform.

Monitoringcapacity planningfull-chain testing

0 likes · 13 min read

Full-Chain Load Testing Practices for iQIYI Payment System

JD Retail Technology

Oct 31, 2019 · Operations

Collaborative Load Testing for JD.com 11.11 Event: Organizational Changes, Scale Expansion, and ForceBot Traffic Recording & Replay

The article details JD.com's coordinated effort to prepare for the 11.11 shopping festival by expanding load‑testing scale, improving cross‑team collaboration, and enhancing the ForceBot platform with traffic recording and replay capabilities to achieve more realistic and efficient full‑chain performance evaluations.

JD.comOperationsforcebot

0 likes · 7 min read

Collaborative Load Testing for JD.com 11.11 Event: Organizational Changes, Scale Expansion, and ForceBot Traffic Recording & Replay

Architecture Digest

Aug 26, 2019 · Operations

Ensuring System Stability for High‑Scale Services: Full‑Link Load Testing at Gaode

The article describes how Gaode handles the challenges of supporting over 100 million daily active users by applying capacity planning, traffic control, disaster recovery, monitoring, rehearsal, and a self‑built full‑link load‑testing platform that simulates realistic traffic, manages resources, and provides detailed reporting to guarantee system stability.

Gaodefull‑link testingload-testing

0 likes · 20 min read

Ensuring System Stability for High‑Scale Services: Full‑Link Load Testing at Gaode

Amap Tech

Aug 20, 2019 · Operations

Full‑Link Load Testing and Stability Assurance at Gaode: Architecture, Practices, and Future Directions

To guarantee stability for over 100 million daily users, Gaode combines capacity planning, traffic control, disaster recovery, monitoring, and pre‑plan drills with a self‑built full‑link load‑testing platform (TestPG) that replays realistic traffic in production‑like environments, isolates test loads, provides rapid configuration, detailed debugging, automated error capture, and comprehensive reporting, while planning future enhancements such as integrated topology monitoring, advanced pressure models, and confidence evaluation.

Distributed Systemscapacity planningload-testing

0 likes · 20 min read

Full‑Link Load Testing and Stability Assurance at Gaode: Architecture, Practices, and Future Directions

Architecture Talk

Apr 28, 2019 · Operations

How Didi Engineered a Full‑Link Load‑Testing Platform to Safeguard Millions of Daily Rides

This article details Didi's 2016 full‑link load‑testing initiative, covering data‑isolation strategies, virtual driver/passenger tooling, trace‑based traffic marking, staged deployment tactics, and the operational insights gained from stress‑testing a massive ride‑hailing platform.

Data IsolationDidiRide Hailing

0 likes · 12 min read

How Didi Engineered a Full‑Link Load‑Testing Platform to Safeguard Millions of Daily Rides

Baidu Intelligent Testing

Apr 16, 2018 · Operations

Online Load‑Testing Practices for Baidu Nuomi Marketing Activities

This article presents a comprehensive case study of Baidu Nuomi's online load‑testing methodology for high‑traffic marketing events, covering capacity estimation, test planning, execution, anti‑attack measures, platform architecture, and lessons learned to ensure system reliability and performance under peak loads.

Online Testingcapacity planningload-testing

0 likes · 16 min read

Online Load‑Testing Practices for Baidu Nuomi Marketing Activities

ITFLY8 Architecture Home

Apr 12, 2018 · Backend Development

How WhatsApp Scales to 450 Million Users with Just 32 Engineers

This article examines WhatsApp's high‑reliability architecture, detailing how a tiny team of 32 engineers leverages Erlang, FreeBSD, and custom BEAM patches to support hundreds of nodes, thousands of cores, and hundreds of terabytes of memory for over 450 million active users.

ErlangScalabilityWhatsApp

0 likes · 22 min read

How WhatsApp Scales to 450 Million Users with Just 32 Engineers

Efficient Ops

Jan 15, 2018 · Operations

How to Build a Full‑Chain Load‑Testing Platform for E‑Commerce in 2 Days

This article details how Xiaohongshu tackled rapid growth challenges by designing, implementing, and operating a full‑link performance testing platform in just two days, covering system architecture, testing models, collaborative deployment, capacity planning, and practical advice for teams seeking reliable e‑commerce load testing.

Operationse-commercefull-chain testing

0 likes · 9 min read

How to Build a Full‑Chain Load‑Testing Platform for E‑Commerce in 2 Days

Efficient Ops

Jun 19, 2017 · Operations

How JD.com’s ForceBot Revolutionized 618 Sale Load Testing

This article examines JD.com’s 618 shopping festival performance, the deployment of unmanned delivery robots, and the design and architecture of the ForceBot full‑link load‑testing system that enabled precise capacity planning and bottleneck detection for massive e‑commerce traffic.

System Architecturecapacity planninge-commerce

0 likes · 8 min read

How JD.com’s ForceBot Revolutionized 618 Sale Load Testing

Alibaba Cloud Developer

Jun 1, 2017 · Operations

How Alibaba Engineers Capacity Planning and Full‑Link Load Testing for Massive Sales Events

This article explains Alibaba's four‑step capacity‑planning methodology, the various single‑machine load‑testing techniques, the design of a full‑link load‑testing platform for Double‑11, and the dynamic flow‑control framework that together ensure system stability during extreme traffic spikes.

AlibabaOperationscapacity planning

0 likes · 18 min read

How Alibaba Engineers Capacity Planning and Full‑Link Load Testing for Massive Sales Events

21CTO

Apr 13, 2017 · Operations

Mastering Internet Performance Engineering and Capacity Planning

This article presents a comprehensive methodology for internet performance engineering, covering non‑functional quality goals, detailed metrics for application servers, databases, caches and message queues, a practical technical review outline, and a real‑world capacity‑planning case study with both maximal and minimal resource solutions.

Backend ArchitectureNon-functional RequirementsOperations

0 likes · 24 min read

Mastering Internet Performance Engineering and Capacity Planning

Efficient Ops

Jan 12, 2017 · Operations

How JD’s ForceBot Revolutionizes Full‑Chain Load Testing for Massive Shopping Events

ForceBot is JD.com’s comprehensive full‑chain load‑testing platform that simulates user behavior across the entire purchase flow, isolates test traffic, leverages Docker‑based agents, GRPC services, and real‑time data analytics to identify bottlenecks, optimize resource planning, and support both routine and peak‑traffic scenarios.

automationload-testingperformance engineering

0 likes · 16 min read

How JD’s ForceBot Revolutionizes Full‑Chain Load Testing for Massive Shopping Events

ITPUB

Jul 7, 2016 · Databases

How Software Performance Engineering Boosts Database Optimization

The talk explains how systematic software performance engineering, through six optimization patterns such as Fast Path, Batching, Flex Path, First Things First, Coupling, and Alternate Routes, can identify and resolve database performance bottlenecks without merely adding more hardware resources.

OptimizationResource ManagementSoftware

0 likes · 14 min read

How Software Performance Engineering Boosts Database Optimization

ITPUB

Feb 25, 2014 · Databases

Why Database Query Optimization Matters and Key Strategies to Cut Scan Times

The article explains how costly full‑table scans in large databases, such as a bank's million‑record account table, can be reduced to minutes through effective query‑optimization techniques, and outlines the main strategies, automation possibilities, and future trends in SQL performance.

Database OptimizationFull Table ScanQuery Tuning

0 likes · 3 min read

Why Database Query Optimization Matters and Key Strategies to Cut Scan Times