Operations 17 min read

Stack Overflow Architecture Overview: Hardware, Scaling, and Infrastructure (2015)

This article provides a detailed overview of Stack Overflow's 2015 architecture, covering daily traffic growth, hardware upgrades, redundancy principles, DNS and ISP routing, HAProxy load balancing, IIS/ASP.NET web layer, Redis caching, WebSocket services, Elasticsearch search, SQL Server databases, and the open‑source tools that support the platform.

Architecture Digest
Architecture Digest
Architecture Digest
Stack Overflow Architecture Overview: Hardware, Scaling, and Infrastructure (2015)

The article begins by comparing Stack Overflow's daily traffic metrics from November 2013 to February 2016, showing significant increases in HTTP requests, data transfer, Redis hits, and SQL queries, while highlighting improvements in request processing times due to hardware upgrades and performance tuning.

It then lists the major hardware components used in 2015, including four SQL Server machines, eleven IIS web servers, two Redis servers, three tag‑engine servers, three Elasticsearch nodes, four HAProxy load balancers, upgraded 10 Gbps network fabrics, Fortinet firewalls, and Cisco ASR routers.

Key operational principles are emphasized: full redundancy, dual 10 Gbps network links, dual power supplies, rack‑level backup, and off‑site Colorado data‑center copies.

Internet connectivity relies on CloudFlare DNS with fallback to internal DNS, four ISPs (Level 3, Zayo, Cogent, Lightower), and active/active BGP routing through ASR‑1001 and ASR‑1001‑X routers, providing redundant 10 Gbps MPLS links between data centers.

Load balancing is handled by HAProxy 1.5.15 on CentOS 7, with TLS termination, rate limiting, and per‑host routing, each HAProxy equipped with dual 10 Gbps internal and DMZ links and ample memory for SSL session caching.

The web layer runs IIS 8.5 with ASP.NET MVC 5.2.3 on nine primary servers and two staging/meta servers, serving all Stack Exchange sites in a multi‑tenant fashion, while separate services (Careers, API, Mobile) run on dedicated instances.

Behind the web tier is a service layer on Windows 2012 R2 IIS servers hosting the tag engine and Providence API, with careful socket affinity to reduce redundant cache loads.

Redis provides L1 (in‑process) and L2 (central) caching, pub/sub for cache invalidation, and powers machine‑learning recommendation services; the primary Redis nodes have 256 GB RAM, while the Providence nodes have 384 GB.

WebSockets are delivered via the NetGain server, handling up to 500 k concurrent connections for real‑time notifications, with observed traffic patterns shown in accompanying charts.

Search is powered by Elasticsearch 1.4, with three‑node clusters per data center, full‑SSD storage, and 192 GB RAM, chosen over SQL full‑text search for scalability and cost efficiency.

SQL Server forms the authoritative data source, with two AlwaysOn availability groups (Dell R720xd and R730xd hardware) hosting the main Q&A, Careers, OpenID, chat, and other databases; Dapper is used as the primary Micro‑ORM.

The article concludes with a list of open‑source .NET tools used across the stack, such as Dapper, StackExchange.Redis, MiniProfiler, Exceptional, Jil, Sigil, NetGain, Opserver, and Bosun, and notes that future posts will dive deeper into hardware specifications and other architectural topics.

architectureoperationsscalabilityload balancingRedisStack OverflowSQL Server
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.