Designing Scalable IT Infrastructure for Large Internet Enterprises: A Practical Overview
This article presents a comprehensive overview of the IT infrastructure model used by a fast‑growing internet company, detailing layered design from business to data‑center, high‑availability services, automation, and the five‑fold approach of service, standardization, automation, self‑service, and technology.
Speaker Introduction
Chen Yitai Senior Manager of Operations at Vancl, responsible for IDC data‑center and website technical operations as well as internal corporate IT system and network maintenance. Over ten years of experience in IT infrastructure, previously at Microsoft China Technical Center, later joining Vancl to deeply participate in the company's system and network infrastructure construction.
Today I am pleased to share my talk titled "Overview of IT Infrastructure for Mid‑to‑Large Internet Enterprises".
Having witnessed Vancl's rapid growth from 2011 to 2013, the company peaked at over 13,000 employees, raising the question of what unique characteristics its internal IT infrastructure possesses.
Topic Introduction
Fundamentally, IT is a service; its service target is the enterprise, so business determines the direction of informationization and the characteristics of the IT infrastructure.
The IT infrastructure design philosophy I will present follows a model that serves business goals, using the model as a framework to discuss concrete implementations.
Insights from the TCP/IP Model
We are all familiar with the TCP/IP and OSI models, which guide us in solving technical problems.
In the TCP/IP model, the top layer is the application layer, followed by transport, network, data‑link, and physical layers, each providing services to the layer above.
This reflects a top‑down, layered, and modular design approach.
IT Design Model and Implementation
Adding a business layer on top of the TCP/IP application layer yields five layers: Business Layer, Application System Layer, Base System Layer, Network Layer, and Data‑Center Layer, plus two cross‑cutting aspects—Resource Management & Monitoring and Auditing & Security.
Brief description of each layer:
Business Layer – company’s administrative, financial, and sales activities.
Application System Layer – technical realization of business, e.g., ERP, OA, CRM.
Base System Layer – familiar to operations staff: DNS, DHCP, file services, mail.
Network Layer – routers, switches, firewalls forming the inter‑connected network.
Data‑Center Layer – servers and switches physically housed in data‑centers; broadly includes all physical IT assets.
Resource management, monitoring, security, and auditing span all layers.
In practice, the enterprise IT infrastructure, according to this model, covers the Base System, Network, and Data‑Center layers; the upper layers influence how these foundational layers are designed and built to meet business growth and application requirements.
Vancl’s implementation follows a set of five principles—service‑orientation, standardization, automation, self‑service, and technology‑driven development.
Historical Background
Using Vancl as a case study, we trace the evolution from business needs to concrete system and network designs.
From 2011 to 2013, employee count grew from ~300 to >13,000, office sites expanded from 4 to about 40 nationwide, with 28 warehouses, 6 offices, and 6 data‑centers, creating massive operational pressure.
Typical daily tasks included unpacking boxes, distributing computers, and provisioning hundreds of PCs at once.
Business and Application System Layers
Vancl’s business is divided into four major areas: Enterprise Office, Warehouse Center, Call Center, and Express Logistics.
Warehouse operations illustrate how virtual goods on the website correspond to physical inventory across ~30 warehouses, influencing IT design.
Base System Layer – DNS and DHCP
High availability is critical for warehouse and call‑center services.
Vancl primarily uses Windows Server for DNS and DHCP because of Active Directory integration, which simplifies management and provides built‑in redundancy.
Each site runs two DNS servers synchronized via AD replication, offering bidirectional redundancy superior to traditional master‑slave DNS.
Clients configure two local DNS servers for failover.
Internal and external DNS namespaces are kept separate to avoid conflicts and improve security.
About DHCP
Subnet and IP‑pool planning must consider future growth; Vancl uses a 10.0.0.0 private network.
Two DHCP servers per site follow a “55 principle” where each server handles a distinct half of the address range to avoid conflicts.
Windows DHCP clustering introduces a shared arbitration disk, which can become a single point of failure; Vancl therefore prefers the simpler split‑range approach.
Base System Layer – File & Print
Windows file servers provide a unified access path (\\vancloa.cn\file\), departmental quotas, hot‑backup, rollback, role‑based permissions, and centralized printing.
Base System Layer – Email
Vancl operates multiple domains (vancl.cn, vancl.com, vjia.com, rufengda.com) across Exchange Server 2010 for internal mail and Postfix for outbound bulk mail and spam filtering.
General office mail. High‑volume business mail (e.g., [email protected]) processed by backend systems. Internal alert mail. Mass marketing and order‑status mail to customers.
Exchange offers high availability, scalability, and mobile synchronization; it is deployed on virtual machines.
Base System Layer – Instant Messaging
Vancl integrates traditional telephony, IP telephony, PC, and mobile communication via a unified platform combining SipX, Microsoft Lync, and PSTN, providing a seamless enterprise communication system.
Base System Layer – Account & Permission Management
Windows Active Directory serves as the central account repository; all servers (Windows and Linux) join the domain.
Account management combines automation and self‑service; permissions are enforced via AD security groups and custom web services (PMS permission system).
Base System Layer – AD Design
Organizational Units (OUs) are flattened and categorized by object type (accounts, distribution groups, security groups, computers) rather than traditional hierarchy, simplifying automation and improving performance.
Base System Layer – IT Operations Automation
Standardization is achieved through Microsoft System Center (SCCM) for OS imaging, software distribution, and remote assistance, enabling rapid, consistent deployment across ~40 sites and thousands of PCs.
Virtualization
All core services (AD, Lync, SipX, Exchange, SCCM, file servers) run on virtual machines; redundancy is ensured by placing redundant services on separate hosts and racks.
Network Layer
Key design principles:
No single point of failure for critical nodes and links; dual links (fiber and VPN) connect sites, with multiple paths between IDC cores.
Dynamic routing using OSPF and BGP for fast convergence and intelligent path selection.
Site‑to‑site VPNs use IPsec tunnels as backup to dedicated lines; client VPNs leverage Microsoft solutions with HTTPS‑based penetration for ease of use.
QoS
Bandwidth is scarce; dual‑exit sites and high‑availability designs are complemented by QoS policies that prioritize business traffic.
Data‑Center Layer
Considerations include location, temperature, power, cooling, rack capacity, and interconnects; requirements differ between office data‑centers (focus on compute density) and warehouse sites (emphasis on environmental robustness).
"Five‑fold" IT Service Model
The five principles—service‑orientation, standardization, automation, self‑service, and technology‑driven implementation—guide the entire infrastructure.
ITIL‑based incident management system provides menu‑driven service requests, automatic routing based on user attributes, and SLA‑driven escalation, improving service quality and visibility.
Self‑service account management (AES) allows employees to request mailboxes, reset passwords, and join/leave groups without IT intervention, freeing staff for higher‑value tasks.
Automation links HR systems to AD and mail systems, enabling batch provisioning and de‑provisioning of accounts.
Standardization underpins self‑service and automation; technology enforces standards through concrete software systems.
Resource Management and Monitoring
Comprehensive monitoring and resource management provide historical data for capacity planning and performance tuning.
Security and Auditing
Security is a multi‑layer concern; depth defense, continuous auditing, and integrated safeguards protect against failures at any layer.
Thank you for reading this overview of IT infrastructure for a mid‑to‑large internet enterprise.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.