Building a Resilient, High‑Performance Website: Domains, CDN, Security & Ops
This guide outlines a comprehensive, step‑by‑step strategy for creating a highly available, secure, and scalable website—from buying and protecting multiple domains, configuring DNS and CDN, setting up image and database servers, to implementing monitoring, redundancy, high‑concurrency testing, and disaster‑recovery plans.
1. Domain
Purchase many domains (50‑100), including primary and promotional ones, from GoDaddy for stability and buy domain protection. Manage DNS on Cloudflare, DNSPod, ZNDNS, or a self‑hosted DNS server so a domain can resolve to multiple IPs based on proximity, enabling faster DNS changes.
2. CDN
Buy a CDN service (e.g., Cloudflare). Point the domain to the CDN, which caches content globally and forwards requests to the core server, providing at least 200 GB of DDoS mitigation and worldwide caching.
3. Image Server
Deploy image‑caching servers in mainland China; Nginx can serve as an image cache to improve access speed.
4. Data Center
Select a data center close to the majority of users; for high‑bandwidth needs, consider US locations. Test ping values nationwide with tools like chinaz. Choose a provider with strong defense, reliability, real‑time status monitoring, and good service attitude.
5. Homepage
Use a cloud host for an advertising landing page that links to the game homepage. If the link includes a port number, use a CDN or a non‑recorded (no‑备案) data center for the core server, allowing users to access the site via domain only.
6. Monitoring System
Implement real‑time monitoring to detect attacks, log spikes, and send logs to a syslog server. Use Cacti for visualization, monitor bandwidth, analyze log sources, and set up alarms for abnormal conditions.
7. Attack Defense
Small attacks can be blocked with Nginx and iptables. Large‑scale attacks that saturate bandwidth require high‑defense data centers (≥200 GB). Block single‑IP sources at the data center and use CDN to absorb traffic; complete prevention of massive DDoS is impossible.
8. Redundancy
Design capacity for double the expected concurrent users (e.g., 2,000 for an expected 1,000) to handle traffic spikes during events.
9. Servers
Equip servers with three NICs: one for external user traffic, one for internal server communication, and one for SSH management. Use multiple IPs per NIC, RAID‑1 disks, dual CPUs, dual power supplies, and avoid single points of failure. The shield server can be lower‑spec, but network connectivity to the core server must be excellent.
10. Database
Implement master‑slave replication with off‑site backups. Configure Nginx upstream clustering. Separate front‑end (user‑facing) and back‑end (admin) services on different machines; other services can share a virtual machine. Use Gmail for corporate email.
11. Test Environment
Maintain three test environments: developer machines, a LAN‑based test setup, and an Internet‑based test setup, plus production. The LAN test should be stable with rack equipment and use SVN or Git for code management.
12. Shield and Core Server
Ensure ping connectivity between the shield server and the core server to verify network reachability.
13. Operations Staff
At least two ops personnel are needed; one manager plus one staff member is sufficient. Document all procedures, maintain 24‑hour on‑call coverage, and a single network admin can handle networking.
14. Data Center (Large Organizations)
Large enterprises should own a dedicated core data center rather than renting, with specialized teams for DB, network, security, and storage engineering.
15. Linux System Optimization
Optimize Linux, Nginx, and applications based on CPU and memory limits.
16. Security
Rotate all passwords every three months, especially domain and email accounts, as they are critical and vulnerable.
17. LAN
Build a stable LAN with at least two 10 Mbps cables and a mobile Wi‑Fi hotspot for staff devices.
18. Ops Tools
Standardize tools: SQLyog for DB, CRT for SSH, Keepass for password management, WinSCP for file transfer, etc. Encourage continuous learning, especially reading English technical documentation.
19. Disaster Recovery Plan
Develop a backup‑and‑restore plan, conduct regular drills, and verify backup usability to avoid catastrophic failures.
20. Server Security
Apply comprehensive security configurations covering user, application, system, and file layers to prevent unauthorized access.
21. High‑Concurrency Testing
Simulate 2,000 concurrent users to assess load, choose optimal IPs, data center, and bandwidth. Invest where necessary and know where to save.
22. Sharing
Two ops members must share all information, passwords, and configuration steps; the manager builds a cohesive, skilled team.
23. Logs
Record every server operation with timestamps; perform risk assessment before any production change.
24. Ops Philosophy
Focus on availability, monitoring & alerting, capacity planning, process standards, knowledge management, and automation.
25. Ops Work After Launch
Post‑launch tasks include version upgrades, monitoring, status statistics, routine inspections, incident handling, change adjustments, cluster management, performance tuning, DB optimization, scaling, security, and ops‑related development.
26. Connection Count Example
<code>netstat -ant | grep $ip:80 | wc -l
netstat -ant | grep $ip:80 | grep EST | wc -l</code>Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.