Operations 19 min read

Essential Ops Insights: Tool Choices, Automation, and Best Practices from a Senior Ops Expert

This article compiles expert Q&A on operations, covering tool selection, monitoring, Linux version choices, automation platforms, security, Docker, backup strategies, and career advice, offering practical guidance for modern infrastructure management.

Efficient Ops
Efficient Ops
Efficient Ops
Essential Ops Insights: Tool Choices, Automation, and Best Practices from a Senior Ops Expert
We face a constantly changing world where business needs, technical architectures, and open‑source versus commercial tools evolve rapidly; only a scientific operations methodology can keep pace.

Speaker Profile

Xu Feng, senior operations specialist with 10 years of experience at Shanda Games, holds a senior information systems project management certification and has led the design and implementation of an automated operations platform.

Key Q&A Highlights

1. Recommended open‑source monitoring tools?

Zabbix is a solid choice as it supports SNMP, its own agent, and custom templates; it can also monitor business‑level metrics. For security monitoring, consider Tenable Nessus, IDS/IPS solutions.

2. Preferred Linux distribution for servers?

CentOS 6.x is the primary choice for its stability and familiarity, though Ubuntu may be selected for faster kernel updates.

3. Cache solutions beyond Redis/Codis?

For non‑persistent data like session IDs, Memcached with consistent hashing is recommended.

4. Automation deployment tools besides Jenkins?

Hudson (the predecessor of Jenkins) and Jenkins both offer rich plugin ecosystems; effective use of plugins is key.

5. Which MySQL variant to use?

The official MySQL version is preferred for its support and ecosystem.

6. Log collection and analysis tools?

ELK stack is widely used despite its learning curve; it remains worth exploring.

7. Where to find scripts from the book?

Scripts are being organized and can be cloned from https://github.com/xufengnju/books.git .

8. Ansible vs. Chef/Puppet for configuration management?

Choose the tool you are most comfortable with; each has distinct strengths. The team’s IaaS platform is self‑developed on KVM.

9. Load‑balancing capacity of LVS and HAProxy?

Capacity depends on connection rates, PPS, and latency; typically, the load balancer is not the bottleneck, and LVS DR mode can be used for high‑throughput scenarios.

10. Monitoring service health?

The custom monitoring system polls game servers to track online player counts and visualizes them.

11. CMDB and batch management?

Batch management is done via SSH with a design similar to Ansible; a self‑developed CMDB stores foundational data.

12. Traffic mirroring for testing?

Refer to Chapter 15 of the book for NIC promiscuous mode and RawSocket techniques; projects like tcpcopy can be explored.

13. System and network tuning on CentOS 6?

Adjust tcp_max_tw_buckets based on TIME_WAIT load; also consider tcp_tw_reuse and tcp_tw_recycle for long‑connection optimization.

14. Upgrading a large fleet of servers?

Stateless microservice architectures enable gray‑scale upgrades via front‑end load balancers; monolithic setups are more challenging.

15. Docker adoption in game operations?

Docker is being evaluated for limited testing; integration with existing operations requires careful network and storage planning.

16. Security measures for operations?

The book covers Linux hardening; for web security, tools like ModSecurity and Tenable Nessus are suggested, along with regular penetration testing.

17. Career advice for operations engineers?

Embrace DevOps, strengthen programming skills (Python, Perl), focus on weak areas, and study high‑quality operations literature.

18. Future of operations?

Automation, predictive capacity planning, and AI‑driven self‑operations are the envisioned direction.

monitoringautomationoperationsDevOpslinuxsecuritytool selection
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.