Backend Development 14 min read

Zhihu’s Migration from Python to Go: Architecture, Process, and Lessons Learned

Zhihu’s backend team migrated high‑traffic services from Python to Go, detailing the motivations, step‑by‑step migration process, performance gains, project structure, testing practices, static analysis, and operational lessons such as graceful traffic shifting and service deprecation.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Zhihu’s Migration from Python to Go: Architecture, Process, and Lessons Learned

Background

Zhihu’s community backend was primarily written in Python. Rapid user growth and increasing business complexity caused CPU and resource pressure, exposing Python’s lower runtime efficiency and higher maintenance cost.

Why Golang

Golang was chosen for its native concurrency, mature internal ecosystem, static typing, single‑binary deployment, and comparable learning curve, offering advantages over Java and Python for high‑concurrency services.

Transformation Results

Core services (member RPC, comments, Q&A) have been fully rewritten in Go, saving over 80% of server resources, reducing development and maintenance costs, and improving internal Go component libraries.

Implementation Process

Because Zhihu’s micro‑service architecture isolates resources, the migration proceeded in five steps: (1) rewrite logic in a new Go service while keeping the external protocol unchanged and sharing the original service’s resources; (2) verify correctness via side‑by‑side request comparison for read APIs and unit/QA testing for write APIs; (3) gray‑scale traffic by proxying requests to the new service; (4) switch traffic entry once 100% traffic is handled by Go; (5) decommission the old Python service.

Go Project Practices

Key practices include a clear project layout (bin, cmd, gen‑go, pkg, thrift_files, vendor), interface‑based layering with impl and mock packages to enable testing, early static analysis using gometalinter (vet, golint, errcheck), feature‑level degradation with circuit , panic‑recover middleware for error handling, careful goroutine management, and ensuring resp.Body.Close() to avoid leaks.

Conclusion

The migration, completed in Q2/Q3 2018 by the community architecture and content technology teams, demonstrates how a large‑scale Python service can be refactored to Go, achieving significant resource savings, higher reliability, and a reusable framework for future projects.

performance optimizationmicroservicestestingGostatic analysisBackend Migration
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.