Backend Development 8 min read

Migrating the Core Course Service from Ruby RPC to Go gRPC: Challenges, Decisions, and Results

This article recounts the migration of the Core Course (CC) service from a custom Ruby RPC framework to a Go‑based gRPC solution, detailing technical choices, performance bottlenecks, testing strategies, monitoring setup, and the substantial improvements achieved after launch.

Liulishuo Tech Team
Liulishuo Tech Team
Liulishuo Tech Team
Migrating the Core Course Service from Ruby RPC to Go gRPC: Challenges, Decisions, and Results

The Core Course (CC) project, launched in early 2016, initially relied on a Ruby‑based RPC service called CourseCenter RPC built on a homemade Scorix RPC framework using ruby‑protobuf. While it met early development needs, the service suffered from poor multi‑language client support and significant performance and memory‑leak issues due to the pure‑Ruby protobuf implementation.

To address these problems, the team rewrote the service as "hexley" using Go and gRPC (stable 1.0). Go was chosen for its performance, ease of deployment, and developer enthusiasm.

client.lesson.find_by(lesson_id: "laix")

Key challenges and solutions during the rewrite included:

Dependency Management: After evaluating options, Bazel was adopted for building Go code.

Project Structure: Packages were organized by business domain rather than generic controller/util layers, avoiding ambiguous naming.

Testing: The Go testing package was supplemented with golang/mock for mock generation and stretchr/testify for suite‑based setup/cleanup, while simple projects could rely on TestMain .

ORM Choice: The team initially used go-xorm/xorm , forking it to add context support required by OpenCensus tracing.

Gray Release Strategy: Implemented user‑specific and percentage‑based traffic routing to safely migrate traffic to the new service.

Monitoring: Built gRPC interceptors with context to collect metrics, using prometheus for metrics, Sentry for error reporting, and OpenCensus for full‑stack tracing.

Data Access Optimization: Moved static course content from the database to S3, loading it into memory and watching for changes with a background goroutine, reducing unnecessary I/O.

After launch, the service handles over 17K+ QPS during peak hours—more than three times the previous traffic—while using fewer machines. Latency mean stays under 5 ms and the 95th percentile under 25 ms, and the earlier protobuf memory‑leak issue disappeared.

The migration also deepened the team’s appreciation for Go. Initial frustrations with repetitive error handling gave way to using interfaces for clean abstraction, as illustrated below:

// Lesson selection interface type LessonSelector interface { // define selection and common behavior } // Common logic implementation type Common struct {} // Specific course A only implements its unique logic type A struct { *Common }

Overall, the rewrite demonstrates how thoughtful language and tooling choices, combined with robust testing, monitoring, and deployment strategies, can dramatically improve backend service performance and reliability.

performancegRPCRuby
Liulishuo Tech Team
Written by

Liulishuo Tech Team

Help everyone become a global citizen!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.