Backend Development 11 min read

How Switching to StringBuilder Made Template Rendering 10× Faster

An in‑depth case study of Alipay’s card‑coupon template engine reveals that replacing costly String.replace calls with a custom StringBuilder implementation, combined with caching strategies, can boost rendering performance by more than tenfold, while also reducing memory overhead and improving scalability.

macrozheng
macrozheng
macrozheng
How Switching to StringBuilder Made Template Rendering 10× Faster

1. Background

1.1 Business Background

Alipay’s card pack stores users’ membership cards and coupons. Both the card cell and the card detail are rendered by combining a static template with dynamic data, which is then presented to the end user.

Figure 1 shows how coupon data is displayed on the client side, and Figure 2 illustrates the data assembly process.

Figure 1: Coupon data display on the client side

Figure 2: Client‑side data assembly process

1.2 Problem Discovery

During a recent project we revisited the coupon assembly and rendering logic. The legacy code (Figure 3) has been in use for about ten years and simply replaces variables delimited by $ symbols with dynamic values. Because this logic is core and high‑frequency, we investigated potential performance improvements.

Figure 3: Original template variable replacement code

The initial analysis identified two inefficiencies:

Each loop iteration performs two

indexOf

operations.

Each loop iteration creates a new

substring

.

We asked:

Can we reduce the number of

indexOf

and

substring

calls?

Is it necessary to search the template for variables on every request?

2. Performance Optimizations

We iteratively refined the implementation across five versions, ultimately achieving more than a ten‑fold speed increase.

2.1 Optimization V1

Removed

indexOf

and

substring

entirely and introduced a two‑pointer scan that extracts all variables first, then replaces them in a second pass.

Figure 4: V1 implementation

2.2 Optimization V2

Since static templates rarely change, we cached the mapping between a template ID and its variable list, eliminating the need to extract variables on each request. The cache is implemented with Google Guava.

Figure 5: Cache implementation example

Figure 6: V2 implementation

2.3 Performance Comparison (1)

Benchmarking V1 and V2 (Figure 7) shows that both improve over the original, but the gains diminish as traffic grows, indicating other bottlenecks remain.

Figure 7: V1 vs V2 performance

Further analysis revealed that

String.replace

itself is costly because each call recompiles the pattern and creates a new string object.

Figure 8: String.replace implementation

2.4 Optimization V3

Replaced

String.replace

with a manual

StringBuilder

approach, avoiding repeated compilation and object creation.

Figure 9: V3 implementation

Note: The variable extraction in V2 returned a

Set

, which lost ordering and duplicated variable handling. V3 switches to an ordered

List

to preserve correctness.

2.5 Optimization V4

Building on V3, we removed the cache entirely and kept only the

StringBuilder

replacement, achieving the most significant speedup (Figure 10).

Figure 10: V4 implementation

2.6 Performance Comparison (2)

Figure 11: V1‑V4 performance comparison

Using StringBuilder yields more than a ten‑fold performance increase.

V4 is slightly faster than V3 with cache, but V3’s code is more readable. Combining the readability of V3 with the speed of V4 suggests a hybrid V5.

2.7 Optimization V5

V5 extracts variables, drops the cache, and uses

StringBuilder

for replacement, achieving the best balance of performance and maintainability.

Figure 12: V5 implementation and 1‑million‑iteration benchmark

3. Summary

Across five optimization iterations, overall performance improved by more than ten times.

The ranking from fastest to slowest is V4 > V3 > V5 > V2 > V1 > original. The critical bottleneck was

String.replace

; caching provided modest gains, while

StringBuilder

delivered the major boost.

Key takeaways:

String.replace

incurs template compilation and creates new string objects, consuming CPU and memory.

Replacing it with

StringBuilder

reduces intermediate object creation, lowers GC pressure, and cuts CPU load.

Such performance gains translate into substantial resource savings at scale—for example, a 10% speedup in a 2000‑server deployment could free the equivalent of 200 servers, improving stability and reducing scaling costs.

In short, the simple switch from

String.replace

to

StringBuilder

is what made a 20‑line piece of code ten times faster.

Javaperformance optimizationcachingtemplate engineStringBuilder
macrozheng
Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.