Optimizing PHP strtolower with SSE2 in PHP 8
This article explains how PHP's case‑insensitive function handling can be accelerated by implementing a locale‑independent strtolower using SSE2 SIMD instructions in PHP 8, compares it with previous table‑lookup methods, and discusses a further Yaf‑specific optimization.
In PHP, class, function, and method names are case‑insensitive, so the engine normalizes them to lowercase, often using strtolower . While PHP already minimizes lowercase conversions, dynamic names still require it, making a faster implementation valuable.
The author previously demonstrated SIMD‑based character replacement using SSE2 and now presents a high‑performance, locale‑independent strtolower implementation for PHP 8, with a similar approach applicable to strtoupper .
Because ASCII uppercase letters (A‑Z) differ from their lowercase counterparts (a‑z) by 32, the conversion can be performed by detecting uppercase characters and adding 32. Traditional implementations processed characters one by one or used a 256‑entry lookup table.
The new SSE2 version processes 16 bytes at a time, using intrinsics to load data, compare ranges, mask uppercase characters, add the delta, and store results. The core code is:
const __m128i _A = _mm_set1_epi8('A' - 1);
const __m128i Z_ = _mm_set1_epi8('Z' + 1);
const __m128i delta = _mm_set1_epi8('a' - 'A');
do {
__m128i op = _mm_loadu_si128((__m128i*)p);
__m128i gt = _mm_cmpgt_epi8(op, _A);
__m128i lt = _mm_cmplt_epi8(op, Z_);
__m128i mingle = _mm_and_si128(gt, lt);
__m128i add = _mm_and_si128(mingle, delta);
__m128i lower = _mm_add_epi8(op, add);
_mm_storeu_si128((__m128i*)q, lower);
p += 16;
q += 16;
} while (p + 16 <= end);This loop loads 16 characters, identifies uppercase letters with two comparisons, adds the 32‑offset where needed, and writes back the transformed block, yielding a noticeable speedup over per‑character processing.
The optimization is applied only when the locale is the default "C"; custom locales disable it.
The author also experimented with a Yaf‑specific variant that reduces the two comparisons to a single one by adjusting the guard value:
const __m128i upper_guard = _mm_set1_epi8('A' + 128);
__m128i in = _mm_loadu_si128((__m128i*)str);
__m128i rot = _mm_sub_epi8(in, upper_guard);
__m128i upper = _mm_cmpgt_epi8(rot, _mm_set1_epi8(-128 + 'Z' - 'A'));This method simplifies the detection logic but was not merged into PHP 8 due to readability concerns and modest performance gains.
Overall, the SIMD‑based approach demonstrates how leveraging SSE2 can significantly accelerate string case conversion in PHP, especially for large data sets.
Beike Product & Technology
As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.