Master StarRocks Lateral Join & Unnest for Efficient Row‑to‑Column Transformations
StarRocks 1.18 introduces Lateral Join + Unnest for elegant row‑to‑multiple‑row conversion, Fast Decimal with up to 38‑digit precision and 4× faster arithmetic, extensive Bitmap function enhancements, and a suite of import, storage, operator, and ecosystem optimizations that dramatically boost query performance and scalability.
Lateral Join & Unnest
Converting a single row into multiple rows is a common ETL task, and using intermediate tables can be cumbersome. StarRocks 1.18 adds native support for LATERAL JOIN combined with the UNNEST table function, allowing the function to reference columns from the left table directly.
Expand an array into multiple rows.
When combined with split, split a comma‑separated string into rows.
When combined with bitmap_to_array, convert a Bitmap to multiple ID rows, improving both conversion efficiency and downstream analysis.
Example table user:
SELECT * FROM user; +---------+-----------------------+
| user_id | label |
+---------+-----------------------+
| 1 | ['male','student'] |
| 2 | ['male','employee'] |
| 3 | ['female','employee'] |
| 4 | ['male','student'] |
+---------+-----------------------+Using LATERAL JOIN with UNNEST to split the label array:
# General syntax
SELECT unnest, COUNT(unnest)
FROM user
CROSS JOIN LATERAL UNNEST(label)
GROUP BY unnest; # Simplified syntax
SELECT unnest, COUNT(unnest)
FROM user, UNNEST(label)
GROUP BY unnest;Both queries produce identical results:
+----------+-----------------+
| unnest | COUNT('unnest') |
+----------+-----------------+
| female | 1 |
| male | 3 |
| employee | 2 |
| student | 2 |
+----------+-----------------+Intermediate result of the lateral join:
+---------+----------+
| user_id | unnest |
+---------+----------+
| 1 | male |
| 1 | student |
| 2 | male |
| 2 | employee |
| 3 | female |
| 3 | employee |
| 4 | male |
| 4 | student |
+---------+----------+Fast Decimal
Decimal types store exact floating‑point numbers, essential for financial calculations. In version 1.18 StarRocks upgrades to Fast Decimal, offering up to 38 digits of precision and significant performance gains:
Supports wider Decimal types with up to 38 significant digits.
Uses 64‑bit integers for Decimal(M≤18, D) instead of the previous uniform 128‑bit representation, reducing instruction count for arithmetic and conversion.
Algorithmic optimizations, especially for multiplication, deliver roughly 4× faster performance (see benchmark chart).
Bitmap Optimization
Bitmap indexes excel in high‑cardinality analysis and precise deduplication. StarRocks enhances Bitmap functions with new operators and performance improvements: bitmap_andnot(BITMAP lhs, BITMAP rhs) – returns the set difference (elements in lhs not in rhs). bitmap_xor(BITMAP lhs, BITMAP rhs) – returns the symmetric difference (elements unique to either side). bitmap_remove(BITMAP lhs, BIGINT input) – removes input from lhs and returns the result.
Optimizations to the storage format and SIMD‑based processing yield a 2‑10× speedup for Bitmap operations.
Other Optimizations
Additional improvements in version 1.18 cover import, storage, operator, and ecosystem aspects:
Import : Broker load now supports ZSTD‑compressed files; stable import of up to 1,000 columns and 10 TB per table.
Storage : Bitmap storage format optimized, giving ~10× faster bitmap_union and bitmap_union_count; column names now accept spaces and Chinese characters.
Operator : Enhanced DELETE syntax (supports IN expressions and partition‑less deletes); TOPN sorting performance improved by 50‑150% for large limits; HyperLogLog functions 3‑5× faster.
Ecosystem : Tableau compatibility raised to 95 % (TBVT) and ODBC5 connector compatibility to 98 %; Hive external tables now support ViewFs.
For full details, refer to the StarRocks release notes and documentation links provided in the original article.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
StarRocks
StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
