Databases 24 min read

DongSQL V1.2.0: Dual Performance and Stability Gains for Retail Databases

DongSQL V1.2.0 introduces a suite of kernel upgrades—including group‑commit unicast, semi‑sync replication tweaks, plan‑cache, single‑point query enhancements, massive‑table fast startup, SIMD acceleration, hotspot‑row updates and high‑volume testing support—delivering significant performance and stability improvements across diverse retail workloads.

JD Tech
JD Tech
JD Tech
DongSQL V1.2.0: Dual Performance and Stability Gains for Retail Databases

1. Group‑Commit Unicast Notification

Background: In the original group‑commit flow, follower threads wait on a single condition variable (COND_done) after the leader flushes the binlog. With many concurrent threads, broadcasting causes severe lock contention.

Solution: DongSQL V1.2.0 allocates an independent condition variable (THD::COND_wakeup_ready) per thread and switches from broadcast to per‑thread unicast. After the leader finishes flushing, it iterates the follower queue and wakes each thread individually, dramatically reducing contention.

Advantages:

Applicable to all scenarios with master‑slave replication enabled.

Each thread waits independently, eliminating the thundering‑herd effect.

High‑concurrency write workloads see noticeable gains; low‑concurrency workloads remain unchanged.

Test environment: 16 CPU / 32 GB, sysbench with 16 tables, each containing 10 million rows.

Performance data: (see images for oltp_insert, oltp_update_non_index, oltp_delete).

-- Enable group‑commit unicast (default ON)
SET GLOBAL binlog_group_commit_unicast_mode = ON;

2. Semi‑Sync Replication Optimization

Key changes:

Update binlog end position immediately after flush, reducing follower wait latency.

Flush relay log on ACK, ensuring consistency while improving replication speed.

Dual‑side optimization (transaction side and log side) for comprehensive gains.

Test scenario: sysbench oltp_write_only on 20 tables (10 million rows each) in a master‑slave semi‑sync setup.

Results: DongSQL 8.0 shows up to 25 % improvement in low‑concurrency cases; DongSQL 5.7 also benefits.

-- Relevant system variables
rpl_update_binlog_end_pos_after_flush = ON   -- effective on master
rpl_semi_sync_flush_relay_log_on_ack = ON   -- effective on slave

3. Execution Plan Cache

Problem: Earlier versions regenerated execution plans for every identical SQL, wasting CPU cycles.

Solution: V1.2.0 adds a Plan Cache that stores SELECT execution plans, avoiding repeated generation and boosting query throughput.

-- Enable plan cache
SET GLOBAL plan_cache_enabled = ON;
-- View cache status
SHOW STATUS LIKE 'plan_cache%';

Test setup: sysbench standard test (8 CPU / 16 GB, 10 tables, 1 million rows each) comparing cache ON vs OFF.

Performance data: (see images for select_random_ranges and select_random_points).

Best‑practice scenarios:

Frequent execution of identical SQL patterns.

High‑concurrency OLTP short queries.

Prepared statements and parameterized queries.

4. Single‑Point Query Optimization

V1.1.0 recap: Introduced bypass for primary‑key equality queries, skipping part of the SQL layer.

V1.2.0 enhancements:

Support for Unique Index queries.

Support for VARCHAR and CHAR column types.

-- Primary‑key query optimization
SELECT * FROM orders WHERE order_id = 1001;  -- INT primary key
-- Unique‑index query optimization
SELECT * FROM users WHERE email = '[email protected]';  -- VARCHAR unique index
-- CHAR type optimization
SELECT * FROM products WHERE product_code = 'ABC123';  -- CHAR unique index

Note: Applicable only to single‑table primary‑key or unique‑index lookups.

5. Massive‑Table Fast Startup

Problem: Startup scans every tablespace file to read the first page for space‑id. With hundreds of thousands of tables, startup time becomes minutes.

Typical scenario: DongDAL sharding architecture where a single instance may hold tens of thousands of tables.

Solution: V1.2.0 stores space‑id in file extended attributes (xattr). During startup, the engine reads the attribute directly, avoiding per‑file page parsing.

-- System variable (default ON)
innodb_fast_initial_fil_scan = ON;
-- View xattr of a table file
getfattr ./t1.ibd -n user.dongsql.space_id

Features:

Automatic xattr write on table creation.

Compatibility: older versions auto‑populate xattr on first upgrade.

Safety: detects and repairs xattr conflicts.

Runtime toggle support.

Performance breakthrough: Test on an 8 CPU / 16 GB instance with 100 k–1 M tables shows dramatic reduction in scan time and overall startup latency.

6. SIMD Acceleration

Background: SQL digest (hash) is critical for performance monitoring; computing it becomes a bottleneck under high concurrency.

Optimization: V1.2.0 uses AVX2 instructions to accelerate the DIGEST_HASH_TO_STRING function, leveraging SIMD parallelism.

-- System variable (default ON)
enable_digest_hash_to_string_avx2 = ON;

Test environment: 8 CPU / 16 GB, DongSQL 8.0, 16 tables (1 million rows each), oltp_point_select workload.

Result: Significant throughput increase; applicable only on CPUs supporting AVX2 (Intel Haswell+, AMD Excavator+).

7. Statement Outline (Ported to 5.7)

V1.1.0 introduced Statement Outline in the 8.0 branch to solidify execution plans. V1.2.0 back‑ports this capability to the 5.7 branch, providing:

Execution‑plan fixation (binding a specific plan to a SQL).

Hint injection (dynamically adding optimizer hints).

Emergency intervention (quickly remedying problematic SQL in production).

CALL dbms_outln.add_index_outline('ecommerce_db', '', 1, 'USE INDEX', 'idx_status', '', 'SELECT * FROM orders WHERE status = "PAID"');
CALL dbms_outln.add_optimizer_outline('ecommerce_db', '', 1, '/*+ ccl_queue_digest(5) */', 'SELECT * FROM hot_products WHERE status = 1');

8. Hotspot‑Row Update Optimization

Business pain: High‑frequency updates to the same row (e.g., inventory deduction, balance decrement) cause massive lock contention, semi‑sync ACK delays, B+‑tree traversal overhead, and heavy dead‑lock detection.

Traditional approach limitation: Hint‑based throttling and fast commit reduce dead‑lock detection cost but do not eliminate the inherent serial execution of UPDATE statements.

New design:

SQL‑layer row cache: Maintains a full copy of InnoDB rows in the SQL layer protected by a mutex.

Dual update units: Each row has two alternating update units; only one holds the X‑lock at a time.

Leader‑Follower model:

Leader thread acquires the X‑lock, updates InnoDB, writes the new version to the SQL‑layer cache, then flushes redo & binlog.

Follower threads read from the cache, apply updates, and write back without entering InnoDB.

Key optimizations:

Reduced B+‑tree traversal (only leader traverses).

Lower dead‑lock detection cost (fewer concurrent waiting threads).

Network‑delay amortization (single semi‑sync ACK shared by the whole update unit).

Crash‑recovery safety via binlog integrity markers.

-- Enable hotspot update (default ON)
SET GLOBAL hotspot = ON;
SET GLOBAL hotspot_for_autocommit = ON;   -- also works in autocommit mode
SET GLOBAL hotspot_for_trigger = ON;      -- enables tables with UPDATE triggers
SET GLOBAL hotspot_update_max_wait_time = 5000000;   -- leader wait timeout (µs)

Test environment: 8 CPU / 16 GB, DongSQL 8.0, single table with 10 rows, hotspot update on id=1.

Results:

Low‑concurrency (≤2 threads): 10‑30 % performance drop due to merge overhead.

High‑concurrency (≥4 threads): dramatic gains; 128 threads reach 34,127 TPS, ≈9× baseline.

Latency drops from 72.69 ms to 3.78 ms (≈95 % reduction) under high load.

Best practice: Enable for high‑concurrency hotspot updates (e.g., inventory, balance, ID generators); avoid in low‑concurrency workloads; always use the COMMIT_ON_SUCCESS hint.

UPDATE /*+ COMMIT_ON_SUCCESS */ t1 SET val = val + 1 WHERE id = 1;

9. High‑Volume Testing Support

Business pain: Traditional load‑testing creates a separate instance and duplicates all data, leading to high resource consumption and impacting production traffic.

Solution: V1.2.0 adds a low‑priority queue in the thread pool. SQL statements marked with the DONGSQL_LOW_PRIO hint are routed to this queue, while normal business SQL stays in the high‑priority queue.

Architecture:

Auto‑enable module: Detects the presence of the low‑priority hint and turns on traffic‑detection.

Traffic‑detection module: Uses MSG_PEEK to read the first N bytes of a network packet before full parsing; if the hint is present, the task is placed into the low‑priority queue.

Auto‑disable module: Periodically checks if load‑testing is inactive and turns off detection after a 30‑second grace period.

Core innovations:

MSG_PEEK pre‑read technique enables hint detection without affecting normal packet processing.

Automatic on/off mechanism ensures zero overhead when load‑testing is absent.

-- Example load‑testing SQL (low priority)
SELECT /*+ DONGSQL_LOW_PRIO */ id, name FROM user WHERE age > 18;
UPDATE /*+ DONGSQL_LOW_PRIO */ user SET age = 21 WHERE id = 1;
-- Normal business SQL (high priority)
SELECT id, name FROM user WHERE age > 18;
UPDATE user SET age = 21 WHERE id = 1;

System variables:

enable_high_volume_testing = ON   -- default ON
load_testing_disable_cycle_period = 30   -- auto‑disable interval (seconds)

10. Other Important Optimizations

10.1 Monitoring Table Enhancement

New table performance_schema.dongsql_statement_summary records full‑link metrics (parse time, optimizer time, execution time, lock wait, etc.). Gate parameter dongsql_query_time controls which statements are persisted.

SELECT * FROM performance_schema.dongsql_statement_summary WHERE digest_text LIKE '%orders%' ORDER BY timer_wait DESC LIMIT 10;
SET GLOBAL dongsql_query_time = 0.1;   -- record only statements >100 ms

10.2 Jemalloc Profiling

Integrates malloc_stats_print for memory‑usage analysis, aiding leak detection and tuning.

10.3 fdatasync Optimization

Replaces fsync with fdatasync to skip unnecessary metadata flushes, yielding ~5 % throughput gain and latency reduction in write‑only sysbench workloads.

11. Performance Benchmark Summary

Aggregated charts (see images) show that V1.2.0 outperforms V1.1.x across all test scenarios:

Write‑intensive workloads benefit most from group‑commit unicast and semi‑sync replication improvements.

Read‑intensive workloads gain from single‑point query enhancements and SIMD acceleration.

Mixed read/write workloads see an overall ~40 % uplift.

Overall version comparison (16 CPU / 32 GB, sysbench 16 tables × 1 million rows) confirms substantial gains for both 8.0 and 5.7 branches.

12. Future Roadmap

Continuous performance tuning of the core execution engine for OLTP.

Intelligent operations: richer monitoring, diagnostic tools.

Cloud‑native evolution: separation of compute and storage, delivering high‑performance, low‑cost database services.

13. Conclusion

From V1.1.0 to V1.2.0, DongSQL has achieved breakthroughs in hotspot‑row updates, replication performance, plan caching, high‑volume testing, massive‑table startup, and SIMD acceleration. These advances not only boost raw performance but also provide a more robust technical foundation for JD’s retail scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

replicationSIMDDatabase PerformancePlan CacheDongSQLHigh‑Volume TestingHotspot Update
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.