Tag

Apache Druid

1 views collected around this technical thread.

Xingsheng Youxuan Technology Community
Xingsheng Youxuan Technology Community
Jul 5, 2022 · Databases

Mastering Apache Druid: Architecture, Real-Time Ingestion, and Query Optimization

Apache Druid is a distributed, column‑store OLAP engine designed for massive real‑time data ingestion and sub‑second queries; this article explains its LSM‑tree‑inspired architecture, DataSource and Segment structures, memory‑based querying, practical deployment steps, common pitfalls, and optimization techniques for high‑throughput analytics.

Apache DruidData ingestionDistributed Database
0 likes · 20 min read
Mastering Apache Druid: Architecture, Real-Time Ingestion, and Query Optimization
Shopee Tech Team
Shopee Tech Team
Jan 13, 2022 · Big Data

Engineering Practices and Performance Optimizations of Apache Druid for Real‑Time OLAP at Shopee

Shopee’s engineering team scaled a 100‑node Apache Druid cluster for real‑time OLAP by redesigning the Coordinator load‑balancing algorithm, adding incremental metadata pulls, introducing a segment‑merged result cache, and building exact‑count and flexible sliding‑window operators, while planning cloud‑native deployment.

Apache DruidBig DataCache
0 likes · 17 min read
Engineering Practices and Performance Optimizations of Apache Druid for Real‑Time OLAP at Shopee
Beike Product & Technology
Beike Product & Technology
Jul 1, 2021 · Big Data

Oak Off‑Heap Key‑Value Map and Its Application in Apache Druid for Real‑Time and Batch Ingestion

The article introduces Oak, an off‑heap concurrent key‑value map, explains its design and performance benefits over ConcurrentSkipListMap, and details extensive offline and real‑time ingestion experiments in Apache Druid that demonstrate reduced memory usage, lower CPU consumption, and faster data loading.

Apache DruidIncremental IndexJava
0 likes · 10 min read
Oak Off‑Heap Key‑Value Map and Its Application in Apache Druid for Real‑Time and Batch Ingestion
Tencent Cloud Developer
Tencent Cloud Developer
Sep 22, 2020 · Big Data

Evolution and Architecture of Beike's OLAP Platform: From Hive/MySQL to Multi‑Engine Flexibility

Beike’s OLAP platform has progressed from a basic Hive‑MySQL batch pipeline to a Kylin‑based single‑engine solution, and now to a flexible multi‑engine architecture that uses a query‑engine layer to route metrics across Kylin, Druid, ClickHouse and Doris, dramatically cutting cube‑build times, supporting real‑time ingestion, and paving the way for further engine consolidation and automated performance routing.

Apache DruidApache KylinBeike
0 likes · 17 min read
Evolution and Architecture of Beike's OLAP Platform: From Hive/MySQL to Multi‑Engine Flexibility
Big Data Technology Architecture
Big Data Technology Architecture
Aug 13, 2020 · Databases

Deep Dive into Apache Druid V1 Storage Format: Index Structures and Disk Layout

This article provides a detailed analysis of Apache Druid V1's column‑oriented storage format, covering dimension dictionaries, variable‑length encoded values, bitmap inverted indexes, array handling, and the physical metadata layout that enables sub‑second OLAP queries on massive datasets.

Apache DruidOLAPStorage Format
0 likes · 8 min read
Deep Dive into Apache Druid V1 Storage Format: Index Structures and Disk Layout
DataFunTalk
DataFunTalk
Jul 4, 2020 · Databases

Deep Dive into Apache Druid V1 Data Storage Format: Index Structures and Disk Layout

This article provides an in‑depth analysis of Apache Druid V1's column‑oriented storage format, covering dimension structures, dictionaries, variable‑length integer encoding, inverted indexes, array handling, and how these components are used during query execution.

Apache DruidColumnar DatabaseIndexing
0 likes · 9 min read
Deep Dive into Apache Druid V1 Data Storage Format: Index Structures and Disk Layout
DataFunTalk
DataFunTalk
Jun 14, 2020 · Big Data

Practical Experience and Optimization of Apache Druid for Real‑Time OLAP at iQIYI

This article describes how iQIYI evaluated various OLAP engines, selected Apache Druid for real‑time analytics, detailed its architecture, identified performance bottlene‑cks in Coordinator, Overlord and indexing, applied configuration and resource‑allocation optimizations, and built a user‑friendly RAP platform to democratize real‑time data analysis.

Apache DruidReal-time OLAPStreaming Analytics
0 likes · 15 min read
Practical Experience and Optimization of Apache Druid for Real‑Time OLAP at iQIYI
iQIYI Technical Product Team
iQIYI Technical Product Team
May 29, 2020 · Big Data

Real-Time OLAP with Apache Druid at iQiyi: Architecture, Optimizations, and Platform Practices

iQiyi replaced its offline OLAP stack with Apache Druid, leveraging its real‑time, multi‑dimensional engine and a five‑component architecture, while solving coordinator and overlord bottlenecks, optimizing indexing resources, adopting KIS mode, and building the self‑service RAP platform that now powers thousands of low‑latency dashboards.

Apache DruidKISReal-time OLAP
0 likes · 17 min read
Real-Time OLAP with Apache Druid at iQiyi: Architecture, Optimizations, and Platform Practices
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 9, 2020 · Big Data

Design and Evolution of iQIYI Real-Time Analysis Platform (RAP)

iQIYI’s Real‑Time Analysis Platform (RAP) combines Apache Druid with Spark/Flink to deliver minute‑level, low‑latency multidimensional analytics via a web wizard, supporting hundreds of streaming tasks and thousands of reports across membership, recommendation, and TV monitoring, while simplifying development and maintenance.

Apache DruidFlinkOLAP
0 likes · 13 min read
Design and Evolution of iQIYI Real-Time Analysis Platform (RAP)
Youzan Coder
Youzan Coder
Feb 13, 2019 · Big Data

Druid OLAP Platform Practice at YouZan: Architecture, Features, and Challenges

YouZan adopted MetaMarket’s Druid OLAP platform—featuring millisecond‑level interactive queries, high availability, horizontal scalability, and rich SQL/API query types—by configuring simple ingestion tasks that automatically manage real‑time and batch data, tiered hot/cold storage, and monitoring, while still facing ingestion limits, lack of joins, and occasional latency spikes.

Apache DruidDruidOLAP
0 likes · 12 min read
Druid OLAP Platform Practice at YouZan: Architecture, Features, and Challenges