Tag

Kudu

0 views collected around this technical thread.

Architect
Architect
Oct 6, 2021 · Big Data

Design and Implementation of a Real-time and Offline Integrated Query System

This article details the requirements, architecture, and implementation of a real-time and offline integrated query system, covering data ingestion via Debezium and Confluent Platform, storage in Kudu and HDFS, query engines Presto and Kylin, and strategies for data synchronization, partitioning, and scaling.

DebeziumKafkaKudu
0 likes · 19 min read
Design and Implementation of a Real-time and Offline Integrated Query System
DataFunTalk
DataFunTalk
Sep 1, 2020 · Big Data

NetEase Real-Time Computing Platform (Sloth): Architecture, Practices, and Future Outlook

This article introduces NetEase's real-time computing platform Sloth, detailing its architecture, component layers, integrated IDE, operational tooling, unified metadata management, challenges such as Kudu write amplification, and proposes a tiered real‑time data‑warehouse model with a vision for storage‑compute separation and unified batch‑stream APIs.

FlinkKafkaKudu
0 likes · 13 min read
NetEase Real-Time Computing Platform (Sloth): Architecture, Practices, and Future Outlook
Big Data Technology Architecture
Big Data Technology Architecture
Jun 16, 2020 · Big Data

Real-time Multi-dimensional Analytics and SlimBase State Backend at Kuaishou: Flink Applications and Optimizations

This article describes how Kuaishou leverages Apache Flink for large‑scale real‑time multi‑dimensional analytics, details the architecture of its analytics platform using Kudu storage and KwaiBI, and introduces SlimBase—a lightweight, embedded shared state backend that replaces RocksDB to reduce I/O, latency, and CPU overhead.

FlinkKuaishouKudu
0 likes · 17 min read
Real-time Multi-dimensional Analytics and SlimBase State Backend at Kuaishou: Flink Applications and Optimizations
Big Data Technology Architecture
Big Data Technology Architecture
Feb 3, 2020 · Big Data

NetEase Data Foundation Platform Construction – Technical Sharing

This article, originally shared by NetEase’s data expert Jiang Hongxiang on DataFun, outlines the construction of NetEase’s data foundation platform, covering database kernel insights and the implementation of the ad‑hoc query engine Impala with the distributed storage system Kudu, offering valuable big‑data engineering practices.

Data InfrastructureImpalaKudu
0 likes · 4 min read
NetEase Data Foundation Platform Construction – Technical Sharing
DataFunTalk
DataFunTalk
Jan 16, 2019 · Big Data

NetEase Data Infrastructure: Database Technologies and Big Data Platform Overview

This article presents NetEase Hangzhou Research Institute's experience in building a data infrastructure, covering database innovations such as InnoSQL, NTSDB, and InnoRocks, as well as the integration of big‑data components like HDFS, Spark, Impala, and Kudu to enable efficient storage, processing, and real‑time analytics.

DatabaseImpalaInnoSQL
0 likes · 12 min read
NetEase Data Infrastructure: Database Technologies and Big Data Platform Overview