Tag

Real-time Processing

1 views collected around this technical thread.

Full-Stack Internet Architecture
Full-Stack Internet Architecture
May 27, 2025 · Big Data

Understanding Event Streaming in Kafka: Core Concepts, Architecture, and Use Cases

This article explains Kafka's event streaming concept, detailing events and streams, core components such as producers, topics, partitions, consumers, persistence, and typical real‑time data pipeline, event‑driven architecture, stream processing, and log aggregation use cases, highlighting its role as a foundational big‑data infrastructure.

Event StreamingKafkaMessage Queues
0 likes · 7 min read
Understanding Event Streaming in Kafka: Core Concepts, Architecture, and Use Cases
Full-Stack Internet Architecture
Full-Stack Internet Architecture
May 20, 2025 · Big Data

Why Learn Kafka? Core Benefits, Use Cases, and a Summary

This article explains why Kafka is widely adopted by top companies, outlines its high throughput, scalability, and durability, and describes key real‑time data pipeline, stream processing, and big‑data integration scenarios, concluding that mastering Kafka is essential for modern backend and data engineering roles.

Data EngineeringKafkaReal-time Processing
0 likes · 4 min read
Why Learn Kafka? Core Benefits, Use Cases, and a Summary
php中文网 Courses
php中文网 Courses
Apr 7, 2025 · Backend Development

Implementing Sliding Window Algorithms in PHP for Real-Time Data Processing

This article introduces the sliding window technique, demonstrates efficient PHP implementations for computing averages and handling real-time streams, provides optimization strategies, and outlines practical applications such as financial analysis, network monitoring, and recommendation systems, highlighting performance considerations for backend development.

PHPReal-time ProcessingSliding Window
0 likes · 6 min read
Implementing Sliding Window Algorithms in PHP for Real-Time Data Processing
AntData
AntData
Mar 20, 2025 · Big Data

Design and Optimization of Real‑time Data Lake Tables with Paimon and Flink for Advertising Diagnostics

This article presents a comprehensive exploration of using Apache Paimon and Flink to design lake tables that support minute‑level latency, low cost, and unified batch‑stream processing for advertising data, covering schema design, partitioning strategies, performance trade‑offs, cost analysis, and operational best practices.

Advertising AnalyticsData LakePaimon
0 likes · 34 min read
Design and Optimization of Real‑time Data Lake Tables with Paimon and Flink for Advertising Diagnostics
Zhuanzhuan Tech
Zhuanzhuan Tech
Mar 13, 2025 · Backend Development

Design and Implementation of a Real-Time Product Tagging Platform for a Second‑Hand E‑Commerce System

This article presents a comprehensive technical case study of a three‑layer product‑tagging platform that addresses the challenges of fine‑grained operations, ensures real‑time tag updates, guarantees data consistency, and eliminates read bottlenecks through traffic separation, event‑driven processing, deduplication MQ, and multi‑level caching.

Real-time Processingbackend architecturecaching
0 likes · 13 min read
Design and Implementation of a Real-Time Product Tagging Platform for a Second‑Hand E‑Commerce System
vivo Internet Technology
vivo Internet Technology
Dec 18, 2024 · Big Data

Kafka Streams: Architecture, Configuration, and Monitoring Use Cases

Kafka Streams is a client library that enables low‑latency, fault‑tolerant real‑time processing of Kafka data through configurable topologies, time semantics, and state stores, and the article explains its architecture, essential configurations, monitoring‑focused ETL example, performance tuning, and strategies for handling partition skew.

ETLJavaKafka Streams
0 likes · 25 min read
Kafka Streams: Architecture, Configuration, and Monitoring Use Cases
JD Retail Technology
JD Retail Technology
Oct 11, 2024 · Big Data

JD Retail Data Lake Architecture: Challenges, Optimizations, and Future Plans

This article presents JD Retail's data lake architecture overhaul, detailing the shortcomings of the Lambda model, the migration to Flink‑Hudi‑Spark pipelines, performance gains, storage savings, unified APIs, and upcoming improvements for resilience and automation.

Data LakeHudiReal-time Processing
0 likes · 11 min read
JD Retail Data Lake Architecture: Challenges, Optimizations, and Future Plans
DataFunSummit
DataFunSummit
Jul 1, 2024 · Big Data

Optimizing JD Retail Data Architecture: From Lambda to Real‑time Unified Processing with Flink, Hudi, and StarRocks

This article details JD Retail's transition from a complex Lambda architecture to a unified real‑time data pipeline using Flink, Hudi, and StarRocks, addressing data completeness versus latency, reducing maintenance costs, improving storage efficiency, and delivering faster, more consistent analytics for business users.

HudiJD RetailReal-time Processing
0 likes · 13 min read
Optimizing JD Retail Data Architecture: From Lambda to Real‑time Unified Processing with Flink, Hudi, and StarRocks
DataFunTalk
DataFunTalk
May 13, 2024 · Big Data

Data Integration Maturity Model: From ETL to EtLT

The article examines the evolution of data integration architectures—from traditional ETL through ELT to the emerging EtLT model—highlighting their advantages, disadvantages, industry trends, maturity stages, and practical guidance for enterprises and professionals navigating modern big‑data pipelines.

DataOpsELTETL
0 likes · 31 min read
Data Integration Maturity Model: From ETL to EtLT
iQIYI Technical Product Team
iQIYI Technical Product Team
Apr 26, 2024 · Big Data

iQIYI Real-time Lakehouse: Stream‑Batch Unified Architecture

iQIYI replaced its costly Lambda architecture with a unified Iceberg‑based lakehouse that combines Flink streaming and batch processing, cutting data latency from hours to minutes, supporting thousands of tables via a multi‑table sink, guaranteeing completeness, and saving millions of RMB in operational costs.

Data LakeIcebergReal-time Processing
0 likes · 18 min read
iQIYI Real-time Lakehouse: Stream‑Batch Unified Architecture
DataFunSummit
DataFunSummit
Mar 17, 2024 · Big Data

OPPO Smart Data Lakehouse: Architecture, Real‑time Lakehouse, and Technical Practices

This article presents OPPO's smart data lakehouse solution, describing its massive EB‑scale architecture, the integration of batch and streaming engines, the Glacier service for table management, schema‑adaptive ingestion, performance optimizations, and future technical road‑maps for unified data processing.

Data LakehouseIcebergOPPO
0 likes · 15 min read
OPPO Smart Data Lakehouse: Architecture, Real‑time Lakehouse, and Technical Practices
DataFunTalk
DataFunTalk
Mar 5, 2024 · Big Data

Changan Automotive Big Data Platform: Challenges and Practices in Connected Vehicle Scenarios

This article outlines the rapid growth of data in the smart automotive sector and details Changan's big data platform challenges—high cost, data accessibility, and operational complexity—and the practical migration from a Lambda to a unified Kappa architecture that delivers significant storage, compute, and maintenance efficiencies.

Connected VehiclesKappa architectureReal-time Processing
0 likes · 14 min read
Changan Automotive Big Data Platform: Challenges and Practices in Connected Vehicle Scenarios
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Mar 4, 2024 · Big Data

Integrating Data Lake Technologies with Data Warehouse Architecture at Xiaohongshu: Practices and Performance Optimizations

Xiaohongshu’s data‑warehouse team integrated Apache Iceberg‑based data‑lake techniques into its existing warehouse, replacing the legacy Hive/Spark stack with global sorting, Z‑order, and upsert‑enabled tables, which cut query latency by up to 90 %, boosted data freshness by 50 %, slashed storage costs by 83 % and saved tens of thousands of GB‑hours of compute daily.

Apache IcebergData LakePerformance Optimization
0 likes · 19 min read
Integrating Data Lake Technologies with Data Warehouse Architecture at Xiaohongshu: Practices and Performance Optimizations
DataFunTalk
DataFunTalk
Feb 27, 2024 · Big Data

Best Practices of Cloud‑Native OLAP Architecture and Logistics Warning at Jushuitan

This article presents Jushuitan's cloud‑native OLAP architecture, detailing its evolution, current big‑data stack—including DataWorks, MaxCompute, Flink, Hologres, and Aerospike—along with logistics warning workflows, rule‑matching mechanisms, real‑time processing challenges, and future scalability plans.

HologresOLAPReal-time Processing
0 likes · 20 min read
Best Practices of Cloud‑Native OLAP Architecture and Logistics Warning at Jushuitan
DataFunTalk
DataFunTalk
Feb 25, 2024 · Big Data

Implementation Practice of Bilibili's Tag System: Evolution, Architecture, and Future Plans

This article details Bilibili's tag system from its 2021 inception through successive redesigns, describing the three‑layer architecture, data flow pipelines using Hive, Iceberg, Spark and ClickHouse, crowd selection DSL, online services with Redis, performance optimizations, and upcoming governance and quality initiatives.

ClickHouseData EngineeringReal-time Processing
0 likes · 12 min read
Implementation Practice of Bilibili's Tag System: Evolution, Architecture, and Future Plans
DataFunSummit
DataFunSummit
Jan 25, 2024 · Big Data

Best Practices of Jushuitan Cloud‑Native OLAP Architecture and Logistics Warning

This article presents Jushuitan's cloud‑native OLAP architecture, covering business background, data‑warehouse evolution, real‑time processing with Flink, Hologres, and Aerospike, and detailed logistics‑warning use cases, followed by technical challenges, future outlook, and a Q&A on implementation details.

Logistics WarningOLAPReal-time Processing
0 likes · 20 min read
Best Practices of Jushuitan Cloud‑Native OLAP Architecture and Logistics Warning
vivo Internet Technology
vivo Internet Technology
Jan 24, 2024 · Big Data

Evolution of Vivo's Trillions-Scale Data Architecture: Dual-Active Real-Time and Offline Computing

Vivo’s trillion‑scale data platform evolved into a dual‑active real‑time and offline architecture that leverages multi‑datacenter clusters, Kafka/Pulsar caching, a unified sorting layer, HBase‑backed dimension tables, and micro‑batch Spark jobs to deliver low‑cost, high‑performance processing, 99.9% availability, and 99.9995% data‑integrity.

Data ArchitectureData IntegrityHBase
0 likes · 16 min read
Evolution of Vivo's Trillions-Scale Data Architecture: Dual-Active Real-Time and Offline Computing
Bilibili Tech
Bilibili Tech
Dec 15, 2023 · Artificial Intelligence

Bilibili's AI-Powered Video Frame Interpolation: Techniques, Challenges, and Deployment

Bilibili’s AI‑driven frame‑interpolation pipeline upgrades low‑frame-rate videos to smooth high‑frame-rate 1080p playback by optimizing optical‑flow models for large motion, texture and text artifacts, pruning for speed, and deploying via the BVT SDK across on‑demand and live streams.

AIReal-time ProcessingVideo Frame Interpolation
0 likes · 14 min read
Bilibili's AI-Powered Video Frame Interpolation: Techniques, Challenges, and Deployment
Zhuanzhuan Tech
Zhuanzhuan Tech
Dec 14, 2023 · Big Data

Design and Implementation of a Data Service Platform for New Media Business

This article details the background, challenges, design principles, and implementation of a unified data service platform—including data modeling, multi-source governance, real-time processing, and a Doris-based storage solution—to support large‑scale video data for a new media operation.

Apache DorisReal-time Processingbig data
0 likes · 7 min read
Design and Implementation of a Data Service Platform for New Media Business