Big Data 7 min read

Design and Evolution of Airbnb's Log Data Storage and Query Platform

The article describes how Airbnb's data infrastructure team built a next‑generation log storage and query platform to improve data quality, timeliness, flexibility, and anomaly detection, outlining the system architecture, key requirements, five improvement areas, and the resulting benefits.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Design and Evolution of Airbnb's Log Data Storage and Query Platform

As Airbnb’s business grew rapidly, traditional batch‑processing and unstructured log handling could no longer meet application needs, prompting the data infrastructure team to develop a new log storage and query platform focused on data quality, real‑time availability, flexible querying, multidimensional analytics, and anomaly detection.

Background: Logs serve as the bridge between product and data warehouse, enabling fraud detection, user acquisition, A/B testing, and product decisions; however, the previous system with over 800 unstructured JSON log types suffered from errors, lack of monitoring, and reliability issues.

Platform Requirements: The new platform must ensure data timeliness (predictable ingestion), completeness (no loss or duplication), and quality (valid, deserializable records).

Architecture Overview: Client applications (web and mobile) generate logs that are sent to Kafka via proxies; downstream jobs consume Kafka messages using Camus, storing data in Hive/Presto for offline analysis, while derived databases feed data products back to front‑end services. The stack is built on Ruby services and internal clusters.

Key Improvement Areas (five):

Module monitoring to guarantee correctness and reliability of each pipeline component.

End‑to‑end audit mechanisms for overall system reliability.

Enforced log format constraints to reduce invalid logs and improve data quality.

Anomaly detection modules for rapid identification of failures.

Real‑time stream processing to enable faster queries and aggregations.

Module Monitoring Details: Monitor process health, CPU/memory usage, compare input/output data volumes, and detect seasonal patterns to trigger alerts when deviations occur.

Conclusion: The revamped log platform provides system‑level detection and alerts, quantifies platform reliability, enforces log format standards, introduces real‑time stream processing, and offers an anomaly detection service for swift issue identification.

monitoringdata pipelineReal-time Processinglog platformAirbnb
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.