Overview of the Qirin Big Data Platform: Architecture, Modules, and Capabilities
The article provides a comprehensive overview of the Qirin big‑data platform, detailing its architecture, core modules such as resource management, metadata, data ingestion, task development, interactive query, and self‑service analysis, and outlines future development plans for the system.
The Qirin big‑data platform, developed by the 360 System Department since 2010, offers a one‑stop solution covering the entire big‑data lifecycle—from data acquisition, storage, processing, to analysis—serving over 30 departments, 1,000+ users, more than 25,000 servers, and petabyte‑scale storage.
Platform Architecture
The functional view consists of eight modules (from bottom to top): resource management, metadata management, data collection, task development, interactive analysis, data services, permission center, and system management.
Resource Account Management
Qirin uses a multi‑tenant model with departments, resource groups, and project accounts to isolate storage, compute, and permission resources. Project accounts are required for all resource requests; personal accounts are not permitted.
Resource Management
The module allows users to request various big‑data resources, including storage (HDFS, Hive, HBase, Cassandra), compute (YARN, containers), message buses (Kafka, QBus, NSQ), analytical engines (Druid, Doris), indexing systems (Poseidon, Elasticsearch), and service resources such as ScribeQ.
Metadata Management
Provides a unified metadata view across the platform, supporting APIs for data asset construction and governance. It manages data sources such as MySQL, Hive, Kafka, and HDFS, and catalogs objects like tables, topics, and HDFS paths.
Data Collection (xCollector)
The data ingestion module supports real‑time collection from logs, files, and Kafka, routing data to Kafka, HDFS, or Elasticsearch. It follows a four‑step pipeline: collect → parse → transform → send, often performing parsing and transformation at the edge.
Task Development Platform
Supports visual workflow composition and code development for both batch and streaming jobs. Available task types include basic tasks (Shell, TransX, conditional flow), offline computing (MapReduce, Spark, Flink, offline SQL), real‑time computing (Flink streaming, FlinkSQL), and open APIs for integration.
Interactive Query (BigSQL)
Provides fast, cross‑data‑source SQL queries using engines like JDBC, Spark, Hive, and Presto. Supported sources include Hive, MySQL, Druid, Elasticsearch, and HBase, with capabilities for query templates and result visualization.
Self‑Service Analysis
Built on open‑source components, this lightweight visual analytics tool enables users to create dimensional models, drag‑and‑drop dashboards, and share reports via links.
Future Plans
The roadmap includes continuous enhancement of metadata, task development, and data collection capabilities; opening more APIs for data governance; expanding resource management services; delivering generic big‑data solutions for external projects; and integrating container‑cloud management for elastic, cloud‑native big‑data services.
In summary, Qirin 1.0 delivers a comprehensive, one‑stop big‑data operating system and development platform, covering metadata, ingestion, storage, computation, scheduling, and basic visualization, with ongoing investment to productize and service‑orient the platform for broader enterprise use.
360 Smart Cloud
Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.