Big Data 5 min read

Key Big Data Terminology: Offline vs Real-time Computing, Real-time vs Ad Hoc Queries, OLTP vs OLAP, Row vs Column Storage

This article explains fundamental big‑data concepts by comparing offline (batch) and real‑time (stream) computing, distinguishing real‑time queries from ad‑hoc queries, clarifying OLTP versus OLAP workloads, and outlining the differences between row‑based and column‑based storage architectures.

Big Data Technology Architecture

Aug 21, 2019

Key Big Data Terminology: Offline vs Real-time Computing, Real-time vs Ad Hoc Queries, OLTP vs OLAP, Row vs Column Storage

01 Offline Computing vs Real-time Computing

Offline computing (batch processing) handles high‑latency, static data, suitable for periodic jobs such as reports; frameworks include MapReduce and Spark SQL. Real‑time computing (stream processing) processes low‑latency streams, used for ETL, monitoring, with frameworks like Spark Streaming (micro‑batch) and Flink (event‑driven).

02 Real-time Query vs Ad Hoc Query

Real‑time query (online query) returns fresh data instantly, often via APIs; HBase provides low‑latency access. Ad hoc query (Ad hoc) is an interactive SQL‑based query in data warehouses, using engines such as Hive, Impala, Presto, and differs from real‑time query.

03 OLTP vs OLAP

OLTP (On‑Line Transaction Processing) supports frequent transactional operations (insert, update, delete) with strong consistency, typical for banking or order systems. OLAP (On‑Line Analytical Processing) enables complex analytical queries for decision support, often implemented with real‑time OLAP stores like Apache Druid or ClickHouse.

04 Row‑based Storage vs Column‑based Storage

Row‑based storage (e.g., MySQL, Oracle) stores complete records together, favoring write performance and OLTP workloads but incurs higher read I/O. Column‑based storage (e.g., Parquet, Arrow) stores each column separately, optimizing read‑heavy OLAP queries through column pruning and compression.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data OLAP OLTP Offline Computing Row Storage Column Storage

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.