Big Data 5 min read

Key Big Data Terminology: Offline vs Real-time Computing, Real-time vs Ad Hoc Queries, OLTP vs OLAP, Row vs Column Storage

This article explains fundamental big‑data concepts by comparing offline (batch) and real‑time (stream) computing, distinguishing real‑time queries from ad‑hoc queries, clarifying OLTP versus OLAP workloads, and outlining the differences between row‑based and column‑based storage architectures.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Key Big Data Terminology: Offline vs Real-time Computing, Real-time vs Ad Hoc Queries, OLTP vs OLAP, Row vs Column Storage

01 Offline Computing vs Real-time Computing

Offline computing (batch processing) handles high‑latency, static data, suitable for periodic jobs such as reports; frameworks include MapReduce and Spark SQL. Real‑time computing (stream processing) processes low‑latency streams, used for ETL, monitoring, with frameworks like Spark Streaming (micro‑batch) and Flink (event‑driven).

02 Real-time Query vs Ad Hoc Query

Real‑time query (online query) returns fresh data instantly, often via APIs; HBase provides low‑latency access. Ad hoc query (Ad hoc) is an interactive SQL‑based query in data warehouses, using engines such as Hive, Impala, Presto, and differs from real‑time query.

03 OLTP vs OLAP

OLTP (On‑Line Transaction Processing) supports frequent transactional operations (insert, update, delete) with strong consistency, typical for banking or order systems. OLAP (On‑Line Analytical Processing) enables complex analytical queries for decision support, often implemented with real‑time OLAP stores like Apache Druid or ClickHouse.

04 Row‑based Storage vs Column‑based Storage

Row‑based storage (e.g., MySQL, Oracle) stores complete records together, favoring write performance and OLTP workloads but incurs higher read I/O. Column‑based storage (e.g., Parquet, Arrow) stores each column separately, optimizing read‑heavy OLAP queries through column pruning and compression.

big dataReal-time ProcessingOLAPOLTPOffline Computingrow storagecolumn storage
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.