Databases 11 min read

Row Store vs Column Store in openGauss: Comparison, Advantages, Disadvantages, and Experimental Evaluation

This article explains the fundamental differences between row‑based and column‑based storage in databases, compares their read/write performance, presents experimental results on openGauss showing space and speed trade‑offs, and offers practical guidance on when to choose each storage model.

Architecture Digest

Jul 19, 2022

Row Store vs Column Store in openGauss: Comparison, Advantages, Disadvantages, and Experimental Evaluation

Many beginners learn databases using relational tables, which store data in a row‑oriented format (row store) where each record’s fields are placed together on disk.

Some databases also support column‑oriented storage (column store), which stores each column’s values contiguously on disk.

Storage Comparison

The differences are illustrated in the figure below.

In a row store, the attribute values of a record are stored together, followed by the next record’s values. In a column store, all values of a single attribute are stored consecutively, giving each column its own contiguous space.

Consider which model is more suitable for queries and which for updates.

Write‑Performance Comparison

1) Row store writes are performed in a single operation. Because the write is handled by the operating system’s file system, success or failure is guaranteed and data integrity is clear.

2) Column store must split a row into individual columns for storage, resulting in many more write operations and additional disk‑head movements, so write latency is higher. Consequently, row store has a clear advantage for writes.

3) Data modification is essentially a write operation, so row store also outperforms column store for updates.

Read‑Performance Comparison

1) Row store typically reads an entire row; when only a few columns are needed, extra columns are read and later filtered in memory, adding overhead.

2) Column store reads only the required column(s) or a contiguous segment, eliminating redundancy and making it especially efficient for projection queries.

3) Because each column contains homogeneous data types, parsing is straightforward and CPU‑efficient, unlike row store where mixed types require frequent type conversion.

4) Uniform column data enables effective compression; different columns can use different algorithms, improving I/O performance.

Advantages & Disadvantages

The choice of storage type is the first step in table design and should be driven by business requirements. Both models have pros and cons, and the appropriate one depends on the workload.

Row Store

Column Store

Advantages

Data kept together; INSERT/UPDATE are simple.

Only the needed columns are read during queries.

Projection is highly efficient.

Any column can serve as an index.

Disadvantages

When selecting a subset of columns, all data is still read.

Selected columns must be re‑assembled after reading.

INSERT/UPDATE are more complex.

Point queries are not well‑suited.

Applicable Scenarios

Point queries (few records, index‑based simple queries).

Workloads with frequent INSERT/UPDATE/DELETE.

Analytical queries (OLAP, data‑warehouse) where few columns are aggregated over many rows.

Ad‑hoc queries where filter conditions are uncertain and full‑table scans are needed.

Experimental Evaluation

openGauss supports hybrid storage; the storage orientation can be specified at table creation. The following experiment was performed on a Huawei Cloud server with openGauss Enterprise 3.0.0 and openEuler 20.03.

Two tables were created: custom1 (row store) and custom2 (column store). 500,000 rows were inserted into each.

# create table custom1 (id integer, name varchar2(20));
CREATE TABLE
# create table custom2 (id integer, name varchar2(20)) with (orientation = column);
CREATE TABLE
# insert into custom1 select n,'testtt'||n from generate_series(1,500000) n;
INSERT 0 500000
# insert into custom2 select * from custom1;
INSERT 0 500000

Storage size comparison:

# \d+
 List of relations
 Schema | Name    | Type  | Owner | Size   | Storage
--------+---------+-------+-------+--------+--------------------------------------
 public | custom1 | table | omm   | 24 MB  | {orientation=row,compression=no}
 public | custom2 | table | omm   | 3104 kB| {orientation=column,compression=low}

The column‑store table occupies roughly one‑seventh of the row‑store table’s space.

Insert latency for a single new record:

# explain analyze insert into custom1 values(1,'zhang3');
QUERY PLAN
... Total runtime: 0.135 ms
# explain analyze insert into custom2 values(1,'zhang3');
QUERY PLAN
... Total runtime: 0.207 ms

Finally, the test tables were dropped:

# drop table custom1;
DROP TABLE
# drop table custom2;
DROP TABLE

Selection Recommendations

Frequent updates → prefer row store.

Frequent small inserts → row store; bulk inserts → column store.

Wide tables with few queried columns → column store; narrow tables with most columns queried → row store.

If queries touch only a small fraction of columns (<50% of total), column store is advantageous.

Column store offers higher compression but consumes more CPU.

Precautions

Column stores have many restrictions: they do not support arrays, generated columns, global temporary tables, or foreign keys, and the supported data types are fewer than in row stores. Always consult the database documentation before using column‑oriented tables.

Interested readers can further test scenarios such as creating very wide tables or performing extensive UPDATE operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL performance comparison Database Storage Column Store openGauss Row Store

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.