Databases 6 min read

Performance Impact of Using TIMESTAMP vs VARCHAR for Date‑Time Data in Large Tables

Replacing a VARCHAR column that stores timestamp strings with a native TIMESTAMP type in a massive sorting table reduced query time from over 20 minutes to 39 seconds, demonstrating a 30‑plus‑fold speedup due to smaller storage, faster CPU comparisons, and reduced I/O.

Qunar Tech Salon

May 1, 2015

Performance Impact of Using TIMESTAMP vs VARCHAR for Date‑Time Data in Large Tables

A large table used for product ranking grew from millions to tens of millions of rows, causing the basic sorting query to expand from under ten minutes to more than forty minutes. After analyzing the schema, it was found that the timestamp column was stored as a VARCHAR string despite having an index.

By altering the column to a native TIMESTAMP type, the query time dropped dramatically from over twenty minutes (≈1200 s) to just 39 seconds.

According to the ISO standard, a timestamp string such as '2013-04-07 12:33:20.000+08' occupies 26 bytes. Databases can store timestamps as integers or floating‑point numbers: a 4‑byte integer for a timezone‑less, millisecond‑less timestamp, or an 8‑byte value for a full‑precision timestamp with timezone. In contrast, a VARCHAR representation requires 4 bytes for length plus the actual data length, resulting in roughly 23–30 bytes per value. This means VARCHAR storage is 5–7 times larger than native timestamp storage.

From a CPU perspective, comparing an 8‑byte integer involves a few memory accesses and a single register comparison. Comparing a 23‑ or 30‑byte string requires multiple 4‑byte chunks, effectively doubling the number of memory accesses and leading to 12–16 times slower comparisons.

On the I/O side, ten million rows of 8‑byte timestamps occupy about 80 MB, while the same number of 23‑byte or 30‑byte strings occupy roughly 230 MB and 300 MB respectively. This translates to three to four times more disk blocks (≈160 k vs 460 k‑600 k blocks of 512 bytes), further degrading performance.

Summing the storage, CPU, and I/O factors yields an expected speedup of 36–64×, which aligns closely with the observed reduction from 20 minutes to 39 seconds (≈30×). The conclusion is clear: whenever a native data type exists, it should be used instead of storing equivalent data as strings.

Author: He Weiping, Search and Database Researcher at Qunar, contributor to the Chinese PostgreSQL manual and Programming Perl translation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance Optimization varchar timestamp

Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.