Choosing Faster JSON Libraries in Python: ujson, rapidjson, and orjson Comparison
The article investigates why Python's built‑in json.dumps can be slow on large data structures, compares the performance of ujson, python‑rapidjson, simplejson, and the Rust‑based orjson, provides benchmark code, installation commands, and discusses a bug in ujson's indent handling.
When using Python's built‑in json.dumps on a large list, the author noticed slow performance and investigated faster alternatives such as ujson (UltraJSON), python‑rapidjson, simplejson, and later discovered orjson written in Rust.
All three libraries can be installed via pip (pip install python-rapidjson, pip install simplejson, pip install ujson, pip install orjson). The author wrote a benchmark script (test.py) that generates a list of dictionaries and measures the time taken by json.dumps , ujson.dumps , rapidjson.dumps , and later orjson.dumps .
<code># test.py
from time import time
import sys, string
num = int(sys.argv[1])
lib = sys.argv[2]
items = []
for i in range(num):
items.append({c: c for c in string.ascii_letters})
start = time()
if lib == 'ujson':
import ujson
ujson.dumps(items)
elif lib == 'rapidjson':
import rapidjson
rapidjson.dumps(items)
elif lib == 'orjson':
import orjson
orjson.dumps(items)
else:
import json
json.dumps(items)
print(time() - start)
</code>Running the script with different sizes (1000, 10000, 100000, 1000000) and libraries produced timing results that match benchmarks reported in “Benchmark of Python JSON libraries”. The author also included screenshots of those benchmark tables.
The tests confirmed that orjson, which is implemented in Rust, outperforms ujson and rapidjson, making it the preferred choice for high‑performance JSON serialization in Python.
A bug in ujson 3.0.0/3.1.0 where the indent parameter is ignored beyond a value of 1 is documented, with example REPL sessions showing the issue. The author suggests pinning ujson to version 2.0.3 until the bug is fixed.
Finally, the article notes that orjson's dumps() returns bytes and does not accept the indent argument; to obtain a formatted string one must decode the bytes, e.g., json_str = orjson.dumps(record, option=orjson.OPT_INDENT_2).decode() .
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.