Fundamentals 9 min read

How Erasure Coding Cuts Storage Costs in Ozone: A Deep Dive

This article explains how Erasure Coding (EC) improves data reliability and dramatically reduces storage overhead in Ozone by leveraging hot‑cold data characteristics, intelligent tiering, dynamic EC ratios, and repair throttling, while also discussing performance trade‑offs and limitations.

360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
How Erasure Coding Cuts Storage Costs in Ozone: A Deep Dive

01 Introduction

Erasure Coding (EC) is an encoding technique used in RAID and communication to ensure data reliability. EC‑encoded files can be decoded even when a small portion is damaged. Ozone originally guarantees reliability with multiple replicas, typically three copies, which multiplies storage cost. To lower this cost, Ozone adopts EC.

1.2 Data Hot‑Cold Characteristics

Data is classified by access frequency: hot data is frequently accessed and stored on high‑performance media; cold data is rarely accessed and stored on cheaper media. IDC reports that in production environments hot data accounts for about 10%, cold data 60%, and the remaining 30% is warm data.

02 EC in Ozone

2.1 Advantages of EC

Using three‑replica storage, a 2 MB file consumes 6 MB of raw capacity. With EC using a 3‑2‑512 KB configuration, the same file occupies 4 MB, saving roughly 33 % of storage. A 4‑2‑512 KB configuration reduces the occupied space to 3 MB, saving about 50 %. Higher EC ratios yield greater storage savings.

2.2 EC in Object Storage Products

2.2.1 Intelligent Tiering

Hot files, critical for business, are stored with three replicas for fast access. Warm or cold files, which tolerate higher latency, are stored using EC encoding, reducing cost. Based on file lifecycle, Ozone offers an intelligent tiered storage product that automatically migrates data from replica‑based storage to EC‑based storage as access frequency declines.

2.2.2 Archive Storage

When data becomes cold, Ozone provides an archive storage tier that uses high EC ratios. Archive storage is suitable for long‑term, infrequently accessed data, but incurs higher read latency and requires a thawing process that temporarily creates a three‑replica copy for reading.

2.3 Dynamic EC Ratios

Different EC ratios affect storage savings, and block size also influences the final occupied space. For a 512 KB file with a 3‑2 EC ratio, using block sizes of 256 KB, 512 KB, and 1 MB results in total storage consumption of 1 MB, 1.5 MB, and 2.5 MB respectively. Selecting the EC configuration that yields the smallest space usage maximizes storage efficiency.

2.4 Limitations of EC

Performance impact: with a 6‑3‑512 KB EC configuration, a file block is split into nine blocks distributed across different nodes, requiring at least six DataNodes to respond for a read, increasing request load compared to three‑replica mode.

Repair cost: damaged blocks need CPU‑intensive reconstruction, and higher EC ratios increase the probability of block loss.

03 EC Repair Throttling

Higher EC ratios raise the likelihood of block loss, triggering two repair mechanisms: read‑triggered asynchronous repair when a missing replica is accessed, and periodic reports from DataNodes to the Storage Container Manager (SCM) that initiate repair if a replica is missing.

Repair tasks consume CPU and, without limits, can overload a DataNode. Ozone introduces a token‑bucket based throttling mechanism that caps the number of concurrent repair tasks per DataNode. Tasks exceeding the limit are queued and may be dropped if they wait beyond a maximum timeout.

Storage OptimizationData ReliabilityErasure Codingobject storageOzone
360 Zhihui Cloud Developer
Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.