Cloud Native 3 min read

Optimizing I/O for Data-Intensive Analytics in Cloud-Native Environments: Insights from Uber Presto

This whitepaper examines the industry trend of migrating data‑intensive analytics workloads to cloud‑native environments, revealing how cloud storage’s unique cost model demands finer‑grained performance optimization, and presents Uber Presto case‑study findings that expose fragmented I/O patterns and associated financial impacts.

DataFunSummit

Jun 22, 2024

Optimizing I/O for Data-Intensive Analytics in Cloud-Native Environments: Insights from Uber Presto

The paper investigates the growing industry trend of moving data‑intensive analytics applications from on‑premises to cloud‑native environments, emphasizing that cloud storage introduces a distinct cost model that requires more detailed performance‑optimization strategies.

Through an empirical study of Uber’s production Presto workload, the authors observed highly fragmented data‑access patterns—over 50 % of reads are smaller than 10 KB and more than 90 % are under 1 MB—highlighting that traditional I/O optimizations that ignore storage‑API call costs can lead to excessive expenses in the cloud.

The whitepaper presents a case‑study‑driven set of I/O‑optimization logic and tactics, illustrating how to redesign I/O for cloud environments to improve cost‑effectiveness and performance.

Readers are offered a new perspective on system design for cloud‑computing platforms, enabling them to devise efficient I/O strategies tailored to data‑intensive applications in cloud‑native settings.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native I/O optimization Data Analytics Presto Cost Model

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.