Migrating Data‑Intensive Analytics to Cloud‑Native Environments: Cost‑Aware I/O Optimization Insights from Uber Presto
This whitepaper examines the industry trend of moving data‑intensive analytics workloads to cloud‑native platforms, revealing how cloud storage’s unique cost model demands finer‑grained I/O optimization, illustrated through an empirical case study of Uber’s Presto production environment and its fragmented access patterns.
The whitepaper explores the growing industry trend of migrating data‑intensive analytics applications from on‑premises environments to cloud‑native platforms, emphasizing that cloud storage introduces a distinct cost model that requires more detailed performance‑optimization strategies.
Using empirical observations from Uber’s production Presto deployment, the study shows that data‑access patterns are highly fragmented—over 50 % of accesses are smaller than 10 KB and more than 90 % are under 1 MB—highlighting the financial impact of storage API calls in the cloud.
The authors present a case‑study‑driven I/O optimization framework, outlining logical steps and techniques to redesign I/O paths for cloud environments, thereby improving cost‑effectiveness and performance for data‑intensive workloads.
Readers are offered actionable insights to design efficient I/O strategies tailored to cloud‑native systems, positioning the whitepaper as a starting point for further research and implementation.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.