Big Data 18 min read

From Cost‑Efficiency to Value‑Centric: The Evolution of Data Systems in the Data+AI Era

The article reviews the rapid advances in generative AI and big‑data technologies, traces the historical development of data infrastructure, and argues that modern data systems are shifting from a cost‑efficiency focus to a value‑centric paradigm driven by multimodal, non‑structured data, vector search and machine‑oriented services.

AntData
AntData
AntData
From Cost‑Efficiency to Value‑Centric: The Evolution of Data Systems in the Data+AI Era

In the past two years, generative AI has achieved major breakthroughs, combining massive data and compute power to drive countless product innovations, while the data technology field is also accelerating into a new historical stage full of unprecedented challenges and opportunities.

Reviewing the evolution of data technologies, the 1990s saw the rise of the Internet and efficient database storage that enabled digitalization for SMEs and e‑commerce, leading to high‑performance, high‑availability database systems.

From 2003‑2006, papers such as MapReduce, Bigtable, and Google File System launched the industrial big‑data era, while the proliferation of mobile internet, smartphones, and apps enriched data portraits and spurred personalized services.

In 2017, the "Attention is All You Need" paper laid the foundation for generative AI, making large‑model‑centric intelligent services possible and ushering in a new era of data‑intelligence convergence.

The data system is transitioning from a cost‑efficiency center to a value‑center. In the big‑data era, technology focused on infrastructure metrics such as latency, throughput, and resource cost. In the data‑intelligence era, the scale, diversity, and quality of data assets become the key to intelligent application performance.

Data production is expanding beyond traditional web crawling to include fine‑grained records from wearables, smart appliances, and IoT devices, turning all observed information into valuable digital assets.

High‑quality, professionally annotated data is increasingly critical for large‑model training, making data labeling and synthesis essential for maintaining data quality.

Data forms are shifting from structured to unstructured; IDC predicts that by 2027 unstructured data will account for 86.8% of total data (≈250 ZB). Processing such multimodal data (text, image, audio, video) introduces new challenges in cleaning, quality assessment, mining, and auditing.

Data services are extending from user‑oriented to machine‑and‑agent‑oriented, enabling agents to exchange semantic representations, improve multimedia encoding/decoding, and meet stringent latency and throughput requirements in immersive interactions.

Mixed scalar‑and‑vector retrieval is highlighted as a critical technology for new search and interaction scenarios, requiring efficient storage, indexing, and real‑time data streams to support instant user interests.

In the data‑intelligence era, experimental, iteration‑native engineering pipelines are needed to evaluate data scale, structure, content, and trustworthiness, and to manage data versioning, sampling, and feedback loops.

Future data ecosystems must support open data value discovery, secure multi‑party data flow, privacy protection, and value‑based settlement to sustain healthy, sustainable development.

Ant Group’s data team has been building a multimodal storage and compute engine, vector database capabilities, and mixed‑retrieval analytics to support the evolving data‑intelligence landscape, emphasizing the shift from cost‑efficiency to value‑centric data systems.

Data EngineeringArtificial IntelligenceBig Datamultimodaldata valueData
AntData
Written by

AntData

Ant Data leverages Ant Group's leading technological innovation in big data, databases, and multimedia, with years of industry practice. Through long-term technology planning and continuous innovation, we strive to build world-class data technology and products.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.