AntData
Author

AntData

Ant Data leverages Ant Group's leading technological innovation in big data, databases, and multimedia, with years of industry practice. Through long-term technology planning and continuous innovation, we strive to build world-class data technology and products.

26
Articles
0
Likes
86
Views
0
Comments
Recent Articles

Latest from AntData

26 recent articles
AntData
AntData
Mar 7, 2025 · Artificial Intelligence

Design and Implementation of a Cloud‑Native AI Storage Acceleration System (PCache) for Large‑Scale Model Training

This article examines the challenges of AI storage for massive models, describes Ant Group's multi‑cloud, high‑availability PCache architecture, and details its GPU‑mixed deployment, metadata services, data‑link optimizations, and performance results that enable petabyte‑scale training with low cost and high stability.

AI storageLarge ModelsPCache
0 likes · 19 min read
Design and Implementation of a Cloud‑Native AI Storage Acceleration System (PCache) for Large‑Scale Model Training
AntData
AntData
Mar 4, 2025 · Big Data

Design and Analysis of 3FS: An AI‑Optimized Distributed File System

The article provides a comprehensive English overview of 3FS, an AI‑focused distributed file system that leverages FoundationDB for metadata, CRAQ for chunk replication, and a hybrid Fuse/native client architecture, detailing its design, components, fault handling, and performance considerations for large‑scale training workloads.

AI storageCRAQ replicationDistributed File System
0 likes · 25 min read
Design and Analysis of 3FS: An AI‑Optimized Distributed File System
AntData
AntData
Dec 17, 2024 · Databases

Designing Database Services for Modern Online Business: Scalability, Agility, Security, and Cost Optimization

The article examines how database services must evolve to meet the high‑availability, real‑time response, horizontal scalability, application agility, security compliance, and cost‑optimization demands of modern online businesses, using Ant Group’s multi‑generation architecture and technologies such as distributed middleware, HTAP/HSAP, and polyglot persistence as examples.

Application AgilityDatabase ServicesHTAP
0 likes · 22 min read
Designing Database Services for Modern Online Business: Scalability, Agility, Security, and Cost Optimization
AntData
AntData
Dec 11, 2024 · Big Data

Flex: A Stream‑Batch Integrated Vectorized Engine for Flink

This article introduces Flex, a Flink‑compatible stream‑batch vectorized engine built on Velox and Gluten, explains the SIMD‑based execution model, details native operator optimizations, fallback mechanisms, correctness and usability improvements, and presents performance results and future development plans.

Distributed computingFlinkSIMD
0 likes · 17 min read
Flex: A Stream‑Batch Integrated Vectorized Engine for Flink
AntData
AntData
Nov 18, 2024 · Databases

Modern Data Paradigms: From Relational Databases to Vector Retrieval and AI

This article surveys the evolution of modern data technologies—from the 4V characteristics of big data and the limitations of traditional relational databases, through the rise of NoSQL and polyglot persistence, to embedding‑driven vector search, hybrid retrieval and RAG, illustrating how each paradigm frees applications from data constraints.

Artificial IntelligenceBig DataData Architecture
0 likes · 30 min read
Modern Data Paradigms: From Relational Databases to Vector Retrieval and AI
AntData
AntData
Oct 16, 2024 · Artificial Intelligence

Building a Data Assistant Application with DB‑GPT V0.6.0

This tutorial walks through the end‑to‑end process of creating a data‑assistant application using DB‑GPT V0.6.0, covering prerequisite deployment, knowledge‑base construction, sub‑agent creation, RAG‑based QA, AWEL workflow installation, intent‑recognition knowledge base, and unified multi‑agent orchestration.

AIDB-GPTData Assistant
0 likes · 12 min read
Building a Data Assistant Application with DB‑GPT V0.6.0
AntData
AntData
Sep 26, 2024 · Databases

Apache HoraeDB (CeresDB): An Open‑Source Distributed Time‑Series Database

Apache HoraeDB (CeresDB) is an open‑source, distributed, high‑availability time‑series database developed by Ant Group, supporting multi‑dimensional queries, compatible with Prometheus and OpenTSDB, and offering SQL and OLAP capabilities for use cases such as APM, IoT monitoring, financial analytics, and AI‑infra observability.

Distributed SystemsObservabilityOpen Source
0 likes · 5 min read
Apache HoraeDB (CeresDB): An Open‑Source Distributed Time‑Series Database
AntData
AntData
Sep 26, 2024 · Artificial Intelligence

DB-GPT: Open-Source AI-Native Data Application Development Framework

DB‑GPT is an open‑source AI‑native data‑application framework that provides multi‑model management, Text‑to‑SQL optimization, RAG, multi‑agent collaboration, and intelligent workflow orchestration, enabling developers to build scalable large‑model database applications, with proven enterprise adoption, community growth, and academic publications.

AIOpen SourceRAG
0 likes · 6 min read
DB-GPT: Open-Source AI-Native Data Application Development Framework
AntData
AntData
Sep 9, 2024 · Big Data

From Cost‑Efficiency to Value‑Centric: The Evolution of Data Systems in the Data+AI Era

The article reviews the rapid advances in generative AI and big‑data technologies, traces the historical development of data infrastructure, and argues that modern data systems are shifting from a cost‑efficiency focus to a value‑centric paradigm driven by multimodal, non‑structured data, vector search and machine‑oriented services.

@DataArtificial IntelligenceBig Data
0 likes · 18 min read
From Cost‑Efficiency to Value‑Centric: The Evolution of Data Systems in the Data+AI Era