Author

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

1.7k

Articles

Likes

5.4k

Views

Comments

Latest from DataFunSummit

100 recent articles max

DataFunSummit

Apr 23, 2026 · Databases

How Hologres Dynamic Table Redefines Data Processing with Incremental Computing

The article analyzes the limitations of traditional batch and stream processing, introduces Hologres Dynamic Table as a declarative, incremental‑compute framework that bridges the gap between low‑cost batch jobs and low‑latency streaming, and validates its performance with benchmarks and real‑world case studies.

Dynamic TableHologresPerformance Benchmark

0 likes · 13 min read

How Hologres Dynamic Table Redefines Data Processing with Incremental Computing

DataFunSummit

Apr 23, 2026 · Artificial Intelligence

Ontology + Large Model: How Knora Solves Hallucination and Execution Gaps in Enterprise AI

The article details how Knora 4.0 integrates ontology with large‑model AI to create a reusable, extensible enterprise AI platform that mitigates hallucination, stabilises output, and enables autonomous end‑to‑end execution, illustrated with LED production line case studies, architectural breakdowns, and a roadmap for future intelligent agents.

autonomous agentsenterprise AIknowledge graph

0 likes · 17 min read

Ontology + Large Model: How Knora Solves Hallucination and Execution Gaps in Enterprise AI

DataFunSummit

Apr 22, 2026 · Artificial Intelligence

Why the Overlooked Agent Harness Is the Real Reason AI Projects Fail

The article explains that the hidden infrastructure layer called Agent Harness—responsible for prompt, context, and tool orchestration—determines whether impressive AI agent demos can survive production, highlighting issues like context rot, compounding errors, verification loops, and concrete benchmark improvements.

AI AgentsAgent HarnessContext Management

0 likes · 14 min read

Why the Overlooked Agent Harness Is the Real Reason AI Projects Fail

DataFunSummit

Apr 22, 2026 · Artificial Intelligence

From Flawed RAG to Production‑Ready: Deep Dive into Scaling Retrieval‑Augmented Generation

This expert roundtable dissects why RAG often fails in production—low recall, hallucinations, cost overruns—and walks through concrete diagnostics, hybrid search designs, knowledge‑engineering tricks, GraphRAG and Agentic RAG advances, plus practical deployment, security, and cost‑optimization guidelines.

AI deploymentAgentic RAGHybrid Search

0 likes · 20 min read

From Flawed RAG to Production‑Ready: Deep Dive into Scaling Retrieval‑Augmented Generation

DataFunSummit

Apr 21, 2026 · Industry Insights

How SelectDB Cuts 60% Costs and Boosts Real‑Time Performance for New Energy Batteries

The whitepaper analyzes the data‑driven transformation of the new‑energy battery sector, outlines four core challenges—massive data streams, fast‑changing R&D demands, long manufacturing cycles, and multi‑dimensional quality standards—and demonstrates how SelectDB’s unified lake‑warehouse architecture delivers million‑level throughput, second‑level latency, up to 30× query speedup, and 60% cost reduction across real‑world case studies.

Big DataData WarehouseNew Energy

0 likes · 18 min read

How SelectDB Cuts 60% Costs and Boosts Real‑Time Performance for New Energy Batteries

DataFunSummit

Apr 21, 2026 · Industry Insights

How AI Search & Recommendation Systems Beat Multi-Modal, High-Concurrency Hurdles

This article reviews cutting‑edge technical practices from Alibaba Cloud AI Search, Huawei Noah's recommendation platform, and Baidu's GRAB model, detailing how multi‑agent RAG architectures, large‑language‑model enhancements, and generative ranking overcome high‑concurrency, multi‑modal data, and feature‑engineering bottlenecks.

AI SearchGenerative RankingIndustry Insights

0 likes · 6 min read

How AI Search & Recommendation Systems Beat Multi-Modal, High-Concurrency Hurdles

DataFunSummit

Apr 20, 2026 · Artificial Intelligence

Why Ontology‑Driven Agents Are the Key to Safe, Controllable Enterprise AI

The article analyses the current hype around AI agents, explains why pure prompt‑based constraints fail in complex business scenarios, and proposes an ontology‑driven Harness Engineering framework that embeds architectural constraints, context engineering, and a traceable feedback loop to achieve secure, business‑level controllability.

AI AgentsContext EngineeringKnora

0 likes · 21 min read

Why Ontology‑Driven Agents Are the Key to Safe, Controllable Enterprise AI

DataFunSummit

Apr 20, 2026 · Industry Insights

How Apache Gravitino Solves Data Fragmentation in the Multi‑Cloud AI Era

In a Data for AI meetup, Datastrato's VP of Engineering Shi Shaofeng explains how Apache Gravitino's metadata federation, metalake architecture, and unified access control address multi‑cloud data fragmentation, compliance, and AI‑driven governance while outlining version 1.1.0 enhancements and the roadmap for 1.2.0.

AI data governanceApache GravitinoMetadata Management

0 likes · 12 min read

How Apache Gravitino Solves Data Fragmentation in the Multi‑Cloud AI Era

DataFunSummit

Apr 19, 2026 · Big Data

How OPPO Built a Multi‑Modal Data Lake with Gravitino and Curvine

OPPO’s data‑lake team, led by David, detailed their transition from Hive‑Spark to a unified multi‑modal lake, leveraging Gravitino for cross‑engine metadata management and the open‑source Curvine cache to eliminate data silos, boost I/O performance, and support massive image, recommendation, and AI‑Agent workloads.

Big DataDistributed CacheMetadata Management

0 likes · 11 min read

How OPPO Built a Multi‑Modal Data Lake with Gravitino and Curvine

DataFunSummit

Apr 19, 2026 · Artificial Intelligence

How to Build a Multimodal Product Search Engine with Embedding and Vector Retrieval on Elasticsearch Serverless

This article explains a complete multimodal product search solution that combines text and image embeddings, dense, sparse, and hybrid models, vector similarity metrics, and Elasticsearch Serverless features such as dense_vector, sparse_vector, hybrid search, quantization, and RRF ranking to achieve fast, accurate, and cost‑effective retrieval.

AIElasticsearchEmbedding

0 likes · 20 min read