Author

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

1.7k

Articles

Likes

5.4k

Views

Comments

Latest from DataFunSummit

100 recent articles max

DataFunSummit

May 9, 2026 · Artificial Intelligence

DeepEye: Building an Autonomous, Human‑Steerable Data Agent System

The article presents DeepEye, an open‑source autonomous data‑agent platform that combines LLM reasoning, workflow orchestration, and human‑in‑the‑loop control to enable end‑to‑end analysis of heterogeneous data, and introduces a six‑level capability taxonomy to guide its evolution from manual to fully autonomous operation.

Data AgentDeepEyeHuman-in-the-Loop

0 likes · 18 min read

DeepEye: Building an Autonomous, Human‑Steerable Data Agent System

DataFunSummit

May 8, 2026 · Artificial Intelligence

Agent Architecture in Action: Building Next‑Gen Recommendation and Search Systems

This article reviews cutting‑edge AI search and recommendation technologies, covering Alibaba Cloud's Agentic RAG architecture, Huawei Noah's LLM‑enhanced recommendation pipeline, and Baidu's generative ranking model GRAB, while detailing their design challenges, multi‑modal retrieval strategies, performance gains, and real‑world deployment results.

AI SearchAgentic RAGGenerative Ranking

0 likes · 6 min read

Agent Architecture in Action: Building Next‑Gen Recommendation and Search Systems

DataFunSummit

May 7, 2026 · Artificial Intelligence

From Text to Images: Building Multimodal Product Search with Elasticsearch Serverless

This article walks through a complete multimodal product search solution, explaining how embedding and vector retrieval technologies—combined with Elasticsearch Serverless and Alibaba Cloud AI Search—enable image‑based and semantic queries, detailing the architecture, key algorithms, quantization tricks, and practical deployment steps.

AI SearchElasticsearchEmbedding

0 likes · 22 min read

From Text to Images: Building Multimodal Product Search with Elasticsearch Serverless

DataFunSummit

May 7, 2026 · Artificial Intelligence

How LanceDB Powers Enterprise‑Level Memory in Volcano Engine’s OpenClaw

The article details Volcano Engine’s LAS AI team’s analysis, selection, and deep optimization of the LanceDB vector database as the core memory plugin for the enterprise‑grade OpenClaw (ArkClaw) agent platform, covering comparative evaluation, custom enhancements, and a vision for a cloud‑edge collaborative memory lake.

ArkClawAutoDreamContext Engine

0 likes · 16 min read

How LanceDB Powers Enterprise‑Level Memory in Volcano Engine’s OpenClaw

DataFunSummit

May 6, 2026 · Artificial Intelligence

Inside 1688’s Inference‑Based Recommendation System: Architecture, Challenges, and Future Directions

This article details how Alibaba 1688 tackles the “information cocoon” problem by deploying large‑model inference‑based recommendation, describing its three‑layer architecture, multi‑stage user demand analysis, long‑cycle behavior compression, prompt engineering, trend mining, near‑line serving, and future enhancements.

Large Language ModelMultimodalbehavior compression

0 likes · 23 min read

Inside 1688’s Inference‑Based Recommendation System: Architecture, Challenges, and Future Directions

DataFunSummit

May 5, 2026 · Artificial Intelligence

How Huawei Noah’s KAR Project Leverages LLMs to Advance Recommendation Systems

The article reviews the evolution of recommendation systems from deep learning to large language models, analyzes core challenges such as noisy implicit feedback and limited semantic understanding, and details Huawei Noah’s KAR solution that uses factorized prompting, multi‑expert adapters, and AI‑Agent architectures to achieve a 1.5% AUC lift and validated online A/B test results.

AI agentAUCHuawei

0 likes · 5 min read

How Huawei Noah’s KAR Project Leverages LLMs to Advance Recommendation Systems

DataFunSummit

May 5, 2026 · Big Data

A New Data Lake Paradigm: Volcano Engine’s Multi‑Modal Data Lake Built on Lance

The article presents Volcano Engine’s AI‑focused data lake built on the Lance format, detailing why traditional lakes fall short for multimodal data, the engineering enhancements such as Binary Copy Compaction, Lance Insight, distributed vector indexing, JSON‑based tagging, Row‑ID shuffle optimization, and real‑world case studies that demonstrate significant performance and cost gains.

AIBinary Copy CompactionDistributed Vector Index

0 likes · 18 min read

A New Data Lake Paradigm: Volcano Engine’s Multi‑Modal Data Lake Built on Lance

DataFunSummit

May 4, 2026 · Artificial Intelligence

DeepSeek’s MCTS Failure: The ‘Roast Chicken and Baijiu’ Dilemma in LLM Training

The article examines why DeepSeek’s large‑model training cannot yet leverage Monte‑Carlo Tree Search, detailing its reliance on SFT, GRPO‑driven CoT activation and rejection‑sampling, contrasting this with Google’s PRM‑based approaches, and proposing a MCTS‑powered data‑generation pipeline to overcome the “roast chicken and baijiu” training dilemma.

GRPOMonte Carlo Tree SearchProcess Reward Model

0 likes · 14 min read

DeepSeek’s MCTS Failure: The ‘Roast Chicken and Baijiu’ Dilemma in LLM Training

DataFunSummit

May 4, 2026 · Artificial Intelligence

Inside Alibaba Cloud AI Search: Agentic RAG Architecture and Multi‑Agent Techniques

Alibaba Cloud AI Search tackles high‑concurrency, multimodal, and multi‑hop queries by evolving its Agentic RAG architecture from a single agent to a coordinated multi‑agent system that integrates planning, retrieval, and generation, leverages hybrid vector‑text‑DB‑graph recall, GPU‑accelerated indexing, quantization, NL2SQL, and multimodal search, with performance data and real‑world case studies.

AI SearchAgentic RAGAlibaba Cloud

0 likes · 6 min read

Inside Alibaba Cloud AI Search: Agentic RAG Architecture and Multi‑Agent Techniques

DataFunSummit

May 4, 2026 · Artificial Intelligence

Best Practices for Persistent, Reliable AI Agent Memory: Insights from the ‘Memory in the Age of AI Agents’ Paper

The article analyzes the 2025 "Memory in the Age of AI Agents" paper, presenting its three‑dimensional classification of AI memory (Forms, Functions, Dynamics), comparing token‑level, parameter‑level and latent‑space approaches, evaluating major frameworks such as Mem0, Letta, Zep, ReMem, and offering concrete guidance on design, forgetting mechanisms, retrieval strategies, and future research directions.

AI memoryagentic AIlatent space memory

0 likes · 17 min read

Best Practices for Persistent, Reliable AI Agent Memory: Insights from the ‘Memory in the Age of AI Agents’ Paper