Author

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

2.5k

Articles

Likes

7.3k

Views

Comments

Latest from DataFunTalk

100 recent articles max

DataFunTalk

May 1, 2026 · Artificial Intelligence

Why Ontology Is the Semantic Operating System for Large‑Model AI

The article argues that in the era of powerful large models, enterprises lack a unified, computable, and evolvable semantic layer—ontology—that acts as a semantic operating system, bridging business concepts, data, and AI to enable reliable, actionable intelligence.

Large ModelsOpen Sourceenterprise AI

0 likes · 16 min read

Why Ontology Is the Semantic Operating System for Large‑Model AI

DataFunTalk

May 1, 2026 · Artificial Intelligence

Evolving Agent Development: Simplifying Multi‑Source Real‑Time Context from an Environment‑Engineering Perspective

The article analyzes why AI agents thrive in software engineering yet lag in many industries, attributing the gap to insufficient real‑time, multi‑source context, and proposes a five‑dimensional framework—information completeness, sensory management, knowledge reconciliation, change governance, and low entry barrier—illustrated with Alibaba Cloud EventHouse solutions.

AI AgentsChange GovernanceContext Management

0 likes · 15 min read

Evolving Agent Development: Simplifying Multi‑Source Real‑Time Context from an Environment‑Engineering Perspective

DataFunTalk

Apr 30, 2026 · Artificial Intelligence

How GenericAgent Cuts Token Costs by 10× While Boosting AI Agent Performance

The technical report on GenericAgent, a self‑evolving LLM‑based agent, shows that by maximizing context information density and using a minimal atomic toolset with hierarchical memory, it achieves up to ten‑fold token savings, 100% task accuracy, and progressive efficiency gains across multiple benchmarks.

AI benchmarksGenericAgentLLM

0 likes · 15 min read

How GenericAgent Cuts Token Costs by 10× While Boosting AI Agent Performance

DataFunTalk

Apr 29, 2026 · Big Data

How Xiaohongshu Revamped Its Data Architecture for the Big AI Data Era

Xiaohongshu transformed its data platform from a simple ClickHouse‑based analytics stack to a unified lakehouse with generic incremental compute, cutting architecture complexity, resource cost, and development effort by roughly one‑third while supporting petabyte‑scale, sub‑second queries across its 350 million‑user app.

Big DataClickHouseData Architecture

0 likes · 22 min read

How Xiaohongshu Revamped Its Data Architecture for the Big AI Data Era

DataFunTalk

Apr 29, 2026 · Artificial Intelligence

Hinton Warns: $4.8 Trillion AI Market Locked In – Is AGI a Foolish Term?

In a stark address at the World Digital Conference, Geoffrey Hinton warned that only about 1% of AI research focuses on safety while the $4.8 trillion market races ahead, critiquing the term AGI, outlining three classes of AI risk, and highlighting the dangerous concentration of AI power and resources worldwide.

AGIAI MarketAI governance

0 likes · 12 min read

Hinton Warns: $4.8 Trillion AI Market Locked In – Is AGI a Foolish Term?

DataFunTalk

Apr 28, 2026 · Artificial Intelligence

From “Lobster” to Ontology: DACon Reveals the Next Trend in Self‑Evolving AI Agents

The DACon conference in Shanghai gathered over 8,000 developers and experts, showcasing 50 talks that explored self‑evolving AI agents, the open‑source GenericAgent framework, data‑governance ontology, Agent‑Ready big‑data infrastructure, and AI+AR ecosystems, while highlighting practical case studies and future industry directions.

AI AgentsAI+ARBig Data

0 likes · 11 min read

From “Lobster” to Ontology: DACon Reveals the Next Trend in Self‑Evolving AI Agents

DataFunTalk

Apr 28, 2026 · Industry Insights

Why China Can’t Replicate Palantir – Not a Penguin in the Sahara, but a Different Beast

The article dissects Palantir’s rise—backed by In‑Q‑Tel, F‑class shares, and a subscription model—showing how the U.S. political‑capital ecosystem created a unique AI powerhouse that China’s project‑based procurement, legal constraints, and capital structure cannot emulate, and proposes a vertically‑focused, long‑term AI strategy suited to China’s own soil.

Chinese tech ecosystemIn-Q-TelPalantir

0 likes · 15 min read

Why China Can’t Replicate Palantir – Not a Penguin in the Sahara, but a Different Beast

DataFunTalk

Apr 28, 2026 · Artificial Intelligence

Manifold AI’s WorldScape 0.2 Tops WorldArena: How MoE Drives Superior Physics and 3D Understanding

Manifold AI’s WorldScape 0.2 achieved the highest overall score on the embodied world‑model benchmark WorldArena, outperforming giants like Google and Nvidia by excelling in comprehensive perception, physics compliance, and 3D accuracy while using only about 10 % of the parameters of competing models, thanks to a newly introduced MoE architecture.

Embodied AIMoEScaling Law

0 likes · 9 min read

Manifold AI’s WorldScape 0.2 Tops WorldArena: How MoE Drives Superior Physics and 3D Understanding

DataFunTalk

Apr 27, 2026 · Artificial Intelligence

Ontology + Large Model: How Knora Tackles Enterprise AI Hallucination and Execution Gaps

The article analyses how Knora 4.0 combines enterprise ontologies with large‑model AI to eliminate hallucinations, provide stable semantic constraints, and enable end‑to‑end autonomous execution across complex business scenarios, illustrated with LED production‑line use cases and a detailed platform architecture.

AI PlatformKnoraLarge Language Model

0 likes · 17 min read

Ontology + Large Model: How Knora Tackles Enterprise AI Hallucination and Execution Gaps

DataFunTalk

Apr 26, 2026 · Artificial Intelligence

Building an Enterprise‑Grade RAG 2.0 System: Architecture, Challenges, and Best Practices

This article analyses the practical construction of an enterprise‑level Retrieval‑Augmented Generation (RAG) 2.0 system, covering background issues of large models, a modular architecture, layered offline/online pipelines, hybrid retrieval, ranking strategies, prompt engineering, and deployment insights drawn from China Mobile’s production experience.

Hybrid RetrievalRAGRanking Models

0 likes · 22 min read

Building an Enterprise‑Grade RAG 2.0 System: Architecture, Challenges, and Best Practices