Artificial Intelligence 9 min read

Ximalaya's ChatBI: Applying Large‑Model AI to Build an Intelligent Business Intelligence Platform

This article presents Ximalaya's practical exploration of a large‑model‑driven BI system called ChatBI, detailing the background challenges, product architecture, implementation workflow, model‑optimization techniques, launch results, and future directions for data‑intelligent operations.

DataFunSummit
DataFunSummit
DataFunSummit
Ximalaya's ChatBI: Applying Large‑Model AI to Build an Intelligent Business Intelligence Platform

Background – Ximalaya faced high barriers for business users to access data, slow response times, and inflexible dashboards, while data engineers struggled with limited resources, high development costs, and low consumption of curated data. To address these issues, the team aimed to create a large‑model‑based BI solution that reduces development pressure and simplifies business usage.

Product Architecture – ChatBI is offered as a web portal, DingTalk robot, and open API. The system consists of two layers: the ChatBI layer (user‑facing interfaces) and the Data Intelligence Engine layer (agents that handle intent recognition, metric definition, data query, NL‑to‑SQL, data development, and governance). The overall architecture is organized into five layers: Model Access, Knowledge Management, Tooling, Agent Capabilities, and Product Features, supporting both metric‑level and table‑level queries.

Implementation Details – The workflow starts with knowledge preparation: modeling tables, fields, SQL dialects, business rules, and examples into a knowledge base, vectorizing unstructured knowledge, and building evaluation datasets. User queries trigger intent detection, prompt rewriting, knowledge retrieval, and NL‑to‑SQL generation, followed by validation, execution, and chart rendering. The system also includes automated testing, logging, and feedback loops.

Model Optimization – Techniques used include Prompt Engineering, Retrieval‑Augmented Generation (RAG), Fine‑Tuning, combined RAG + Fine‑Tuning, multi‑agent orchestration, and continuous model upgrades. Optimizations span knowledge (high‑quality tables and rules), technical (advanced prompting, multi‑agent split, vector/graph retrieval), product (interactive guidance, multi‑turn conversation, explainability), and quality assurance (unit tests, dataset coverage, online monitoring).

Launch Results – Within two weeks of release, ChatBI’s UV surpassed the self‑service data extraction tool, PV reached half of the target, query latency improved by several times, and answer accuracy stabilized around 85%.

Future Outlook – The roadmap includes enhancing intent recognition, smart rewriting, error correction, and chart generation, as well as developing DataOps agents for SQL generation, optimization, and troubleshooting, ultimately integrating all data products with natural‑language interaction capabilities.

AI agentsPrompt EngineeringRAGBusiness Intelligencelarge language modeldata intelligence
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.