Tagged articles
924 articles
Page 4 of 10
Wuming AI
Wuming AI
Feb 3, 2026 · Artificial Intelligence

How Short‑Term vs Long‑Term Memory Works in LLM‑Powered Autonomous Agents

This article demystifies short‑term and long‑term memory in LLM‑driven autonomous agents, explaining their mechanisms, limitations, and practical implementations such as sliding windows, summarization, and vector‑based retrieval, while illustrating each concept with concrete Cherry Studio examples and relevant research references.

Cherry StudioLLMMemory Management
0 likes · 7 min read
How Short‑Term vs Long‑Term Memory Works in LLM‑Powered Autonomous Agents
Architecture and Beyond
Architecture and Beyond
Feb 1, 2026 · Artificial Intelligence

5 High‑ROI Strategies to Supercharge RAG Retrieval Performance

This article outlines five practical engineering strategies—multi‑vector retrieval, manual splitting and labeling, scalar enhancement, context augmentation, and dense‑sparse vector integration—that together address common RAG retrieval bottlenecks and dramatically improve recall stability and answer quality.

BM25EngineeringLLM
0 likes · 17 min read
5 High‑ROI Strategies to Supercharge RAG Retrieval Performance
SpringMeng
SpringMeng
Jan 30, 2026 · Artificial Intelligence

Hands‑On Guide: Build AI Agent Chatbots on Windows with RagFlow

Programmer Xiao Meng walks through a complete Windows setup for AI‑powered customer service agents using RagFlow, covering prerequisites, Docker and Ollama installation, model download, container deployment, configuration of knowledge bases, and testing, based on five real‑world projects.

AI chatbotLarge Language ModelOllama
0 likes · 7 min read
Hands‑On Guide: Build AI Agent Chatbots on Windows with RagFlow
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 29, 2026 · Backend Development

How to Build a BFF Agent with LangGraph: A Step‑by‑Step Guide

This article walks through integrating an AI‑powered Agent into an internal BFF platform using LangGraph, detailing the architectural choices, state‑graph implementation, prompt engineering, knowledge‑base construction, tool integration, conversation handling, and context compression techniques to enable reliable script generation, execution, and validation.

AIAgentLangGraph
0 likes · 24 min read
How to Build a BFF Agent with LangGraph: A Step‑by‑Step Guide
Old Zhang's AI Learning
Old Zhang's AI Learning
Jan 28, 2026 · Artificial Intelligence

RAG-Anything: A Universal RAG Framework for PDFs, Office Docs, and Images

RAG-Anything is an open-source, end-to-end multimodal RAG framework that ingests PDFs, Office files, images, and scientific papers, parses them with high fidelity using MinerU, builds a multimodal knowledge graph, and enables hybrid retrieval, while noting resource and dependency considerations.

AIDocument ProcessingKnowledge Base
0 likes · 7 min read
RAG-Anything: A Universal RAG Framework for PDFs, Office Docs, and Images
PaperAgent
PaperAgent
Jan 27, 2026 · Artificial Intelligence

How Agentic‑R Boosts Multi‑Turn Retrieval for LLMs by 2–3 EM Points

This article analyzes the Agentic‑R framework, which upgrades traditional single‑hop Retrieval‑Augmented Generation by introducing dual‑perspective scoring and a bidirectional flywheel, resulting in 2–3 absolute EM improvements across seven QA datasets and a 10–15% reduction in search rounds.

LLMMulti-hop reasoningRAG
0 likes · 6 min read
How Agentic‑R Boosts Multi‑Turn Retrieval for LLMs by 2–3 EM Points
Data STUDIO
Data STUDIO
Jan 27, 2026 · Artificial Intelligence

How Python RAG Architectures Can Tame Large‑Model Hallucinations: A Complete Guide to 9 Designs

This article explains why large‑language‑model hallucinations are risky, introduces Retrieval‑Augmented Generation (RAG) as a remedy, and walks through nine Python‑based RAG architectures—standard, conversational, corrective, adaptive, fusion, HyDE, self‑RAG, agentic, and graph RAG—detailing their workflows, code examples, strengths, weaknesses, and a decision‑making map for selecting the right design.

AI HallucinationLangChainLarge Language Models
0 likes · 29 min read
How Python RAG Architectures Can Tame Large‑Model Hallucinations: A Complete Guide to 9 Designs
Efficient Ops
Efficient Ops
Jan 26, 2026 · Artificial Intelligence

Why AI Skills Will Redefine Agents Beyond MCP

This article explains how AI Skills serve as structured knowledge bases that complement, rather than replace, Model Context Protocols, enhance Retrieval‑Augmented Generation, and drive three major trends—standardized agent stacks, low‑code knowledge engineering, and the emergence of personal AI agents.

AI agentsAI ecosystemKnowledge Engineering
0 likes · 8 min read
Why AI Skills Will Redefine Agents Beyond MCP
Old Meng AI Explorer
Old Meng AI Explorer
Jan 24, 2026 · Artificial Intelligence

How UltraRAG Turns Complex RAG Deployment into a One‑Click, No‑Code Process

UltraRAG, an open‑source RAG framework from THUNLP and NEUIR, consolidates data construction, model fine‑tuning, and evaluation into a zero‑code WebUI, offering one‑click multimodal knowledge‑base creation, modular deployment, and multi‑dimensional metrics that boost retrieval accuracy by up to 30% while halving development time.

AIOpen‑SourceRAG
0 likes · 9 min read
How UltraRAG Turns Complex RAG Deployment into a One‑Click, No‑Code Process
AI Waka
AI Waka
Jan 24, 2026 · Artificial Intelligence

Building Production‑Ready AI Agents with NVIDIA’s Nemotron Stack

The article explains how NVIDIA’s Nemotron Stack combines ultra‑fast speech recognition, multimodal retrieval, and advanced safety models into a unified, low‑latency pipeline, offering practical integration code, performance insights, and deployment options for turning experimental AI agents into production‑grade services.

AI agentsContent SafetyDeployment
0 likes · 9 min read
Building Production‑Ready AI Agents with NVIDIA’s Nemotron Stack
AI Waka
AI Waka
Jan 24, 2026 · Artificial Intelligence

2026 Agentic AI Roadmap: How to Build Autonomous AI Agents

This comprehensive 2026 roadmap outlines the essential programming foundations, core agent architectures, LLM and API integrations, tool usage, memory management, RAG systems, deployment strategies, monitoring, and security practices needed to design, develop, and operate autonomous AI agents.

AI roadmapLLM integrationPython
0 likes · 10 min read
2026 Agentic AI Roadmap: How to Build Autonomous AI Agents
Tech Verticals & Horizontals
Tech Verticals & Horizontals
Jan 23, 2026 · Artificial Intelligence

Comparing 9 Major Agent Development Frameworks: Choosing the Best Fit

This article provides an in‑depth comparison of nine mainstream AI agent development frameworks—Pydantic AI, SmolAgents, DeepAgents, LlamaIndex, CAMEL, AutoGen, CrewAI, LangGraph, and OpenAI Agents SDK—detailing their design principles, strengths, weaknesses, typical scenarios, and guidance for selecting or mixing them in production.

Agent FrameworksLLMLangChain
0 likes · 30 min read
Comparing 9 Major Agent Development Frameworks: Choosing the Best Fit
Java Architecture Diary
Java Architecture Diary
Jan 22, 2026 · Artificial Intelligence

Unlock Java Power with Claude Agent SDK: From One‑Shot to Reactive APIs

This article explains how Claude Code, a super‑intelligent AI agent, differs from traditional code‑completion tools, introduces its official SDK limitations, and provides a comprehensive guide to the community‑driven Claude Agent SDK for Java—including one‑shot, blocking, and reactive APIs and a practical RAG‑based Q&A example.

AI AgentClaude CodeJava SDK
0 likes · 10 min read
Unlock Java Power with Claude Agent SDK: From One‑Shot to Reactive APIs
JakartaEE China Community
JakartaEE China Community
Jan 20, 2026 · Backend Development

How to Build AI‑Powered Java Apps with Helidon and LangChain4j

This article explains how Helidon 4.2 integrates the LangChain4j framework to simplify adding large‑language‑model capabilities, covering core features, Maven setup, configuration, component creation, dependency injection, annotations, custom tools, and sample applications such as a coffee‑shop assistant.

AI integrationHelidonJava
0 likes · 14 min read
How to Build AI‑Powered Java Apps with Helidon and LangChain4j
Tencent Cloud Developer
Tencent Cloud Developer
Jan 20, 2026 · Artificial Intelligence

From Transformers to Agents: A Complete Timeline of Large Language Model Evolution

This article traces the evolution of large language models from the 2017 Transformer breakthrough through successive milestones such as BERT, GPT‑3, RL‑HF alignment, multimodal extensions, open‑source alternatives, and the rise of retrieval‑augmented generation, AI agents, and emerging protocols that shape modern AI applications.

Large Language ModelsOpen-source modelsPrompt Engineering
0 likes · 44 min read
From Transformers to Agents: A Complete Timeline of Large Language Model Evolution
macrozheng
macrozheng
Jan 16, 2026 · Artificial Intelligence

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

WeKnora is an open‑source Tencent framework that combines large language models with retrieval‑augmented generation to enable fast, accurate semantic search and question answering across heterogeneous documents such as PDFs, Word files, and images, offering a modular, extensible architecture and easy Docker‑based deployment.

AILLMRAG
0 likes · 7 min read
Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework
PaperAgent
PaperAgent
Jan 15, 2026 · Artificial Intelligence

How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs

The article presents GAG, a third‑generation framework that injects proprietary domain knowledge into frozen large language models using a single token, eliminating retrieval, avoiding base model updates, and maintaining constant inference budget while delivering strong performance on private QA and public benchmarks.

AI alignmentGAGLLM
0 likes · 8 min read
How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs
Sohu Tech Products
Sohu Tech Products
Jan 14, 2026 · Artificial Intelligence

Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch

This guide walks through building an open‑source Retrieval‑Augmented Generation (RAG) system that indexes local files with Everything, uses hybrid BM25‑vector search via Elasticsearch, and answers questions with a local LLM, covering architecture, core techniques, deployment steps, performance tweaks, and common pitfalls.

ElasticsearchLLMOpen Source
0 likes · 11 min read
Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch
Tech Verticals & Horizontals
Tech Verticals & Horizontals
Jan 14, 2026 · Artificial Intelligence

Why Parallelism Matters: Designing Multi‑Agent Architectures for Scalable AI Systems

The article explains why parallelism is crucial for large‑scale AI systems—addressing I/O latency and reliability—by detailing core agent patterns, multi‑agent architectures, reliability strategies, and advanced retrieval‑augmented generation techniques, each illustrated with concrete Jupyter notebooks.

AI governanceParallelismRAG
0 likes · 6 min read
Why Parallelism Matters: Designing Multi‑Agent Architectures for Scalable AI Systems
Instant Consumer Technology Team
Instant Consumer Technology Team
Jan 13, 2026 · Artificial Intelligence

Scalable Enterprise AI Assistant: Intent Planning, Context Engineering, Data Iteration

This article details the end‑to‑end design of an enterprise AI office assistant, covering the three‑layer framework of intent planning, context engineering, and data self‑iteration, the key pain points of intent understanding, knowledge integration, and quality control, and practical architectural and implementation solutions for scalable deployment.

AI AssistantAgent CollaborationContext Engineering
0 likes · 25 min read
Scalable Enterprise AI Assistant: Intent Planning, Context Engineering, Data Iteration
Fun with Large Models
Fun with Large Models
Jan 12, 2026 · Artificial Intelligence

Why You Should Master Large‑Model Training: A Full‑Process Practical Guide

The article explains why mastering large‑model training is crucial for professionals, researchers, and enterprises, outlines the end‑to‑end pipeline—from data preparation and pre‑training to instruction fine‑tuning and RLHF alignment—compares training with RAG, and presents a structured learning roadmap.

AI agentsData EngineeringPyTorch
0 likes · 14 min read
Why You Should Master Large‑Model Training: A Full‑Process Practical Guide
AI Algorithm Path
AI Algorithm Path
Jan 11, 2026 · Artificial Intelligence

How Vector Embeddings Enable AI to Understand Anything

This article explains the principle of vector embeddings, shows how they turn words, images, audio and other data into dense numeric vectors, compares them with one‑hot encoding, describes static and contextual models, training methods, similarity metrics, and a wide range of real‑world AI applications.

AI fundamentalsMultimodalRAG
0 likes · 15 min read
How Vector Embeddings Enable AI to Understand Anything
PaperAgent
PaperAgent
Jan 9, 2026 · Artificial Intelligence

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

The article explains why traditional retrieval‑augmented generation fails in multi‑hop scenarios due to overly large chunks, introduces SentGraph’s sentence‑level graph that trims retrieval units and encodes logical relations, details offline construction and online inference steps, and shows experimental gains and remaining limitations.

LLMMulti-hop QARAG
0 likes · 7 min read
Why Traditional RAG Breaks the Chain and How SentGraph Fixes It
Old Meng AI Explorer
Old Meng AI Explorer
Jan 9, 2026 · Artificial Intelligence

How UltraRAG Turns RAG Deployment into a Zero‑Code, Multi‑Modal Powerhouse

UltraRAG, an open‑source RAG framework from THUNLP and NEUIR, eliminates data‑cooking, retrieval tuning, and fine‑tuning hurdles by offering a zero‑code Web UI, one‑click data synthesis, multimodal support, modular design, and comprehensive evaluation, enabling enterprises, developers, and researchers to launch domain‑specific RAG systems up to twice as fast with up to 30% higher accuracy.

Artificial IntelligenceOpen-sourceRAG
0 likes · 10 min read
How UltraRAG Turns RAG Deployment into a Zero‑Code, Multi‑Modal Powerhouse
Old Meng AI Explorer
Old Meng AI Explorer
Jan 8, 2026 · Artificial Intelligence

How UltraRAG Turns RAG Development into a Zero‑Code, One‑Click Process

UltraRAG, an open‑source RAG framework from THUNLP and NEUIR, offers a zero‑code WebUI that streamlines data construction, model fine‑tuning, and multi‑dimensional evaluation, boosting retrieval accuracy by up to 30% and cutting deployment time in half for enterprise, AI developers, and researchers.

AIOpen-sourceRAG
0 likes · 10 min read
How UltraRAG Turns RAG Development into a Zero‑Code, One‑Click Process
Sohu Tech Products
Sohu Tech Products
Jan 7, 2026 · Artificial Intelligence

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

This article explains Retrieval‑Augmented Generation (RAG), its dual‑stage architecture that combines parametric LLM knowledge with external non‑parametric data, outlines its technical evolution, discusses why it outperforms pure LLMs, and provides a step‑by‑step guide with toolchain choices, evaluation metrics, and future challenges.

AIKnowledge BaseLLM
0 likes · 14 min read
Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation
Youzan Coder
Youzan Coder
Jan 6, 2026 · Artificial Intelligence

How to Build Efficient Code Search with Vector Embeddings and AST Indexing

This article explains the motivations, techniques, and practical implementations of code indexing—covering semantic vector‑based RAG pipelines and AST‑based structural analysis—to improve code navigation, AI‑assisted queries, security scanning, and development efficiency.

AI developmentASTRAG
0 likes · 17 min read
How to Build Efficient Code Search with Vector Embeddings and AST Indexing
Advanced AI Application Practice
Advanced AI Application Practice
Jan 6, 2026 · Artificial Intelligence

Enterprise-Grade AI + Knowledge Graph for Automating Complex API Test Scenarios

The article details how an AI‑driven test platform combines large language models with a corporate‑level knowledge graph to automatically generate end‑to‑end API test scripts for complex business flows, overcoming context loss, dependency gaps, and scalability limits of single‑interface generation.

AIAPI testingKnowledge Graph
0 likes · 12 min read
Enterprise-Grade AI + Knowledge Graph for Automating Complex API Test Scenarios
Tech Freedom Circle
Tech Freedom Circle
Jan 5, 2026 · Artificial Intelligence

A Three‑Step Guide to Mastering RAG Semantic‑Loss Interview Questions

RAG (Retrieval‑Augmented Generation) is a hot interview topic, and many candidates stumble on semantic‑loss issues; this article dissects a real JD interview case, identifies three core shortcomings, and presents a three‑step technical solution—structure restoration, semantic splitting, and hybrid retrieval—plus a ready‑to‑use answer template.

AI InterviewDocument ParsingHybrid Search
0 likes · 25 min read
A Three‑Step Guide to Mastering RAG Semantic‑Loss Interview Questions
DataFunTalk
DataFunTalk
Jan 4, 2026 · Artificial Intelligence

How Agentic RAG and Generative Ranking Are Redefining AI Search and Recommendation

This article summarizes three cutting‑edge AI techniques—Alibaba Cloud's Agentic RAG architecture for multimodal search, Huawei Noah's large‑model‑driven recommendation system evolution, and Baidu's generative ranking (GRAB) model for ads—detailing their challenges, designs, performance gains, and practical deployment insights.

AI SearchGenerative RankingLarge Language Models
0 likes · 7 min read
How Agentic RAG and Generative Ranking Are Redefining AI Search and Recommendation
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Jan 3, 2026 · Artificial Intelligence

Build Your Own AI Coding Assistant in 5 Minutes: A Hands‑On Guide

The article analyzes common pain points of traditional AI coding chats—repetitive context input, lengthy prompts, and generic answers—and demonstrates how to create a persistent, expert‑level AI coding assistant using Coco AI, with step‑by‑step configuration, example prompts, and future RAG enhancements.

AI AgentCoco AIDeepSeek
0 likes · 9 min read
Build Your Own AI Coding Assistant in 5 Minutes: A Hands‑On Guide
dbaplus Community
dbaplus Community
Jan 1, 2026 · Artificial Intelligence

Boost LLM Retrieval Accuracy with MCP – A Superior Alternative to RAG

This guide explains why traditional Retrieval‑Augmented Generation (RAG) struggles with precision, introduces the Model Context Protocol (MCP) as a standardized way for large language models to interact with external data sources, and provides step‑by‑step instructions for integrating MCP with MongoDB using Cherry Studio and VSCode +Cline.

FunctionCallMCPMongoDB
0 likes · 25 min read
Boost LLM Retrieval Accuracy with MCP – A Superior Alternative to RAG
Old Meng AI Explorer
Old Meng AI Explorer
Dec 30, 2025 · Artificial Intelligence

How UltraRAG Delivers One‑Click, No‑Code RAG Deployment and Boosts Retrieval Accuracy

UltraRAG, an open‑source RAG framework from THUNLP and NEUIR, consolidates data construction, model fine‑tuning, and evaluation into a zero‑code WebUI, offering multimodal knowledge‑base creation, one‑click optimization, robust multi‑dimensional evaluation, and micro‑service deployment that can raise retrieval accuracy by up to 30% and halve development time.

AIOpen-sourceRAG
0 likes · 10 min read
How UltraRAG Delivers One‑Click, No‑Code RAG Deployment and Boosts Retrieval Accuracy
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 29, 2025 · Cloud Native

How a Visual Platform Cut Search Costs by 60% with All‑in‑Elasticsearch

This case study details how a major internet visual platform consolidated its log, keyword, and vector search workloads onto Alibaba Cloud Elasticsearch, eliminating three separate pipelines, reducing write‑costs by 60%, cutting storage expenses over 60%, and achieving multi‑fold performance gains through serverless scaling, FalconSeek engine optimizations, and unified monitoring.

ElasticsearchRAGSearch Architecture
0 likes · 10 min read
How a Visual Platform Cut Search Costs by 60% with All‑in‑Elasticsearch
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Dec 28, 2025 · Artificial Intelligence

Building an Elasticsearch‑Powered RAG Q&A System: Theory and Full Code Walkthrough

This article walks through the principles of Retrieval‑Augmented Generation (RAG) and provides a complete Python implementation using Elasticsearch, covering document chunking, semantic embedding, bulk indexing, hybrid BM25‑vector search, RRF result fusion, prompt design, LLM invocation, and a practical demo.

ElasticsearchHybrid SearchPrompt Engineering
0 likes · 9 min read
Building an Elasticsearch‑Powered RAG Q&A System: Theory and Full Code Walkthrough
Architecture and Beyond
Architecture and Beyond
Dec 27, 2025 · Artificial Intelligence

Turning Claude Skill Folders into Scalable Industry Workflows

This article explains how Anthropic's Claude Skill folders let you package domain expertise, scripts, and resources into reusable modules, differentiate Skills from prompts, combine them with MCP tools and workflows, and build a robust mixed Agent‑Workflow architecture for reliable enterprise automation.

AI agentsClaudeMCP
0 likes · 18 min read
Turning Claude Skill Folders into Scalable Industry Workflows
AI Architecture Hub
AI Architecture Hub
Dec 27, 2025 · Artificial Intelligence

How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs

GraphRAG extends traditional Retrieval‑Augmented Generation by building a knowledge graph from documents, extracting entities and relationships, performing community detection, and supporting both local and global searches, offering detailed step‑by‑step guidance, code examples, configuration tips, and a comparison with classic RAG approaches.

GraphRAGKnowledge GraphLLM
0 likes · 28 min read
How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs
360 Tech Engineering
360 Tech Engineering
Dec 26, 2025 · Artificial Intelligence

15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation

This article presents fifteen practical chunking techniques—ranging from line‑by‑line and fixed‑size chunking to semantic and hierarchical methods—explaining their principles, ideal use‑cases, concrete input examples, chunk outputs, and key advantages or cautions for improving Retrieval‑Augmented Generation with large language models.

AIChunkingData Retrieval
0 likes · 28 min read
15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 26, 2025 · Artificial Intelligence

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

This article presents a complete end‑to‑end pipeline that automatically extracts, generalizes, incrementally updates, and vector‑syncs knowledge from diverse sources such as tickets, documents, and SQL code, turning the traditionally labor‑intensive knowledge‑base construction for agents into a low‑effort, continuously maintainable Python‑driven solution.

LLMPythonRAG
0 likes · 15 min read
How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python
Architect
Architect
Dec 25, 2025 · Artificial Intelligence

How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide

This article explains why traditional RAG suffers from hallucinations, introduces GraphRAG’s knowledge‑graph‑based approach, walks through its indexing and query pipelines—including text splitting, entity‑relation extraction, graph construction, community detection, and local vs. global retrieval—provides practical setup commands, Neo4j visualization steps, and compares its performance with classic RAG.

EmbeddingGraphRAGKnowledge Graph
0 likes · 27 min read
How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide
Open Source Tech Hub
Open Source Tech Hub
Dec 25, 2025 · Artificial Intelligence

Explore Symfony AI: Bringing Native AI Capabilities to PHP

Symfony AI v0.1.0 launches with a suite of PHP components that let developers integrate OpenAI‑style models, vector stores, autonomous agents, and chat persistence directly into Symfony apps, offering easy installation, rich demos, and a dedicated website for hands‑on experimentation.

AIMachine LearningOpenAI
0 likes · 6 min read
Explore Symfony AI: Bringing Native AI Capabilities to PHP
AI Architecture Hub
AI Architecture Hub
Dec 24, 2025 · Artificial Intelligence

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

This article explains the three evolutionary stages of AI—from large language models that generate text, through workflow‑enhanced systems using retrieval‑augmented generation, to fully autonomous agents capable of self‑directed decision‑making—while detailing the four core technologies that power each stage.

AI evolutionAgentEmbedding
0 likes · 9 min read
From LLMs to Autonomous Agents: The Three Evolution Stages of AI
PMTalk Product Manager Community
PMTalk Product Manager Community
Dec 24, 2025 · Artificial Intelligence

Why AI Hallucinates and How Product Managers Can Tame It

The article explains the internal and external causes of AI hallucinations, examines how pre‑training data flaws and fine‑tuning choices amplify them, and presents a five‑pronged technical toolbox—including RAG, prompt engineering, chain‑of‑thought, self‑verification, and safety APIs—plus risk‑based product strategies for different industries.

AI HallucinationPrompt EngineeringRAG
0 likes · 12 min read
Why AI Hallucinates and How Product Managers Can Tame It
Tencent Cloud Developer
Tencent Cloud Developer
Dec 24, 2025 · Backend Development

How IMA Scaled Its AI Knowledge Base from Monolith to Micro‑services

This article walks through the end‑to‑end design of IMA's AI‑driven knowledge base, covering its definition, core business flow, architecture evolution, data ingestion pipelines, management challenges, asynchronous processing, permission modeling, and the business value demonstrated by the prototype.

AI ArchitectureAccess ControlData Consistency
0 likes · 14 min read
How IMA Scaled Its AI Knowledge Base from Monolith to Micro‑services
DataFunTalk
DataFunTalk
Dec 23, 2025 · Artificial Intelligence

Unlocking AI Search: Agentic RAG, LLM‑Powered Recommendations, and Generative Ranking Explained

This article summarizes three cutting‑edge AI search and recommendation techniques—Alibaba Cloud's Agentic RAG architecture, Huawei's LLM‑enhanced recommendation system evolution, and Baidu's generative ranking model GRAB—detailing their challenges, design choices, performance gains, and practical deployment insights.

AIGenerative RankingMulti‑Agent
0 likes · 7 min read
Unlocking AI Search: Agentic RAG, LLM‑Powered Recommendations, and Generative Ranking Explained
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Dec 21, 2025 · Artificial Intelligence

Deploy and Explore Open WebUI: A Feature‑Rich Self‑Hosted AI Platform

Open WebUI is a self‑hosted, extensible AI platform that runs fully offline, supports multiple LLM back‑ends such as Ollama and OpenAI‑compatible APIs, offers built‑in RAG, role‑based access, multi‑model chat, markdown/LaTeX, image generation, and provides detailed Docker, pip, and Kubernetes installation guides with ready‑to‑run commands.

AI PlatformLLMOpen WebUI
0 likes · 11 min read
Deploy and Explore Open WebUI: A Feature‑Rich Self‑Hosted AI Platform
Architecture and Beyond
Architecture and Beyond
Dec 21, 2025 · Artificial Intelligence

Designing RAG for Industry‑Specific AI Agents: From Data to Safe Execution

This article explains how to build Retrieval‑Augmented Generation (RAG) for industry‑specific AI agents, covering required capabilities, metrics, data sources, indexing, hybrid retrieval, decision‑point integration, layered output, permission controls, rollout strategies, and common pitfalls to ensure reliable and secure automation.

Agent DesignKnowledge retrievalRAG
0 likes · 17 min read
Designing RAG for Industry‑Specific AI Agents: From Data to Safe Execution
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Dec 20, 2025 · Artificial Intelligence

How to Build an Enterprise‑Grade Intelligent Document QA System with Everything plus RAG

This article walks through the need for fast, accurate answers from massive document collections, compares plain keyword search and pure LLM chat, and presents a hybrid Retrieval‑Augmented Generation solution built with open‑source components, detailing architecture, hybrid retrieval, prompt engineering, deployment, performance tuning, and common pitfalls.

ElasticsearchHybrid RetrievalPrompt Engineering
0 likes · 12 min read
How to Build an Enterprise‑Grade Intelligent Document QA System with Everything plus RAG
Architect's Journey
Architect's Journey
Dec 19, 2025 · Artificial Intelligence

Why Context Engineering Is the Hottest AI Skill in 2025

The article explains how context engineering—building a dynamic system that supplies AI with user intent, dialogue history, long‑term memory, external knowledge and tool definitions—outperforms traditional prompt engineering, eliminates hallucinations, and enables AI to complete complex, end‑to‑end tasks.

AIAI agentsContext Engineering
0 likes · 8 min read
Why Context Engineering Is the Hottest AI Skill in 2025
PaperAgent
PaperAgent
Dec 18, 2025 · Artificial Intelligence

Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?

This article presents an ontology‑aware knowledge‑graph RAG framework that transforms complex, hierarchical industrial standard documents into a graph of sections, atomic propositions, and refined triples, achieving nearly double F1 scores on table‑based QA tasks and robust performance on long documents.

Knowledge GraphLLMOntology
0 likes · 6 min read
Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?
Architects' Tech Alliance
Architects' Tech Alliance
Dec 17, 2025 · Artificial Intelligence

Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment

This guide explains how Retrieval‑Augmented Generation (RAG) overcomes LLM knowledge staleness, hallucination, and domain‑adaptation challenges by combining external knowledge bases with real‑time retrieval, and provides detailed architecture, optimization techniques, engineering practices, monitoring, cost‑control, and future trends for building production‑grade RAG systems.

AICloudflareLLM
0 likes · 15 min read
Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment
Aikesheng Open Source Community
Aikesheng Open Source Community
Dec 16, 2025 · Databases

How to Build Predictive and Generative AI Apps with MySQL AI

MySQL AI adds built‑in LLMs, embeddings, vector storage, AutoML and a graphical console to on‑premise MySQL, enabling developers to create predictive and generative AI applications—including fraud detection, semantic search, RAG and NL2SQL—without external vector databases or GPUs.

AutoMLMySQL AIPredictive AI
0 likes · 15 min read
How to Build Predictive and Generative AI Apps with MySQL AI
JakartaEE China Community
JakartaEE China Community
Dec 16, 2025 · Artificial Intelligence

Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This guide walks through the importance of Retrieval‑Augmented Generation, outlines the core Langchain4j and Ollama 3 components, and provides a complete Java example—including Maven setup, document ingestion, embedding creation, similarity search, prompt construction, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingJavaLLM
0 likes · 9 min read
Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3
PaperAgent
PaperAgent
Dec 14, 2025 · Artificial Intelligence

GPT‑5.2 vs Gemini 3 Pro: Coding Tests, NeurIPS 2025 Paper Insights, and RAG Refactor

The article evaluates GPT‑5.2 and Gemini 3 Pro on real‑world coding tasks, analyzes trends from the 6000 papers presented at NeurIPS 2025, and demonstrates how to extract and refactor the tree‑building component of the open‑source RAPTOR RAG system into an independent module.

AI model evaluationCode RefactoringGPT-5.2
0 likes · 5 min read
GPT‑5.2 vs Gemini 3 Pro: Coding Tests, NeurIPS 2025 Paper Insights, and RAG Refactor
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Dec 13, 2025 · Artificial Intelligence

Explore 100+ Open‑Source LLM Apps and How to Run Them Locally

This guide presents a curated collection of over a hundred open‑source large language model applications—including AI agents, RAG pipelines, and domain‑specific tools—explains their categories, showcases example projects, and provides step‑by‑step instructions to clone and run them on your own machine.

AI agentsGitHubLLM
0 likes · 8 min read
Explore 100+ Open‑Source LLM Apps and How to Run Them Locally
Fun with Large Models
Fun with Large Models
Dec 7, 2025 · Frontend Development

Building a Multimodal RAG Front‑End with Trae Solo: A Vibe‑Coding Guide

This article walks through a three‑step Vibe‑Coding workflow—structured prompt creation, prompt optimization with DeepSeek, and precise bug‑fix guidance—to automatically generate, refine, and extend a React + TypeScript front‑end for a multimodal RAG system using Trae Solo, covering architecture, streaming chat, and PDF citation features.

AI programmingFrontendLangChain
0 likes · 22 min read
Building a Multimodal RAG Front‑End with Trae Solo: A Vibe‑Coding Guide
dbaplus Community
dbaplus Community
Dec 7, 2025 · Artificial Intelligence

How AI Agents Can Revolutionize Data Governance: A Step‑by‑Step Blueprint

This article explains how AI agents transform traditional data governance by introducing a four‑layer perception‑decision‑execution‑learning architecture, detailing the required technologies, tool integrations, code examples, deployment steps, team roles, security safeguards, and practical rollout strategies for enterprises seeking automated, intelligent data management.

AI AgentData GovernanceLangChain
0 likes · 10 min read
How AI Agents Can Revolutionize Data Governance: A Step‑by‑Step Blueprint
Old Meng AI Explorer
Old Meng AI Explorer
Dec 5, 2025 · Industry Insights

How Bisheng Turns Enterprise AI Deployment into a Zero‑Code, One‑Stop Process

Bisheng, an open‑source LLM DevOps platform, solves the fragmented, high‑threshold, and compliance‑heavy challenges of enterprise AI by offering a zero‑code visual workflow, all‑in‑one RAG/Agent capabilities, strict security controls, and high‑precision document parsing, enabling rapid, secure AI application rollout.

AI PlatformLLM DevOpsLow‑code
0 likes · 11 min read
How Bisheng Turns Enterprise AI Deployment into a Zero‑Code, One‑Stop Process
Instant Consumer Technology Team
Instant Consumer Technology Team
Dec 4, 2025 · Artificial Intelligence

How to Build an AI‑Powered Jira Assistant with LangGraph, RAG, and MCP

This article walks through the design and implementation of an AI‑driven Jira assistant that uses LangGraph as the agent brain, Retrieval‑Augmented Generation for knowledge access, and a Model Context Protocol (MCP) server to execute Jira operations, complete with architecture diagrams, code snippets, and practical use cases.

AI AgentJira AutomationLangGraph
0 likes · 12 min read
How to Build an AI‑Powered Jira Assistant with LangGraph, RAG, and MCP
DataFunTalk
DataFunTalk
Dec 4, 2025 · Artificial Intelligence

Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking: Cutting‑Edge AI Search Techniques

This article reviews three advanced AI search solutions—Alibaba Cloud's Agentic RAG architecture for multi‑modal retrieval, Huawei's LLM‑enhanced recommendation system with factorized prompting, and Baidu's generative ranking model GRAB—detailing their technical challenges, design choices, performance gains, and deployment insights.

AI SearchBaiduGenerative Ranking
0 likes · 8 min read
Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking: Cutting‑Edge AI Search Techniques
Baidu MEUX
Baidu MEUX
Dec 3, 2025 · User Experience Design

Boost User Research with AI: Automating Short Feedback Classification & Long‑Form Insight Extraction

This article explains how AI large‑language models can automate short user‑feedback classification and extract insights from long interview texts, offering practical prompting tips, fine‑tuning strategies, and Retrieval‑Augmented Generation workflows to make user research faster, more accurate, and less labor‑intensive.

AIFeedback ClassificationLarge Language Models
0 likes · 11 min read
Boost User Research with AI: Automating Short Feedback Classification & Long‑Form Insight Extraction
macrozheng
macrozheng
Dec 3, 2025 · Databases

How Redis’s New Multithreaded Query Engine Boosts Vector Search Performance

Redis has introduced a multithreaded query engine that dramatically reduces latency and increases throughput—up to 16×—for vector similarity searches, enabling vertical scaling and better support for real‑time RAG applications compared to traditional single‑threaded architectures and competing vector databases.

Performance BenchmarkRAGRedis
0 likes · 6 min read
How Redis’s New Multithreaded Query Engine Boosts Vector Search Performance
Yiche Technology
Yiche Technology
Dec 3, 2025 · Artificial Intelligence

How Milvus Powered a Scalable AI Assistant for Car Queries with Vector Search

This article details how an automotive AI assistant migrated from keyword matching to a Milvus‑based vector retrieval system, overcoming semantic gaps, scaling to millions of daily queries, optimizing indexing, introducing multi‑vector and sparse‑vector search, and building a real‑time RAG pipeline with Flink.

AI AssistantMilvusRAG
0 likes · 12 min read
How Milvus Powered a Scalable AI Assistant for Car Queries with Vector Search
Data STUDIO
Data STUDIO
Dec 3, 2025 · Artificial Intelligence

Pixeltable: One Table to Power Multimodal AI with Declarative Python

Pixeltable introduces a unified table abstraction that treats images, text, embeddings and model outputs as columns, enabling declarative multimodal AI pipelines, eliminating glue code, supporting built‑in vector indexing, versioned experiments, extensible custom functions, and a concise 30‑line RAG implementation.

PixeltablePythonRAG
0 likes · 15 min read
Pixeltable: One Table to Power Multimodal AI with Declarative Python
DataFunTalk
DataFunTalk
Dec 2, 2025 · Artificial Intelligence

How Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking Are Redefining AI Search

This article reviews three cutting‑edge AI search and recommendation techniques—Alibaba Cloud's Agentic RAG architecture, Huawei Noah's LLM‑enhanced recommendation pipeline, and Baidu's GRAB generative ranking model—detailing their design challenges, multi‑modal retrieval strategies, performance gains, and real‑world deployment results.

AI SearchAI agentsGenerative Ranking
0 likes · 8 min read
How Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking Are Redefining AI Search
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Dec 2, 2025 · Artificial Intelligence

How LLMs Can Revolutionize Test Case Generation: Methods, Benefits, and Challenges

This article examines the shortcomings of manual test case creation, explains how large language models (LLMs) can dramatically improve efficiency, coverage, consistency, and knowledge sharing in software testing, outlines the key capabilities required, and presents a detailed end‑to‑end solution with practical steps, evaluation metrics, and future outlook.

AI automationKnowledge BaseLLM
0 likes · 20 min read
How LLMs Can Revolutionize Test Case Generation: Methods, Benefits, and Challenges
Instant Consumer Technology Team
Instant Consumer Technology Team
Dec 1, 2025 · Artificial Intelligence

Understanding AIGC, RAG, Function Calling, and the MCP Protocol: A Practical AI Guide

This article explains the fundamentals of AI‑generated content (AIGC), the Retrieval‑Augmented Generation (RAG) technique, Function Calling, autonomous agents, and the Model Context Protocol (MCP), highlighting their evolution, technical workflows, limitations, and real‑world examples for developers.

AIAIGCFunction Calling
0 likes · 19 min read
Understanding AIGC, RAG, Function Calling, and the MCP Protocol: A Practical AI Guide
Fun with Large Models
Fun with Large Models
Nov 30, 2025 · Artificial Intelligence

Multimodal RAG with LangChain: PDF Parsing, Chunking, and Citation Guide

This article walks through building a LangChain‑based multimodal RAG system that parses PDFs (both native and scanned), splits them into semantic chunks, stores embeddings in a vector database, and generates answers with precise source citations, complete with code samples and API integration.

FastAPILangChainPDF parsing
0 likes · 20 min read
Multimodal RAG with LangChain: PDF Parsing, Chunking, and Citation Guide
Fun with Large Models
Fun with Large Models
Nov 27, 2025 · Artificial Intelligence

Mastering Coze Knowledge Base: A Step‑by‑Step Low‑Code Agent Guide

This article provides a comprehensive, hands‑on guide to Coze's knowledge base, covering its core concepts, key features, practical use‑case scenarios, detailed creation steps, configuration options, prompt design, testing methods, and a comparison with variables, memory, and databases.

CozeKnowledge BaseLow‑code
0 likes · 15 min read
Mastering Coze Knowledge Base: A Step‑by‑Step Low‑Code Agent Guide
Old Meng AI Explorer
Old Meng AI Explorer
Nov 27, 2025 · Artificial Intelligence

How UltraRAG Turns RAG Deployment into a Zero‑Code, One‑Click Process

UltraRAG, an open‑source RAG framework co‑developed by Tsinghua and NEUIR, offers a zero‑code WebUI that streamlines data construction, model fine‑tuning, and multi‑dimensional evaluation, boosting retrieval accuracy by up to 30% and cutting deployment time by half across legal, medical, and research use cases.

AIOpen-sourceRAG
0 likes · 11 min read
How UltraRAG Turns RAG Deployment into a Zero‑Code, One‑Click Process
Java Tech Enthusiast
Java Tech Enthusiast
Nov 26, 2025 · Artificial Intelligence

How LLM, RAG, and AI Agents Work Together

The article clarifies how large language models (LLM), retrieval‑augmented generation (RAG), and AI agents complement each other, describing the brain‑like reasoning of LLMs, the dynamic knowledge access provided by RAG, and the autonomous action capabilities of AI agents, plus practical usage scenarios.

AI AgentArtificial IntelligenceLLM
0 likes · 7 min read
How LLM, RAG, and AI Agents Work Together
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 26, 2025 · Artificial Intelligence

Unlocking AI-Powered Customer Service: From RAG to Deep Evaluation and Optimization

This article explores how the rapid growth of large language models reshapes intelligent customer service, detailing the evolution from rule‑based NLP bots to Retrieval‑Augmented Generation (RAG) and AI‑native agents, and presents a comprehensive framework for evaluating, diagnosing, and continuously improving chatbot performance using LLM‑driven metrics and context engineering.

AIContext EngineeringCustomer Service
0 likes · 46 min read
Unlocking AI-Powered Customer Service: From RAG to Deep Evaluation and Optimization
PMTalk Product Manager Community
PMTalk Product Manager Community
Nov 25, 2025 · Product Management

Avoid the 3 Common AI Product Management Pitfalls: Prompt Engineering, RAG, and Fine‑Tuning

The article examines why AI product managers repeatedly fall into three traps—over‑relying on prompt engineering, blindly adopting Retrieval‑Augmented Generation, or costly fine‑tuning—by presenting real‑world failures, debunking myths, and offering a five‑layer decision framework with cost, data, resource, and risk analysis to choose the right solution.

AI product managementPrompt EngineeringRAG
0 likes · 24 min read
Avoid the 3 Common AI Product Management Pitfalls: Prompt Engineering, RAG, and Fine‑Tuning
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 22, 2025 · Artificial Intelligence

Why Your RAG System Slows Down Over Time and How to Fix It

The article explains why a production Retrieval‑Augmented Generation (RAG) system becomes slower as it runs—due to growing embedding costs, expanding vector databases, heavier re‑ranking, and larger prompts—and provides concrete engineering optimizations such as batching, async concurrency, caching, partitioned retrieval, HNSW tuning, replica scaling, answer caching, and prompt sparsification to keep performance stable.

AI EngineeringPerformance OptimizationRAG
0 likes · 10 min read
Why Your RAG System Slows Down Over Time and How to Fix It
JD Tech Talk
JD Tech Talk
Nov 21, 2025 · Artificial Intelligence

Mastering Chunking Strategies for Retrieval‑Augmented Generation

This article explains why effective chunking is crucial for RAG performance, compares seven major chunking strategies—including fixed‑size, semantic, recursive, document‑structure, agent‑driven, sentence, and paragraph methods—and offers practical guidance on selecting and optimizing chunks for real‑world AI applications.

AIChunkingRAG
0 likes · 10 min read
Mastering Chunking Strategies for Retrieval‑Augmented Generation
JD Cloud Developers
JD Cloud Developers
Nov 21, 2025 · Artificial Intelligence

Why Chunking Strategy Makes or Breaks RAG Performance

This article explains how different chunking methods—fixed size, semantic, recursive, document‑based, agent‑driven, sentence‑level, and paragraph‑level—affect Retrieval‑Augmented Generation, offering practical guidelines, metrics, and optimization tips for real‑world deployments.

AIChunkingRAG
0 likes · 9 min read
Why Chunking Strategy Makes or Breaks RAG Performance
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 21, 2025 · Artificial Intelligence

How to Build a Multi‑Layer Cache for Dynamic RAG Systems

This article explains why dynamic Retrieval‑Augmented Generation (RAG) requires a layered caching strategy rather than simple result caching, details a four‑level cache architecture—including embedding, search, answer, and pipeline caches—provides practical key‑generation and TTL guidelines, and outlines dirty‑data defenses to keep caches consistent and performant.

AI EngineeringCachingLLM
0 likes · 10 min read
How to Build a Multi‑Layer Cache for Dynamic RAG Systems
Baidu Maps Tech Team
Baidu Maps Tech Team
Nov 19, 2025 · Artificial Intelligence

Boosting Socio‑Economic Q&A: The ARAG Framework Merges Structured Data Analysis with RAG

ARAG introduces a novel Retrieval‑Augmented Generation framework that tightly integrates LLM‑driven structured data analysis with unstructured information retrieval, addressing the “structured + unstructured” reasoning gap in socio‑economic queries, and demonstrates superior accuracy, robustness, and hallucination resistance through extensive evaluations.

Data AnalysisHallucination MitigationLLM
0 likes · 12 min read
Boosting Socio‑Economic Q&A: The ARAG Framework Merges Structured Data Analysis with RAG
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 19, 2025 · Artificial Intelligence

How to Build a Reliable Dynamic Incremental RAG Pipeline for Real‑Time Data

This article explains why dynamic incremental RAG is harder than static RAG, identifies the three main points where recall accuracy breaks, and presents a three‑stage engineering pipeline—including a quality‑control layer, two‑stage retrieval, and reference‑injection generation—to keep real‑time data retrieval both accurate and robust.

AIDynamic DataRAG
0 likes · 13 min read
How to Build a Reliable Dynamic Incremental RAG Pipeline for Real‑Time Data
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 19, 2025 · Artificial Intelligence

Building an AI-Powered Proofreading Agent for Media: Architecture, Prompt Engineering, and Evaluation

This article details a practical case study of designing, implementing, and evaluating an AI-driven proofreading agent for a media client, covering background challenges, a three‑layer architecture, prompt engineering techniques, RAG knowledge‑base construction, model selection, fine‑tuning, automated metrics, and lessons learned.

AILarge Language ModelProofreading
0 likes · 26 min read
Building an AI-Powered Proofreading Agent for Media: Architecture, Prompt Engineering, and Evaluation
JakartaEE China Community
JakartaEE China Community
Nov 18, 2025 · Artificial Intelligence

How to Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This article explains why Retrieval‑Augmented Generation improves LLM accuracy, outlines the key Langchain4j and Ollama3 components, and provides a step‑by‑step Java example—including Maven setup, document ingestion, embedding, similarity search, prompt creation, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingJavaLLM
0 likes · 8 min read
How to Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3
Alipay Experience Technology
Alipay Experience Technology
Nov 18, 2025 · Mobile Development

Boosting KMP Native Cross‑Platform Development with AI Agents: Real‑World Practices

This article details how Alipay's engineering team built an AI‑Agent‑powered coding assistant for Kotlin Multiplatform (KMP) native cross‑platform development, covering architecture, UI generation from designs and images, RAG‑based knowledge retrieval, crash analysis, and future directions for AI‑driven software engineering.

AI AgentCompose MultiplatformKMP
0 likes · 20 min read
Boosting KMP Native Cross‑Platform Development with AI Agents: Real‑World Practices
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 16, 2025 · Artificial Intelligence

How to Slash RAG First‑Token Latency: Practical Engineering Strategies

This guide breaks down the three layers of a RAG pipeline—embedding, vector retrieval, and system architecture—and provides concrete engineering tactics such as batch embedding, async concurrency, caching, ANN indexing, partitioning, connection pooling, and async pipelines to dramatically reduce Time‑to‑First‑Token latency.

Async PipelineEmbeddingRAG
0 likes · 10 min read
How to Slash RAG First‑Token Latency: Practical Engineering Strategies
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 14, 2025 · Artificial Intelligence

How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework

This article explains why function‑call accuracy is critical for LLM agents, identifies four common failure causes, and presents a systematic, five‑step engineering framework—including dynamic routing, chain‑of‑thought planning, result validation, memory injection, and log‑driven optimization—backed by concrete examples and quantitative improvements.

Function CallingInterview preparationLLM
0 likes · 10 min read
How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework
DataFunTalk
DataFunTalk
Nov 11, 2025 · Artificial Intelligence

How Alibaba Cloud’s AI Search Redefines Vector Retrieval and RAG

This article outlines Alibaba Cloud AI Search’s evolution, detailing its dual product lines—enhanced Elasticsearch and self‑developed OpenSearch—key Agentic RAG technologies, serverless architecture, vector and LLM‑driven search capabilities, and future directions in AI‑powered search.

AI SearchAlibaba CloudElasticsearch
0 likes · 4 min read
How Alibaba Cloud’s AI Search Redefines Vector Retrieval and RAG
DaTaobao Tech
DaTaobao Tech
Nov 10, 2025 · Artificial Intelligence

How Tmall’s AI Transforms Test Case Generation for Faster, Smarter QA

This article details Tmall's technology team's deep AI‑driven testing practice, outlining industry challenges, the need for intelligent test case generation, and a comprehensive strategy that combines prompt engineering, RAG‑based knowledge bases, and platform integration to boost coverage, reduce manual effort, and accelerate release cycles.

AI testingKnowledge BaseLarge Language Models
0 likes · 10 min read
How Tmall’s AI Transforms Test Case Generation for Faster, Smarter QA
Data Party THU
Data Party THU
Nov 9, 2025 · Artificial Intelligence

Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed

This article walks through the core RAG pipeline, explains why chunking is the linchpin of retrieval quality, and provides detailed definitions, trade‑offs, and implementation examples for five chunking techniques—fixed, recursive, semantic, structure‑aware, and delayed—so you can choose the right approach for any document‑heavy AI application.

AIChunkingLLM
0 likes · 10 min read
Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed
DataFunSummit
DataFunSummit
Nov 8, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Real‑World AI Solutions with RAG and Agents

This article examines Tencent's large language model deployments across diverse business scenarios, detailing core use cases such as content generation, intelligent customer service, and role‑playing, while deep‑diving into the RAG, GraphRAG, and Agent technologies that enable smarter, more reliable AI applications.

AIAgentLLM
0 likes · 4 min read
How Tencent’s LLM Powers Real‑World AI Solutions with RAG and Agents