Topic

RAG

Collection size
167 articles
Page 7 of 9
JD Tech Talk
JD Tech Talk
Jul 16, 2024 · Artificial Intelligence

Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models

TaD, a task‑aware decoding technique jointly developed by JD.com and Tsinghua University and presented at IJCAI 2024, leverages differences between pre‑ and post‑fine‑tuned LLM outputs to construct knowledge vectors, significantly reducing hallucinations across various models, tasks, and data‑scarce scenarios, especially when combined with RAG.

AILLMRAG
0 likes · 18 min read
Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 29, 2024 · Artificial Intelligence

Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices

This comprehensive guide explores the complexities of building enterprise‑level Retrieval‑Augmented Generation (RAG) systems, detailing common failure points, architectural components such as authentication, input guards, query rewriting, document ingestion, indexing, storage, retrieval, generation, observability, caching, and multi‑tenant considerations, and provides actionable best‑practice recommendations for developers and technical leaders.

Enterprise AILLMObservability
0 likes · 32 min read
Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Mar 30, 2024 · Artificial Intelligence

Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design

This article provides an in‑depth overview of the Coze low‑code AI bot platform, covering its core features, product comparisons, step‑by‑step bot creation, RAG implementation, plugin usage, memory mechanisms, cron jobs, agent design, advanced workflow techniques, quality management, and future prospects.

AI BotCozeLLM
0 likes · 25 min read
Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design
Architect
Architect
Mar 15, 2025 · Artificial Intelligence

Why Building Your Own RAG System Is a Costly Mistake

The article explains that developing a custom Retrieval‑Augmented Generation (RAG) solution incurs hidden infrastructure, personnel, and security costs, leads to operational overload and budget overruns, and is rarely justified compared to purchasing a proven vendor solution.

AILLMRAG
0 likes · 11 min read
Why Building Your Own RAG System Is a Costly Mistake
Architect
Architect
Aug 2, 2024 · Artificial Intelligence

Building AI‑Native Applications with Spring AI: A Complete Tutorial

This article explains how to quickly develop an AI‑native application using Spring AI, covering core features such as chat models, prompt templates, function calling, structured output, image generation, embedding, vector stores, and Retrieval‑Augmented Generation (RAG), and provides end‑to‑end Java code examples for building a simple AI‑driven service.

AI nativeBackendJava
0 likes · 40 min read
Building AI‑Native Applications with Spring AI: A Complete Tutorial
DataFunSummit
DataFunSummit
Jan 26, 2025 · Artificial Intelligence

ChatBI in Automotive Enterprises: Challenges, Architecture, and Implementation

This article examines the rise of ChatBI in automotive companies, outlining current BI challenges, the five “no” and five “difficulties” issues, the motivations for adopting ChatBI, its evolving architecture, and practical implementation steps to achieve data‑driven decision making.

AIAutomotiveChatBI
0 likes · 17 min read
ChatBI in Automotive Enterprises: Challenges, Architecture, and Implementation
DataFunSummit
DataFunSummit
Jan 22, 2025 · Artificial Intelligence

RAG2.0 Engine Design Challenges and Implementation

This article presents a comprehensive overview of the RAG2.0 engine design, covering RAG1.0 limitations, effective chunking methods, accurate retrieval techniques, advanced multimodal processing, hybrid search strategies, database indexing choices, and future directions such as agentic RAG and memory‑enhanced models.

ChunkingHybrid SearchMultimodal
0 likes · 23 min read
RAG2.0 Engine Design Challenges and Implementation
DataFunSummit
DataFunSummit
Jan 11, 2025 · Artificial Intelligence

Generative AI Applications, MLOps, and LLMOps: A Comprehensive Overview

This article presents a detailed overview of generative AI lifecycle management, covering practical use cases such as email summarization, the roles of providers, fine‑tuners and consumers, MLOps/LLMOps processes, retrieval‑augmented generation, efficient fine‑tuning methods like PEFT, and Amazon Bedrock services for model deployment and monitoring.

Amazon BedrockGenerative AILLMOps
0 likes · 14 min read
Generative AI Applications, MLOps, and LLMOps: A Comprehensive Overview
DataFunSummit
DataFunSummit
Oct 27, 2024 · Artificial Intelligence

How Siemens Harnesses Generative AI to Build the Enterprise Knowledge Chatbot “XiaoYu”

This article describes Siemens' journey in applying generative AI and Retrieval‑Augmented Generation to create an internal knowledge chatbot, detailing the business challenges, technical architecture, data integration, multi‑modal capabilities, deployment outcomes, and strategic lessons for enterprise AI adoption.

AI chatbotEnterprise Knowledge ManagementGenerative AI
0 likes · 21 min read
How Siemens Harnesses Generative AI to Build the Enterprise Knowledge Chatbot “XiaoYu”
DataFunSummit
DataFunSummit
Sep 4, 2024 · Artificial Intelligence

How Elasticsearch Powers Retrieval‑Augmented Generation (RAG) Applications

This article explains how Elasticsearch’s advanced search capabilities—including vector and semantic search, hardware acceleration, hybrid retrieval, model re‑ranking, multi‑vector support, and integrated security—enable robust RAG implementations and outlines future directions such as a new compute engine, stronger vector engines, and cloud‑native serverless deployment.

AIElasticsearchHybrid Search
0 likes · 9 min read
How Elasticsearch Powers Retrieval‑Augmented Generation (RAG) Applications
DataFunSummit
DataFunSummit
Aug 29, 2024 · Artificial Intelligence

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

This article details Tencent Game's end‑to‑end approach to building intelligent NPCs, covering the opportunities brought by AI, the practical implementation of multimodal LLM‑driven dialogue, knowledge‑augmented retrieval, long‑context handling, safety measures, multimodal expression (voice and facial animation), and system‑level performance optimizations for real‑time deployment.

AILLMMultimodal
0 likes · 18 min read
Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations
DataFunTalk
DataFunTalk
Mar 15, 2024 · Artificial Intelligence

NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation

This article explains NVIDIA’s end‑to‑end ecosystem for large language models, covering the NeMo Framework’s data processing, distributed training, model fine‑tuning, inference acceleration with TensorRT‑LLM, deployment via Triton, and Retrieval‑Augmented Generation (RAG) techniques that enhance model reliability and performance.

AINVIDIANeMo
0 likes · 16 min read
NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation
DataFunTalk
DataFunTalk
Nov 17, 2023 · Databases

Cost as the Primary Driver of Vector Database Industry Development

Vector databases gain traction because they dramatically reduce storage, learning, scaling, and large‑model limitations costs by enabling semantic similarity search, RAG‑based prompt optimization, efficient high‑dimensional indexing, and cloud‑native architectures, making them essential for modern AI applications despite the promotional context.

AIRAGbig data
0 likes · 11 min read
Cost as the Primary Driver of Vector Database Industry Development
Architecture & Thinking
Architecture & Thinking
Jun 19, 2024 · Artificial Intelligence

Build AI‑Native Apps Quickly with Spring AI: From Chat Models to RAG

This guide explains what an AI‑native application is, compares AI‑native and AI‑based approaches, and walks through Spring AI’s core features—including chat models, prompt templates, function calling, structured output, image generation, embedding, and vector stores—showing step‑by‑step code examples and how to assemble a complete AI‑native app with RAG support.

AI native applicationJavaPrompt Engineering
0 likes · 43 min read
Build AI‑Native Apps Quickly with Spring AI: From Chat Models to RAG
Java Architecture Diary
Java Architecture Diary
Feb 13, 2025 · Artificial Intelligence

Create a Java RAG System Using DeepSeek R1, Milvus, and Spring

This guide walks through building a Java RAG system with DeepSeek R1, Milvus, and Spring, covering environment setup, vector model integration via OpenAI protocol, Maven dependencies, data embedding, and a chat endpoint that combines semantic retrieval with LLM generation.

AI integrationDeepSeekJava
0 likes · 11 min read
Create a Java RAG System Using DeepSeek R1, Milvus, and Spring
macrozheng
macrozheng
Jan 20, 2025 · Artificial Intelligence

How Redis’s New Multithreaded Query Engine Boosts Vector Search for Real‑Time AI Apps

Redis has introduced a multithreaded query engine that dramatically lowers latency and multiplies throughput for vector‑based retrieval, enabling real‑time RAG applications to approach the 100 ms response target while scaling vertically to billions of documents.

AI performanceMultithreadingRAG
0 likes · 6 min read
How Redis’s New Multithreaded Query Engine Boosts Vector Search for Real‑Time AI Apps
Instant Consumer Technology Team
Instant Consumer Technology Team
Jun 12, 2025 · Artificial Intelligence

How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models

This guide walks through using Alibaba's new Qwen3-Embedding and Qwen3-Reranker models to build a two‑stage Retrieval‑Augmented Generation pipeline with Milvus, covering environment setup, data ingestion, vector indexing, reranking, and LLM‑driven answer generation, demonstrating production‑grade performance across multilingual queries.

LLMMilvusPython
0 likes · 19 min read
How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models
Instant Consumer Technology Team
Instant Consumer Technology Team
Jun 4, 2025 · Artificial Intelligence

Unlocking Retrieval-Augmented Generation: Theory, Practice, and Future Trends

This comprehensive article examines Retrieval‑Augmented Generation (RAG), covering its historical evolution, core theory, implementation variants, practical code examples, diverse applications, current controversies, and future research directions within the AI and NLP landscape.

Generative ModelsNatural Language ProcessingRAG
0 likes · 21 min read
Unlocking Retrieval-Augmented Generation: Theory, Practice, and Future Trends
Instant Consumer Technology Team
Instant Consumer Technology Team
May 29, 2025 · Artificial Intelligence

How to Build an Agent‑Powered Financial Q&A System with RAG and SQL

This article explains how to construct a financial question‑answering agent that automatically decides between SQL queries and RAG retrieval, covering intent recognition, tool creation, prompt design, agent initialization, and end‑to‑end testing with Python code.

LangChainPythonRAG
0 likes · 13 min read
How to Build an Agent‑Powered Financial Q&A System with RAG and SQL
Qunhe Technology Quality Tech
Qunhe Technology Quality Tech
Aug 14, 2024 · Artificial Intelligence

Should Your Testing Team Build a Private LLM or Use RAG with a General Model?

This article compares the high costs and technical challenges of building a private large language model with the benefits, flexibility, and lower risk of using Retrieval‑Augmented Generation (RAG) on a general LLM, offering practical guidance for testing teams seeking AI assistance.

AIModel DeploymentRAG
0 likes · 11 min read
Should Your Testing Team Build a Private LLM or Use RAG with a General Model?