RAG | BestHub

Collection size

167 articles

Page 7 of 9

JD Tech Talk

Jul 16, 2024 · Artificial Intelligence

Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models

TaD, a task‑aware decoding technique jointly developed by JD.com and Tsinghua University and presented at IJCAI 2024, leverages differences between pre‑ and post‑fine‑tuned LLM outputs to construct knowledge vectors, significantly reducing hallucinations across various models, tasks, and data‑scarce scenarios, especially when combined with RAG.

AILLMRAG

0 likes · 18 min read

Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models

Rare Earth Juejin Tech Community

Apr 29, 2024 · Artificial Intelligence

Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices

This comprehensive guide explores the complexities of building enterprise‑level Retrieval‑Augmented Generation (RAG) systems, detailing common failure points, architectural components such as authentication, input guards, query rewriting, document ingestion, indexing, storage, retrieval, generation, observability, caching, and multi‑tenant considerations, and provides actionable best‑practice recommendations for developers and technical leaders.

Enterprise AILLMObservability

0 likes · 32 min read

Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices

Rare Earth Juejin Tech Community

Mar 30, 2024 · Artificial Intelligence

Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design

This article provides an in‑depth overview of the Coze low‑code AI bot platform, covering its core features, product comparisons, step‑by‑step bot creation, RAG implementation, plugin usage, memory mechanisms, cron jobs, agent design, advanced workflow techniques, quality management, and future prospects.

AI BotCozeLLM

0 likes · 25 min read

Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design

Architect

Mar 15, 2025 · Artificial Intelligence

Why Building Your Own RAG System Is a Costly Mistake

The article explains that developing a custom Retrieval‑Augmented Generation (RAG) solution incurs hidden infrastructure, personnel, and security costs, leads to operational overload and budget overruns, and is rarely justified compared to purchasing a proven vendor solution.

AILLMRAG

0 likes · 11 min read

Why Building Your Own RAG System Is a Costly Mistake

Architect

Aug 2, 2024 · Artificial Intelligence

Building AI‑Native Applications with Spring AI: A Complete Tutorial

This article explains how to quickly develop an AI‑native application using Spring AI, covering core features such as chat models, prompt templates, function calling, structured output, image generation, embedding, vector stores, and Retrieval‑Augmented Generation (RAG), and provides end‑to‑end Java code examples for building a simple AI‑driven service.

AI nativeBackendJava

0 likes · 40 min read

Building AI‑Native Applications with Spring AI: A Complete Tutorial

DataFunSummit

Jan 26, 2025 · Artificial Intelligence

ChatBI in Automotive Enterprises: Challenges, Architecture, and Implementation

This article examines the rise of ChatBI in automotive companies, outlining current BI challenges, the five “no” and five “difficulties” issues, the motivations for adopting ChatBI, its evolving architecture, and practical implementation steps to achieve data‑driven decision making.

AIAutomotiveChatBI

0 likes · 17 min read

ChatBI in Automotive Enterprises: Challenges, Architecture, and Implementation

DataFunSummit

Jan 22, 2025 · Artificial Intelligence

RAG2.0 Engine Design Challenges and Implementation

This article presents a comprehensive overview of the RAG2.0 engine design, covering RAG1.0 limitations, effective chunking methods, accurate retrieval techniques, advanced multimodal processing, hybrid search strategies, database indexing choices, and future directions such as agentic RAG and memory‑enhanced models.

ChunkingHybrid SearchMultimodal

0 likes · 23 min read

RAG2.0 Engine Design Challenges and Implementation

DataFunSummit

Jan 11, 2025 · Artificial Intelligence

Generative AI Applications, MLOps, and LLMOps: A Comprehensive Overview

This article presents a detailed overview of generative AI lifecycle management, covering practical use cases such as email summarization, the roles of providers, fine‑tuners and consumers, MLOps/LLMOps processes, retrieval‑augmented generation, efficient fine‑tuning methods like PEFT, and Amazon Bedrock services for model deployment and monitoring.

Amazon BedrockGenerative AILLMOps

0 likes · 14 min read

Generative AI Applications, MLOps, and LLMOps: A Comprehensive Overview

DataFunSummit

Oct 27, 2024 · Artificial Intelligence

How Siemens Harnesses Generative AI to Build the Enterprise Knowledge Chatbot “XiaoYu”

This article describes Siemens' journey in applying generative AI and Retrieval‑Augmented Generation to create an internal knowledge chatbot, detailing the business challenges, technical architecture, data integration, multi‑modal capabilities, deployment outcomes, and strategic lessons for enterprise AI adoption.

AI chatbotEnterprise Knowledge ManagementGenerative AI

0 likes · 21 min read

How Siemens Harnesses Generative AI to Build the Enterprise Knowledge Chatbot “XiaoYu”

DataFunSummit

Sep 4, 2024 · Artificial Intelligence

How Elasticsearch Powers Retrieval‑Augmented Generation (RAG) Applications

This article explains how Elasticsearch’s advanced search capabilities—including vector and semantic search, hardware acceleration, hybrid retrieval, model re‑ranking, multi‑vector support, and integrated security—enable robust RAG implementations and outlines future directions such as a new compute engine, stronger vector engines, and cloud‑native serverless deployment.

AIElasticsearchHybrid Search

0 likes · 9 min read

How Elasticsearch Powers Retrieval‑Augmented Generation (RAG) Applications

DataFunSummit

Aug 29, 2024 · Artificial Intelligence

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

This article details Tencent Game's end‑to‑end approach to building intelligent NPCs, covering the opportunities brought by AI, the practical implementation of multimodal LLM‑driven dialogue, knowledge‑augmented retrieval, long‑context handling, safety measures, multimodal expression (voice and facial animation), and system‑level performance optimizations for real‑time deployment.

AILLMMultimodal

0 likes · 18 min read

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

DataFunTalk

Mar 15, 2024 · Artificial Intelligence

NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation

This article explains NVIDIA’s end‑to‑end ecosystem for large language models, covering the NeMo Framework’s data processing, distributed training, model fine‑tuning, inference acceleration with TensorRT‑LLM, deployment via Triton, and Retrieval‑Augmented Generation (RAG) techniques that enhance model reliability and performance.

AINVIDIANeMo

0 likes · 16 min read

NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation

DataFunTalk

Nov 17, 2023 · Databases

Cost as the Primary Driver of Vector Database Industry Development

Vector databases gain traction because they dramatically reduce storage, learning, scaling, and large‑model limitations costs by enabling semantic similarity search, RAG‑based prompt optimization, efficient high‑dimensional indexing, and cloud‑native architectures, making them essential for modern AI applications despite the promotional context.

AIRAGbig data

0 likes · 11 min read

Cost as the Primary Driver of Vector Database Industry Development

Architecture & Thinking

Jun 19, 2024 · Artificial Intelligence

Build AI‑Native Apps Quickly with Spring AI: From Chat Models to RAG

This guide explains what an AI‑native application is, compares AI‑native and AI‑based approaches, and walks through Spring AI’s core features—including chat models, prompt templates, function calling, structured output, image generation, embedding, and vector stores—showing step‑by‑step code examples and how to assemble a complete AI‑native app with RAG support.

AI native applicationJavaPrompt Engineering

0 likes · 43 min read

Build AI‑Native Apps Quickly with Spring AI: From Chat Models to RAG

Java Architecture Diary

Feb 13, 2025 · Artificial Intelligence

Create a Java RAG System Using DeepSeek R1, Milvus, and Spring

This guide walks through building a Java RAG system with DeepSeek R1, Milvus, and Spring, covering environment setup, vector model integration via OpenAI protocol, Maven dependencies, data embedding, and a chat endpoint that combines semantic retrieval with LLM generation.

AI integrationDeepSeekJava

0 likes · 11 min read

Create a Java RAG System Using DeepSeek R1, Milvus, and Spring

macrozheng

Jan 20, 2025 · Artificial Intelligence

How Redis’s New Multithreaded Query Engine Boosts Vector Search for Real‑Time AI Apps

Redis has introduced a multithreaded query engine that dramatically lowers latency and multiplies throughput for vector‑based retrieval, enabling real‑time RAG applications to approach the 100 ms response target while scaling vertically to billions of documents.

AI performanceMultithreadingRAG

0 likes · 6 min read

How Redis’s New Multithreaded Query Engine Boosts Vector Search for Real‑Time AI Apps

Instant Consumer Technology Team

Jun 12, 2025 · Artificial Intelligence

How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models

This guide walks through using Alibaba's new Qwen3-Embedding and Qwen3-Reranker models to build a two‑stage Retrieval‑Augmented Generation pipeline with Milvus, covering environment setup, data ingestion, vector indexing, reranking, and LLM‑driven answer generation, demonstrating production‑grade performance across multilingual queries.

LLMMilvusPython

0 likes · 19 min read

How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models

Instant Consumer Technology Team

Jun 4, 2025 · Artificial Intelligence

Unlocking Retrieval-Augmented Generation: Theory, Practice, and Future Trends

This comprehensive article examines Retrieval‑Augmented Generation (RAG), covering its historical evolution, core theory, implementation variants, practical code examples, diverse applications, current controversies, and future research directions within the AI and NLP landscape.

Generative ModelsNatural Language ProcessingRAG

0 likes · 21 min read

Unlocking Retrieval-Augmented Generation: Theory, Practice, and Future Trends

Instant Consumer Technology Team

May 29, 2025 · Artificial Intelligence

How to Build an Agent‑Powered Financial Q&A System with RAG and SQL

This article explains how to construct a financial question‑answering agent that automatically decides between SQL queries and RAG retrieval, covering intent recognition, tool creation, prompt design, agent initialization, and end‑to‑end testing with Python code.

LangChainPythonRAG

0 likes · 13 min read

How to Build an Agent‑Powered Financial Q&A System with RAG and SQL

Qunhe Technology Quality Tech

Aug 14, 2024 · Artificial Intelligence

Should Your Testing Team Build a Private LLM or Use RAG with a General Model?

This article compares the high costs and technical challenges of building a private large language model with the benefits, flexibility, and lower risk of using Retrieval‑Augmented Generation (RAG) on a general LLM, offering practical guidance for testing teams seeking AI assistance.

AIModel DeploymentRAG

0 likes · 11 min read

Should Your Testing Team Build a Private LLM or Use RAG with a General Model?