Artificial Intelligence 10 min read

Retrieval‑Augmented Generation (RAG) with LangChain: Concepts and Python Implementation

Retrieval‑Augmented Generation (RAG) using LangChain lets developers enhance large language models by embedding user queries, fetching relevant documents from a vector store, inserting the context into a prompt template, and generating concise, source‑grounded answers, offering low‑cost, up‑to‑date knowledge while reducing hallucinations and fine‑tuning expenses.

iKang Technology Team

Feb 7, 2025

Retrieval‑Augmented Generation (RAG) with LangChain: Concepts and Python Implementation

RAG (Retrieval‑Augmented Generation) is a technique that enhances large language models (LLMs) with external knowledge sources to improve answer accuracy and reduce hallucinations.

The article first explains why fine‑tuning is costly and how RAG offers a low‑cost, fast alternative. It then describes the three‑step RAG workflow: retrieval, augmentation, and generation.

Retrieval: user query is embedded and matched against a vector database to fetch the most relevant documents.

Augmentation: the retrieved context is inserted into a prompt template.

Generation: the prompt with context is fed to the LLM to produce the final answer.

Implementation using LangChain (an open‑source AI application framework) and Python is demonstrated. The steps include:

Loading raw data (text, PDF, CSV, etc.) with appropriate document loaders.

Chunking documents into smaller pieces using CharacterTextSplitter.

Embedding chunks with OpenAI embeddings and storing them in a Weaviate vector store.

Creating a retriever from the vector store.

Defining a prompt template and assembling a RAG chain that connects the retriever, prompt, and an OpenAI chat model.

Key code snippets:

import requests

from langchain.document_loaders import TextLoader

url = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"

res = requests.get(url)

with open("state_of_the_union.txt", "w") as f:

f.write(res.text)

loader = TextLoader('./state_of_the_union.txt')

documents = loader.load()

from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)

chunks = text_splitter.split_documents(documents)

from langchain.embeddings import OpenAIEmbeddings

from langchain.vectorstores import Weaviate

import weaviate

from weaviate.embedded import EmbeddedOptions

client = weaviate.Client(embedded_options=EmbeddedOptions())

vectorstore = Weaviate.from_documents(client=client, documents=chunks, embedding=OpenAIEmbeddings(), by_text=False)

retriever = vectorstore.as_retriever()

from langchain.prompts import ChatPromptTemplate

template = """You are an assistant for question‑answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:"""

prompt = ChatPromptTemplate.from_template(template)

from langchain.chat_models import ChatOpenAI

from langchain.schema.runnable import RunnablePassthrough

from langchain.schema.output_parser import StrOutputParser

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

rag_chain = ({"context": retriever, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser())

query = "What did the president say about Justice Breyer"

rag_chain.invoke(query)

The example query returns a concise answer derived from the retrieved context, demonstrating how RAG can provide up‑to‑date, verifiable information while reducing hallucinations and training costs.

Benefits of using RAG with LLMs include:

Access to the latest and most accurate content with source traceability.

Reduced risk of hallucinations and leakage of sensitive data.

Lower financial overhead by avoiding frequent model retraining.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LLM LangChain RAG Vector Database retrieval

Written by

iKang Technology Team

The iKang tech team shares their technical and practical experiences in medical‑health projects.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.