Artificial Intelligence 10 min read

Retrieval‑Augmented Generation (RAG) with LangChain: Concepts and Python Implementation

Retrieval‑Augmented Generation (RAG) using LangChain lets developers enhance large language models by embedding user queries, fetching relevant documents from a vector store, inserting the context into a prompt template, and generating concise, source‑grounded answers, offering low‑cost, up‑to‑date knowledge while reducing hallucinations and fine‑tuning expenses.

iKang Technology Team
iKang Technology Team
iKang Technology Team
Retrieval‑Augmented Generation (RAG) with LangChain: Concepts and Python Implementation

RAG (Retrieval‑Augmented Generation) is a technique that enhances large language models (LLMs) with external knowledge sources to improve answer accuracy and reduce hallucinations.

The article first explains why fine‑tuning is costly and how RAG offers a low‑cost, fast alternative. It then describes the three‑step RAG workflow: retrieval, augmentation, and generation.

Retrieval: user query is embedded and matched against a vector database to fetch the most relevant documents.

Augmentation: the retrieved context is inserted into a prompt template.

Generation: the prompt with context is fed to the LLM to produce the final answer.

Implementation using LangChain (an open‑source AI application framework) and Python is demonstrated. The steps include:

Loading raw data (text, PDF, CSV, etc.) with appropriate document loaders.

Chunking documents into smaller pieces using CharacterTextSplitter .

Embedding chunks with OpenAI embeddings and storing them in a Weaviate vector store.

Creating a retriever from the vector store.

Defining a prompt template and assembling a RAG chain that connects the retriever, prompt, and an OpenAI chat model.

Key code snippets:

import requests
from langchain.document_loaders import TextLoader
url = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
f.write(res.text)
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Weaviate
import weaviate
from weaviate.embedded import EmbeddedOptions
client = weaviate.Client(embedded_options=EmbeddedOptions())
vectorstore = Weaviate.from_documents(client=client, documents=chunks, embedding=OpenAIEmbeddings(), by_text=False)
retriever = vectorstore.as_retriever()
from langchain.prompts import ChatPromptTemplate
template = """You are an assistant for question‑answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:"""
prompt = ChatPromptTemplate.from_template(template)
from langchain.chat_models import ChatOpenAI
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
rag_chain = ({"context": retriever, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser())
query = "What did the president say about Justice Breyer"
rag_chain.invoke(query)

The example query returns a concise answer derived from the retrieved context, demonstrating how RAG can provide up‑to‑date, verifiable information while reducing hallucinations and training costs.

Benefits of using RAG with LLMs include:

Access to the latest and most accurate content with source traceability.

Reduced risk of hallucinations and leakage of sensitive data.

Lower financial overhead by avoiding frequent model retraining.

PythonLLMLangChainRAGvector databaseretrieval
iKang Technology Team
Written by

iKang Technology Team

The iKang tech team shares their technical and practical experiences in medical‑health projects.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.