Getting Started with LangChain in Java: Building Large Language Model Applications
This tutorial introduces the fundamentals of LangChain, explains large language models, prompt engineering, word embeddings, and demonstrates how to use the Java implementation LangChain4j with Maven dependencies, model I/O, memory, retrieval, chains, and agents to build sophisticated LLM‑driven applications.
1. Introduction
In this tutorial we explore LangChain, a framework for developing applications powered by large language models (LLMs). We first review basic concepts of language models that will help understand the rest of the guide.
Although LangChain primarily provides Python and JavaScript/TypeScript versions, it can also be used from Java. We will discuss the building blocks of LangChain and experiment with them in Java.
2. Background
Before diving into why a framework is needed for LLM‑based applications, we clarify what a language model is and the typical complexities involved when using them.
2.1. Large Language Models
A language model is a probabilistic model of natural language that can generate sequences of words. Large language models (LLMs) are massive neural networks with billions of parameters, typically pretrained on vast amounts of unlabelled data using self‑supervised and weak‑supervised learning, then fine‑tuned and prompt‑engineered for specific tasks.
LLMs can perform translation, summarisation, content generation and many other NLP tasks, and are offered by major cloud providers such as Microsoft Azure (Llama 2, GPT‑4) and Amazon Bedrock.
2.2. Prompt Engineering
Prompt engineering is a fast way to make LLMs perform specific tasks by providing structured text that describes the task. It enables contextual learning, safer model usage, and the integration of domain knowledge and external tools.
Techniques such as chain‑of‑thought prompting are becoming popular, allowing the model to break a problem into intermediate steps before producing a final answer.
2.3. Word Embeddings
Representing words as dense vectors (embeddings) improves model performance. Common algorithms include Word2Vec and GloVe, which encode semantic information that can be used for similarity search and as additional context for LLMs.
3. Building an LLM Tech Stack with LangChain
Effective prompting is key to leveraging LLMs. LangChain simplifies tasks such as creating prompt templates, invoking language models, and feeding user‑specific data from various sources.
The framework also supports chaining multiple model calls, memory of past interactions, logging, monitoring, streaming, and other maintenance tasks.
4. LangChain for Java
LangChain was launched as an open‑source project in 2022, initially in Python. A Java version called LangChain4j is community‑maintained, compatible with Java 8+ and Spring Boot 2/3. Add the following Maven dependency to use it:
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
<version>0.23.0</version>
</dependency>5. LangChain Building Blocks
LangChain offers modular components that abstract common LLM operations.
5.1. Model I/O
LangChain provides prompt templating and dynamic model input handling. Example of a prompt template in Java:
PromptTemplate promptTemplate = PromptTemplate.from("Tell me a {{adjective}} joke about {{content}}..");
Map<String, Object> variables = new HashMap<>();
variables.put("adjective", "funny");
variables.put("content", "computers");
Prompt prompt = promptTemplate.apply(variables);5.2. Memory
Memory stores past conversation turns so the model can reference earlier information. Example using a token‑window chat memory:
ChatMemory chatMemory = TokenWindowChatMemory.withMaxTokens(300, new OpenAiTokenizer(GPT_3_5_TURBO));
chatMemory.add(userMessage("你好,我叫 Kumar"));
AiMessage answer = model.generate(chatMemory.messages()).content();
System.out.println(answer.text()); // 你好 Kumar!今天我能为您做些什么?
chatMemory.add(answer);
chatMemory.add(userMessage("我叫什么名字?"));
AiMessage answerWithName = model.generate(chatMemory.messages()).content();
System.out.println(answerWithName.text()); // 您的名字是 Kumar。
chatMemory.add(answerWithName);5.3. Retrieval
Retrieval‑augmented generation (RAG) fetches relevant external data and feeds it to the LLM. Typical steps include loading documents, splitting them, embedding the chunks, storing embeddings in a vector store, and performing semantic search.
Document document = FileSystemDocumentLoader.loadDocument("simpson's_adventures.txt");
DocumentSplitter splitter = DocumentSplitters.recursive(100, 0, new OpenAiTokenizer(GPT_3_5_TURBO));
List<TextSegment> segments = splitter.split(document);
EmbeddingModel embeddingModel = new AllMiniLmL6V2EmbeddingModel();
List<Embedding> embeddings = embeddingModel.embedAll(segments).content();
EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
embeddingStore.addAll(embeddings, segments);
String question = "Who is Simpson?";
Embedding questionEmbedding = embeddingModel.embed(question).content();
List<EmbeddingMatch<TextSegment>> relevantEmbeddings = embeddingStore.findRelevant(questionEmbedding, 3, 0.7);
// Use relevantEmbeddings as context for the LLM6. Advanced LangChain Applications
6.1. Chains
Chains allow sequential execution of multiple components. Example of a conversational retrieval chain:
ConversationalRetrievalChain chain = ConversationalRetrievalChain.builder()
.chatLanguageModel(chatModel)
.retriever(EmbeddingStoreRetriever.from(embeddingStore, embeddingModel))
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.promptTemplate(PromptTemplate.from("Answer the following question to the best of your ability: {{question}}\n\nBase your answer on the following information:\n{{information}}"))
.build();
String answer = chain.execute("Who is Simpson?");6.2. Agents
Agents treat the LLM as a reasoning engine that decides which actions to take and in what order, optionally granting access to external tools.
In LangChain4j, an AI service can be defined with tools such as a calculator:
public class AIServiceWithCalculator {
static class Calculator {
@Tool("Calculates the length of a string")
int stringLength(String s) { return s.length(); }
@Tool("Calculates the sum of two numbers")
int add(int a, int b) { return a + b; }
}
}
interface Assistant { String chat(String userMessage); }
Assistant assistant = AiServices.builder(Assistant.class)
.chatLanguageModel(OpenAiChatModel.withApiKey(OPENAI_API_KEY))
.tools(new AIServiceWithCalculator.Calculator())
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.build();
String question = "What is the sum of the numbers of letters in the words \"language\" and \"model\"?";
String answer = assistant.chat(question);
System.out.println(answer); // The sum ... is 13.Agents can overcome limitations of LLMs on tasks requiring arithmetic, temporal reasoning, or other specialized capabilities by providing appropriate tools.
7. Conclusion
This guide covered the essential elements for building LLM‑driven applications with LangChain, highlighted the value of prompt engineering, embeddings, memory, retrieval, chains, and agents, and demonstrated how the Java implementation LangChain4j can be used to create robust AI services.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.