Spring AI ChatMemory: Concepts, Practical Setup, and Common Issues
This guide explains how Spring AI abstracts LLM conversation memory using a three‑layer architecture, demonstrates configuring MessageWindowChatMemory with a sliding‑window strategy, shows two ways to register the memory advisor, and provides complete Maven, YAML, and Java code examples with test screenshots.
Scenario
Spring AI Advisor full guide: interceptor mechanism and practical walkthrough. The author previously implemented basic session memory and now expands the learning.
Large language models (LLM) are stateless—each call is independent. If the model is told "My name is Xiao Ming" in the first turn and asked "What is my name?" in the second turn, it will not remember.
To achieve coherent multi‑turn dialogue, the core idea is to send the complete conversation history with each request. Spring AI encapsulates this process through ChatMemory and the Advisor mechanism, allowing developers to enable memory, conversation isolation, and even persistence with minimal configuration.
Core Concepts
Spring AI splits conversation‑memory management into three layers, each with a distinct role:
ChatMemory (memory strategy layer) – decides which messages to keep and when to trim (e.g., retain only the most recent N messages).
ChatMemoryRepository (storage layer) – purely handles CRUD of messages (in‑memory, JDBC, Redis, etc.).
MessageChatMemoryAdvisor (interceptor layer) – automatically injects the conversation history into each request and saves new messages after the response.
MessageWindowChatMemory: Sliding‑Window Memory
MessageWindowChatMemoryis the recommended implementation. It maintains a fixed‑size window of messages (default 20). When the limit is exceeded, the oldest messages are removed, while system messages are retained. This design lets you precisely control the context length sent to the model, preventing token explosion.
Two Ways to Register an Advisor
Global registration (at builder time) – ChatClient.builder().defaultAdvisors(...) applies to all requests.
Per‑request registration (on demand) – prompt().advisors(a -> a.param(...)) passes runtime parameters such as conversationId. Because memory must load different histories based on conversation ID, the MessageChatMemoryAdvisor must be used with per‑request parameters.
Implementation
pom.xml
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.3.3</version> <!-- downgrade to stable version -->
</parent>
<groupId>com.example</groupId>
<artifactId>spring-ai-ollama-demo</artifactId>
<version>1.0</version>
<properties>
<java.version>17</java.version>
<spring-ai.version>1.1.2</spring-ai.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Spring AI Ollama core -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-ollama</artifactId>
<version>${spring-ai.version}</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>spring-milestones</id>
<url>https://repo.spring.io/milestone</url>
<snapshots><enabled>false</enabled></snapshots>
</repository>
</repositories>application.yml
server:
port: 886
spring:
ai:
ollama:
base-url: http://localhost:11434
chat:
model: qwen2.5:7b-instruct
options:
temperature: 0.7
num-ctx: 4096 # Ollama context window size
logging:
level:
org.springframework.ai.chat.client.advisor: DEBUG # observe memory injection logsMemoryConfig – Creating ChatMemory
package com.badao.ai.config;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.memory.InMemoryChatMemoryRepository;
import org.springframework.ai.chat.memory.MessageWindowChatMemory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class MemoryConfig {
@Bean
public ChatMemory chatMemory() {
// 1. Create the underlying storage repository (in‑memory implementation)
InMemoryChatMemoryRepository repository = new InMemoryChatMemoryRepository();
// 2. Wrap with a sliding‑window strategy, limiting to the 10 most recent messages
return MessageWindowChatMemory.builder()
.chatMemoryRepository(repository)
.maxMessages(10)
.build();
}
}ChatConfig – Registering the Memory Advisor
package com.badao.ai.config;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class ChatConfig {
@Bean
public ChatClient chatClient(ChatModel chatModel, ChatMemory chatMemory) {
return ChatClient.builder(chatModel)
.defaultAdvisors(
MessageChatMemoryAdvisor.builder(chatMemory).build()
)
.build();
}
}Controller – Multi‑turn Chat API
package com.badao.ai.controller;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.memory.ChatMemory; // import ChatMemory interface
import org.springframework.web.bind.annotation.*;
@RestController
@RequestMapping("/api")
public class MemoryChatController {
private final ChatClient chatClient;
public MemoryChatController(ChatClient chatClient) {
this.chatClient = chatClient;
}
@PostMapping("/chat/memory")
public ChatResponse chatWithMemory(@RequestBody MemoryChatRequest request) {
String result = chatClient.prompt()
.user(request.message())
.advisors(advisor -> advisor.param(
ChatMemory.CONVERSATION_ID,
request.conversationId()
))
.call()
.content();
return new ChatResponse(200, "success", result);
}
public record MemoryChatRequest(String message, String conversationId) {}
public record ChatResponse(int code, String msg, String data) {}
}Testing Verification
Test the session memory by sending conversation ID "001", stating a name, then asking for the name again. The following screenshots show the expected behavior.
Next, send conversation ID "002" and ask for the name again. The screenshot demonstrates that the memory is correctly scoped per conversation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
The Dominant Programmer
Resources and tutorials for programmers' advanced learning journey. Advanced tracks in Java, Python, and C#. Blog: https://blog.csdn.net/badao_liumang_qizhi
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
