Validate AI-Generated JSON in Spring Boot with JSON Schema – A Step-by-Step Guide
This article explains how to integrate the networknt JSON Schema validator into a Spring Boot application to enforce structured AI output, covering dependency setup, schema definition, service implementation, custom output validator, and exception handling, ensuring reliable, correctly formatted JSON responses from AI models.
In the previous article we showed how to use Ollama to generate JSON‑structured output. This follow‑up explains how to guarantee that the AI‑generated data conforms to an expected structure by using JSON Schema validation inside a Spring Boot application.
Project Dependencies
Add the JSON Schema validator library to your
pom.xml:
<code><dependency>
<groupId>com.networknt</groupId>
<artifactId>json-schema-validator</artifactId>
<version>1.4.0</version>
</dependency>
</code>The networknt/json-schema-validator is a lightweight, high‑performance validator designed for Java applications. It supports complex nested JSON structures, provides clear error messages, offers excellent performance for large‑scale validation, and can cache schemas for efficiency.
AI Structured Output
OpenAI’s Function Calling feature lets you define the desired output format. By supplying a JSON Schema, the model can be forced to produce output that matches the schema.
Define Output Schema
Below is an example schema for generating a product review:
<code>{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"review": {
"type": "object",
"properties": {
"rating": { "type": "number" },
"summary": { "type": "string" },
"sentiment": {
"type": "string",
"enum": ["positive", "neutral", "negative"],
"description": "overall sentiment"
}
},
"required": ["rating", "summary", "sentiment"]
}
}
}
</code>Implement AI Service
Create a Spring service that calls the OpenAI API and validates the response against the schema:
<code>@Service
public class AIReviewService {
private final OpenAiClient openAiClient;
private final JsonSchemaValidator schemaValidator;
@Value("classpath:schemas/review-output-schema.json")
private Resource schemaResource;
public ReviewResponse generateStructuredReview(String productDescription) {
var request = ChatCompletionRequest.builder()
.model("gpt-4")
.messages(List.of(
new Message("system", "You are a professional product review analyst. Generate a structured review based on the description provided."),
new Message("user", productDescription)
))
.functions(List.of(
new FunctionDefinition(
"generate_review",
"Generate product review",
schemaResource // JSON Schema as function parameter definition
)
))
.functionCall("generate_review") // force the function
.build();
return openAiClient.createChatCompletion(request)
.map(this::validateAndParseResponse)
.orElseThrow(() -> new AIGenerationException("Failed to generate review"));
}
private ReviewResponse validateAndParseResponse(String jsonResponse) {
// Validate JSON against the schema
if (!schemaValidator.isValid(jsonResponse)) {
throw new InvalidOutputException("AI generated invalid review format");
}
// Parse the validated JSON
return objectMapper.readValue(jsonResponse, ReviewResponse.class);
}
}
</code>Output Validator
A dedicated component caches and validates schemas at runtime:
<code>@Component
public class AIOutputValidator {
private final Map<String, JsonSchema> schemaCache = new ConcurrentHashMap<>();
private final ObjectMapper objectMapper;
public AIOutputValidator(ObjectMapper objectMapper) {
this.objectMapper = objectMapper;
}
public ValidationResult validateOutput(String output, String schemaPath) {
JsonSchema schema = getOrLoadSchema(schemaPath);
try {
JsonNode outputNode = objectMapper.readTree(output);
Set<ValidationMessage> errors = schema.validate(outputNode);
if (errors.isEmpty()) {
return ValidationResult.success();
}
return ValidationResult.failure(errors.stream()
.map(ValidationMessage::getMessage)
.collect(Collectors.toList()));
} catch (Exception e) {
return ValidationResult.failure(List.of("Invalid JSON format: " + e.getMessage()));
}
}
private JsonSchema getOrLoadSchema(String schemaPath) {
return schemaCache.computeIfAbsent(schemaPath, this::loadSchema);
}
private JsonSchema loadSchema(String schemaPath) {
try {
Resource resource = new ClassPathResource(schemaPath);
JsonNode schemaNode = objectMapper.readTree(resource.getInputStream());
return JsonSchemaFactory.getInstance(SpecVersion.VersionFlag.V7)
.getSchema(schemaNode);
} catch (Exception e) {
throw new SchemaLoadException("Failed to load schema: " + schemaPath, e);
}
}
}
</code>Error Handling
Define a controller advice to translate validation and generation errors into HTTP responses:
<code>@ControllerAdvice
public class AIExceptionHandler extends ResponseEntityExceptionHandler {
@ExceptionHandler(InvalidOutputException.class)
public ResponseEntity<ErrorResponse> handleInvalidOutput(InvalidOutputException ex) {
ErrorResponse error = new ErrorResponse(
"AI_OUTPUT_VALIDATION_ERROR",
"AI generated output format is invalid",
ex.getValidationErrors()
);
return ResponseEntity.badRequest().body(error);
}
@ExceptionHandler(AIGenerationException.class)
public ResponseEntity<ErrorResponse> handleAIGeneration(AIGenerationException ex) {
ErrorResponse error = new ErrorResponse(
"AI_GENERATION_ERROR",
"Failed to generate AI content",
Collections.singletonList(ex.getMessage())
);
return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE).body(error);
}
}
</code>Summary and Best Practices
Using JSON Schema to validate AI output ensures that generated content meets the expected format and quality, improving reliability and simplifying downstream processing. Key recommendations:
Include the JSON Schema in the prompt – embedding the schema directly in the prompt dramatically increases the chance that the model follows the required structure; keep the schema concise.
Use a structured prompt template
<code>Generate output according to the following JSON Schema:
{
"type": "object",
"properties": {
"field1": {"type": "string"},
"field2": {"type": "number"}
},
"required": ["field1", "field2"]
}
</code>Optimize output quality
Provide examples of the desired output.
Clearly state required fields and their data types.
Emphasize strict adherence to JSON syntax in the prompt.
Applying these practices reduces retries and error‑handling costs, making AI‑enhanced backend services more stable and production‑ready.
Java Architecture Diary
Committed to sharing original, high‑quality technical articles; no fluff or promotional content.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.