Introduction to Spring Batch and Its Core Concepts
This article provides a comprehensive overview of Spring Batch, covering its purpose, architecture, core components such as Job, Step, ItemReader/Writer/Processor, chunk processing, skip strategies, and practical guidelines for building robust batch processing solutions in Java.
Spring Batch is a lightweight, comprehensive batch processing framework provided by Spring, designed to develop robust batch applications essential for enterprise daily operations.
Typical batch use cases include automated large‑scale data processing without user interaction, periodic complex business rule execution, and integration of data from internal or external systems.
Spring Batch Architecture
A typical batch job reads a large number of records from a database, file, or queue, processes them, and writes the results back. The framework provides reusable features such as transaction management, job restart, skip logic, and resource management.
The overall architecture consists of Jobs composed of Steps, each Step containing an ItemReader , ItemProcessor , and ItemWriter . Jobs, Steps, and their executions are persisted in a JobRepository and launched via a JobLauncher .
Core Concepts
Job
A Job is the top‑level abstraction representing the entire batch process. Its interface defines methods such as getName() , isRestartable() , and execute(JobExecution) .
public interface Job {
String getName();
boolean isRestartable();
void execute(JobExecution execution);
JobParametersIncrementer getJobParametersIncrementer();
JobParametersValidator getJobParametersValidator();
}JobInstance
A JobInstance represents a logical execution of a Job with a unique identifier and name.
public interface JobInstance {
/** Get unique id for this JobInstance. */
long getInstanceId();
/** Get job name. */
String getJobName();
}JobParameters
JobParameters are used to differentiate JobInstances, often containing values such as execution dates.
JobExecution
JobExecution represents a single attempt to run a Job, providing status, start/end times, and exit information.
public interface JobExecution {
long getExecutionId();
String getJobName();
BatchStatus getBatchStatus();
Date getStartTime();
Date getEndTime();
String getExitStatus();
Date getCreateTime();
Date getLastUpdatedTime();
Properties getJobParameters();
}Step and StepExecution
A Step encapsulates a distinct phase of a batch job. Each execution is represented by a StepExecution , which tracks metrics, transaction counts, and timestamps.
JobRepository and JobLauncher
JobRepository persists Jobs, Steps, and their executions. The @EnableBatchProcessing annotation auto‑configures it. JobLauncher starts a Job with given parameters.
public interface JobLauncher {
JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException,
JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}ItemReader, ItemProcessor, ItemWriter
These abstractions handle reading, processing, and writing data within a Step. Spring Batch provides many implementations such as JdbcPagingItemReader and JdbcCursorItemReader .
@Bean
public JdbcPagingItemReader itemReader(DataSource dataSource, PagingQueryProvider queryProvider) {
Map
parameterValues = new HashMap<>();
parameterValues.put("status", "NEW");
return new JdbcPagingItemReaderBuilder
()
.name("creditReader")
.dataSource(dataSource)
.queryProvider(queryProvider)
.parameterValues(parameterValues)
.rowMapper(customerCreditMapper())
.pageSize(1000)
.build();
} private JdbcCursorItemReader
> buildItemReader(final DataSource dataSource, String tableName, String tenant) {
JdbcCursorItemReader
> itemReader = new JdbcCursorItemReader<>();
itemReader.setDataSource(dataSource);
itemReader.setSql("sql here");
itemReader.setRowMapper(new RowMapper());
return itemReader;
}Chunk Processing
Spring Batch processes data in chunks; a chunk size (e.g., 10) determines how many items are read before a transaction commit.
Skip Strategy and Failure Handling
Skip policies allow a Step to ignore a configurable number of exceptions. Methods such as skipLimit() , skip() , and noSkip() control this behavior.
Batch Processing Guidelines
Keep architecture simple and avoid overly complex logic in a single batch application.
Place processing and storage physically close to reduce I/O.
Minimize system resource usage, especially I/O, by performing as much work in memory as possible.
Avoid redundant operations; aggregate data during the initial processing phase.
Allocate sufficient memory at startup to prevent frequent reallocations.
Assume worst‑case data integrity and implement thorough validation and checksums.
Conduct performance testing with realistic data volumes early.
Plan and test backup strategies for both databases and file‑based data.
Preventing Automatic Job Startup
To disable automatic job execution on application startup, set the following property:
spring.batch.job.enabled=falseMemory Exhaustion During Reading
If a reader loads all data at once, the JVM may run out of heap memory. Solutions include paging the reader or increasing the service memory.
Resource exhaustion event: the JVM was unable to allocate memory from the heap.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.