Spring Batch Architecture Overview and Core Concepts
This article introduces Spring Batch as a lightweight, comprehensive batch‑processing framework for enterprise applications, explains its overall architecture, and details core concepts such as Job, JobInstance, JobParameters, JobExecution, Step, StepExecution, ExecutionContext, JobRepository, JobLauncher, ItemReader, ItemWriter, ItemProcessor, chunk processing, skip/failure handling, best‑practice guidelines, and common troubleshooting tips.
Spring Batch Overview
Spring Batch is a lightweight, full‑featured batch processing framework provided by Spring. It is designed for enterprise scenarios that require large‑scale, automated, and reliable data processing without user interaction, such as end‑of‑month calculations, insurance premium calculations, and massive daily transaction handling.
The framework builds on core Spring features (POJO‑based development, productivity, and ease of use) while offering advanced services like transaction management, job restart, skip logic, and resource management. It is not a scheduling framework.
Spring Batch Architecture
A typical batch job reads a large number of records from a database, file, or queue, processes them, and writes the results back. The following diagram (omitted) illustrates this flow.
In Spring Batch, a Job consists of one or more Step s. Each step defines its own ItemReader , ItemProcessor , and ItemWriter . Jobs are launched via a JobLauncher and persisted in a JobRepository .
Core Concepts
Job
A Job is the top‑level abstraction representing the entire batch process. It is defined by the Job interface:
/**
* Batch domain object representing a job. Job is an explicit abstraction
* representing the configuration of a job specified by a developer. It should
* be noted that restart policy is applied to the job as a whole and not to a
* step.
*/
public interface Job {
String getName();
boolean isRestartable();
void execute(JobExecution execution);
JobParametersIncrementer getJobParametersIncrementer();
JobParametersValidator getJobParametersValidator();
}Implementations include SimpleJob and FlowJob . A job contains one or more Step s and can share common attributes such as listeners and skip policies.
JobInstance
A JobInstance represents a logical execution of a job definition. Its interface is:
public interface JobInstance {
/** Get unique id for this JobInstance. */
long getInstanceId();
/** Get job name. */
String getJobName();
}Each time a job is launched with a distinct set of parameters, a new JobInstance is created.
JobParameters
JobParameters hold the key‑value pairs used to start a job and to differentiate job instances (e.g., a date parameter for a daily end‑of‑day job).
JobExecution
JobExecution represents a single attempt to run a JobInstance . Its interface includes methods to obtain execution id, status, start/end times, exit status, and the associated JobParameters :
public interface JobExecution {
long getExecutionId();
String getJobName();
BatchStatus getBatchStatus();
Date getStartTime();
Date getEndTime();
String getExitStatus();
Date getCreateTime();
Date getLastUpdatedTime();
Properties getJobParameters();
}Step and StepExecution
A Step encapsulates an independent phase of a job. Each execution of a step creates a StepExecution , which stores step‑level statistics, timestamps, and an ExecutionContext (a map of key‑value data used for restart).
JobRepository
The JobRepository persists jobs, steps, and their executions (tables such as batch_job_execution ).
JobLauncher
The JobLauncher starts a job with given parameters:
public interface JobLauncher {
JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException,
JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}ItemReader, ItemProcessor, ItemWriter
These three abstractions handle the read‑process‑write pattern. Spring Batch provides many ready‑made implementations (e.g., JdbcPagingItemReader , JdbcCursorItemReader ). Example reader configuration:
@Bean
public JdbcPagingItemReader itemReader(DataSource dataSource, PagingQueryProvider queryProvider) {
Map
parameterValues = new HashMap<>();
parameterValues.put("status", "NEW");
return new JdbcPagingItemReaderBuilder
()
.name("creditReader")
.dataSource(dataSource)
.queryProvider(queryProvider)
.parameterValues(parameterValues)
.rowMapper(customerCreditMapper())
.pageSize(1000)
.build();
}
@Bean
public SqlPagingQueryProviderFactoryBean queryProvider() {
SqlPagingQueryProviderFactoryBean provider = new SqlPagingQueryProviderFactoryBean();
provider.setSelectClause("select id, name, credit");
provider.setFromClause("from customer");
provider.setWhereClause("where status=:status");
provider.setSortKey("id");
return provider;
}Writer and processor examples follow the same pattern and can be customized as needed.
Chunk Processing
Spring Batch can group items into chunks. A chunk size (e.g., 10) determines how many items are read, processed, and written before a transaction commit.
Skip and Failure Handling
Skip logic allows a step to ignore a configurable number of exceptions. Methods such as skipLimit() , skip() , and noSkip() control which exceptions are skippable and which are fatal.
Batch Development Guidelines
Keep the batch architecture simple and avoid overly complex logic in a single job.
Process data close to where it is stored to reduce I/O.
Allocate sufficient memory at startup to avoid frequent reallocations.
Assume worst‑case data integrity and add validation checks.
Perform pressure testing with realistic data volumes.
Plan and test backup strategies for both database and file data.
Common Configuration Tips
Disable Automatic Job Startup
Set spring.batch.job.enabled=false in application.properties to prevent jobs from running on application start.
Memory Exhaustion Fix
If a reader loads all records at once, switch to a paging reader (e.g., JdbcPagingItemReader ) or increase JVM heap size.
Conclusion
The article provides a comprehensive guide to building, configuring, and troubleshooting Spring Batch jobs, covering all essential components and best practices for reliable large‑scale data processing.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.