Integrating Spring Batch with Spring Boot: A Step‑by‑Step Tutorial
This article provides a comprehensive guide on using Spring Batch within a Spring Boot application, covering CSV and database data sources, detailed configuration beans, custom readers, processors, writers, listeners, validation, and execution via a REST controller, along with practical troubleshooting tips.
Spring Batch is a robust, easy‑to‑use batch‑processing framework built on Spring, offering built‑in logging, transaction management, retry, skip, and restart capabilities.
The tutorial demonstrates two typical scenarios: reading large volumes of data from a CSV file and from a MySQL table, then persisting the processed results.
A simple CREATE TABLE `bloginfo` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键', `blogAuthor` varchar(255) NULL DEFAULT NULL COMMENT '博客作者标识', `blogUrl` varchar(255) NULL DEFAULT NULL COMMENT '博客链接', `blogTitle` varchar(255) NULL DEFAULT NULL COMMENT '博客标题', `blogItem` varchar(255) NULL DEFAULT NULL COMMENT '博客栏目', PRIMARY KEY (`id`) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=89031 CHARACTER SET utf8 COLLATE=utf8_general_ci ROW_FORMAT=Dynamic; is used to store blog information.
The Maven pom.xml includes essential dependencies such as spring-boot-starter-web , spring-boot-starter-batch , mybatis-spring-boot-starter , MySQL driver and Druid datasource.
All batch components are defined in MyBatchConfig.java as Spring beans:
@Bean public JobRepository myJobRepository(DataSource dataSource, PlatformTransactionManager tm) throws Exception { ... }
@Bean public SimpleJobLauncher myJobLauncher(DataSource ds, PlatformTransactionManager tm) throws Exception { ... }
@Bean public Job myJob(JobBuilderFactory jobs, Step myStep) { ... }
@Bean public Step myStep(StepBuilderFactory sbf, ItemReader<BlogInfo> reader, ItemWriter<BlogInfo> writer, ItemProcessor<BlogInfo, BlogInfo> processor) { ... }
@Bean public ItemReader<BlogInfo> reader() { FlatFileItemReader<BlogInfo> r = new FlatFileItemReader<>(); r.setResource(new ClassPathResource("static/bloginfo.csv")); ... }
@Bean public ItemProcessor<BlogInfo, BlogInfo> processor() { MyItemProcessor p = new MyItemProcessor(); p.setValidator(myBeanValidator()); return p; }
@Bean public ItemWriter<BlogInfo> writer(DataSource ds) { JdbcBatchItemWriter<BlogInfo> w = new JdbcBatchItemWriter<>(); w.setSql("insert into bloginfo (blogAuthor,blogUrl,blogTitle,blogItem) values(:blogAuthor,:blogUrl,:blogTitle,:blogItem)"); w.setDataSource(ds); return w; }
@Bean public MyJobListener myJobListener() { return new MyJobListener(); }
@Bean public MyReadListener myReadListener() { return new MyReadListener(); }
@Bean public MyWriteListener myWriteListener() { return new MyWriteListener(); }
@Bean public MyBeanValidator<BlogInfo> myBeanValidator() { return new MyBeanValidator<>(); }
The domain object BlogInfo.java defines fields id, blogAuthor, blogUrl, blogTitle, blogItem with getters, setters and a toString() method.
The MyBatis mapper BlogMapper.java provides an @Insert for persisting BlogInfo and a @Select for querying records by author ID.
The processor MyItemProcessor.java extends ValidatingItemProcessor , invokes super.process(item) for JSR‑303 validation, and modifies blogTitle based on the value of blogItem (e.g., setting a custom title for "springboot").
A custom validator MyBeanValidator uses the JSR‑303 Validator to collect constraint violations and throws a ValidationException when any are found.
Job execution is triggered via a simple REST controller TestController.java that autowires SimpleJobLauncher and the Job , builds JobParameters , and calls jobLauncher.run(myJob, jobParameters) .
After running the /testJob endpoint, the CSV data is read, processed, and inserted into the bloginfo table; batch metadata tables (e.g., BATCH_JOB_EXECUTION ) are also created automatically.
The article then extends the example to a second job that reads from the database using MyBatisCursorItemReader , processes records with a new processor MyItemProcessorNew , and writes to a new table bloginfonew . It explains a runtime issue with the Druid datasource and recommends switching to Spring Boot's default HikariCP pool.
Finally, a new endpoint /testJobNew?authorId=12345 demonstrates the full flow: database read → custom processing based on author ID → insertion into bloginfonew , confirming that the batch job works correctly after the datasource adjustment.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.