Quick Start Spring Batch with Spring Boot 3
Contents
- Introduction
- Target Audience
- Repository Setup
- Skeleton Batch Guide
- DB and CSV Batch Guide
- Continuous Integration
- Conclusion
Introduction
Hello, I'm Miyashita, an engineer from KINTO Technologies' Common Service Development Group[1][2][3][4][5].
While developing batch processes with Spring Boot 3, I encountered several challenges with Spring Batch's class structure and annotations. The transition to Spring Boot 3 presented additional complexities, particularly with multi-database configurations and error handling.
A particular challenge was the lack of Japanese documentation for Spring Boot 3. While referring to the official documentation, I had trouble finding configurations that matched my requirements and encountered issues with multi-database setup and error handling. This led to a cycle of trial and error, making me acutely aware of the challenges in building a batch process from scratch.
In this article, I've compiled information to help others facing similar challenges implement Spring Batch more smoothly:
Spring Boot 3 Batch Skeleton
- Immediate deployment through GitHub clone and run
- Ready for business logic implementation
- Pre-configured for production use
Common Batch Processing Use Cases
In addition to the skeleton, I'll introduce sample code for batch processing utilizing a multi-database configuration:
- Batch process for exporting DB records to CSV
- Batch process for importing CSV data to DB
I hope this article helps make Spring Batch implementation and development easier.
Target Audience
This guide is designed for developers who are:
✓ New to Spring Batch 5 framework implementation
✓ Returning to Spring Batch after a hiatus
✓ Migrating from Spring Boot 2 Batch due to end of support
✓ Experienced with Java main method batches but need Spring Batch
✓ Seeking rapid batch development solutions
Repository Setup
For GUI-based operations, use GitHub Desktop,
Repository Structure
The repository is structured as follows:
.
├── gradlew # Gradle wrapper for build automation
├── settings.gradle
├── compose.yaml # Docker Compose configuration
├── init-scripts # Database initialization
│ ├── 1-create-table.sql
│ └── 2-insert-data.sql
├── dbAndCsvBatch # DB/CSV processing implementation
│ ├── build.gradle
│ └── src
│ ├── main
│ └── test
└── skeletonBatch # Basic batch implementation
├── build.gradle
└── src
├── main
└── test
Skeleton Batch Guide
This is a skeleton code that becomes a complete batch process just by adding business logic. It consists of only the necessary code components.
Core Concepts: Jobs and Steps
Spring Batch is built on two fundamental concepts: Jobs and Steps.
Jobs
A Job represents the complete batch process:
- Contains one or multiple Steps
- Manages batch execution flow
- Supports sequential and parallel processing
Job Capabilities
Capability | Implementation | Example |
---|---|---|
Sequential | Chain multiple jobs | Daily data processing pipeline |
Error Handling | Define recovery actions | Automated error notifications |
Concurrent | Run multiple jobs simultaneously | Parallel data processing |
Parallel | Multiple instances of same job | Large dataset processing |
Step
A Step represents a processing unit within a job. One job can register multiple steps, and step behavior can be designed flexibly like jobs.
Implementation Method
Steps can be implemented using either Chunk or Tasklet processing.
1. Chunk Processing
This is a method for efficiently processing large volumes of data.
Implementation Characteristics
Processing is divided into three phases:
-
Reader: Reads data in fixed quantities
- Example: Reading 100 lines at a time from CSV file using FlatFileItemReader
-
Processor: Processes data one item at a time
- Example: Converting date formats using CustomProcessor
-
Writer: Outputs data in bulk
- Example: Bulk INSERT to DB using JdbcBatchItemWriter
-
Transactions are committed by chunk size (e.g., 100 items). This means that even if an error occurs midway, the successfully completed chunks are committed.
2. Tasklet Processing
This is a method for executing simple, single-unit processes.
Characteristics
- Simple implementation
- Easy to understand processing flow
- Straightforward debugging
Choosing Between Implementation Methods
Aspect | Chunk | Tasklet |
---|---|---|
Data Volume | Suitable for large volumes | Suitable for small volumes |
Processing Complexity | Suitable for complex processing | Suitable for simple processing |
Management Cost | Higher | Lower |
Debugging | Somewhat complex | Easy |
Transaction | By chunk | By process unit |
Summary
- Job is the management unit for the entire batch process
- Step is the execution unit for specific processes
- Chunk is suitable for efficient processing of large data volumes
- Tasklet is optimal for simple processing
- Choose between Chunk and Tasklet based on processing requirements
YAML Configuration and Spring Batch Management Tables
Let's first explain the basic configuration needed to run Spring Batch in batch-dedicated mode.
YAML Configuration
This configuration allows the Spring Boot application to run in serverless mode.
While typical Spring Boot applications maintain their process by starting servers like Tomcat, this is unnecessary for batch processing. With this configuration, the process terminates when batch processing completes.
Spring Batch Management Tables
Spring Batch requires management tables in a database to record execution results. However, managing databases and tables can be cumbersome. Moreover, you might need to run batches in environments without databases. In such cases, using H2 in-memory database is recommended. When the process ends:
- Memory is released
- Database and tables are cleared
- Next execution starts with a fresh state
- H2 is automatically configured if no database settings are specified in application.yml
Job Class Guide
This is the core class for registering and executing Jobs in Spring Batch.
Class Definition Key Points
-
@Configuration
- Recognized as a Spring Batch configuration class
- Indicates that this class provides Spring configuration
- Required annotation for Job definition classes
-
@Bean
- Applied to methods that generate objects managed by Spring Framework
- In this case, allows Spring to manage the Job created by
createSampleJob
- The Job instance is used during batch execution
Dependency Class Roles
JobRepository
: Manages job execution statePlatformTransactionManager
: Maintains database consistencySampleLogic
: Handles actual business processing
Processing Flow
- Output job registration start log
- Create Step (define processing unit)
- Create Job and register Step
- Output job registration completion log
Transaction Management
- Normal completion: Database changes are confirmed when all processing succeeds
- Abnormal termination: Database changes are rolled back when errors occur
This ensures the reliability of batch processing operations.
Logic Class Guide
This class defines the actual processing content of the batch.
Class Definition Key Points
-
@Component
- Makes this class manageable by Spring
- Allows usage with @Autowired in other classes
-
Tasklet Interface
- Interface representing Spring Batch processing unit
- Implement actual processing in the execute method
Processing Flow
- Output batch processing start log
- Call SampleService's process method
- Output completion log on normal termination
- Log exception and re-throw on error occurrence
- Output batch processing end log
Error Handling
- Normal completion: Return
RepeatStatus.FINISHED
to complete batch - Abnormal termination: Catch exception, log it, and notify upper layer
Service Class Guide
This class is for implementing actual business logic.
Class Definition Key Points
-
@Service
- Recognized as a Spring Framework service class
- Indicates this class implements business logic
-
process Method
- Method for implementing batch business processing
- Includes start and completion log output
Customization Options
You can modify:
- Class name
- Method name
- Arguments and return values
- Business logic implementation (e.g., data validation, transformation, external system integration)
This concludes our explanation of the main skeleton batch classes. This loosely coupled design makes the batch easy to maintain and test.
Running the Batch
Running from IDE
Start the BatchApp class. It works like a regular Spring Boot application with no special startup arguments required.
Running via Gradle from Terminal
Execute the following command:
./gradlew :skeletonBatch:bootRun
Running by Generating and Executing JAR File
Gradle's default task is configured to generate a JAR file. Execute as follows:
cd skeletonBatch
../gradlew
java -jar build/libs/batch-skeleton*.jar
Checking Execution Logs
Let's examine the Spring Batch execution flow through the logs.
1. Job Registration Check
----------- Registering job: sample -----------
----------- Job registered successfully: sample-job -----------
During Spring Boot startup, the batch job 'sample-job' was successfully registered.
2. Batch Processing Execution
Started BatchApp in 0.456 seconds (process running for 0.616)
Running default command line with: []
Job: [SimpleJob: [name=sample-job]] launched with the following parameters: [{}]
Executing step: [processing-step]
----------- START ----------- Batch Processing -----------
--- Starting batch process ---
--- Batch process completed ---
Processing completed successfully
----------- END ----------- Batch Processing -----------
Step: [processing-step] executed in 3ms
Job: [SimpleJob: [name=sample-job]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 9ms
Important Points from Logs:
- 'sample-job' executed and 'processing-step' started
- START and END markers clearly show processing boundaries
- Step completed successfully and job finished with COMPLETED status
- Application terminated normally
Skeleton Batch Summary
Advantages
- Complete batch process by just adding business logic
- Integration with schedulers like cron using generated JAR file
- High maintainability through simple structure
This skeleton code aims to contribute to more efficient batch development.
DB and CSV Batch Guide
Overview
This project includes two batch processes:
-
DB to CSV Export
- Exports database records to CSV format
- Customizable extraction conditions via startup arguments
- Works with default settings
-
CSV to DB Import
- Bulk imports CSV data into database
- Requires CSV file, so run export batch first
Local Database Setup
This batch uses a MySQL local database. Follow these steps for setup:
- Start MySQL Container
docker compose up -d
- Verify Setup
Connect to MySQL container and check sample data:
docker exec -it mysql-container mysql -u sampleuser -psamplepassword sampledb
Run sample query:
mysql>
SELECT *
FROM member
WHERE delete_flag = 0
AND type IN (1, 2, 3)
ORDER BY type ASC;
mysql> exit
Bye
Entity Class Auto-generation
After cloning the GitHub repository, you might encounter a "Entity class not found" error. This occurs because jOOQ needs to auto-generate Entity classes.
Running Auto-generation
Either command will generate Entity classes:
# Execute default task
cd dbAndCsvBatch/
../gradlew
# Or directly execute jOOQ generation
../gradlew generateJooq
The build.gradle is configured for immediate use without additional settings. Default task execution includes:
- Cleanup of build results
- Java code formatting (Google Java Format)
- Entity class auto-generation (jOOQ plugin)
- Compilation
- Static analysis and testing (JUnit, coverage, SpotBugs)
- Executable JAR generation
Generated Entity Classes
Check generated Entity classes with:
tree dbAndCsvBatch/build/generated-src
# If tree command is not installed:
ls -R dbAndCsvBatch/build/generated-src
Example output:
dbAndCsvBatch/build/generated-src/jooq
└── main
└── com
└── example
└── batch
└── jooq
├── DefaultCatalog.java
├── Keys.java
├── Sampledb.java
├── Tables.java
└── tables
├── Member.java
└── records
└── MemberRecord.java
Handling "Class Not Found" Errors
Your IDE might report Entity classes not found. Resolve this by either:
- Add generated-src/jooq to IDE's build path
- Copy classes from generated-src/jooq to dbAndCsvBatch/src/main/java
Production Environment Setup
Multi-database Configuration
This batch uses a dual database configuration:
-
H2 (In-memory Database)
- Used for Spring Batch management tables
- Resets when process ends
- Ideal for development and testing
-
MySQL (Business Logic Database)
- Used for actual business data processing
- Runs in Docker container
Configuration File (application-local.yml)
Spring Boot's configuration file defines two databases under spring.datasource.
The configuration file includes:
- Two database settings
- Profile suffixes (local and server)
- Runtime profile selection capability
The server profile is intended for your production MySQL server settings.
Data Source Configuration Class
Key Annotations
-
@Configuration
- Marks class as Spring configuration
- Spring manages beans defined here
-
@ConfigurationProperties
- Maps YAML settings to class properties
- Example: maps 'spring.datasource.mysqlmain'
-
@BatchDataSource
- Specifies data source for Spring Batch tables
- Applied to H2 database
-
@Primary
- Designates default data source
- Applied to MySQL data source
Job Class Guide
New Features Added
- Automatic Run ID Increment
.incrementer(new RunIdIncrementer())
- Automatically increments job execution number (run.id)
- Used for execution history management
- Listener Integration
.listener(listener)
- Add processing before and after job execution
- Useful for logging, email notifications, etc.
Logic Class Guide
This example demonstrates basic logging implementation, but you can add custom processing like error handling and notifications.
Parameter Configuration
@Value("${batch.types:2,3,4}")
private String typesConfig;
-
Uses 2,3,4 as default values even without application.yml
-
Can be overridden by passing --batch.types=1,2 at runtime
Error Handling
contribution.setExitStatus(ExitStatus.FAILED);
- Explicitly sets failure status on error
- Automatically sets COMPLETED status on normal completion
Repository Class and jOOQ Guide
About jOOQ
jooq is a library that allows writing SQL as Java code:
- Express SQL syntax directly in Java code
- Auto-generate Entity classes from table definitions
- Type-safe SQL operations
Select Processing
Key Features
-
SQL-like Syntax
- Direct expression of SQL in Java code
- Leverage existing SQL knowledge
-
Auto-generated Class Usage
- MEMBER: Represents table definition (column names, etc.)
- MemberRecord: For record mapping
Bulk Insert Processing
Bulk Insert Features
- Registers multiple records at once instead of one-by-one
- Common in batch processing to reduce database load
Repository Class Summary
- Use jOOQ's Gradle plugin to auto-generate entities from database schema
- Auto-generated classes enable type-safe SQL writing in Java
- Easy to handle schema changes - just regenerate classes
DB to CSV Batch Execution Guide
Run the DB to CSV batch following similar steps as the skeleton batch.
Running via Gradle
./gradlew :dbAndCsvBatch:bootRun
Running JAR File
cd dbAndCsvBatch/
../gradlew
java -jar build/libs/batch-dbAndCsv*.jar
Running without arguments produces this error:
org.springframework.beans.factory.BeanCreationException:
at com.example.batch.DbAndCsvBatchApp.main(DbAndCsvBatchApp.java:16)
Caused by: java.lang.IllegalArgumentException: Job name must be specified in case of multiple jobs
Why This Error Occurs
Unlike the skeleton batch with a single job, this project contains multiple jobs:
- CSV to DB: Importing CSV data into database
- DB to CSV: Exporting database data to CSV
The framework cannot determine which job to execute without explicit specification.
Running with Arguments
Specify the job name and environment in startup arguments:
--spring.batch.job.name=DB_TO_CSV --spring.profiles.active=local
- --spring.batch.job.name=DB_TO_CSV: Specifies which job to run
- --spring.profiles.active=local: Activates local environment settings
Example Commands
# Using Gradle
./gradlew :dbAndCsvBatch:bootRun --args="--spring.batch.job.name=DB_TO_CSV --spring.profiles.active=local"
# Using JAR directly
cd dbAndCsvBatch/
java -jar build/libs/batch-dbAndCsv*.jar --spring.batch.job.name=DB_TO_CSV --spring.profiles.active=local
Examining the Logs
Let's check the execution logs:
1. Initial Startup Log
##### KEY:"sun.java.command", VALUE:"com.example.batch.DbAndCsvBatchApp --spring.batch.job.name=DB_TO_CSV --spring.profiles.active=local"
##### Spring Batch ##### - Job: DB_TO_CSV, Profile: local
This confirms our startup arguments were correctly received.
2. Job Start Log
----------- JOB [Job Name:DB_TO_CSV] START! -----------
Executing step: [DB_TO_CSV-step]
Fetching members with types = [1, 2, 3]
Shows BatchNotificationListener's beforeJob method execution and configured types.
3. Database Operation Log
-> with bind values : select `sampledb`.`member`.`id`, `sampledb`.`member`.`type`, `sampledb`.`member`.`name`, `sampledb`.`member`.`email`, `sampledb`.`member`.`phone`, `sampledb`.`member`.`address`, `sampledb`.`member`.`delete_flag`, `sampledb`.`member`.`created_at`, `sampledb`.`member`.`updated_at`
from `sampledb`.`member`
where (`sampledb`.`member`.`delete_flag` = 0 and `sampledb`.`member`.`type` in (1, 2, 3)) order by `sampledb`.`member`.`type`
Version : Database version is supported by dialect MYSQL: 9.1.0
Fetched result : +----+----+----------+----------------------+----------+-----------------------------+-----------+-------------------+-------------------+
: | id|type|name |email |phone |address |delete_flag|created_at |updated_at |
: +----+----+----------+----------------------+----------+-----------------------------+-----------+-------------------+-------------------+
: | 1| 1|John Doe |john.doe@example.com |1234567890|123 Main St, City, Country | 0|2024-12-07T09:46:07|2024-12-07T09:46:07|
: | 2| 1|Jane Smith|jane.smith@example.com|0987654321|456 Oak St, Town, Country | 0|2024-12-07T09:46:07|2024-12-07T09:46:07|
: | 26| 1|John Doe |john.doe@example.com |1234567890|123 Main St, City, Country | 0|2024-12-09T05:36:37|2024-12-09T05:36:37|
: | 27| 1|Jane Smith|jane.smith@example.com|0987654321|456 Oak St, Town, Country | 0|2024-12-09T05:36:37|2024-12-09T05:36:37|
: | 3| 2|ABC Corp |contact@abccorp.com |5678901234|789 Pine St, Village, Country| 0|2024-12-07T09:46:07|2024-12-07T09:46:07|
: +----+----+----------+----------------------+----------+-----------------------------+-----------+-------------------+-------------------+
Fetched row(s) : 5 (or more)
Batch process completed successfully.
Step: [DB_TO_CSV-step] executed in 193ms
Job: [SimpleJob: [name=DB_TO_CSV]] completed with the following parameters: [{'run.id':'{value=1, type=class java.lang.Long, identifying=true}'}] and the following status: [COMPLETED] in 212ms
----------- JOB [Job Name:DB_TO_CSV] FINISHED! status:[COMPLETED] -----------
jOOQ's query and result logging is enabled in logback.xml for easier debugging.
Generated CSV Check
"id","type","name","email","phone","address","deleteFlag","createdAt","updatedAt"
"1","1","John Doe","john.doe@example.com","1234567890","123 Main St, City, Country","0","2024-12-11T06:05:26","2024-12-11T06:05:26"
"2","1","Jane Smith","jane.smith@example.com","0987654321","456 Oak St, Town, Country","0","2024-12-11T06:05:26","2024-12-11T06:05:26"
"3","2","ABC Corp","contact@abccorp.com","5678901234","789 Pine St, Village, Country","0","2024-12-11T06:05:26","2024-12-11T06:05:26"
"5","3","Alice Premium","alice.premium@example.com","4561237890","987 Maple St, City, Country","0","2024-12-11T06:05:26","2024-12-11T06:05:26"
"6","3","Charlie Davis","charlie.davis@example.com","1112223333",,"0","2024-12-11T06:05:26","2024-12-11T06:05:26"
The CSV file is generated using the OpenCSV library.
Testing Runtime Argument Override
Let's try customizing the types parameter at runtime:
--spring.batch.job.name=DB_TO_CSV --batch.types=4,5 --spring.profiles.active=local
Log Output
-> with bind values : select `sampledb`.`member`.`id`, `sampledb`.`member`.`type`, `sampledb`.`member`.`name`, `sampledb`.`member`.`email`, `sampledb`.`member`.`phone`, `sampledb`.`member`.`address`, `sampledb`.`member`.`delete_flag`, `sampledb`.`member`.`created_at`, `sampledb`.`member`.`updated_at` from `sampledb`.`member` where (`sampledb`.`member`.`delete_flag` = 0 and
`sampledb`.`member`.`type` in (4, 5)) order by `sampledb`.`member`.`type`
This confirms that the configuration file values were successfully overridden by command line arguments.
CSV to DB Batch Execution
Run the batch with the specified job name:
--spring.batch.job.name=CSV_TO_DB --spring.profiles.active=local
Log Output
insert into `sampledb`.`member` (`name`, `email`, `phone`, `address`, `type`) values
('Premium Corp', 'premium@corporate.com', '8889997777', '555 High St, City, Country', 4),
('Elite Ltd', 'elite@elitecorp.com', '4445556666', '777 Sky Ave, Town, Country', 4),
('Guest User1', 'guest1@example.com', '', 'Guest Address 1, City, Country', 5),
('Guest User2', 'guest2@example.com', '9998887777', '', 5)
Affected row(s) : 4
This confirms successful bulk insertion of records.
Continuous Integration
This project implements CI (Continuous Integration) using GitHub Actions. Quality management is efficiently handled through automated processing triggered by code pushes or pull requests.
Main Workflow
- MySQL Setup
- Launch MySQL using Docker Compose
- Verify required tables
- JDK 21 Setup
- Install Java 21
- Configure build environment
- jOOQ Class Generation
- Auto-generate entity classes from database schema
- Build and Test
- Execute Gradle build
- Run quality checks:
- JUnit tests
- Jacoco coverage
- SpotBugs analysis
This workflow enables:
- Immediate detection of compilation errors
- Quick identification of JUnit test failures
- Rapid response to issues
- Consistent code quality maintenance
Dynamic Badge Updates
When the build and tests succeed, a dynamic badge is displayed on the GitHub README, indicating the status of the project.
Code Coverage Visualization
This project uses Codecov to measure and visualize test coverage. Coverage reports are automatically generated during pull requests, and the coverage rate can be checked via this badge:
This enables:
- Visual tracking of test coverage
- Quick detection of coverage changes
- Enhanced transparency in quality management
Detailed coverage reports are available on the Codecov dashboard.
Conclusion
We hope you found this guide helpful!
This project provides:
- Skeleton code foundation for efficient Spring Batch development
- Common use cases like "DB to CSV" and "CSV to DB"
- Flexible customization through database settings and CSV layouts
We hope this skeleton helps streamline your batch development by allowing you to focus on business logic implementation.
If you found this article helpful and got your Spring Batch up and running quickly, we would appreciate a ⭐ on our GitHub repository!
Thanks for reading!
Post by Common Service Development Group Member 1
[Implementing Domain-Driven Design (DDD) in Payment Platform with Global Expansion in Mind ] ↩︎Post by Common Service Development Group Member 2
[Success Story: New System Development through Remote Mob Programming by Team Members with Less Than One Year Experience] ↩︎Post by Common Service Development Group Member 3
[Improving Deploy Traceability Across Multiple Environments Using JIRA and GitHub Actions] ↩︎Post by Common Service Development Group Member 4
[ Development Environment Setup Using VSCode Dev Container ] ↩︎Post by Common Service Development Group Member 5
[ Guide to Setting Up S3 Local Development Environment Using MinIO (AWS SDK for Java 2.x) ] ↩︎
関連記事 | Related Posts
Spring Bootを2系から3系へバージョンアップしました。
Spring Boot 2 to 3 Upgrade: Procedure, Challenges, and Solutions
Spring BatchとDBUnitを使ったテストで起きた問題
An Issue We Encountered During Testing With Spring Batch using DBUnit
Flyway Implementation
Efforts to Improve Deploy Traceability to Multiple Environments Utilizing GitHub and JIRA
We are hiring!
【データエンジニア】分析G/名古屋・大阪
分析グループについてKINTOにおいて開発系部門発足時から設置されているチームであり、それほど経営としても注力しているポジションです。決まっていること、分かっていることの方が少ないぐらいですので、常に「なぜ」を考えながら、未知を楽しめるメンバーが集まっております。
【スクラッチ開発エンジニア】プラットフォームG/東京・大阪
プラットフォームグループについてAWS を中心とするインフラ上で稼働するアプリケーション運用改善のサポートを担当しています。