MyBatis Streaming Query: Concepts, Cursor API, Implementation, and Use Cases
This article introduces MyBatis streaming queries, explains the Cursor interface and its methods, provides step‑by‑step code examples for configuration and implementation, discusses practical application scenarios such as large‑scale data processing, and highlights important considerations for efficient and safe usage.
MyBatis streaming queries allow the result set to be returned as an iterator (Cursor) instead of loading all data into memory, which is useful for processing large datasets efficiently.
Environment Configuration
JDK version: 1.8
IDE: IntelliJ IDEA 2020.1
Spring Boot: 2.3.9.RELEASE
mybatis-spring-boot-starter: 2.1.4
What Is MyBatis Streaming Query?
When using MyBatis as the persistence layer, a streaming query returns a org.apache.ibatis.cursor.Cursor iterator, allowing rows to be fetched one by one, reducing memory consumption.
Cursor Interface
The Cursor interface defines three key methods:
isOpen() : checks if the cursor is currently open.
isConsumed() : determines whether all rows have been read.
getCurrentIndex() : returns the index of the current row within the total result set.
public interface Cursor
extends Closeable, Iterable
{
// 判断 cursor 是否正处于打开状态
boolean isOpen();
// 判断查询结果是否全部读取完
boolean isConsumed();
// 查询已读取数据在全部数据里的索引位置
int getCurrentIndex();
}Code Implementation
Define a mapper method that returns a Cursor<Person> and a regular count query:
@Mapper
public interface PersonDao {
Cursor<Person> selectByCursor();
Integer queryCount();
}Configure the corresponding SQL statements in the MyBatis XML mapper.
In the service layer, open a SqlSession , obtain the mapper, and iterate over the cursor, processing data in batches (e.g., 1000 rows) while keeping the session open until all rows are consumed, then commit and close the session:
@Service
@Slf4j
public class PersonServiceImpl implements IPersonService {
@Autowired
private SqlSessionFactory sqlSessionFactory;
@Override
public void getOneByAsync() throws InterruptedException {
new Thread(() -> {
SqlSession sqlSession = sqlSessionFactory.openSession();
try {
PersonDao mapper = sqlSession.getMapper(PersonDao.class);
Cursor<Person> cursor = mapper.selectByCursor();
Integer total = mapper.queryCount();
List<Person> batch = new ArrayList<>();
int batchIndex = 0;
for (Person person : cursor) {
if (batch.size() < 1000) {
batch.add(person);
} else {
batchIndex++;
log.info("----{}、从cursor取数据达到1000条,开始处理数据", batchIndex);
// simulate processing
Thread.sleep(1000);
log.info("----{}、从cursor中取出的1000条数据已经处理完毕", batchIndex);
batch.clear();
batch.add(person);
}
if (total == cursor.getCurrentIndex() + 1) {
// process remaining data
batchIndex++;
log.info("----{}、从cursor取数据达到1000条,开始处理数据", batchIndex);
Thread.sleep(1000);
log.info("----{}、从cursor中取出的1000条数据已经处理完毕", batchIndex);
batch.clear();
}
}
if (cursor.isConsumed()) {
log.info("----查询sql匹配中的数据已经消费完毕!");
}
sqlSession.commit();
log.info("----提交事务");
} catch (Exception e) {
e.printStackTrace();
sqlSession.rollback();
} finally {
if (sqlSession != null) {
sqlSession.close();
log.info("----关闭sqlSession");
}
}
}).start();
}
}Application Scenarios
Streaming queries are ideal for processing massive data sets, such as generating payroll reports for 500,000 employees, where loading all rows at once would cause memory overflow and long garbage‑collection pauses.
Typical approach: partition data by month/department, submit each partition as an asynchronous task to a thread pool, use CountDownLatch to wait for all tasks, and within each task employ MyBatis streaming to fetch and process data in manageable batches.
Considerations
Streaming reduces memory usage but may increase total processing time; it requires keeping the SqlSession open for the entire iteration, handling transactions manually, and being aware of potential connection timeouts.
Proper batch size selection, error handling, and timely commit/rollback are essential to avoid data loss or resource leaks.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.