Efficient Import and Export of Massive Data Using POI and EasyExcel
This article explains how to handle large‑scale Excel import and export in Java by comparing traditional POI workbooks, selecting the appropriate implementation based on data volume and requirements, and presenting a high‑performance solution with EasyExcel, batch processing, and JDBC transactions for hundreds of millions of rows.
In many projects data needs to be imported from Excel into a database and exported back to Excel; the author encountered performance and memory issues when dealing with very large datasets and decided to solve them.
Traditional POI versions and their pros/cons – HSSFWorkbook works with the old .xls format and is limited to 65,535 rows; XSSFWorkbook supports the newer .xlsx format up to 1,048,576 rows; SXSSFWorkbook (available from POI 3.8) uses a low‑memory, disk‑backed approach that can handle millions of rows but has limitations such as no formula evaluation and restricted sheet cloning.
Choosing the right workbook – For data under about 70,000 rows either HSSFWorkbook or XSSFWorkbook is sufficient; for data exceeding 70,000 rows without needing styles or formulas, SXSSFWorkbook is recommended; when styles or formulas are required, XSSFWorkbook with batch writes should be used.
Massive data export (300 w rows) with EasyExcel – The solution splits the query into batches (e.g., 200 k rows per query), writes each batch to a sheet, creates a new sheet after 1 M rows, and repeats until all data is exported. The author provides a complete Java export snippet that calculates the required number of sheets, batch write counts, and uses ExcelWriter to stream data directly to the HTTP response.
public void dataExport300w(HttpServletResponse response) {
OutputStream outputStream = null;
try {
long startTime = System.currentTimeMillis();
outputStream = response.getOutputStream();
ExcelWriter writer = new ExcelWriter(outputStream, ExcelTypeEnum.XLSX);
String fileName = new String(("excel100w").getBytes(), "UTF-8");
Table table = new Table(1);
List
> titles = new ArrayList<>();
titles.add(Arrays.asList("onlineseqid"));
titles.add(Arrays.asList("businessid"));
// ... other columns ...
table.setHead(titles);
int totalCount = actResultLogMapper.findActResultLogByCondations(3000001);
int sheetDataRows = ExcelConstants.PER_SHEET_ROW_COUNT;
int writeDataRows = ExcelConstants.PER_WRITE_ROW_COUNT;
int sheetNum = totalCount % sheetDataRows == 0 ? totalCount / sheetDataRows : totalCount / sheetDataRows + 1;
int oneSheetWriteCount = sheetDataRows / writeDataRows;
int lastSheetWriteCount = totalCount % sheetDataRows == 0 ? oneSheetWriteCount : (totalCount % sheetDataRows % writeDataRows == 0 ? totalCount / sheetDataRows / writeDataRows : totalCount / sheetDataRows / writeDataRows + 1);
List
> dataList = new ArrayList<>();
for (int i = 0; i < sheetNum; i++) {
Sheet sheet = new Sheet(i, 0);
sheet.setSheetName("测试Sheet1" + i);
for (int j = 0; j < (i != sheetNum - 1 ? oneSheetWriteCount : lastSheetWriteCount); j++) {
dataList.clear();
PageHelper.startPage(j + 1 + oneSheetWriteCount * i, writeDataRows);
List
resultList = actResultLogMapper.findByPage100w();
if (!CollectionUtils.isEmpty(resultList)) {
resultList.forEach(item -> dataList.add(Arrays.asList(item.getOnlineseqid(), item.getBusinessid(), item.getBecifno(), item.getIvisresult(), item.getCreatedby(), Calendar.getInstance().getTime().toString(), item.getUpdateby(), Calendar.getInstance().getTime().toString(), item.getRisklevel())));
}
writer.write0(dataList, sheet, table);
}
}
response.setHeader("Content-Disposition", "attachment;filename=" + new String(fileName.getBytes("gb2312"), "ISO-8859-1") + ".xlsx");
response.setContentType("multipart/form-data");
response.setCharacterEncoding("utf-8");
writer.finish();
outputStream.flush();
long endTime = System.currentTimeMillis();
System.out.println("导出结束时间:" + endTime + "ms");
System.out.println("导出所用时间:" + (endTime - startTime) / 1000 + "秒");
} catch (Exception e) {
e.printStackTrace();
} finally {
if (outputStream != null) {
try { outputStream.close(); } catch (Exception e) { e.printStackTrace(); }
}
}
}Massive data import (300 w rows) – The import process reads Excel in batches (e.g., 200 k rows), stores each batch in a list, and when the list reaches a threshold it performs a JDBC batch insert within a manual transaction. The author provides a listener class for EasyExcel, a JDBC utility class, and a service method that builds a PreparedStatement and executes addBatch() followed by executeBatch() and commit() .
public Map
import2DBFromExcel10w(List
> dataList) {
Map
result = new HashMap<>();
if (dataList.isEmpty()) { result.put("empty", "0000"); return result; }
Connection conn = null;
PreparedStatement ps = null;
try {
long startTime = System.currentTimeMillis();
conn = JDBCDruidUtils.getConnection();
conn.setAutoCommit(false);
String sql = "insert into ACT_RESULT_LOG (onlineseqid,businessid,becifno,ivisresult,createdby,createddate,updateby,updateddate,risklevel) values (?,?,?,?,?,?,?,?,?)";
ps = conn.prepareStatement(sql);
for (Map
item : dataList) {
ps.setString(1, item.get(0));
ps.setString(2, item.get(1));
ps.setString(3, item.get(2));
ps.setString(4, item.get(3));
ps.setString(5, item.get(4));
ps.setTimestamp(6, new Timestamp(System.currentTimeMillis()));
ps.setString(7, item.get(6));
ps.setTimestamp(8, new Timestamp(System.currentTimeMillis()));
ps.setString(9, item.get(8));
ps.addBatch();
}
ps.executeBatch();
conn.commit();
long endTime = System.currentTimeMillis();
System.out.println(dataList.size() + "条,导入用时:" + (endTime - startTime) + "ms");
result.put("success", "1111");
} catch (Exception e) {
result.put("exception", "0000");
e.printStackTrace();
} finally {
JDBCDruidUtils.close(conn, ps);
}
return result;
}Test results – Exporting 300 w rows took about 2 minutes 15 seconds, while importing the same amount using the described batch JDBC approach took roughly 8 seconds. The author also notes that EasyExcel can export 1.02 M rows in 105 seconds (≈163 MB file size) and that the overall solution is fast enough for practical use.
Conclusion – By selecting the proper POI workbook, leveraging EasyExcel for low‑memory streaming, and using batch JDBC with manual transaction control, massive data import and export become feasible and performant, allowing developers to list such experience on their resumes.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.