Understanding and Solving Memory Overflow in Apache POI Excel Processing with EasyExcel
The article explains why Apache POI loads entire .xlsx zip files into memory causing OOM, demonstrates the issue with code examples, and shows how switching to EasyExcel's streaming read dramatically reduces memory usage while handling large Excel files in Java.
Apache POI is a popular Java library for processing Excel files, but it often causes OutOfMemory errors because it loads the entire .xlsx package (a zip of XML files) into memory.
The article explains the internal structure of .xlsx files, shows how POI's XSSFWorkbook and OPCPackage.open methods read the whole zip, and provides a code snippet illustrating the problem.
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
public class ExcelReadTest {
public static void main(String[] args) {
String filename = "example.xlsx";
try (FileInputStream fileInputStream = new FileInputStream(new File(filename))) {
Workbook workbook = new XSSFWorkbook(fileInputStream);
Sheet sheet = workbook.getSheetAt(0);
for (Row row : sheet) {
for (Cell cell : row) {
switch (cell.getCellType()) {
case STRING:
System.out.print(cell.getStringCellValue() + "\t");
break;
case NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
System.out.print(cell.getDateCellValue() + "\t");
} else {
System.out.print(cell.getNumericCellValue() + "\t");
}
break;
case BOOLEAN:
System.out.print(cell.getBooleanCellValue() + "\t");
break;
case FORMULA:
System.out.print(cell.getCellFormula() + "\t");
break;
default:
System.out.print(" ");
}
}
System.out.println();
}
} catch (IOException e) {
e.printStackTrace();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
}It then discusses that SXSSFWorkbook only helps with writing, not reading, and recommends using EasyExcel for reading large Excel files, which streams data to temporary files and dramatically reduces memory consumption.
package excel.read;
import com.alibaba.excel.EasyExcel;
import com.alibaba.excel.context.AnalysisContext;
import com.alibaba.excel.read.listener.ReadListener;
public class EasyExcelReadTest {
public static void main(String[] args) {
String filename = "example.xlsx";
EasyExcel.read(filename, new PrintDataListener()).sheet().doRead();
}
}
class PrintDataListener implements ReadListener
{
@Override
public void invoke(Object data, AnalysisContext context) {
System.out.println(data);
}
@Override
public void doAfterAllAnalysed(AnalysisContext context) {
// after all data processed
}
@Override
public void onException(Exception exception, AnalysisContext context) throws Exception {
// handle exception
}
}Memory usage comparisons using Arthas are presented, showing POI consuming over 1 GB while EasyExcel stays below 100 MB for the same 27 MB file.
These findings demonstrate that EasyExcel provides a memory‑efficient alternative for reading large Excel files in Java applications.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.