Optimizing Java File Compression: From Buffered Streams to NIO Channels and Memory‑Mapped Files
This article explains how to improve the performance of Java code that compresses multiple images into a zip archive by replacing unbuffered streams with buffered I/O, then using NIO channels, direct buffers, memory‑mapped files, and pipes, achieving a reduction from 30 seconds to about 1 second.
A requirement arose to receive ten photos from the front end, compress them into a zip file, and stream the result; the initial unbuffered Java implementation took about 30 seconds for a 20 MB test set.
Original unbuffered code:
public static void zipFileNoBuffer() {
File zipFile = new File(ZIP_FILE);
try (ZipOutputStream zipOut = new ZipOutputStream(new FileOutputStream(zipFile))) {
long beginTime = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
try (InputStream input = new FileInputStream(JPG_FILE)) {
zipOut.putNextEntry(new ZipEntry(FILE_NAME + i));
int temp = 0;
while ((temp = input.read()) != -1) {
zipOut.write(temp);
}
}
}
printInfo(beginTime);
} catch (Exception e) {
e.printStackTrace();
}
}Test output showed fileSize:20M and consum time:29599 ms.
First optimization – using a buffered stream reduces the number of native read calls by reading larger blocks into memory.
Optimized buffered code:
public static void zipFileBuffer() {
File zipFile = new File(ZIP_FILE);
try (ZipOutputStream zipOut = new ZipOutputStream(new FileOutputStream(zipFile));
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(zipOut)) {
long beginTime = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
try (BufferedInputStream bufferedInputStream = new BufferedInputStream(new FileInputStream(JPG_FILE))) {
zipOut.putNextEntry(new ZipEntry(FILE_NAME + i));
int temp = 0;
while ((temp = bufferedInputStream.read()) != -1) {
bufferedOutputStream.write(temp);
}
}
}
printInfo(beginTime);
} catch (Exception e) {
e.printStackTrace();
}
}Result: fileSize:20M and consum time:1808 ms (≈2 seconds).
Second optimization – using NIO Channel and transferTo eliminates the explicit byte‑by‑byte loop and lets the OS move data directly between channels.
Channel‑based code:
public static void zipFileChannel() {
long beginTime = System.currentTimeMillis();
File zipFile = new File(ZIP_FILE);
try (ZipOutputStream zipOut = new ZipOutputStream(new FileOutputStream(zipFile));
WritableByteChannel writableByteChannel = Channels.newChannel(zipOut)) {
for (int i = 0; i < 10; i++) {
try (FileChannel fileChannel = new FileInputStream(JPG_FILE).getChannel()) {
zipOut.putNextEntry(new ZipEntry(i + SUFFIX_FILE));
fileChannel.transferTo(0, FILE_SIZE, writableByteChannel);
}
}
printInfo(beginTime);
} catch (Exception e) {
e.printStackTrace();
}
}Result: fileSize:20M and consum time:1416 ms (≈1.4 seconds).
The article then discusses kernel‑space vs user‑space, explaining why each read/write system call incurs overhead and how direct buffers avoid the kernel copy step.
Direct vs. non‑direct buffers – non‑direct buffers go through the kernel for every read/write, while direct buffers allocate memory that is mapped into both kernel and user address spaces, allowing faster data transfer but with drawbacks such as safety concerns and garbage‑collection timing.
Memory‑mapped file version (direct buffer) further improves speed:
public static void zipFileMap() {
long beginTime = System.currentTimeMillis();
File zipFile = new File(ZIP_FILE);
try (ZipOutputStream zipOut = new ZipOutputStream(new FileOutputStream(zipFile));
WritableByteChannel writableByteChannel = Channels.newChannel(zipOut)) {
for (int i = 0; i < 10; i++) {
zipOut.putNextEntry(new ZipEntry(i + SUFFIX_FILE));
MappedByteBuffer mappedByteBuffer = new RandomAccessFile(JPG_FILE_PATH, "r").getChannel()
.map(FileChannel.MapMode.READ_ONLY, 0, FILE_SIZE);
writableByteChannel.write(mappedByteBuffer);
}
printInfo(beginTime);
} catch (Exception e) {
e.printStackTrace();
}
}Result: fileSize:20M and consum time:1305 ms, comparable to the pure Channel approach.
Finally, a Pipe‑based solution demonstrates asynchronous processing using a producer thread that writes to a pipe and a consumer that reads from it, again leveraging transferTo for each image.
public static void zipFilePip() {
long beginTime = System.currentTimeMillis();
try (WritableByteChannel out = Channels.newChannel(new FileOutputStream(ZIP_FILE))) {
Pipe pipe = Pipe.open();
CompletableFuture.runAsync(() -> runTask(pipe));
ReadableByteChannel readableByteChannel = pipe.source();
ByteBuffer buffer = ByteBuffer.allocate((int) FILE_SIZE * 10);
while (readableByteChannel.read(buffer) >= 0) {
buffer.flip();
out.write(buffer);
buffer.clear();
}
} catch (Exception e) {
e.printStackTrace();
}
printInfo(beginTime);
}
public static void runTask(Pipe pipe) {
try (ZipOutputStream zos = new ZipOutputStream(Channels.newOutputStream(pipe.sink()));
WritableByteChannel out = Channels.newChannel(zos)) {
for (int i = 0; i < 10; i++) {
zos.putNextEntry(new ZipEntry(i + SUFFIX_FILE));
FileChannel jpgChannel = new FileInputStream(new File(JPG_FILE_PATH)).getChannel();
jpgChannel.transferTo(0, FILE_SIZE, out);
jpgChannel.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}The article concludes that even simple optimizations—switching to buffered I/O, using NIO channels, or memory‑mapped files—can dramatically improve performance, and encourages readers to apply what they learn to reinforce understanding.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.