Frontend Development 17 min read

Implementing High-Performance Online PDF Preview with PDF.js and Chunk Loading

This article describes how to achieve fast online PDF preview by splitting PDF files into chunks, using PDF.js to load and render only visible pages on demand, and optimizing memory by clearing off‑screen content while supporting rotation, zoom, and page navigation.

政采云技术
政采云技术
政采云技术
Implementing High-Performance Online PDF Preview with PDF.js and Chunk Loading

The author received a product requirement to enable users to view PDF files online with operations such as rotation, zoom, and jumping to a specific page.

Initial attempts using iframe/embed/object tags or the PDF.js library with a simple iframe viewer worked for small files but caused noticeable delays for large PDFs because the entire file had to be downloaded before rendering could start.

The core idea is to avoid downloading the whole PDF at once. Instead, the PDF is split into chunks (e.g., 5 pages per chunk) on the server, and the client downloads and renders only the chunks that intersect the user’s current viewport, similar to lazy‑loading or data pagination.

Server‑side: using a library like itextpdf to split the PDF. After splitting, the server agrees on a JSON format for each chunk, for example:

{
  "startPage": 1,
  "endPage": 5,
  "totalPage": 100,
  "url": "http://test.com/asset/fhdf82372837283.pdf"
}

Client‑side workflow:

Fetch the first chunk (e.g., startPage = 1) via an AJAX call to obtain its URL.

Use PDF.js’s getDocument(url) to obtain a PDFDocumentLoadingTask ; when the promise resolves, iterate through the pages of the loaded chunk and store each PDFPageProxy in a pages array, marking their load status.

Once a page’s proxy is available, trigger rendering.

Rendering a chunk:

const viewport = pdfPage.getViewport({ scale: 1, rotation: 0 });
const pageSize = { width: viewport.width, height: viewport.height };
// create a container for all pages
const contentView = document.createElement('div');
contentView.style.width = pageSize.width + 'px';
contentView.style.height = (totalPage * (pageSize.height + PAGE_INTVERVAL)) + PAGE_INTVERVAL + 'px';
pdfContainer.appendChild(contentView);
// render a single page
function renderPageContent(page) {
  if (page.dom) return;
  const viewport = page.pdfPage.getViewport({ scale, rotation });
  const canvas = document.createElement('canvas');
  const context = canvas.getContext('2d');
  canvas.height = pageSize.height;
  canvas.width = pageSize.width;
  const pageDom = document.createElement('div');
  pageDom.style.position = 'absolute';
  pageDom.style.top = ((page.pageNo - 1) * (pageSize.height + PAGE_INTVERVAL)) + PAGE_INTVERVAL + 'px';
  pageDom.style.width = pageSize.width + 'px';
  pageDom.style.height = pageSize.height + 'px';
  pageDom.appendChild(canvas);
  page.pdfPage.render({ canvasContext: context, viewport });
  page.dom = pageDom;
  contentView.appendChild(pageDom);
}

Scroll‑based chunk loading:

const scrollPdf = _.debounce(() => {
  const scrollTop = pdfContainer.scrollTop;
  const height = pdfContainer.height;
  const pageIndex = scrollTop > 0 ? Math.ceil((scrollTop + height / 2) / (pageSize.height + PAGE_INTVERVAL)) : 1;
  loadBefore(pageIndex);
  loadAfter(pageIndex);
}, 200);
const SLICE_COUNT = 5;
function loadBefore(pageIndex) {
  const start = (Math.floor(pageIndex / SLICE_COUNT) * SLICE_COUNT) - (SLICE_COUNT - 1);
  if (start > 0) {
    const prevPage = pages[start - 1] || {};
    if (prevPage.loadStatus === pageLoadStatus.WAIT) loadPdfData(start);
  }
}
function loadAfter(pageIndex) {
  const start = (Math.floor(pageIndex / SLICE_COUNT) * SLICE_COUNT) + 1;
  if (start <= pages.length) {
    const nextPage = pages[start - 1] || {};
    if (nextPage.loadStatus === pageLoadStatus.WAIT) loadPdfData(start);
  }
}
function loadPdfData(chunkIndex) {
  // fetch chunk JSON, get URL, call PDFJS.getDocument, etc.
}

Optimizations to keep memory usage low:

Only keep DOM nodes for a window of pages around the visible area (e.g., ±10 pages); off‑screen pages are removed via clearPage .

While a chunk is downloading, show a lightweight loading placeholder ( renderPageLoading ).

Potential pitfalls:

The solution assumes all PDF pages have the same size. If pages differ, the calculated positions can be off, causing misplaced or overlapped content.

Two mitigation strategies are discussed: (a) scale each page to a uniform size on the client (may affect visual fidelity), or (b) have the server pre‑compute and return each page’s exact dimensions, letting the client compute positions accurately (more computation).

In summary, by splitting PDFs into chunks and loading only the needed parts with PDF.js, the team achieved fast initial display and smooth scrolling for large PDFs, satisfying the product’s performance requirements.

web developmentfrontend performancePDF.jschunk loadinglazy renderingPDF preview
政采云技术
Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.