Frontend Development 17 min read

Implementing High-Performance Online PDF Preview with PDF.js and Chunk Loading

This article describes how to achieve fast online PDF preview by splitting PDF files into chunks, using PDF.js to load and render only visible pages on demand, and optimizing memory by clearing off‑screen content while supporting rotation, zoom, and page navigation.

政采云技术

May 31, 2020

Implementing High-Performance Online PDF Preview with PDF.js and Chunk Loading

The author received a product requirement to enable users to view PDF files online with operations such as rotation, zoom, and jumping to a specific page.

Initial attempts using iframe/embed/object tags or the PDF.js library with a simple iframe viewer worked for small files but caused noticeable delays for large PDFs because the entire file had to be downloaded before rendering could start.

The core idea is to avoid downloading the whole PDF at once. Instead, the PDF is split into chunks (e.g., 5 pages per chunk) on the server, and the client downloads and renders only the chunks that intersect the user’s current viewport, similar to lazy‑loading or data pagination.

Server‑side: using a library like itextpdf to split the PDF. After splitting, the server agrees on a JSON format for each chunk, for example:

{
  "startPage": 1,
  "endPage": 5,
  "totalPage": 100,
  "url": "http://test.com/asset/fhdf82372837283.pdf"
}

Client‑side workflow:

Fetch the first chunk (e.g., startPage = 1) via an AJAX call to obtain its URL.

Use PDF.js’s getDocument(url) to obtain a PDFDocumentLoadingTask; when the promise resolves, iterate through the pages of the loaded chunk and store each PDFPageProxy in a pages array, marking their load status.

Once a page’s proxy is available, trigger rendering.

Rendering a chunk:

const viewport = pdfPage.getViewport({ scale: 1, rotation: 0 });
const pageSize = { width: viewport.width, height: viewport.height };
// create a container for all pages
const contentView = document.createElement('div');
contentView.style.width = pageSize.width + 'px';
contentView.style.height = (totalPage * (pageSize.height + PAGE_INTVERVAL)) + PAGE_INTVERVAL + 'px';
pdfContainer.appendChild(contentView);
// render a single page
function renderPageContent(page) {
  if (page.dom) return;
  const viewport = page.pdfPage.getViewport({ scale, rotation });
  const canvas = document.createElement('canvas');
  const context = canvas.getContext('2d');
  canvas.height = pageSize.height;
  canvas.width = pageSize.width;
  const pageDom = document.createElement('div');
  pageDom.style.position = 'absolute';
  pageDom.style.top = ((page.pageNo - 1) * (pageSize.height + PAGE_INTVERVAL)) + PAGE_INTVERVAL + 'px';
  pageDom.style.width = pageSize.width + 'px';
  pageDom.style.height = pageSize.height + 'px';
  pageDom.appendChild(canvas);
  page.pdfPage.render({ canvasContext: context, viewport });
  page.dom = pageDom;
  contentView.appendChild(pageDom);
}

Scroll‑based chunk loading:

const scrollPdf = _.debounce(() => {
  const scrollTop = pdfContainer.scrollTop;
  const height = pdfContainer.height;
  const pageIndex = scrollTop > 0 ? Math.ceil((scrollTop + height / 2) / (pageSize.height + PAGE_INTVERVAL)) : 1;
  loadBefore(pageIndex);
  loadAfter(pageIndex);
}, 200);
const SLICE_COUNT = 5;
function loadBefore(pageIndex) {
  const start = (Math.floor(pageIndex / SLICE_COUNT) * SLICE_COUNT) - (SLICE_COUNT - 1);
  if (start > 0) {
    const prevPage = pages[start - 1] || {};
    if (prevPage.loadStatus === pageLoadStatus.WAIT) loadPdfData(start);
  }
}
function loadAfter(pageIndex) {
  const start = (Math.floor(pageIndex / SLICE_COUNT) * SLICE_COUNT) + 1;
  if (start <= pages.length) {
    const nextPage = pages[start - 1] || {};
    if (nextPage.loadStatus === pageLoadStatus.WAIT) loadPdfData(start);
  }
}
function loadPdfData(chunkIndex) {
  // fetch chunk JSON, get URL, call PDFJS.getDocument, etc.
}

Optimizations to keep memory usage low:

Only keep DOM nodes for a window of pages around the visible area (e.g., ±10 pages); off‑screen pages are removed via clearPage.

While a chunk is downloading, show a lightweight loading placeholder ( renderPageLoading).

Potential pitfalls:

The solution assumes all PDF pages have the same size. If pages differ, the calculated positions can be off, causing misplaced or overlapped content.

Two mitigation strategies are discussed: (a) scale each page to a uniform size on the client (may affect visual fidelity), or (b) have the server pre‑compute and return each page’s exact dimensions, letting the client compute positions accurately (more computation).

In summary, by splitting PDFs into chunks and loading only the needed parts with PDF.js, the team achieved fast initial display and smooth scrolling for large PDFs, satisfying the product’s performance requirements.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Web Development frontend performance PDF.js chunk loading lazy rendering PDF preview

Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.