Frontend Development 23 min read

Comprehensive Survey and Implementation Guide for File Preview Solutions

This article presents an extensive survey of file preview options—including commercial services, front‑end libraries, and server‑side converters—detailing their advantages, limitations, implementation steps, and code examples for handling DOCX, PPTX, XLSX, and PDF formats in web applications.

Top Architect
Top Architect
Top Architect
Comprehensive Survey and Implementation Guide for File Preview Solutions

When faced with the need to preview documents, the author investigated various solutions and categorized them into paid services, front‑end implementations, and back‑end conversions.

1. Commercial Preview Services

Options such as Microsoft Office Viewer, Google Drive Viewer, Alibaba Cloud IMM, XDOC, Office Web 365, and WPS Open Platform are listed, with notes on usage, limitations (e.g., file size limits, animation support), and pricing.

2. Front‑End Preview Solutions

PPTX Preview

The only found open‑source project is github.com/g21589/PPTX2HTML , which is outdated. Therefore, the author proposes parsing PPTX files directly using the Office OpenXML standard.

import JSZip from 'jszip'
// Load PPTX data
const zip = await JSZip.loadAsync(pptxData)
const filesInfo = await getContentTypes(zip)
async function getContentTypes(zip: JSZip) {
  const ContentTypesJson = await readXmlFile(zip, '[Content_Types].xml')
  const subObj = ContentTypesJson['Types']['Override']
  const slidesLocArray = []
  const slideLayoutsLocArray = []
  for (let i = 0; i < subObj.length; i++) {
    switch (subObj[i]['attrs']['ContentType']) {
      case 'application/vnd.openxmlformats-officedocument.presentationml.slide+xml':
        slidesLocArray.push(subObj[i]['attrs']['PartName'].substr(1))
        break
      case 'application/vnd.openxmlformats-officedocument.presentationml.slideLayout+xml':
        slideLayoutsLocArray.push(subObj[i]['attrs']['PartName'].substr(1))
        break
      default:
    }
  }
  return { slides: slidesLocArray, slideLayouts: slideLayoutsLocArray }
}

Further steps include parsing [Content_Types].xml , extracting slide information, loading themes, and rendering slides onto a canvas.

PDF Preview

Browsers can display PDFs directly via <iframe src="viewFileUrl"/> , but for a consistent UI the author recommends using PDF.js.

import * as pdfjs from 'pdfjs-dist'
import * as pdfjsWorker from 'pdfjs-dist/build/pdf.work.entry'

class PdfPreview {
  private pdfDoc: PDFDocumentProxy | undefined
  pageNumber: number = 1
  total: number = 0
  dom: HTMLElement
  pdf: string | ArrayBuffer
  constructor(pdf: string | ArrayBuffer, dom: HTMLElement | undefined) {
    this.pdf = pdf
    this.dom = dom ? dom : document.body
  }
  async pdfPreview() {
    window.pdfjsLib.GlobalWorkerOptions.workerSrc = pdfjsWorker
    const doc = await window.pdfjsLib.getDocument(this.pdf).promise
    this.pdfDoc = doc
    this.total = doc.numPages
    for (let i = 1; i <= this.total; i++) {
      await this.getPdfPage(i)
    }
  }
  private async getPdfPage(number: number) {
    return new Promise((resolve, reject) => {
      if (this.pdfDoc) {
        this.pdfDoc.getPage(number).then(page => {
          const viewport = page.getViewport()
          const canvas = document.createElement('canvas')
          this.dom.appendChild(canvas)
          const context = canvas.getContext('2d')
          const [_, __, width, height] = viewport.viewBox
          canvas.width = width
          canvas.height = height
          viewport.width = width
          viewport.height = height
          canvas.style.width = Math.floor(viewport.width) + 'px'
          canvas.style.height = Math.floor(viewport.height) + 'px'
          const renderContext = {
            canvasContext: context,
            viewport: viewport,
            transform: [1, 0, 0, -1, 0, viewport.height]
          }
          page.render(renderContext)
          resolve({ success: true, data: page })
        })
      } else {
        reject({ success: false, data: null, message: 'pdfDoc is undefined' })
      }
    })
  }
}

DOCX Preview

The docx-preview npm package is used to render DOCX files to HTML.

import { renderAsync } from 'docx-preview'
export const renderDocx = async (options) => {
  const { bodyContainer, styleContainer, buffer, docxOptions = {} } = options
  const defaultOptions = { className: 'docx', ignoreLastRenderedPageBreak: false }
  const configuration = Object.assign({}, defaultOptions, docxOptions)
  if (bodyContainer) {
    return renderAsync(buffer, bodyContainer, styleContainer, configuration)
  } else {
    const contain = document.createElement('div')
    document.body.appendChild(contain)
    return renderAsync(buffer, contain, styleContainer, configuration)
  }
}

XLSX Preview

The @vue-office/excel package provides Vue 2/3 components for rendering Excel files.

3. Server‑Side Preview Solutions

OpenOffice Conversion

Java code using JODConverter to start an OpenOffice service and convert documents to PDF.

package org.example;
import org.artofsolving.jodconverter.OfficeDocumentConverter;
import org.artofsolving.jodconverter.office.DefaultOfficeManagerConfiguration;
import org.artofsolving.jodconverter.office.OfficeManager;
import java.io.File;
public class OfficeUtil {
  private static OfficeManager officeManager;
  private static int[] port = {8100};
  public static void startService() {
    DefaultOfficeManagerConfiguration configuration = new DefaultOfficeManagerConfiguration();
    try {
      System.out.println("准备启动office转换服务....");
      configuration.setOfficeHome("C:\\Program Files (x86)\\OpenOffice 4");
      configuration.setPortNumbers(port);
      configuration.setTaskExecutionTimeout(1000L * 60 * 30);
      configuration.setTaskQueueTimeout(1000L * 60 * 60 * 24);
      officeManager = configuration.buildOfficeManager();
      officeManager.start();
      System.out.println("office转换服务启动成功!");
    } catch (Exception e) {
      System.out.println("office转换服务启动失败!详细信息:" + e);
    }
  }
  public static void stopService() {
    System.out.println("准备关闭office转换服务....");
    if (officeManager != null) {
      officeManager.stop();
    }
    System.out.println("office转换服务关闭成功!");
  }
  public static void convertToPDF(String inputFile, String outputFile) {
    startService();
    System.out.println("进行文档转换:" + inputFile + " --> " + outputFile);
    OfficeDocumentConverter converter = new OfficeDocumentConverter(officeManager);
    converter.convert(new File(inputFile), new File(outputFile));
    stopService();
  }
  public static void main(String[] args) {
    convertToPDF("/Users/koolearn/Desktop/asdf.docx", "/Users/koolearn/Desktop/adsf.pdf");
  }
}

kkFileView

Instructions for building and running the Java‑based kkFileView service, including installing Java, Maven, LibreOffice, and starting the server.

brew install java
brew install mvn
export JAVA_HOME=$(/usr/libexec/java_home)
source .zshrc
brew install libreoffice
mvn clean install -DskipTests

After launching, the web UI allows uploading files for preview.

OnlyOffice

OnlyOffice provides both open‑source and enterprise editions for document preview and collaborative editing.

4. Summary

For public, non‑confidential files, Microsoft’s online viewer is recommended.

For high‑security, stable requirements with budget, Alibaba Cloud IMM is a viable option.

Server‑side converters (OpenOffice, kkFileView, OnlyOffice) offer the most complete preview capabilities.

If no budget or server is available, front‑end libraries enable zero‑cost client‑side rendering.

5. References

Links to documentation, GitHub repositories, npm packages, and articles are provided for further reading.

JavaScriptWeb ComponentsFile PreviewOpenOfficedocx-previewpdfjs
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.