Frontend Development 13 min read

Turn Web Interactions into Video: Recording, Incremental Snapshots, and rrweb Playback

This article explains how to capture user interactions on insurance web pages by taking DOM snapshots, creating incremental snapshots with MutationObserver, leveraging the rrweb library for recording and replay, and converting the recorded data into high‑frame‑rate video using Puppeteer and FFmpeg to ensure reliable evidence.

WeDoctor Frontend Technology
WeDoctor Frontend Technology
WeDoctor Frontend Technology
Turn Web Interactions into Video: Recording, Incremental Snapshots, and rrweb Playback
Fan Yang, a programmer who hopes to write music with code.

Background

In an insurance project, regulatory requirements demand that the purchase process be fully traceable so disputes can be resolved with evidence. Simply showing backend data is insufficient; a video of the exact user actions provides convincing proof.

DOM Snapshot

To view the page state at a specific moment, we clone the entire DOM and its CSS, then re‑render it in the browser.

<code>const cloneDoc = document.documentElement.cloneNode(true); // record
document.replaceChild(cloneDoc, document.documentElement); // playback</code>

This creates a snapshot, but the cloned document exists only in memory.

Serialization

We serialize the cloned DOM to a string, store it on the server, and later deserialize it for playback.

<code>const serializer = new XMLSerializer();
const str = serializer.serializeToString(cloneDoc);
document.documentElement.innerHTML = str;</code>

Timed Snapshot

Recording a video requires at least 24 frames per second. To achieve this we would clone the page every 41.7 ms (1000 ms / 24 fps). However, cloning the entire page 24 times per second causes severe performance degradation, massive network overhead, and unnecessary duplication when the page does not change.

Incremental Snapshot

Instead of full snapshots, we record only the changes after the initial full clone. This reduces data size, eliminates redundant frames, and simplifies playback by applying only the recorded mutations.

Only changed parts are stored, saving bandwidth and improving performance.

Changes are recorded only when they occur, avoiding duplicate data.

During playback, the first frame (full HTML) is rendered, then each subsequent change is applied in order, reproducing the user flow like a video.

Example of recorded events:

<code>var events = [
    {fullHtml: "<html>...</html>"},
    {id: 'dom2', type: '#fff -> red'},
    {id: 'dom4', type: '#fff -> green'}
];</code>

MutationObserver

To detect DOM changes, we use the browser's MutationObserver API, which batches mutation records.

<code>setTimeout(() => {
  let dom2 = document.getElementById("dom2");
  dom2.style.background = "red";
  let dom4 = document.getElementById("dom4");
  dom4.style.background = "green";
}, 5000);

const callback = function(mutationsList, observer) {
  for (const mutation of mutationsList) {
    if (mutation.type === "childList") {
      console.log("Child nodes added or removed.");
    } else if (mutation.type === "attributes") {
      console.log("Element attributes changed.");
    }
  }
};

document.addEventListener("DOMContentLoaded", function() {
  const observer = new MutationObserver(callback);
  observer.observe(document.body, {
    attributes: true,
    childList: true,
    subtree: true,
  });
});</code>

The observer reports only the mutated element (target) and the type of mutation, enabling efficient incremental snapshots.

Interactive Elements

MutationObserver does not capture user input in

input

,

textarea

, or

select

elements. We therefore listen to

input

and

change

events, and for programmatic value changes we override the property setter.

<code>const input = document.getElementById("input");
Object.defineProperty(input, "value", {
  get: function() {
    console.log("Getting input value");
  },
  set: function(val) {
    console.log("Input value updated");
  },
});
input.value = 123;</code>

rrweb

The overall approach mirrors the open‑source library rrweb (record‑replay‑web), which also captures mouse movements, viewport size, and provides sandboxed playback with time‑calibration.

Installation

<code>npm install --save rrweb</code>

Recording

<code>const events = [];
let stopFn = rrweb.record({
  emit(event) {
    if (events.length > 100) {
      stopFn(); // stop when enough events are collected
      // serialize and send events to server
    }
  },
});</code>

Playback

<code>const events = []; // fetched from server
const replayer = new rrweb.Replayer(events);
replayer.play();</code>

Static Resource Expiration

Recorded data may reference external assets (images, CSS, fonts). If those resources are removed in later deployments, playback will show broken links, undermining the evidence. This is a critical issue for long‑term auditability.

JSON to Video

To create a stable artifact, we convert rrweb JSON data into a video. Using Puppeteer, we replay the events in a headless browser, capture screenshots at a high frame rate, and stitch them together with FFmpeg.

Key steps:

Set frame rate (e.g., 50 fps → capture every 20 ms).

Pause playback before each capture to avoid timing drift caused by the ~300 ms screenshot latency.

Pipe captured PNG buffers into FFmpeg, specifying codec, bitrate, and frame rate.

<code>updateCanvas() {
  if (this.imgIndex * 20 >= this.timeLength) {
    this.stopCut();
    return;
  }
  this.iframe.screenshot({
    type: 'png',
    encoding: 'binary',
  }).then(buffer => {
    this.readAble.push(buffer);
    this.page.evaluate((data) => {
      window.chromePlayer.pause(data * 20);
    }, this.imgIndex);
    this.updateCanvas(this.imgIndex++);
  });
}

stopCut() {
  this.readAble.push(null);
  this.ffmpeg
    .videoCodec('mpeg4')
    .videoBitrate('1000k')
    .inputFPS(50)
    .on('end', () => { console.log('\n video conversion succeeded'); })
    .on('error', (e) => { console.log('error happened:' + e); })
    .save('./res.mp4');
}</code>

Conclusion

Because Puppeteer screenshotting is slow (≈300 ms per frame), converting one second of rrweb data to video currently takes about 15 seconds, which is not performant enough. Contributions are welcome to improve the efficiency and robustness of this rrweb‑to‑video tool. Source code: https://github.com/gumuqi/rrweb-to-video

frontendMutationObserverrrwebvideo conversionweb recordingDOM snapshot
WeDoctor Frontend Technology
Written by

WeDoctor Frontend Technology

Official WeDoctor Group frontend public account, sharing original tech articles, events, job postings, and occasional daily updates from our tech team.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.