Fundamentals 13 min read

Understanding Video Frame Color Spaces: RGB, YUV, and FFmpeg Pixel Formats

This article explains the fundamentals of video frame color spaces, covering how video frames are composed, the differences between RGB and YUV, FFmpeg pixel format definitions, chroma subsampling schemes, storage layouts, and conversion formulas, with code examples and visual illustrations.

Tencent IMWeb Frontend Team
Tencent IMWeb Frontend Team
Tencent IMWeb Frontend Team
Understanding Video Frame Color Spaces: RGB, YUV, and FFmpeg Pixel Formats

After entering frontend audio/video development, mastering basic concepts such as video frames and color encoding is essential when using FFmpeg+WASM for frame extraction. This article introduces color spaces used in video frames.

1. Video Frames

Video consists of a series of frames displayed at intervals such as 1/24 or 1/30 second, forming continuous motion. Frames are represented by pixel matrices in either RGB or YUV color spaces. In FFmpeg, the header

libavutil/pixfmt.h

defines many pixel formats, most of which belong to RGB or YUV families.

<code>enum AVPixelFormat {
  // ... (omitted less important types)
  ///< planar YUV 4:2:0, 12bpp, (1 Cr & Cb sample per 2x2 Y samples)
  AV_PIX_FMT_YUV420P,

  ///< packed YUV 4:2:2, 16bpp, Y0 Cb Y1 Cr
  AV_PIX_FMT_YUYV422,

  ///< planar YUV 4:2:2, 16bpp, (1 Cr & Cb sample per 2x1 Y samples)
  AV_PIX_FMT_YUV422P,

  ///< packed YUV 4:2:2, 16bpp, Cb Y0 Cr Y1
  AV_PIX_FMT_UYVY422,

  ///< planar YUV 4:4:4, 24bpp, (1 Cr & Cb sample per 1x1 Y samples)
  AV_PIX_FMT_YUV444P,

  ///< planar YUV 4:4:0 (1 Cr & Cb sample per 1x2 Y samples)
  AV_PIX_FMT_YUV440P,

  ///< packed RGB 8:8:8, 24bpp, RGBRGB...
  AV_PIX_FMT_RGB24,
  ///< packed RGB 8:8:8, 24bpp, BGRBGR...
  AV_PIX_FMT_BGR24,

  ///< packed ARGB 8:8:8:8, 32bpp, ARGBARGB...
  AV_PIX_FMT_ARGB,
  ///< packed RGBA 8:8:8:8, 32bpp, RGBARGBA...
  AV_PIX_FMT_RGBA,
  ///< packed ABGR 8:8:8:8, 32bpp, ABGRABGR...
  AV_PIX_FMT_ABGR,
  ///< packed BGRA 8:8:8:8, 32bpp, BGRABGRA...
  AV_PIX_FMT_BGRA,

  ///< packed RGB 5:6:5, 16bpp, (msb)   5R 6G 5B(lsb), big-endian
  AV_PIX_FMT_RGB565BE,
  ///< packed RGB 5:6:5, 16bpp, (msb)   5R 6G 5B(lsb), little-endian
  AV_PIX_FMT_RGB565LE,
  ///< packed RGB 5:5:5, 16bpp, (msb)1X 5R 5G 5B(lsb), big-endian
  AV_PIX_FMT_RGB555BE,
  ///< packed RGB 5:5:5, 16bpp, (msb)1X 5R 5G 5B(lsb), little-endian
  AV_PIX_FMT_RGB555LE,

  ///< packed BGR 5:6:5, 16bpp, (msb)   5B 6G 5R(lsb), big-endian
  AV_PIX_FMT_BGR565BE,
  ///< packed BGR 5:6:5, 16bpp, (msb)   5B 6G 5R(lsb), little-endian
  AV_PIX_FMT_BGR565LE,
  ///< packed BGR 5:5:5, 16bpp, (msb)1X 5B 5G 5R(lsb), big-endian
  AV_PIX_FMT_BGR555BE,
  ///< packed BGR 5:5:5, 16bpp, (msb)1X 5B 5G 5R(lsb), little-endian
  AV_PIX_FMT_BGR555LE,
}
</code>

Each format comment begins with either

packed

or

planar

. YUV types are followed by numbers such as 4:2:0, 4:2:2, indicating chroma sampling schemes.

2. RGB and YUV

RGB and YUV are both color spaces. RGB uses three channels—red, green, and blue—to represent colors. YUV separates luminance (Y) from chrominance (U and V), allowing a black‑and‑white image to be displayed without color information.

1. RGB

In CSS, developers frequently use RGB or RGBA values. FFmpeg defines 16‑bit formats (RGB555, RGB565), 24‑bit (RGB24), and 32‑bit (RGBA/ARGB). Endianness determines byte order, so RGB may appear as BGR in memory.

<code># RGB555
XRRR RRGG GGGB BBBB

# RGB565
RRRR RGGG GGGB BBBB
</code>

RGB24 allocates 8 bits per channel (24 bits total). RGB32 adds an 8‑bit alpha channel, often called RGBA or ARGB.

<code># RGB24
RRRRRRRR GGGGGGGG BBBBBBBB

# RGB32
RRRRRRRR GGGGGGGG BBBBBBBB AAAAAAAA
</code>

2. YUV

YUV (also known as Y’UV, YCbCr, YPbPr) is used in video pipelines to reduce bandwidth. Common subsampling formats include YUV420, YUV422, and YUV444, which differ in how chroma information is sampled.

YUV ↔ RGB conversion

Conversion formulas:

<code>R = Y + 1.13983 * V
G = Y - 0.39465 * U - 0.58060 * V
B = Y + 2.03211 * U
</code>
<code>Y = 0.299 * R + 0.587 * G + 0.114 * B
U = -0.14713 * R - 0.28886 * G + 0.436 * B
V = 0.615 * R - 0.51499 * G - 0.10001 * B
</code>

Sampling

Chroma subsampling reduces color data based on human visual sensitivity. Notation J:A:B (e.g., 4:2:2) describes horizontal sampling (J), first‑row chroma samples (A), and second‑row chroma samples (B).

J – horizontal sampling width, usually 4.

A – number of chroma samples in the first row of J pixels.

B – number of chroma samples in the second row of J pixels.

YUV 4:4:4

Full sampling: each Y has its own U and V, so the image size matches the original RGB.

YUV 4:2:2

Two Y samples share one UV pair, reducing data size to two‑thirds of the original.

YUV 4:2:0

Common for video frames: the first row samples Y and UV at a 2:1 ratio, while the second row samples only Y.

Storage formats

Pixel data can be stored as packed or planar . Packed stores channel values consecutively (e.g., RGBRGB...). Planar stores all Y values first, then U, then V (e.g., I420).

Packed

Typical for RGB. In YUV, packed formats such as YUYV and UYVY are based on 4:2:2 sampling.

YUYV example order: Y0 U0 Y1 V0, Y2 U2 Y3 V2.

UYVY example order: U0 Y0 V0 Y1, U2 Y2 V2 Y3.

Planar

All Y samples are stored first, followed by U and V. Example: I420 (YUV 4:2:0).

3. Conclusion

Research shows many YUV formats exist, but their principles are similar; the lack of a unified standard once caused chaos.

FFmpeg provides CPU‑based YUV‑to‑RGB conversion; the formulas can be expressed as matrix multiplication, which can be accelerated on GPUs for better performance.

References

libavutil/pixfmt.h source code

Wikipedia – Film frame

Wikipedia – pixel format

Wikipedia – chroma subsampling

Wikipedia – YUV

Zhihu – Video and frame basics

Audio‑Video Development – Understanding YUV sampling and formats

Wasmffmpegrgbyuvcolor spacepixel formatvideo frames
Tencent IMWeb Frontend Team
Written by

Tencent IMWeb Frontend Team

IMWeb Frontend Community gathering frontend development enthusiasts. Follow us for refined live courses by top experts, cutting‑edge technical posts, and to sharpen your frontend skills.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.