Fundamentals 21 min read

Understanding Digital Image Acquisition, Color Spaces, and Video Encoding Basics

This article explains how digital images are captured as pixel arrays, describes the physiological basis of color perception, compares RGB and YUV color spaces, and introduces video compression concepts such as spatial and temporal redundancy, frame types, and common encoding formats.

New Oriental Technology
New Oriental Technology
New Oriental Technology
Understanding Digital Image Acquisition, Color Spaces, and Video Encoding Basics

The video is formed by sequential frames, each created from a series of captured digital images (pixels) represented in binary form.

A digital image consists of a rectangular grid of pixels, each pixel holding a color value; the pixel value ranges from 0 to 255, representing grayscale intensity, while color is produced by combining red, green, and blue components.

Human color perception relies on cone cells (S, M, L types) sensitive to red, green, and blue wavelengths, and rod cells that detect brightness in low‑light conditions.

The RGB color space encodes each pixel with three bytes (R, G, B), allowing 16,777,216 possible colors (24‑bit true color).

The YUV color space separates luminance (Y) from chrominance (U, V), enabling higher compression by storing full‑resolution brightness and lower‑resolution color information; YUV is often represented as Y'CbCr in digital video.

Conversion between RGB and YUV is performed with the following formulas: Y = 0.299 * R + 0.587 * G + 0.114 * B U = -0.147 * R - 0.289 * G + 0.436 * B V = 0.615 * R - 0.515 * G - 0.100 * B R = Y + 1.14 * V G = Y - 0.39 * U - 0.58 * V B = Y + 2.03 * U

Video encoding reduces massive raw data size by eliminating spatial redundancy (similar pixels within a frame) and temporal redundancy (similarities between consecutive frames) using intra‑frame (spatial) and inter‑frame (temporal) compression techniques.

Frames are classified as I‑frames (self‑contained key frames), P‑frames (predicted from previous frames), and B‑frames (bidirectional prediction), each offering different compression ratios and decoding requirements.

Common encoding standards such as H.264/H.265 use inter‑frame compression for efficient streaming, while intra‑frame codecs like ProRes prioritize quality and editing convenience.

Container formats (e.g., MP4, MOV, MKV) act as wrappers that package encoded video, audio, and subtitles, whereas the codec determines how the visual data is compressed.

Video EncodingRGBcompressionYUVColor spacedigital imagingpixel
New Oriental Technology
Written by

New Oriental Technology

Practical internet development experience, tech sharing, knowledge consolidation, and forward-thinking insights.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.