Frontend Development 17 min read

What Is WebRTC? Overview, Architecture, Signaling, and Demo Implementation

This article explains WebRTC as a cross‑platform, low‑latency real‑time communication technology, covering its definition, three‑layer architecture, JavaScript APIs, signaling process, NAT traversal mechanisms, a complete demo code example, and a practical Douyin business use case.

TikTok Frontend Technology Team

Nov 30, 2022

What Is WebRTC

Cross‑platform, low‑latency, end‑to‑end audio‑video real‑time communication technology

Overview

WebRTC (Real‑Time Communications) generally refers to audio‑video real‑time communication, but the broader RTC concept also includes IM, images, whiteboard, file sharing and other rich media interactions. It is both an API and a set of protocols.

Typical scenarios include P2P video calls, conference calls, live streaming, remote access, online education, tele‑medicine, IoT devices (drones, cameras, smart speakers) and more.

Any endpoint—browser, desktop app, Android/iOS device or IoT—can interoperate as long as it follows the WebRTC specifications and has IP connectivity.

The goal of WebRTC is to let web developers create rich real‑time multimedia applications in the browser with simple JavaScript, without installing plugins or handling low‑level signal processing.

WebRTC Principles

Three‑Layer Architecture

Your Web App Layer

Implements the real‑time communication application.

Web API Layer

This layer exposes the WebRTC JavaScript APIs to developers. The main APIs are:

API categories: Media Stream API, RTCPeerConnection, Peer‑to‑peer Data API Media Stream API : Access camera and microphone via MediaStream to obtain synchronized audio‑video streams.

RTCPeerConnection : Represents a WebRTC connection between the local computer and a remote peer, providing methods to create, maintain, monitor and close the connection.

Peer‑to‑peer Data API : Creates a high‑throughput, low‑latency data channel for arbitrary data transfer.

WebRTC Core Layer (Four Layers)

WebRTC C/C++ API (PeerConnection): Implements P2P connection, audio/video capture, transmission, and non‑media data.

Session Management / Abstract Signaling: Manages sessions for audio, video and data streams.

Audio Engine, Video Engine, Transport: Core processing modules.

Hardware Adaptation Layer: Handles device‑level capture/rendering, video capture, and network I/O; these modules can be overridden for custom implementations.

WebRTC Communication

WebRTC uses RTCPeerConnection to exchange media streams between browsers. After creating an RTCPeerConnection instance, two negotiation steps are required:

Media negotiation – determine stream characteristics (resolution, codecs, etc.) via SDP.

Network negotiation – exchange ICE candidates to discover reachable network addresses.

Signaling

Before a WebRTC connection can be established, a signaling process exchanges metadata so that the two peers can locate each other. Signaling messages are plain text and can be transported via WebSockets or any other channel.

Signaling purposes include:

Control messages to open/close the connection.

Error notifications.

Media adaptation data (codec, bandwidth, etc.).

Security key exchange.

Network configuration (IP, ports).

The signaling server acts as an intermediary that helps both peers establish a connection while minimizing privacy exposure.

Session Description Protocol (SDP)

Media negotiation details are described using SDP, a key/value format similar to INI files. An SDP contains one or more media descriptions, each typically mapping to a single media stream.

SDP Handshake

Similar to TCP's three‑way handshake, WebRTC performs an offer/answer exchange that requires at least four messages: send offer, receive answer, send ICE candidates, receive ICE candidates.

NAT Traversal (Hole Punching)

To establish a direct peer‑to‑peer channel across NATs and firewalls, WebRTC uses ICE, which integrates STUN and TURN protocols.

STUN : Discovers public IP/port and NAT type.

TURN : Relays traffic through a server when direct connection fails.

ICE : Attempts STUN first, then falls back to TURN, ensuring connectivity.

WebRTC Demo

Below is a minimal demo that creates a local video element, a remote video element, and uses JavaScript to establish a peer connection.

Demo HTML

<!DOCTYPE html>
<html>
<head>
  <title>WebRTC Demo</title>
  <style type="text/css">
    #remote{position:absolute;top:100px;left:100px;width:500px;}
    #local{position:absolute;top:120px;left:480px;width:100px;z-index:9999;border:1px solid #ddd;}
  </style>
</head>
<body>
  <video id="local" autoplay></video>
  <video id="remote" autoplay></video>
  <script type="text/javascript" src="./main.js"></script>
</body>
</html>

Demo JavaScript (main.js)

function hasUserMedia(){
  navigator.getUserMedia = navigator.getUserMedia || navigator.msGetUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;
  return !!navigator.getUserMedia;
}
function hasRTCPeerConnection(){
  window.RTCPeerConnection = window.RTCPeerConnection || window.webkitRTCPeerConnection || window.mozRTCPeerConnection || window.msRTCPeerConnection;
  return !!window.RTCPeerConnection;
}
function startPeerConnection(stream){
  let config = {'iceServers':[{'url':'stun:stun.services.mozilla.com'},{'url':'stun:stunserver.org'},{'url':'stun:stun.l.google.com:19302'}]};
  const localConnection = new RTCPeerConnection(config);
  const remoteConnection = new RTCPeerConnection(config);
  localConnection.onicecandidate = e=>{ if(e.candidate) remoteConnection.addIceCandidate(new RTCIceCandidate(e.candidate)); };
  remoteConnection.onicecandidate = e=>{ if(e.candidate) localConnection.addIceCandidate(new RTCIceCandidate(e.candidate)); };
  remoteConnection.onaddstream = e=>{ remoteVideo.srcObject = e.stream; };
  localConnection.addStream(stream);
  localConnection.createOffer().then(offer=>{
    localConnection.setLocalDescription(offer);
    remoteConnection.setRemoteDescription(offer);
    remoteConnection.createAnswer().then(answer=>{
      remoteConnection.setLocalDescription(answer);
      localConnection.setRemoteDescription(answer);
    });
  });
}
function main(){
  let localVideo = document.getElementById("local");
  let remoteVideo = document.getElementById("remote");
  if(hasUserMedia()){
    navigator.getUserMedia({video:true,audio:false}, stream=>{
      localVideo.srcObject = stream;
      if(hasRTCPeerConnection()){
        startPeerConnection(stream);
      } else { alert("No RTCPeerConnection API"); }
    }, err=>{ console.log(err); });
  } else { alert("No getUserMedia API"); }
}

Business Scenario: Douyin "Xiao An" Human‑initiated Call

Douyin’s security assistant uses a WebRTC SDK provided by Volcano Engine. The flow is:

Create and initialize the client engine via createEngine.

Join an RTC room with engine.joinRoom, configuring auto‑publish/subscribe.

Capture local audio/video with startAudioCapture / startVideoCapture, publish and play locally.

Subscribe to and play remote streams.

Leave the room with leaveRoom when the call ends.

Key parameters for establishing the connection are appId, token, roomId and uid.

SDK Call Flow (Code Snippet)

// Step 1: Fetch RTC init config
const createVoip = async () => {
  const data = await InterveneServer.createVoip({
    ToUserId: callParams?.ToUserId,
    VoipType: 1,
    BizScene: callParams?.BizScene,
    Desc: callParams?.Desc,
    Command: callParams?.Command
  });
  setToken(data?.Token);
  config.current = {appId: data?.AppId, roomId: data?.RoomId, uid: data?.MyAppUserId, voipUUid: data?.VoipUUid};
  setJoin(true);
  setCallStatus(CallStatus.WAITING);
};

// Step 2: Create RTC instance
export default class RtcComponent extends React.Component {
  rtc = new RtcClient(this.props);
  componentDidMount(){ this.props.onRef(this.rtc); }
  render(){ return <> </>; }
}

// Step 3: Init RTC, bind events, join room
const initRTC = async () => {
  const {roomId, uid} = config.current || {};
  if(!roomId || !uid || !rtc.current) return;
  rtc.current.bindEngineEvents();
  await rtc.current.join(token, roomId, uid);
  await rtc.current.createLocalStream(res => {
    const {code, devicesStatus} = res;
    if(code === ERROR_CODE || devicesStatus.audio === FAILED){ setMicOn(false); return; }
  });
};

// Step 4: Handle remote stream addition
const handleStreamAdd = useCallback(event => {
  const stream = event.stream;
  const userId = stream.userId;
  if(count.current < 3 && !remoteStreams[userId]){
    remoteStreams[userId] = stream;
    stream.playerComp = (
      <MediaPlayer userId={userId} stream={stream} setRemoteVideoPlayer={rtc?.current?.setRemoteVideoPlayer} />
    );
    setRemoteStreams({...remoteStreams});
    count.current += 1;
  }
}, [remoteStreams]);

Appendix

WebRTC vs RTMP

WebRTC vs WebSocket

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

WebRTC Signaling demo JavaScript API NAT traversal

Written by

TikTok Frontend Technology Team

We are the TikTok Frontend Technology Team, serving TikTok and multiple ByteDance product lines, focused on building frontend infrastructure and exploring community technologies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.