Introduction to WebRTC support in XProtect

WebRTC, Web RealTime Communication, is an open standard for real-time peer-to-peer communication on IP networks. WebRTC allows audio and video communication and streaming to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps.

The WebRTC features currently supported in XProtect are:

Establishing a WebRTC session

A connection through WebRTC is established through a process called signaling. The WebRTC standard doesn’t recommend any specific way to implement signaling. XProtect offers signaling through either a RESTful API or a WebSocket API, both hosted by the API Gateway.

RESTful

The following API Gateway endpoints are used to establish WebRTC connections:

For more details about the WebRTC endpoints and methods, refer to the WebRTC RESTful API reference documentation.

The signaling between the client (for example, a browser) and the API Gateway is as follows:

WebRTC RESTful signaling sequence
  1. Make an HTTP POST request to /webRTC/session to start a session. The POST body should at least contain the Id of the camera from which the WebRTC session should stream video. For more details about the session options, refer to Session options. The response will be a session data object containing information about the session. The information relevant here is the session Id and an Offer SDP. The session Id must be used in all subsequent requests in that session.
  2. The client verifies the Offer SDP and generates an Answer SDP if it is acceptable. The client should then update the session with a PATCH request with the Answer SDP.
  3. Once SDPs have been exchanged, ICE candidates should be exchanged. The client should POST its ICE candidates to the endpoint /webRTC/iceCandidates/ and retrieve the server ICE candidates with a GET request to the same endpoint.
  4. After ICE candidates have been exchanged, WebRTC will try to establish a UDP connection if possible with any pair of the ICE candidates, and video will be streamed to the client.
  5. Communication with the WebRTC endpoints uses the same OAuth bearer token as all other XProtect RESTful endpoints. This bearer token will eventually expire, so the client should be sure to retrieve an updated bearer token and PATCH the session to update the token.
  6. Once the UDP session is closed, the WebRTC session in the API Gateway will also close. No further action is needed to close the session.

WebSocket

The WebSocket signaling is in compliance with the ONVIF WebRTC Specification. The following API Gateway entrypoint is used to connect to the WebSocket server:

/ws/webrtc/v1

The signaling between the client and the server is as following:

WebRTC WebSocket signaling signaling
  1. The client registers with the server.
  2. The client sends a connect request. This request contains at minimum the Id of the camera. For more details about the session options, refer to the Session options.
  3. The server will answer with a session id and a list of ICE servers.
  4. The server will send an invite request containing the session id and an Offer SDP.
  5. The client verifies the offer and generates an Answer SDP. The client then updates the server with a answer containing the Answer SDP.
  6. The connection is now open for trickles from either the client or the server. A trickle is a one-way message containing a new ICE candidate.

ICE candidates

WebRTC clients use ICE candidates to negotiate the connection. The main parts of an ICE candidate are an address (IP address, DNS name or mDNS name) and a port. The ICE candidates coming from a browser are generated by the WebRTC implementation in the browser and is sent to the API Gateway. At the same time, the API Gateway generates a list of ICE candidates for the session; these has to be retrieved from the server and given to the WebRTC implementation in the client. The WebRTC implementations will then try to establish a connection with each ICE candidate.

Trickle ICE

Trickle ICE is a method of improving the connection once it is made. The API Gateway fully supports trickle ICE, but it requires the clients to periodically poll the endpoint /webRTC/iceCandidates/ for new ICE candidates.

Session options

The body in the initial POST that starts a session includes options for the session. Below, each parameter will be described.

Live video

To get a working WebRTC session, a camera Id is always required. No other parameters are required to get live video.

Video from specific stream

If the camera has multiple streams, you can get video from a specific stream by supplying the stream Id.

Playback of recorded video

Playback requires the timestamp (in ISO 8601 format) of the recorded video in the POST request. Playback by itself has two options, skip gaps and speed.

Skip gaps
If enabled, the stream will skip gaps between video sequences.
Speed
Set speed to adjust the playback speed of recorded video.

To get the the time of recorded video, the rtpTimestamp in each frame must be inspected. This timestamp starts at 0 in the first frame. In the following frames, it will be the time difference between the current frame and the first frame in milliseconds.

STUN and TURN servers

To help establish a connection through NATs, WebRTC uses STUN (Session Traversal Utilities for NAT) and/or TURN (Traversal Using Relays around NAT) servers.

STUN server
Used to discover the public IP address and port number of a device behind a NAT.
TURN server
Used to relay traffic between peers when a direct connection is not possible due to firewall or NAT restrictions.

For more information about STUN and TURN, see WebRTC API STUN and WebRTC API TURN.

Configure STUN and TURN server from client

The servers can be configured as Session options. In the POST request body sent to initiate a session, there is an iceServers key. If the value is an array of STUN and/or TURN servers, these will be used to generate ICE candidates for the session.

Configure STUN server in appsettings.production.json

You can configure default STUN and TURN servers to be used in sessions that don’t supply servers through Session options. To do so, add URLs for STUN and TURN servers in the API Gateway appsettings.production.json file:

{
  "WebRTC": {
    "iceServers": [ 
      {"url": "stun:mystun.zyx:3478"}, 
      {"url": "turn:myturn.zyx:5349"} 
    ] 
  }
}

TURN servers that require username and credential
The appsettings.production.json file cannot be used for a TURN server that requires username and credential as the file is not a safe location to store credentials.

Updating the OAuth token

When the initial POST request is made to the endpoint /webRTC/session/, the OAuth token in the header is used to authenticate the user. When the OAuth token expires, the client has to retrieve a new token and PATCH the session with the new OAuth token.

PTZ commands

XProtect supports executing the following PTZ commands through a WebRTC data channel:

The commands can be executed by sending messages through a WebRTC data channel with the label commands using a sub-protocol named videoos-commands, described below.

You can use the RTCDataChannel.send() method to send messages from a browser. To learn more about browser support for WebRTC data channels, refer to RTCDataChannel.

videoos-commands protocol message format

The message requests and responses are JSON formatted strings.

Request

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "...",
  "params": {} 
}
Key Type Description or schema
ApiVersion string API version
type string "request" for request messages
method string ["ptzMoveStart", "ptzMoveStop", "ptzMove", "getPresets", "goToPresets", "setAux"]
params object optional, command parameters

Response with success

{
  "ApiVersion": "1.0",
  "type": "response",
  "method": "",
  "success": true,
  "data": []
}
Key Type Description or schema
ApiVersion string API version
type string "response" for response messages
method string ["ptzMoveStart", "ptzMoveStop", "ptzMove", "getPresets", "goToPresets", "setAux"]
success boolean true if command succeeded
data array or null optional, response data or null

Response with error

{
  "apiVersion":"1.0",
  "type": "response",
  "method": "",
  "success": false,
  "error": {
    "code": 1,
    "message": ""
  }
}
Key Type Description or schema
ApiVersion string API version
type string "response" for response messages
method string ["ptzMoveStart", "ptzMoveStop", "ptzMove", "getPresets", "goToPresets", "setAux"]
success boolean "false" if command failed
error object Error code and message
code integer Error code
message string Error message

Messages

Move start

Ask the camera to move continuously in a specific direction.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "ptzMoveStart",
  "params": {
    "pan": 1,
    "tilt": 0,
    "zoom": -1,
    "panSpeed": 0.5,
    "tiltSpeed": 0.5,
    "zoomSpeed": 0.5
  }
}
Key Type Description or schema
ApiVersion string API version
type string "request"
method string "ptzMoveStart"
params object command parameters
pan integer [-1, 0, 1]: left if -1, right if 1
tilt integer [-1, 0, 1]: up if -1, down if 1
zoom integer [-1, 0, 1]: out if -1, in if 1
panSpeed number decimal in the range [0, 1]
tiltSpeed number decimal in the range [0, 1]
zoomSpeed number decimal in the range [0, 1]

Move stop

Ask the camera to stop all ongoing camera movement.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "ptzMoveStop"
}
Key Type Description or schema
ApiVersion string API version
type string "request"
method string "ptzMoveStop"

Move

Ask the camera to move home or one step in a specific direction.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "ptzMove",
  "params": {
    "direction": "..."
  }
}
Key Type Description or schema
ApiVersion string API version
type string "request"
method string "ptzMove"
params object command parameters
direction string ["home", "upLeft", "up", "upRight", "left", "right", "downLeft", "down", "downRight", "zoomIn", "zoomOut"]

Get presets

Ask the camera to return defined presets.

Request
{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "getPresets"
}
Key Type Description or schema
ApiVersion string API version
type string "request"
method string "getPresets"
Response
{
  "ApiVersion": "1.0",
  "type": "response",
  "method": "getPresets",
  "success": true,
  "data": [
    {
      "presetName": "...",
      "presetId": "..."
    }
  ]
}
Key Type Description or schema
ApiVersion string API version
type string "response"
method string "getPresets"
data array response data
presetName string preset name
presetId string "format": "uuid" (GUID)

Goto presets

Ask camera to move to the preset.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "goToPresets",
  "params": {
    "presetId": "..."
  }
}
Key Type Description or schema
ApiVersion string API version
type string "request"
method string "goToPresets"
params object command parameters
presetId string "format": "uuid" (GUID)

Set Aux

Ask camera to activate or deactivate an AUX output.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "setAux",
  "params": {
    "on": true,
    "auxNumber": 3
  }
}
Key Type Description or schema
ApiVersion string API version
type string "request"
method string "setAux"
params object command parameters
on boolean new AUX output state
auxNumber integer AUX output number

Limitations and workarounds

WebRTC connection on a local network uses mDNS

To prevent private IP addresses from leaking from a local network when running WebRTC applications, modern browsers by default send mDNS (multicast DNS) addresses as ICE candidates to the signaling server.

API Gateway support for mDNS

The signaling server running in the API Gateway supports resolving mDNS addresses when running on a Windows version with native support for mDNS. Native support for mDNS was introduced in Windows version 1809 (October 2018) or later, and is available in any recently updated Windows Server 2019 or Windows 10 installations, and all Windows Server 2022 and Windows 11 installations.

WebRTC connections across routers in a local network

mDNS relies on multicast which by default will not pass through routers. This means that in enterprise environments, mDNS will fail in many cases:

The signaling server running in the API Gateway supports a workaround for connections across routers on a local network. The signaling server will attempt to get the client’s local IP network address from X-Forwarded-For and Remote_Addr headers in the HTTP request and use that to add an ICE Candidate with higher priority than the ICE Candidate with the mDNS address. This will not work in all cases; on some networks, X-Forwarded-For is removed and Remote_Addr will not contain the local IP address of client.

Disable browser mDNS support

As a last resort, you can try disabling browser mDNS support to force the browser to reveal the local IP network address in WebRTC connections.

In Chromium-based browsers, mDNS support can be disabled by opening chrome://flags or edge://flags and setting Anonymize local IPs exposed by WebRTC to Disabled.