Introduction to WebRTC support in XProtect

WebRTC, Web RealTime Communication, is an open standard for real-time peer-to-peer communication on IP networks. WebRTC allows audio and video communication and streaming to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps.

The WebRTC features currently supported in XProtect are:

H.264 video streams
Live streaming and playback
Sending PTZ commands to the camera
STUN and TURN

Establishing a WebRTC session

A connection through WebRTC is established through a process called signaling. The WebRTC standard doesn’t recommend any specific way to implement signaling. XProtect offers signaling through either a RESTful API or a WebSocket API, both hosted by the API Gateway.

RESTful

The following API Gateway endpoints are used to establish WebRTC connections:

/webRTC/session/
/webRTC/iceCandidates/

For more details about the WebRTC endpoints and methods, refer to the WebRTC RESTful API reference documentation.

The signaling between the client (for example, a browser) and the API Gateway is as follows:

Make an HTTP POST request to /webRTC/session to start a session. The POST body should at least contain the Id of the camera from which the WebRTC session should stream video. For more details about the session options, refer to Session options. The response will be a session data object containing information about the session. The information relevant here is the session Id and an Offer SDP. The session Id must be used in all subsequent requests in that session.
The client verifies the Offer SDP and generates an Answer SDP if it is acceptable. The client should then update the session with a PATCH request with the Answer SDP.
Once SDPs have been exchanged, ICE candidates should be exchanged. The client should POST its ICE candidates to the endpoint /webRTC/iceCandidates/ and retrieve the server ICE candidates with a GET request to the same endpoint.
After ICE candidates have been exchanged, WebRTC will try to establish a UDP connection if possible with any pair of the ICE candidates, and video will be streamed to the client.
Communication with the WebRTC endpoints uses the same OAuth bearer token as all other XProtect RESTful endpoints. This bearer token will eventually expire, so the client should be sure to retrieve an updated bearer token and PATCH the session to update the token.
Once the UDP session is closed, the WebRTC session in the API Gateway will also close. No further action is needed to close the session.

WebSocket

The WebSocket signaling is in compliance with the ONVIF WebRTC Specification. The following API Gateway entrypoint is used to connect to the WebSocket server:

/ws/webrtc/v1

The signaling between the client and the server is as following:

The client registers with the server.
The client sends a connect request. This request contains at minimum the Id of the camera. For more details about the session options, refer to the Session options.
The server will answer with a session id and a list of ICE servers.
The server will send an invite request containing the session id and an Offer SDP.
The client verifies the offer and generates an Answer SDP. The client then updates the server with a answer containing the Answer SDP.
The connection is now open for trickles from either the client or the server. A trickle is a one-way message containing a new ICE candidate.

ICE candidates

WebRTC clients use ICE candidates to negotiate the connection. The main parts of an ICE candidate are an address (IP address, DNS name or mDNS name) and a port. The ICE candidates coming from a browser are generated by the WebRTC implementation in the browser and is sent to the API Gateway. At the same time, the API Gateway generates a list of ICE candidates for the session; these has to be retrieved from the server and given to the WebRTC implementation in the client. The WebRTC implementations will then try to establish a connection with each ICE candidate.

Trickle ICE

Trickle ICE is a method of improving the connection once it is made. The API Gateway fully supports trickle ICE, but it requires the clients to periodically poll the endpoint /webRTC/iceCandidates/ for new ICE candidates.

Session options

The body in the initial POST that starts a session includes options for the session. Below, each parameter will be described.

Live video

To get a working WebRTC session, a camera Id is always required. No other parameters are required to get live video.

Video from specific stream

If the camera has multiple streams, you can get video from a specific stream by supplying the stream Id.

Playback of recorded video

Playback requires the timestamp (in ISO 8601 format) of the recorded video in the POST request. Playback by itself has two options, skip gaps and speed.

Skip gaps: If enabled, the stream will skip gaps between video sequences.
Speed: Set speed to adjust the playback speed of recorded video.

To get the the time of recorded video, the rtpTimestamp in each frame must be inspected. This timestamp starts at 0 in the first frame. In the following frames, it will be the time difference between the current frame and the first frame in milliseconds.

STUN and TURN servers

To help establish a connection through NATs, WebRTC uses STUN (Session Traversal Utilities for NAT) and/or TURN (Traversal Using Relays around NAT) servers.

STUN server: Used to discover the public IP address and port number of a device behind a NAT.
TURN server: Used to relay traffic between peers when a direct connection is not possible due to firewall or NAT restrictions.

For more information about STUN and TURN, see WebRTC API STUN and WebRTC API TURN.

Configure STUN and TURN server from client

The servers can be configured as Session options. In the POST request body sent to initiate a session, there is an iceServers key. If the value is an array of STUN and/or TURN servers, these will be used to generate ICE candidates for the session.

Configure STUN server in appsettings.production.json

You can configure default STUN and TURN servers to be used in sessions that don’t supply servers through Session options. To do so, add URLs for STUN and TURN servers in the API Gateway appsettings.production.json file:

{
  "WebRTC": {
    "iceServers": [ 
      {"url": "stun:mystun.zyx:3478"}, 
      {"url": "turn:myturn.zyx:5349"} 
    ] 
  }
}

TURN servers that require username and credential
The appsettings.production.json file cannot be used for a TURN server that requires username and credential as the file is not a safe location to store credentials.

Updating the OAuth token

When the initial POST request is made to the endpoint /webRTC/session/, the OAuth token in the header is used to authenticate the user. When the OAuth token expires, the client has to retrieve a new token and PATCH the session with the new OAuth token.

PTZ commands

XProtect supports executing the following PTZ commands through a WebRTC data channel:

Move start
Move stop
Move
Get presets
Goto presets
Set Aux

The commands can be executed by sending messages through a WebRTC data channel with the label commands using a sub-protocol named videoos-commands, described below.

You can use the RTCDataChannel.send() method to send messages from a browser. To learn more about browser support for WebRTC data channels, refer to RTCDataChannel.

videoos-commands protocol message format

The message requests and responses are JSON formatted strings.

Request

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "...",
  "params": {} 
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"request"` for request messages
`method`	string	[`"ptzMoveStart"`, `"ptzMoveStop"`, `"ptzMove"`, `"getPresets"`, `"goToPresets"`, `"setAux"`]
`params`	object	optional, command parameters

Response with success

{
  "ApiVersion": "1.0",
  "type": "response",
  "method": "",
  "success": true,
  "data": []
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"response"` for response messages
`method`	string	[`"ptzMoveStart"`, `"ptzMoveStop"`, `"ptzMove"`, `"getPresets"`, `"goToPresets"`, `"setAux"`]
`success`	boolean	`true` if command succeeded
`data`	array or null	optional, response data or `null`

Response with error

{
  "apiVersion":"1.0",
  "type": "response",
  "method": "",
  "success": false,
  "error": {
    "code": 1,
    "message": ""
  }
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"response"` for response messages
`method`	string	[`"ptzMoveStart"`, `"ptzMoveStop"`, `"ptzMove"`, `"getPresets"`, `"goToPresets"`, `"setAux"`]
`success`	boolean	`"false"` if command failed
`error`	object	Error code and message
`code`	integer	Error code
`message`	string	Error message

Messages

Move start

Ask the camera to move continuously in a specific direction.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "ptzMoveStart",
  "params": {
    "pan": 1,
    "tilt": 0,
    "zoom": -1,
    "panSpeed": 0.5,
    "tiltSpeed": 0.5,
    "zoomSpeed": 0.5
  }
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"request"`
`method`	string	`"ptzMoveStart"`
`params`	object	command parameters
`pan`	integer	[`-1`, `0`, `1`]: left if `-1`, right if `1`
`tilt`	integer	[`-1`, `0`, `1`]: up if `-1`, down if `1`
`zoom`	integer	[`-1`, `0`, `1`]: out if `-1`, in if `1`
`panSpeed`	number	decimal in the range [`0`, `1`]
`tiltSpeed`	number	decimal in the range [`0`, `1`]
`zoomSpeed`	number	decimal in the range [`0`, `1`]

Move stop

Ask the camera to stop all ongoing camera movement.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "ptzMoveStop"
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"request"`
`method`	string	`"ptzMoveStop"`

Move

Ask the camera to move home or one step in a specific direction.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "ptzMove",
  "params": {
    "direction": "..."
  }
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"request"`
`method`	string	`"ptzMove"`
`params`	object	command parameters
`direction`	string	[`"home"`, `"upLeft"`, `"up"`, `"upRight"`, `"left"`, `"right"`, `"downLeft"`, `"down"`, `"downRight"`, `"zoomIn"`, `"zoomOut"`]

Get presets

Ask the camera to return defined presets.

Request

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "getPresets"
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"request"`
`method`	string	`"getPresets"`

Response

{
  "ApiVersion": "1.0",
  "type": "response",
  "method": "getPresets",
  "success": true,
  "data": [
    {
      "presetName": "...",
      "presetId": "..."
    }
  ]
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"response"`
`method`	string	`"getPresets"`
`data`	array	response data
`presetName`	string	preset name
`presetId`	string	`"format": "uuid"` (GUID)

Goto presets

Ask camera to move to the preset.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "goToPresets",
  "params": {
    "presetId": "..."
  }
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"request"`
`method`	string	`"goToPresets"`
`params`	object	command parameters
`presetId`	string	`"format": "uuid"` (GUID)

Set Aux

Ask camera to activate or deactivate an AUX output.

{
  "ApiVersion": "1.0",
  "type": "request",
  "method": "setAux",
  "params": {
    "on": true,
    "auxNumber": 3
  }
}

Key	Type	Description or schema
`ApiVersion`	string	API version
`type`	string	`"request"`
`method`	string	`"setAux"`
`params`	object	command parameters
`on`	boolean	new AUX output state
`auxNumber`	integer	AUX output number

Limitations and workarounds

WebRTC connection on a local network uses mDNS

To prevent private IP addresses from leaking from a local network when running WebRTC applications, modern browsers by default send mDNS (multicast DNS) addresses as ICE candidates to the signaling server.

API Gateway support for mDNS

The signaling server running in the API Gateway supports resolving mDNS addresses when running on a Windows version with native support for mDNS. Native support for mDNS was introduced in Windows version 1809 (October 2018) or later, and is available in any recently updated Windows Server 2019 or Windows 10 installations, and all Windows Server 2022 and Windows 11 installations.

WebRTC connections across routers in a local network

mDNS relies on multicast which by default will not pass through routers. This means that in enterprise environments, mDNS will fail in many cases:

mDNS over wired Ethernet works on the same local network segment, but in more complex network solution (most enterprise environments), mDNS will fail.
mDNS over WiFi will only work on simple network configurations (as for wired networks). In configurations with WiFi extender or Mesh networks, mDNS will likely fail.

The signaling server running in the API Gateway supports a workaround for connections across routers on a local network. The signaling server will attempt to get the client’s local IP network address from X-Forwarded-For and Remote_Addr headers in the HTTP request and use that to add an ICE Candidate with higher priority than the ICE Candidate with the mDNS address. This will not work in all cases; on some networks, X-Forwarded-For is removed and Remote_Addr will not contain the local IP address of client.

Disable browser mDNS support

As a last resort, you can try disabling browser mDNS support to force the browser to reveal the local IP network address in WebRTC connections.

In Chromium-based browsers, mDNS support can be disabled by opening chrome://flags or edge://flags and setting Anonymize local IPs exposed by WebRTC to Disabled.