Skip to content

Management: Managed STUNner dataplane

Compare
Choose a tag to compare
@rg0now rg0now released this 03 Oct 16:34
· 219 commits to main since this release

We are proud to present STUNner v0.16.0, the next major release of the STUNner Kubernetes media gateway for WebRTC from l7mp.io.

News

STUNner v0.16.0 is a major feature release and marks an important step towards STUNner reaching v1.0 and becoming generally available for production use.

Gateway API v0.8.0 support

STUNner uses the Gateway API, the upcoming Kubernetes API slated to replace the venerable Ingress API, to let users customize the way WebRTC media enters the cluster. The Gateway API is rapidly progressing towards general availability at breakneck speed and STUNner needs to progress at the same pace. In this release STUNNer has been updated to the latest Gateway API version v0.8.0. This has the bitter consequence, however, that the API versions in some of the Gateway API resources have to be updated. We have used this opportunity to streamline some of the unclear terminology around STUNner as well, especially concerning the inconsistent use of protocol names. So far, the plain transport protocol name, say, UDP or TCP, has meant "TURN over the said transport" (UDP or TCP). To differentiate between "pure UDP" and "TURN over UDP" listeners, the latter has been renamed to "TURN-UDP", and similarly for TCP, TLS and DTLS.

The particular rules are as follows:

  • The Gateway and GatewayClass API resources have to be updated from v1alpha2 to v1beta1, UDPRoutes remain at v1alpha2, and GatewayConfigs remain at v1alpha1. The old API versions are still accepted, but Kubernetes will silently rewrite API versions in the background. Our understanding is that this may cause problems in certain cases (e.g., with ArgoCD), therefore we recommend users to bump the API version on the corresponding resources (GatewayClass and Gateway) during the upgrade. This is as simple as changing the first line of your GatewayClass and Gateway YAMLs from apiVersion: gateway.networking.k8s.io/v1alpha2 to apiVersion: gateway.networking.k8s.io/v1beta1.
  • The protocol field in Gateway listener specifications has to be updated by adding the TURN- prefix, so UDP becomes TURN-UDP, TCP becomes TURN-TCP, etc. Again, the old protocol names are still accepted for compatibility, but we will remove support in the next release.

Managed dataplane

In the early days of STUNner we made the decision to make it the responsibility of the user to provision the dataplane for STUNner. That is, you had to separately helm-install the control plane, i.e., the stunner-gateway-operator, and the dataplane(s), that is, the stunnerd pods that actually process traffic. If STUNner was to be run in multiple namespaces, you had to provision a separate dataplane per each namespace. With STUNner maturing, this operational model has got more and more cumbersome and error-prone.

This release introduces the managed dataplane mode, a completely new way to operate STUNner. In this mode the operator automatically provisions a separate dataplane per each Gateway, i.e., a separate stunnerd Deployment, plus the usual LoadBalancer service to expose the gateway to clients. This simplifies the installation and configuration of common STUNner use cases substantially and makes the operation of complex setups much easier. Since the managed mode is still experimental, we decided to ship v0.16.0 with the old legacy dataplane mode by default. This means that you do not have to make any changes at this point, but in the next release managed mode is expected to become the default. We ask you to experiment with this feature and help us stabilize the managed mode by filing bug reports.

STUNner dataplane API

One of the recurring criticisms related to STUNner has always been the slow reaction to control plane updates. This was the consequence of the way the control plane (the gateway operator) and the dataplane (the stunnerd daemons) interact: the dataplane configuration is rendered by the operator into a Kubernetes ConfigMap, which is then mapped into the file system of the stunnerd pods as a regular configuration file that stunnerd watches for changes. Unfortunately, it may take Kubernetes up to a minute to push the new config file into the pod's filesystem, which was too slow for certain use cases.

The new release comes with an experimental implementation of the STUNner Configuration Discovery Service (CDS), which lets stunnerd to autodiscover its own configuration via a dedicted WebSocket connection. This makes control plane updates quasi real-time. CDS is available only in the managed mode and, correspondingly, it is switched off by default. However, installing the gateway operator with the managed mode enables the CDS service automatically, so if you are affected by slow control plane updates then you will see immediate improvements once you switch to the bleeding edge.

Standalone TURN server

We have received a number of complaints that STUNner is difficult to deploy as a public TURN server (this is called the "headless model"). Supporting this mode would make it possible to operate, and dynamically scale, a fleet of public TURN servers fully in Kubernetes.

One reason for this is that previously it was very difficult to provision a public IP for STUNner pods. With the new release this has become much simpler with the managed dataplane mode: just set hostNetwork: true in the Dataplane CR serving as a template for provisioning the stunnerd pods and your dataplane should be instantly re-deployed over a public IP. The other reason was a crucial assumption in the way STUNner filters clients' access to peers: namely, STUNner would allow access only to peers located within the same Kubernetes cluster. This made it difficult to forward to peers that are outside the cluster, like it is the case when STUNner is used as a standalone TURN server. In this release we introduce the StaticService API, which provides a a completely Kubernetes friendly way to control the IP ranges into which peer connections are accepted.

Cross-namespace route binding

The initial releases of STUNner included a number of simplifications, which made our lives much easier implementing it. One of these simplifications was the assumption that Gateways and UDPRoutes must both exist in the same Kubernetes namespace to be able to attach. With STUNner gradually maturing towards general availability, this limitation has become more and more cumbersome. In this release, we implemented the full route attachment machinery from the Kubernetes Gateway API, which unlocks a number of exciting new persona-based deployment models.

Mediasoup tutorial

MediaSoup is a popular WebRTC media server distribution that enables developers to build group chats, one-to-many broadcasts, and real-time streaming applications. This release comes with a new tutorial for setting up STUNner with mediasoup.

Minor updates

Apart from the major updates, this release also comes with the usual assortment of documentation updates, tests and CI/CD improvements all around the place.

feature: Implement and integrate a Config Discovery Service (CDS)
feature: Implement a StaticService CRD
feature: Implement route attachment, fixes #30
feature: Managed dataplane (#35)
feature: Use Gateway Addresses as rendered Service's external IPs (#33)
fix: Handle concurrent writes in the CDS server
fix: Make sure service-port names of LB Services are unique, fixes #19
fix: Make sure STUNner listens on all interfaces
fix: Merge LB Service metadata instead of rewriting it
fix: Protocol name disambiguation, fixes #28
fix: Report correct resource names in reconcilers
fix: Set default Namespace in ParentRef from UDPRoute.Namespace
fix: Streamline TURN URI parsing
chore: Upgrade to sigs.k8s.io/gateway-api v0.8.0
doc: Document that cross-namespace route bindings are now supported
doc: Document the StaticService CRD
doc: Warn that UDPRoutes ignore the service-port in backend Services
doc: Add mediasoup demo #103

Enjoy STUNner and don't forget to support us!