WebRTC requires a server

Andrew Hutton2

The first phase of standardization has not yet been completed, but WebRTC (Web-based Real-Time Communications) is already causing a stir in the market for web-based real-time communication. As the initiator of the project, Google estimates that more than 800 [] companies are currently investing in the development of real-time applications based on WebRTC. Time for an article on history, technology and standards.

At the end of 2010, a Google team working on Hangout started with WebRTC. It recognized that web-based real-time communication was only realistic if protocols and APIs were standardized across all browsers. The project should also promote a convergence of the different philosophies both in the web and in the traditional communication industry.

The Google team therefore decided to invite the relevant standardization experts from the IETF (Internet Engineering Task Force) and the W3C (World Wide Web Consortium) to a workshop at their headquarters in Mountain View to "define the field". The response from the industry was probably more positive than expected - and there were various reasons for this: As early as 2010, many communications companies tried to implement unified communications services in the cloud and provide customers with services via web browsers.

Without standardization, however, this was only possible using proprietary plug-ins. On the one hand, however, these were not so easy to implement - consider the different plug-ins for the various browsers and platforms - and on the other hand, they were rejected by many IT departments in companies or at least met with little approval. Another factor that came up during the IETF80 meeting in March 2011 was the shorter innovation cycles required of the telecommunications industry to keep pace with the web world.

The Session Initiation Protocol (SIP) developed by the IETF was successful and was widely used in IP phones, many softphones and, within the framework of the 3GPP standards (3rd Generation Partnership Project), even in mobile communications. Nevertheless, SIP did not enable telecommunications providers to keep up with the same innovation dynamic that is known from the web industry. To this day, the mobile industry is chained to the telecommunications model in the area of ‚Äč‚Äčinnovation. Innovations are so cumbersome in telecommunications because new services in this sector first require the development of new standards by the standardization bodies and their implementation by the providers.

This process takes time and means that the same services are available to all providers at the same time, who then only differentiate themselves through price. However, WebRTC does not dwell on the standardization of signaling protocols, but focuses directly on real-time media handling on the web.

Why is WebRTC so important?

Much has been said that WebRTC as a technology will revolutionize both consumer and corporate communications, including cellular communications. This is a great promise for a technology that essentially consists of just a realtime media stack with an attached JavaScript API.

The reason for the forecast potential is simply that with WebRTC, real-time communication, and especially real-time video conferencing, is becoming a new tool for millions of web application developers. Its APIs ensure that web developers no longer need to master the complex functionality of real-time communication themselves. Instead, they get an API in a language they understand that enables them to embed speech, videos and data in any application they are working on.

Thanks to these functions in the browser, real-time communication can be integrated into any web application - regardless of whether it is a complex business process application or a social media application. This is known as contextual or contextual communication, and it means that real-time communication is related to what you are doing and does not require a specific separate tool.


The architecture of WebRTC

The essence of the WebRTC architecture is that the transmission and reception of audio, video and other data between WebRTC clients and their peers occurs in real time. This is what distinguishes WebRTC, for example, from Dynamic Adaptive Streaming over HTTP (DASH), which works well with IP TV broadcasting, for example, because the latencies that arise here do not play a major role. DASH is just not real real-time communication between several people as you know it from the good old telephone.

With WebRTC, audio and video use standard codecs and are securely transmitted via DTLS-SRTP (Datagram Transport Layer Security, Secure Real-Time Transport Protocol). The client-side application script (JavaScript) calls an API specified by W3C, which instructs the browser to transfer audio or video between the local input / output devices (microphone, loudspeaker or camera / display) and a remote endpoint via IETF-defined protocols. However, there are other factors to consider, such as signaling, connectivity testing, NAT traversal, security, and codecs. Figure 1 shows the basic architecture.

The figure shows the "signaling", i.e. the call signal, ringing, accepting or hanging up a call, between client and server. Only through server signaling can participants who are in different networks be found at all. Since WebRTC follows a web model in which only the basic building blocks are standardized, the actual signaling protocol for WebRTC is not specified and the WebRTC application developers decide on the design themselves.

The fact that the signaling between the WebRTC client and server is not specified has caused much discussion. Many sides interpreted this property as a weakness. De facto, however, it is probably a strength and a conscious design decision of the IETF working group, because the signaling protocol is left to the application developer, who can then work it out as simply or as complex as necessary. Why should a simple communication application in the consumer area be burdened with the same complexity as professional and secure business software? There, for example, when changing from Wi-Fi to 4G / 3G, the media stream is restored almost without interruption, because alternative WebRTC media channels have already been signaled to the server by the client.

The waiver of standardization at the signaling level enables innovations at a speed that is otherwise only available on the web. And the industry can finally free itself from the usual slow innovation cycles of traditional telecommunications.

Another issue is the media level: WebRTC can create direct media connections between browsers. The function must be the same for all browsers and therefore requires standardization - a task of the IETF. The fact that WebRTC allows direct connections between browsers without a plug-in is new to the web. That was previously unthinkable. However, this is accompanied by security implications that the standardization authorities tackle.

Another component of the WebRTC architecture to be standardized is the browser API. It is a prerequisite for WebRTC applications to be able to work independently of the browser and is the responsibility of the W3C's WebRTC working group.

Next step security

Internet-specific standardization is all about security and data protection. This has always been the greatest concern of the IETF and W3C in connection with WebRTC. As mentioned, WebRTC is the first technology to support direct browser-to-browser connections. In addition, WebRTC applications may require access to a microphone, camera or, in the case of screen sharing applications, even the screen. It doesn't take much imagination to imagine that misuse of these functions by a malicious web application would have devastating consequences.

This is one of the reasons WebRTC standardization took so long. The browser providers had to be able to trust that a reliable security solution was available. In order to prevent malicious web servers from hacking a browser and sending unwanted media to other browsers, a special procedure has been implemented: the sender browser must first obtain the consent of the recipient browser before it can transmit data. This procedure is documented in RFC 7675.

After a trust relationship has been established between two WebRTC participants, the resulting media stream still has to be securely encrypted. For this purpose, WebRTC media (audio, video and data) are encrypted using the DTLS-SRTP method as standard. The increasing spread of encryption techniques in communication is a hotly debated topic, not least at the state level. In the WebRTC standardization bodies, the decision about general encryption was never an issue. Previous experience with SIP had shown that the complexity due to optional encryption made an always-on policy for WebRTC an easy decision. So there will be no unencrypted WebRTC media streams.

WebRTC goes mobile

At the beginning of the WebRTC project, both IETF and W3C put browser interoperability first. It soon became apparent, however, that the real goal was to make WebRTC mobile-ready
close. And in the mobile world, native applications, not web applications, are still predominant. Furthermore, Apple, with its significant market share, did not participate in the WebRTC initiative. Many therefore feared that this could slow down the WebRTC implementation.

However, these concerns turned out to be false. Rather, the truth is that leaving Apple has little or no impact on the success of WebRTC in mobile applications. It is much more likely that a number of applications that work with WebRTC are already in use on the mobile device.

Google recognized early on that WebRTC would not be successful without a robust solution for mobile devices, whether iOS or Android-based. That is why the company was massively committed to these platforms. Many WebRTC implementations, at least those of the open source browser, are freely accessible. This enables developers to implement a WebRTC stack with little effort, for example in a smartphone app, outside of a browser.

The lack of support from Apple did not hinder the spread of WebRTC on mobile devices. Rather, rumors in the WebRTC standards community suggest that Apple is gradually changing its stance, and so it is likely that the company could move closer to WebRTC in 2016.

Standardization and implementation today

More than five years have passed since the start of the WebRTC project, and the WebRTC 1.0 standards have not yet been finalized. In the web world, however, nobody is waiting for standards to be ratified. Instead, the agile development model is followed, in which testing, implementation and standardization run side by side and in this way enable a high rate of innovation. Unify, for example, has been accompanying the process with WebRTC for several years. The experiences from the prototype work and the collaboration cloud circuit, which has been available since October 2014, flow back into the standardization process.

The clear leader in browser support for WebRTC is Chrome, closely followed by Firefox and Opera, who are doing everything they can to keep up with Google. WebRTC 1.0 should be completed in a few months. The first standard-compliant implementations in the browsers are expected by the end of 2016.

Microsoft initially struggled to make a clear decision about its WebRTC strategy. The takeover of Skype, just as WebRTC was gradually picking up speed, made the decision even more difficult, as Skype does not rely on open standards. In the meantime, however, Microsoft has made it clear that the company would like to be perceived as part of the WebRTC initiative, even if it takes a slightly different route than the other browser providers.

ORTC and cPaaS

What is ORTC and what does it have to do with WebRTC?

ORTC (Object-Based Real-Time Communications) is a W3C community group that disagrees with various aspects of the WebRTC solution as proposed by the main WebRTC working group. Above all, however, as a community group, ORTC is not regulated by the W3C when creating a standard track specification. Microsoft therefore decided quickly to support the work of ORTC and to implement ORTC in its own Edge browser.

Some of the results of the ORTC efforts have now been incorporated into the W3C's WebRTC mainstream API. However, it is not possible to use WebRTC applications on an ORTC-based browser such as Edge without an additional JavaScript intermediate layer to emulate the WebRTC API. Fortunately, this intermediate level is currently being worked on.

ORTC and WebRTC are identical in terms of media level. Therefore, ORTC and WebRTC implementations should be interoperable. Currently, however, at the media level, particularly video interoperability, there are various compatibility issues between Edge and other WebRTC browsers. As soon as the current work on WebRTC 1.0 has been completed, it can be expected that WebRTC Next Version (WebRTC NV) and ORTC will continue to converge.

What is WebRTC PaaS and a collaborative PaaS?

The W3C's WebRTC API makes it easy for a sufficiently competent web developer to write an application that can establish a peer-to-peer audio, video, or data connection between browsers. Google's WebRTC project enables a corresponding application to be extended to mobile platforms. However, it is much more complex to develop robust video conferencing applications that work for large numbers of participants. Furthermore, since no signaling stack is built into WebRTC, special knowledge is necessary in order to develop applications that support the cooperation with conventional enterprise systems or the public PSTN telephony (Public Switched Telephone Network).

One way to get around these obstacles and implement WebRTC applications is to use a WebRTC platform and API developed by a provider who has this capability and makes this service available to others. Such a service is called WebRTC Platform as a Service (PaaS) and is already available from various providers.

Using WebRTC PaaS, an application developer can take advantage of an API that is more abstract than the W3C's WebRTC API. Complex elements such as the signaling system required for traditional PSTN interworking are hidden. The PaaS provider also provides scalable, robust cloud services that are necessary to set up reliable multi-party conference systems.

If you also want to bring the WebRTC communication in relation to team collaboration, documents or even business events from ERP, CRM and IoT systems, you need more than a WebRTC PaaS. A team collaboration cloud must provide a data model that ensures this business context. This can be found, for example, at Circuit and its collaborative PaaS (cPaaS) with the conversational approach. A WebRTC video conference, for example, takes place with the same group of participants who also chat together on a specific topic or exchange documents.


WebRTC undoubtedly plays a key role in the communications industry, including the mobile world. The first phase of WebRTC standardization has yet to be completed, but it is already having an irreversible impact on hundreds of companies that offer their customers WebRTC applications. WebRTC has accelerated the trend towards cloud communication services such as Unified Communications as a Service (UCaaS) or made modern team collaboration possible in the first place, and the trend is still increasing.

The WebRTC 1.0 standardization should be completed in 2016. The convergences between WebRTC and ORTC (Google and Microsoft) will then increase in all probability and result in a thoroughly standardized WebRTC technology for all browsers and platforms.

The influence of WebRTC is already considerable. In the coming years, it will be seen how WebRTC will continue to transform the corporate and mobile communications industries. (ane)

Andrew Hutton
is Head of Standardization at Unify. In this role, he leads Unify's involvement in IETF and W3C organizations that further develop standards such as WebRTC.

References & links