Imagine, you want to send a hand-written letter to one of your friends. Of course, drafting starts with simply writing the letter but it requires lots of effort in-between to send it to the right destination.
It includes three major process
- Collecting the letter and validating both the addresses.
- Filtering and distributing the letter as per the pin-code
- Sending the letter by finding the exact location such as the house number.
- In case, the address is wrong, reverting the letter to the sender.
You might be thinking why am I explaining this process of sending a letter. It is because WebRTC works in the same way.
WebRTC (Web Real-time audio and video communication), is an open-source project that enables an opportunity to add real-time audio and video communication capabilities to your application.
It allows voice, video, and generic data to be sent between peers to build up powerful audio and video communication solutions.
WebRTC provides plugin-free APIs, that natively support desktop, browsers, and mobile devices. It leverages many high-level protocols and standards to achieve the same.
As I mentioned earlier, WebRTC is a peer-to-peer protocol where media is directly transferred between two clients, opposite to the client and the server model.
Let's understand WebRTC by comparing it with the same process of sending a letter.
Signalling i.e. Collecting Letters
As we need to know the address of the person before sending him the letter, the same is required in the case of webRTC. Before directly connecting to another client, we require the address of the receiver.
Although, it is easy to find the address of the receiving person by his pin-code. It is still one of the major challenges in peer-to-peer communication with identifying the location of another computer or IP to start two-way communication.
It is similar to finding a house without knowing the street name or the house number. In simple terms, when you visit any webpage by hitting any website URL, a request will follow it to the server, which will respond with the webpage in the background, hitting URL as an HTTP request, followed by its discovery via DNS(Domain Name Server) a.k.a. web page, to get the response
Peers (clients or computers) are not listed in DNS, so it is not possible to find a video directly from one's computer and this creates the problem of how to receive audio and video stream directly, without involving any external server. Welcome to WebRTC!
STUN and TURN Servers i.e. Filter as per location
Let's get back to the same example of sending and receiving letters. Whenever we want to send or receive a letter, it is aided by a post office. Based on the details of the receiver, it is sent to the destination.
The same is true while sending and receiving packets over the internet. A mobile device or a computer is placed behind a firewall and a network translation device (NAT), and that's why it is not assigned to the static public IP address. On top of the private network, the NAT device translates private IP addresses to public IP addresses and vice versa.
The primary function of NAT devices is to provide better security. As you can see in the below figure, the private IP address 192.168.0.1 is actually 220.127.116.11 to the outer world. NAT device uses mapping tables to accomplish better routing between the private and the public realm.
Given the same scenario, it is directly not possible to get my friend's IP address, to send him the audio and the video data. How does my friend know what is the IP address to send the audio and the video?
That is the case when it is required to manage signalling and that's where we require STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays around NAT) servers.
Signalling is not implemented or specified by webRTC to avoid redundancy and use existing advancements.
STUN servers are hosted on a public internet and are responsible for sending private IP: PORT from an incoming request. In simple terms, a computer uses STUN server to discover its IP:PORT from a public perspective.
This enables webRTC to set up a direct connection with another peer behind the NAT. STUN servers are light-weight and can handle large numbers of requests at the same time.
Although, when webRTC tries to set up direct communication between peers over UDP and if it fails via UDP, webRTC tries TCP. If that fails, TURN servers can be used as a fallback to stream data between the endpoints.
In simple terms, TURN is used to relay audio, video, and data streaming between peers. TURN servers are deployed on the public internet and that's why they can directly be available to peers even behind the firewall and proxies. TURN servers consume a lot of bandwidth.
SDP and ICE candidates i.e., Finding the exact location
SDP stands for Session Description Protocol. It is useful to describe media communication sessions. It does not stream any media but it is used to transfer various audio and video codecs, network topologies, and other device information between peers. It is simply a string-based profile with all the information about the user's device.
This process works like any channeling protocol, initiated by sending an "offer" to a peer using a signalling protocol (for instance SIP and SDP). After that initiator waits for the "answer" from any receiver connected to the same channel.
Apart from this, a single computer might have multiple network interfaces attached to it. So one computer might have multiple private addresses attached to it such as (wired, wi-fi, mobile data, etc ). To address this problem, peers have to generate their ICE candidates which store the device and network configurations.
ICE stands for Interactive Connectivity Establishment, a technique used in NAT (Network Address Translator). It provides a combination IP, port, and a transport protocol to use. By using these combinations, RTC enables you to find the shortest path in reaching a particular peer in the network.
End to end representation
Mentioned below is a diagrammatic format, which is an end-to-end representation of everything that has been explained so far.
WebRTC and Use Cases
WebRTC works for several use cases and many companies are dependent on it for their daily functioning. These use cases are -
- Online Education
- Video Conferencing
- Live audio streaming
- Health care
- One-way conversational devices
- Team collaboration and communication
WebRTC will remain all-time emerging as these use cases will continue to rise and it will be more prevalent gradually among all the sectors. This leaves us to be determinant of the future of WebRTC to be increasing in the scale of its growth and use cases.
Final words, to conclude.
WebRTC is one of the fastest-growing technologies with very well-written standards and protocols. It leverages new thinking of media communication with the support of old technology stacks too, such as SIP, SDK, and RTP.
There are many libraries available to provide a complete abstraction of WebRTC APIs. There are plenty of WebRTC-as-service product providers such as ZujoNow realtime communication SDK. All of these options are worth considering for faster adaption.
I hope this article helps you to understand WebRTC in detail.
Thank you for reading.