What is live streaming?

Streaming is the method of data transmission used when someone watches video on the Internet. It is a way to deliver a video file a little bit at a time, often from a remote storage location. By transmitting a few seconds of the file at a time over the internet, client devices do not have to download the entire video before starting to play it.

Live streaming is when the streamed video is sent over the Internet in real time, without first being recorded and stored. Today, TV broadcasts, video game streams, and social media video can all be live-streamed.

Think about the difference between regular streaming and live streaming as the difference between an actor reciting a memorized monologue and improvising a speech. In the former, the content is created beforehand, stored, and then relayed to the audience. In the latter, the audience receives the content in the same moment that the actor creates it – just like in live streaming.

The term live streaming usually refers to broadcast live streams: one-to-many connections that go out to multiple users at once. Videoconferencing technologies like Skype, FaceTime, and Google Hangouts Meet work on real-time communication (RTC) protocols rather than the protocols used by one-to-many live stream broadcasts.

How does live streaming work on a technical level?

These are the main steps that take place behind the scenes in a live stream:

  • Compression
  • Encoding
  • Segmentation
  • Content delivery network (CDN) distribution
  • CDN caching
  • Decoding
  • Video playback

Video capture

Live streaming starts with raw video data: the visual information captured by a camera. Within the computing device to which the camera is attached, this visual information is represented as digital data – in other words, 1s and 0s at the deepest level.

Compression and encoding

Next, the segmented video data is compressed and encoded. The data is compressed by removing redundant visual information. For example, if the first frame of the video displays a person talking against a grey background, the grey background does not need to render for any subsequent frames that have the same background.

Think of video compression as being like adding a piece of new furniture to a living room. It is not necessary to buy entirely new furniture each time a new chair or side table is added. Instead it is possible to keep the room layout roughly the same and change out just one piece at a time, occasionally making larger rearrangements as necessary. Similarly, not every frame of a video stream needs to be rendered in total – just the parts that change from frame to frame, such as the movement of a person’s mouth.

“Encoding” refers to the process of converting data into a new format. Live streaming video data is encoded into an interpretable digital format that a wide variety of devices recognize. Common video encoding standards include:

  • H.264
  • H.265
  • VP9
  • AV1

Segmentation

Video includes a lot of digital information, which is why it takes longer to download a video file than to download a short PDF or an image. Because it would not be practical to send all the video data out over the Internet at once, streaming video is divided into smaller segments a few seconds in length.

CDN distribution and caching

Once the live stream has been segmented, compressed, and encoded (all of which only takes a few seconds), it needs to be made available to the dozens or millions of viewers who want to watch it. In order to maintain high quality with minimal latency while serving the stream to multiple viewers in different locations, a CDN should distribute it.

A CDN is a distributed network of servers that cache and serve content on behalf of an origin server. Using a CDN results in faster performance, because user requests no longer have to travel all the way to the origin server but can instead be handled by a nearby CDN server. Handling requests and delivering content in this manner also reduces the origin server’s workload. Finally, CDNs make it possible to efficiently serve content to users around the world because their servers are located all over the world instead of clustered in a single geographic area.

A CDN will also cache – temporarily save – each segment of the live stream, so most viewers will get the live stream from the CDN cache instead of from the origin server. This actually makes the live stream closer to real-time even though the cached data is a few seconds behind, because it cuts down on round-trip time (RTT) to and from the origin server.

Decoding and video playback

The CDN sends the live stream out to all the users who are watching the stream. Each user’s device receives, decodes, and decompresses the segmented video data. Finally, a media player on the user’s device – either a dedicated app or a video player within the browser – interprets the data as visual information, and the video plays.