This is part one of a three part series analysing the impact of Adaptive Streaming, specifically MPEG-DASH, on Quality of Experience (QoE).
Other posts in this series:
- Part 2. The second part outlines the methods employed to conduct the experiment.
- Part 3. This post describes the results and conclusions that can be derived from our findings.
Real-time entertainment services such as high quality video streaming currently account for more than 60% of the Internet traffic, e.g., in North America’s fixed access networks as shown in Figure 1. Interestingly, these services are all delivery over-the-top (OTT) of the existing networking infrastructure using the Hypertext Transfer Protocol (HTTP) which resulted in the standardization of MPEG Dynamic Adaptive Streaming over HTTP (DASH). The MPEG-DASH standard enables smooth multimedia streaming towards heterogeneous devices and commonly assumes the usage of HTTP-URLs to identify the segments available for the clients.
Internet TV vs. Traditional TV in 2010
The following areas are most important to an overall TV experience:
- timing control
- ease of use
Source: Cisco IBSG Youth Survey, Cisco IBSG Youth Focus Group Sessions, 2010
In the first part of this blog post we focus on the Quality of Experience (QoE) of DASH-based services. We provide a general definition of QoE and which parameters are important for media services based on MPEG-DASH. The second part of the blog post comprises results of a QoE evaluation of different adaptation logics proposed in the research literature and also one commercially available implementation from Bitmovin.
QOE for Adaptive Streaming with DASH
I. Quality of Experience
The term Quality of Experience (QoE) can be seen as an evolution from the term Quality of Service (QoS), both defined by the ITU-T in P.10/G.100. QoS is defined as the “totality of characteristics of a telecommunications service that bear on its ability to satisfy stated and implied needs of the user of the service” whereas QoE is defined as “the overall acceptability of an application or service, as perceived subjectively by the end-user”. Although this definition was largely used (but not necessarily agreed upon), one could easily understand that acceptability is only one aspect of quality, as one may accept a service – depending on the context – but not necessarily be happy or satisfied. Therefore, the COST Action IC1003 – QUALINET goes a step beyond and defines QoE as “the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and/or enjoyment of the application or service in the light of the user’s personality and current state”.
The QUALINET white paper goes even further and defines influence factors as “any characteristic of a user, system, service, application, or context whose actual state or setting may have influence on the Quality of Experience for the user” which are grouped into human, system, and context influence factors. Additionally, features of QoE are provided depending on the level of direct perception, interaction, the usage situation, and service. A QoE feature is thus defined as “a perceivable, recognized and namable characteristic of the individual’s experience of a service which contributes to its quality”.
As the definitions above are very generic, we will describe next what it means for DASH-based and high quality video streaming services.
II. QoE parameters for DASH
Different application domains may have different requirements in terms of QoE. Therefore, there is a need to provide specializations of a generally agreed definition of QoE (see above) pertaining to the respective application domain, taking into account its requirements formulated by means of influence factors and features of QoE. Consequently, an application-specific QoE definition can be provided by selecting the influence factors and features of QoE reflecting the requirements of the application domain and incorporating them into the generally agreed definition of QoE.
For DASH-based adaptive streaming services the main QoE influence factors can be described as
- initial/start-up delay,
- buffer underruns also known as stalls,
- quality switches and
- media throughput.
The initial or start-up delay comprises the time between service/content request and start of the actual playout which typically involves processing time both at the server and client, network time for sending the MPD request and receiving first segments and initial buffer time before the playout starts. In general, the start-up delay should be low but it also depends on the use case. For example, the QoE of live streams or short movie clips is more sensitive to start-up delay than full-length video on demand content.
Buffer underruns / stalls
A stall occurs when the video/picture freezes. This is typically due to buffer underuns and playback is resumed if enough segments have been re-buffered. In practice, users experiencing stalls usually report a very low QoE and, thus, stalls should be completely avoided, even if it means increasing the start-up delay.
Under changing network conditions, quality switches occur to avoid buffer underruns (and stalls) in order to guarantee a smooth video playback. However, if it happens too often (e.g., every second) or with a high amplitude (e.g., switching from a very high quality to a very low quality representation) it may negatively impact the QoE.
Finally, the overall media throughput at the client, measured in media bits per second, and a higher media throughput, usually means higher QoE. But it should be never used alone but always in conjunction with the above metrics, as we will see in the experiment results.
Overview and Summary
The above-mentioned parameters focus on the context; specifically on delivery and device characteristics. However, QoE is about the users consuming content and services. Therefore it is important to understand how the content is provided for DASH-based adaptive streaming services, as it directly influences the QoE. In particular, this means how many different representations are available and in which qualities (incl. bitrate, resolution, etc.) and the actual segment length (e.g., 2s vs. 10s). Additional parameters are the available languages, existence of subtitles, closed caption, or any other means that help impaired users to consume the content more conveniently. In this paper we focus on the context parameters, different segment lengths, and assume a broad range of different representations available from which the client can select.
The second part of this blog post will provide results of a QoE evaluation of different adaptation logics proposed in the research literature and also one commercially available implementation from Bitmovin.