Evaluating ML-based video streaming in real-world environments

Machine Learning in networking, in particular for Video Streaming, suffers from a reproducibility crisis. Proposed ML-based algorithms that worked well in research environments failed to deliver in the real world or delivered only modest improvements over conventional solutions. A particular lynchpin is that we do not fully understand what makes a network environment really difficult for a streaming algorithm. This makes it nearly impossible to properly evaluate algorithms before they are deployed, or just to reason about whether we could expect benefits from an ML-based algorithm in a given environment.

Furthermore, as we lack understanding about what is really needed for a `good’ ML algorithm, each new proposal features a new set of gimmicks, like novel input features that have not been used or collected for past algorithms. This make comparison of algorithms even harder. For example, the Pensieve algorithm only takes video information as input; while the more recent Fugu algorithm also included TCP information. In the test environment of Fugu, Pensieve did not perform as well as it did in the environment it was originally tested it, and it also performed much worse than Fugu. What role do the additional inputs of Fugu play in this? We cannot tell, as the TCP information used by Fugu was not collected by Pensieve, so we cannot port Fugu back to the environment Pensieve was tested in. In a similar vein, we have recently found evidence that using even more fine-grained packet information can improve network predictions – but we cannot validate this hypothesis in either the environments of Pensieve or Fugu, as both do not collect packets.

In this project, we aim to collect the real-world data needed to answer these questions. We will set up streaming applications that capture information on every availble layer, from the network (packets), over transport (TCP info), to application (video info). With this “full picture” of information, we will analyze what aspects of networks really matter for existing models, and leverage this insights to improve our own models.

This thesis can be roughly separated into the following steps:

  • Research: learn about existing algorithms for video streaming, in particular what information they rely on. Also learn about data collection in networks, both regarding theory and methods.
  • Data Collection: Extend an existing video streaming client/server to be able to collect a broad range of data. Deploy it in virtual machines across the globe.
  • Analysis and Beyond: Use the collected data to better understand what makes networks difficult for streaming algorithms and develop algorithms to leverage this.