Adaptive Network Traffic Modeling

Doctoral Thesis

Abstract

The Internet is a decentralized and constantly growing control system that has become an integral part of the lives of over 5.3 billion people. With this scale comes a vast array of applications. Performant and robust control of applications like video streaming requires modeling network traffic, such as estimating whether the network is congested or how long it will take to transmit data. The complexity of this modeling problem has steadily increased over time: the model space is ever-growing with each new network algorithm and application, while observable signals have remained largely unchanged. It has become extremely difficult, and perhaps intractable, to model network traffic from first principles, and research has increasingly turned to machine learning (ML) to learn models from data.

This dissertation explores the opportunities and challenges of using ML for network traffic modeling and additionally investigates how advances in programmable networking may provide better signals.

First, we study learning over time. We present Memento, a sample selection system for updating ML models with a focus on tail performance while avoiding unnecessary retraining. The key insight behind Memento is that a smart data selection is crucial to maintain representative training data and to decide when retraining models with the selected data is beneficial.

Second, we investigate learning over space, the generalization of models to other network environments and tasks, and present a Network Traffic Transformer (NTT). NTT is a pre-trained Transformer-based model that can be efficiently fine-tuned to different networks and prediction tasks.

Third, we study the underlying problem of learning latent network state common to many prediction tasks. Through in-depth analysis and comparison of several ML-based models for video streaming, we gain important insights into modeling strategies and model generalizability.

Finally, we explore the potential of programmable networks to enhance observable signals by programmatically processing all packets in the network, albeit with limited computational resources. We present FitNets, which makes the most of constrained programmability with hardware-software co-design: FitNets learns accurate distributions of network traffic features in the control plane, enabled by efficient model scoring in the data plane.

People

Dr. Alexander Dietmüller

PhD student

2018—2024

BibTex

@PHDTHESIS{dietmüller2024adaptive,
	copyright = {In Copyright - Non-Commercial Use Permitted},
	year = {2024},
	type = {Doctoral Thesis},
	author = {Dietmüller, Alexander},
	size = {149 p.},
	abstract = {The Internet is a decentralized and constantly growing control system that has become an integral part of the lives of over 5.3 billion people.With this scale comes a vast array of applications.Performant and robust control of applications like video streaming requires modeling network traffic, such as estimating whether the network is congested or how long it will take to transmit data.The complexity of this modeling problem has steadily increased over time:the model space is ever-growing with each new network algorithm and application, while observable signals have remained largely unchanged.It has become extremely difficult, and perhaps intractable, to model network traffic from first principles, and research has increasingly turned to machine learning (ML) to learn models from data.This dissertation explores the opportunities and challenges of using ML for network traffic modeling and additionally investigates how advances in programmable networking may provide better signals.First, we study learning over time.We present Memento, a sample selection system for updating ML models with a focus on tail performance while avoiding unnecessary retraining.The key insight behind Memento is that a smart data selection is crucial to maintain representative training data and to decide when retraining models with the selected data is beneficial.Second, we investigate learning over space, the generalization of models to other network environments and tasks, and present a Network Traffic Transformer (NTT). NTT is a pre-trained Transformer-based model that can be efficiently fine-tuned to different networks and prediction tasks.Third, we study the underlying problem of learning latent network state common to many prediction tasks.Through in-depth analysis and comparison of several ML-based models for video streaming, we gain important insights into modeling strategies and model generalizability.Finally, we explore the potential of programmable networks to enhance observable signals by programmatically processing all packets in the network, albeit with limited computational resources.We present FitNets, which makes the most of constrained programmability with hardware-software co-design:FitNets learns accurate distributions of network traffic features in the control plane, enabled by efficient model scoring in the data plane.},
	keywords = {COMPUTER NETWORKS; Network Traffic Analysis; MACHINE LEARNING (ARTIFICIAL INTELLIGENCE)},
	language = {en},
	address = {Zurich},
	publisher = {ETH Zurich},
	DOI = {10.3929/ethz-b-000698662},
	title = {Adaptive Network Traffic Modeling},
	school = {ETH Zurich}
}

Research Collection: 20.500.11850/698662