Probabilistic modelling of internet routes
Network operators are often interested in verifying the configuration of their network is correct, that is, it behaves the way they intend it to (e.g. no blackholes, no forwarding loops, no route leaks, …). However, the behaviour of the network depends on how it is connected to the rest of the internet (so-called routing input). Indeed, to form the internet, networks (Autonomous Systems) exchange thousands of BGP messages to establish routes to connect with each other. Therefore, from the viewpoint of one network operator, being able to predict what type of routes you can expect from your neighbours may yield great advantages. State-of-the-art verification tools are typically agnostic to routing input dynamics and either constrain the set of routing inputs [1] or allow arbitrary routing inputs [2][3]. The latter generally makes them unrealistic or unscalable. You will contribute in an effort to improve this.
This thesis aims to better understand the “laws of motion” of the internet in order to predict changes at a local level. More precisely, we wish to estimate a distribution over all the possible routing inputs our network could be exposed to based on historical data. In a previous Masters thesis, some initial results were achieved in this direction, showing that some environments were much more likely than others. We wish to take the scope of that work to the next level both in terms of methods used and in performance (precision and training/inference time).
For the probabilistic modelling, we envision either traditional black box probabilistic modelling (Gaussian Processes, variational inference techniques, deep learning) or the use of domain knowledge to explicitly define a parametric model of the internet that explains better the dynamics of message exchange. The latter method is inspired by network inference techniques [4]. Inference of model probabilities can then be done exactly or by random sampling.
You will work in close collaboration with two PhD students, with weekly meetings and continuous support.
Milestones
- After familiarising yourself with the background information (we will provide references to get you started), you will get to know your tools with initial experiments.
- Getting to know the data by replicating some results.
- Formally framing the problem. What are you trying to model and with which assumptions?
- Come up with good evaluation metrics, given the objectives of capturing route diversity, generalisation, training and inference complexity, …
- Explore modelling ideas presented above.
- Enhance the framework to enable training and evaluation different models.
- Implement at least one black box model and one based on domain knowledge.
- Improve one of the models.
- Depending on your initial experiments, you should explore the most promising avenues more sophisticatedly and perhaps develop your own ideas.
- You should rigorously evaluate your model on a variety of networks.
Requirements
- Some knowledge of probabilistic modelling.
- Coding abilities are also welcome as implementation and evaluation are key to this thesis.
References
- S. Steffen, T. Gehr, P. Tsankov, L. Vanbever, and M. Vechev, “NetDice: Probabilistic Verification of Network Configurations”, SIGCOMM ‘20.
- R. Beckett, A. Gupta, R. Mahajan, and D. Walker, “Minesweeper: A General Approach to Network Configuration Verification” SIGCOMM ‘17.
- D. Wang, P. Zhang, and A. Gember-Jacobson, “Expresso: Comprehensively Reasoning About External Routes Using Symbolic Simulation,” SIGCOMM ‘24
- W. Mühlbauer, A. Feldmann, O. Maennel, M. Roughan, and S. Uhlig, “Building an AS-Topology Model that Captures Route Diversity”.