Generating representative, live network traffic out of millions of code repositories
Abstract
In theory, any network operator, developer, or vendor should have access to large amounts of live network traffic for testing their solutions. In practice, though, that is not the case. Network actors instead have to use packet traces or synthetic traffic, which is highly suboptimal: today’s generated traffic is unrealistic. We propose a system for generating live application traffic leveraging massive codebases such as GitHub.
Our key observation is that many repositories have now become “orchestrable” thanks to the rise of container technologies. To showcase the practicality of the approach, we iterate through >293k GitHub repositories and manage to capture >74k traces containing meaningful and diverse network traffic. Based on this first success, we outline the design of a system, Dynamo, which analyzes these traces to select and orchestrate open-source projects to automatically generate live application traffic matching a user’s specification.
People
Talk
BibTex
@INPROCEEDINGS{bühler2022generating,
isbn = {978-1-4503-9899-2},
doi = {10.1145/3563766.3564084},
year = {2022-11},
booktitle = {HotNets '22: Proceedings of the 21st ACM Workshop on Hot Topics in Networks},
type = {Conference Paper},
author = {Bühler, Tobias and Schmid, Roland and Lutz, Sandro and Vanbever, Laurent},
size = {7 p.},
abstract = {In theory, any network operator, developer, or vendor should have access to large amounts of live network traffic for testing their solutions. In practice, though, that is not the case. Network actors instead have to use packet traces or synthetic traffic, which is highly suboptimal: today's generated traffic is unrealistic. We propose a system for generating live application traffic leveraging massive codebases such as GitHub.Our key observation is that many repositories have now become "orchestrable" thanks to the rise of container technologies. To showcase the practicality of the approach, we iterate through >293k GitHub repositories and manage to capture >74k traces containing meaningful and diverse network traffic. Based on this first success, we outline the design of a system, Dynamo, which analyzes these traces to select and orchestrate open-source projects to automatically generate live application traffic matching a user's specification.},
keywords = {traffic generation; traffic analysis; network virtualization},
language = {en},
address = {New York, NY},
publisher = {Association for Computing Machinery},
title = {Generating representative, live network traffic out of millions of code repositories},
PAGES = {3563766},
Note = {21st ACM Workshop on Hot Topics in Networks (HotNets 2022); Conference Location: Austin, TX, USA; Conference Date: November 14-15, 2022}
}
Research Collection: 20.500.11850/589729
Slide Sources: https://gitlab.ethz.ch/projects/41218