USENIX NSDI 2023. Boston, MA, USA (April 2023).
Monitoring where traffic enters and leaves a network is a routine task for network operators. In order to scale with Tbps of traffic, large Internet Service Providers (ISPs) mainly use traffic sampling for such global monitoring. Sampling either provides a sparse view or generates unreasonable overhead. While sampling can be tailored and optimized to specific contexts, this coverage–overhead trade-off is unavoidable.
Rather than optimizing sampling, we propose to "magnify" the sampling coverage by complementing it with mirroring. Magnifier enhances the global network view using a two-step approach: based on sampling data, it first infers traffic ingress and egress points using a heuristic, then it uses mirroring to validate these inferences efficiently. The key idea behind Magnifier is to use negative mirroring rules; i.e., monitor where traffic should not go. We implement Magnifier on commercial routers and demonstrate that it indeed enhances the global network view with negligible traffic overhead. Finally, we observe that monitoring based on our heuristics also allows to detect other events, such as certain failures and DDoS attacks.
HotCarbon 2022. La Jolla, California, USA (July 2022).
Today, the ICT industry has a massive carbon footprint (a few percent of the worldwide emissions) and one of the fastest growth rates. The Internet accounts for a large part of that footprint while being also energy inefficient; i.e., the total energy cost per byte transmitted is very high. Thankfully, there are many ways to improve on the current status; we discuss two relatively unexplored directions in this paper.
Putting network devices to “sleep,” i.e., turning them off, is known to be an efficient vector to save energy; we argue that harvesting this potential requires new routing protocols, better suited to devices switching on/off often, and revising the corresponding hardware/software co-design. Moreover, we can reduce the embodied carbon footprint by using networking hardware longer, and we argue that this could even be beneficial for reliability! We sketch our first ideas in these directions and outline practical challenges that we (as a community) need to address to make the Internet more sustainable.
Journal of Systems Research. Volume 1 (November 2021).
When designing their performance evaluations, networking researchers often encounter questions such as: How long should a run be? How many runs to perform? How to account for the variability across multiple runs? Despite their best intentions, researchers often answer these questions differently, thus impairing the reproducibility of their evaluations and decreasing the confidence in their results. To support networking researchers, we propose a systematic methodology that streamlines the design and analysis of performance evaluations. Our methodology first identifies the temporal characteristics of variability sources in networking experiments, and then applies rigorous statistical methods to derive performance results with quantifiable confidence, in spite of the inherent variability. We implement this methodology in a software framework called TriScale.
For each performance metric, TriScale computes a variability score that estimates, with a desired confidence, how similar the results would be if the evaluation were repeated; in other words, TriScale quantifies the reproducibility of the performance evaluation. We apply TriScale to four diverse use cases (congestion control, wireless embedded systems, failure detection, video streaming), demonstrating that TriScale helps generalize and strengthen previously published results. Improving the standards of reproducibility in networking is a crucial and complex challenge; with TriScale, we make an important contribution to this endeavor by providing a rationale and statistically sound experimental methodology.