Learning distributions to detect anomalies using all the network traffic

Anomaly detection is an essential building block of many applications, including DDoS detection, root cause analysis, traffic estimation, and change detection. A vital part of detecting anomalies is establishing a sense of normality, e.g., by learning distributions for various features from benign traffic. Learning these distributions in the control plane requires coping with the limited visibility of sampling; learning distributions in the data plane requires relying on simplistic techniques because of hardware constraints.

We propose a novel data- and control-plane co-design for learning distributions: in the control plane, we search for candidate distributions with Bayesian optimization; in the data plane, we evaluate how well each distribution matches all observed traffic, without missing rare events. The aggregated evaluation results are fed back to the control plane to guide the optimization and learn accurate distributions. Our key insight is that while learning and optimization are infeasible in the data plane, evaluating distributions is feasible and leverages data plane strengths. We confirm the feasibility of our approach with a preliminary evaluation.

Research Areas: Data-Driven Networking and Network Analysis and Reasoning