Machine Learning-based Detection of C&C Channels with a Focus on the Locked Shields Cyber Defense Exercise

Authors: Nicolas Känzig, Roland Meier, Luca Gambazzi, Vincent Lenders, and Laurent Vanbever
2019 11th International Conference on Cyber Conflict (CyCon)

Abstract

The diversity of applications and devices in enterprise networks combined with large traffic volumes make it inherently challenging to quickly identify malicious traffic. When incidents occur, emergency response teams often lose precious time in reverse-engineering the network topology and configuration before they can focus on malicious activities and digital forensics. In this paper, we present a system that quickly and reliably identifies Command and Control (C&C) channels without prior network knowledge. The key idea is to train a classifier using network traffic from attacks that happened in the past and use it to identify C&C connections in the current traffic of other networks. Specifically, we leverage the fact that - while benign traffic differs - malicious traffic bears similarities across networks (e.g., devices participating in a botnet act in a similar manner irrespective of their location). To ensure performance and scalability, we use a random forest classifier based on a set of computationally-efficient features tailored to the detection of C&C traffic. In order to prevent attackers from outwitting our classifier, we tune the model parameters to maximize robustness. We measure high resilience against possible attacks - e.g., attempts to camouflaging C&C flows as benign traffic - and packet loss during the inference. We have implemented our approach and we show its practicality on a real use case: Locked Shields, the world’s largest cyber defense exercise. In Locked Shields, defenders have limited resources to protect a large, heterogeneous network against unknown attacks. Using recorded datasets (from 2017 and 2018) from a participating team, we show that our classifier is able to identify C&C channels with 99% precision and over 90% recall in near real time and with realistic resource requirements. If the team had used our system in 2018, it would have discovered 10 out of 12 C&C servers in the first hours of the exercise.

People

Dr. Roland Meier
PhD student
2017—2022

BibTex

@INPROCEEDINGS{känzig2019machine,
	isbn = {978-9949-9904-5-0},
	doi = {10.23919/CYCON.2019.8756814},
	year = {2019-07-11},
	booktitle = {2019 11th International Conference on Cyber Conflict (CyCon)},
	type = {Conference Paper},
	author = {Känzig, Nicolas and Meier, Roland and Gambazzi, Luca and Lenders, Vincent and Vanbever, Laurent},
	size = {19 p.},
	keywords = {Malware; Botnets; Machine learning; Digital forensics; Locked Shields; Network defense},
	language = {en},
	address = {Piscataway, NJ},
	publisher = {IEEE},
	title = {Machine Learning-based Detection of C&C Channels with a Focus on the Locked Shields Cyber Defense Exercise},
	PAGES = {8756814},
	Note = {11th International Conference on Cyber Conflict (CyCon 2019); Conference Location: Tallinn, Estonia; Conference Date: May 28-31, 2019}
}

Research Collection: 20.500.11850/355030