Statistical models for the characterization, identification and mitigation of distributed attacks in data networks
Abstract
The thesis focuses on statistical approaches to model, mitigate, and prevent distributed network attacks.
When dealing with distributed network attacks (and, more in general, with cyber-security problems), three
fundamental phases/issues emerge distinctly.
The first issue concerns the threat propagation across the network, which entails an "avalanche" effect,
with the number of infected nodes increasing exponentially as time elapses.
The second issue regards the design of proper mitigation strategies (e.g., threat detection, attacker's
identification) aimed at containing the propagation phenomenon. Finally (and this is the third issue), it is
also desirable to act on the system infrastructure to grant a conservative design by adding some controlled
degree of redundancy, in order to face those cases where the attacker has not been yet defeated.
The contributions of the present thesis address the aforementioned relevant issues, namely, propagation,
mitigation and prevention of distributed network attacks. A brief summary of the main contributions is
reported below.
The first contribution concerns the adoption of Kendall’s birth-and-death process as an analytical model for
threat propagation. Such a model exhibits two main properties: i) it is a stochastic model (a desirable
requirement to embody the complexity of real-world networks) whereas many models are purely
deterministic; ii) it is able to capture the essential features of threat propagation through a few parameters
with a clear physical meaning. By exploiting the remarkable properties of Kendall’s model, the exact
solution for the optimal resource allocation problem (namely, the optimal mitigation policy) has been
provided for both conditions of perfectly known parameters, and unknown parameters (with the latter case
being solved through a Maximum-Likelihood estimator).
The second contribution pertains to the formalization of a novel kind of randomized Distributed Denial of
Service (DDoS) attack. In particular, a botnet (a network of malicious entities) is able to emulate some
normal traffic, by picking messages from a dictionary of admissible requests. Such a model allows to
quantify the botnet “learning ability”, and to ascertain the real nature of users (normal or bot) via an
indicator referred to as MIR (Message Innovation Rate). Exploiting the considered model, an algorithm that
allows to identify a botnet (possibly) hidden in the network has been devised. The results are then
extended to the case of a multi-cluster environment, where different botnets are concurrently present in
the network, and an algorithm to identify the different clusters is conceived.
The third contribution concerns the formalization of the network resilience problem and the consequent
design of a prevention strategy. Two statistical frameworks are proposed to model the high availability
requirements of network infrastructures, namely, the Stochastic Reward Network (SRN), and the Universal
Generating Function (UGF) frameworks. In particular, since in the network environment dealing with multidimensional
quantities is crucial, an extension of the classic UGF framework, called Multi-dimensional UGF
(MUGF), is devised. [edited by author]