A robust domain partitioning intrusion detection method

MWITONDI, Kassim, SAID, Raed A and ZARGARI, Shahrzad (2019). A robust domain partitioning intrusion detection method. Journal of information security and applications, 48.

[img] PDF
Mwitondi-RobustDomainPartitioning(AM).pdf - Accepted Version
Restricted to Repository staff only until 16 July 2020.
Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB)
Official URL: https://www.sciencedirect.com/science/article/pii/...
Link to published version:: https://doi.org/10.1016/j.jisa.2019.102360

Abstract

The capacity for data mining algorithms to learn rules from data is influenced by, inter-alia, the random nature of training and test data as well as by the diversity of domain partitioning models. Isolating normal from malicious data traffic across networks is one regular task that is naturally affected by that randomness and diversity. We propose a robust algorithm Sample-Measure-Assess (SMA) that detects intrusion based on rules learnt from multiple samples. We adapt data obtained from a set of simulations, capturing data attributes identifiable by number of bytes, destination and source of packets, protocol and nature of data flows (normal and abnormal) as well IP addresses. A fixed sample of 82,332 observations on 27 variables was drawn from a superset of 2.54 million observations on 49 variables and multiple samples were then repeatedly extracted from the former and used to train and test multiple versions of classifiers, via the algorithm. With two class labels–binary and multi-class, the dataset presents a classic example of masked and spurious groupings, making an ideal case for concept learning. The algorithm learns a model for the underlying distributions of the samples and it provides mechanics for model assessment. The settings account for our method’s novelty–i.e., ability to learn concept rules from highly masked to highly spurious cases while observing model robustness. A comparative analysis of Random Forests and individually grown trees show that we can circumvent the former’s dependence on multicollinearity of the trees and their individual strength in the forest by proceeding from dimensional reduction to classification using individual trees. Given data of similar structure, the algorithm can order the models in terms of optimality which, means our work can contribute towards understanding the concept of normal and malicious flows across tools. The algorithm yields results that are less sensitive to violated distributional assumptions and, hence, it yields robust parameters and provides a generalisation that can be monitored and adapted to specific low levels of variability. We discuss its potential for deployment with other classifiers and potential for extension into other applications, simply by adapting the objectives to specific conditions.

Item Type: Article
Identification Number: https://doi.org/10.1016/j.jisa.2019.102360
SWORD Depositor: Symplectic Elements
Depositing User: Symplectic Elements
Date Deposited: 18 Jul 2019 13:41
Last Modified: 19 Jul 2019 13:45
URI: http://shura.shu.ac.uk/id/eprint/24877

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics