Toward Constructing a Balanced Intrusion Detection Dataset
Toward Constructing a Balanced
Keywords:Imbalanced dataset classification, SMOTE, CICIDS2017 dataset, Random Forest, Naïve Bayesian, Multilayer Perceptron
Several Intrusion Detection Systems (IDS) have been proposed in the current decade. Most datasets which associate with intrusion detection dataset suffer from an imbalance class problem. This problem limits the performance of classifier for minority classes. This paper has presented a novel class imbalance processing technology for large scale multiclass dataset, referred to as BMCD. Our algorithm is based on adapting the Synthetic Minority Over-Sampling Technique (SMOTE) with multiclass dataset to improve the detection rate of minority classes while ensuring efficiency. In this work we have been combined five individual CICIDS2017 dataset to create one multiclass dataset which contains several types of attacks. To prove the efficiency of our algorithm, several machine learning algorithms have been applied on combined dataset with and without using BMCD algorithm. The experimental results have concluded that BMCD provides an effective solution to imbalanced intrusion detection and outperforms the state-of-the-art intrusion detection methods.