Toward Constructing a Balanced Intrusion Detection Dataset

Toward Constructing a Balanced


  • Amer Abulmajeed Abdulrahman Alsameraee Informatics Institute for Post Graduate Studies
  • Mahmood Khalel Ibrahem Al-Nahrain University



Imbalanced dataset classification, SMOTE, CICIDS2017 dataset, Random Forest, Naïve Bayesian, Multilayer Perceptron


Several Intrusion Detection Systems (IDS) have been proposed in the current decade. Most datasets which associate with intrusion detection dataset suffer from an imbalance class problem. This problem limits the performance of classifier for minority classes. This paper has presented a novel class imbalance processing technology for large scale multiclass dataset, referred to as BMCD. Our algorithm is based on adapting the Synthetic Minority Over-Sampling Technique (SMOTE) with multiclass dataset to improve the detection rate of minority classes while ensuring efficiency. In this work we have been combined five individual CICIDS2017 dataset to create one multiclass dataset which contains several types of attacks. To prove the efficiency of our algorithm, several machine learning algorithms have been applied on combined dataset with and without using BMCD algorithm. The experimental results have concluded that BMCD provides an effective solution to imbalanced intrusion detection and outperforms the state-of-the-art intrusion detection methods.