Research Article | Open Access | Download PDF
Volume 74 | Issue 5 | Year 2026 | Article Id. IJETT-V74I5P130 | DOI : https://doi.org/10.14445/22315381/IJETT-V74I5P130Two-Stage Ensemble Machine Learning for Network Intrusion Detection
Jimson A. Olaybar, Patrick D. Cerna
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 17 Dec 2025 | 13 Jan 2026 | 11 Mar 2026 | 30 May 2026 |
Citation :
Jimson A. Olaybar, Patrick D. Cerna, "Two-Stage Ensemble Machine Learning for Network Intrusion Detection," International Journal of Engineering Trends and Technology (IJETT), vol. 74, no. 5, pp. 484-494, 2026. Crossref, https://doi.org/10.14445/22315381/IJETT-V74I5P130
Abstract
This paper introduces a two-stage ensemble machine learning architecture of network Intrusion Detection Systems (IDS), which is developed to improve the accuracy of detection and reliability of classification in a more complicated cyberspace. The proposed model operates in two phases: Stage A involves binary classification to differentiate between benign traffic and malicious activity with the help of a calibrated stacking ensemble of Random Forest, Gradient Boosting, and XGBoost classifiers; Stage B involves the use of a multi-class attack categorization through a Random Forest classifier that will be trained only on attack samples. The CIC-IDS2017 dataset was used to evaluate the system and includes more than 2.8 million records of network traffic, with varied attack scenarios. Preprocessing involved normalization of features, filling in of missing values, and screening of 78 flow-based numerical features. As a result of the experiments, the two-stage ensemble obtained 99.92% accuracy in binary classification and 99.83% accuracy in multi-class classification on 14 types of attacks. The model scored close to the optimum ROC-AUC ( 0.99987) and was able to reduce the bias of class imbalance using probability estimation and threshold optimization. It was compared and found that the proposed system was superior to the existing methods of ensemble and deep learning techniques in accuracy and computation efficiency. The results prompt the future prospects of multi-level ensemble learning to enhance the performance of IDS with regard to modern network infrastructures. The further developments in the field will focus on adaptive learning to unknown threats, and implementation together with real-time network defenses.
Keywords
Ensemble Learning, Intrusion Detection, Machine Learning, Network Security, Random Forest.
References
[1] Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani,
“Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic
Characterization,” Proceedings of the 4th International
Conference on Information Systems Security and Privacy ICISSP, Funchal, Madeira, Portugal,
vol. 1, pp. 108-116.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Nour Moustafa, and Jill
Slay, “UNSW-NB15: A Comprehensive Data Set for Network Intrusion Detection
Systems (UNSW-NB15 Network Data Set),” 2015 Military Communications and
Information Systems Conference (MilCIS), Canberra, ACT, Australia, pp. 1-6,
2015.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Mahbod Tavallaee et al., “A
Detailed Analysis of the KDD CUP 99 Data Set,” 2009 IEEE Symposium on Computational Intelligence for Security
and Defense Applications, Ottawa, ON, Canada, pp. 1-6, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Leo Breiman, “Random
Forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Jerome H. Friedman, “Greedy
Function Approximation: A Gradient Boosting Machine,” The Annals of
Statistics, vol. 29, no. 5, pp. 1189-1232, 2001.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Tianqi Chen, and Carlos Guestrin, “XGBoost: A Scalable Tree
Boosting System,” Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mini, Association
for Computing Machinery, New York, NY, United States, pp. 785-794, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[7] David H. Wolpert, “Stacked Generalization,” Neural
Networks, vol. 5, no. 2, pp. 241-259, 1992.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Alexandru Niculescu-Mizil, and Rich Caruana, “Predicting Good
Probabilities with Supervised Learning,” Proceedings of the 22nd
International Conference on Machine Learning, Association
for Computing Machinery, New York, NY, United States, pp. 625-632, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[9] John C. Platt, “Probabilistic Outputs for Support Vector
Machines and Comparisons to Regularized Likelihood Methods,” Advances in Large
Margin Classifiers, vol. 10, no. 3, pp. 61-74, 1999.
[Google Scholar]
[10] Bianca Zadrozny, and
Charles Elkan, “Transforming Classifier Scores into Accurate Multi-Class
Probability Estimates,” Proceedings of the Eighth ACM SIGKDD International
Conference on Knowledge Discovery and Data Minin, Association for Computing Machinery, New York,
NY, United States, pp. 694-699, 2002.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Anna L. Buczak, and Erhan
Guven, “A Survey of Data Mining and Machine Learning Methods for Cyber Security
Intrusion Detection,” IEEE Communications Surveys and Tutorials, vol.
18, no. 2, pp. 1153-1176, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Markus Ring et al., “A
Survey of Network-based Intrusion Detection Datasets,” Computers and
Security, vol. 86, pp. 147-167, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Ansam Khraisat et al.,
“Survey of Intrusion Detection Systems: Techniques, Datasets and Challenges,” Cybersecurity,
vol. 2, no. 1, pp. 1-22, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Bianca Zadrozny, and
Charles Elkan, “Obtaining Calibrated Probability Estimates from Decision Trees
and Naive Bayesian Classifiers,” ICMI, vol. 1, no. 5, pp. 1-8, 2001.
[Google Scholar]
[15] Sebastián García, Alejandro
Zunino, and Marcelo Campo, “Survey on Network-based Botnet Detection Methods,” Security
and Communication Networks, vol. 7, no. 5, pp. 878-903, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Shadi Aljawarneh, Monther
Aldwairi, and Muneer Bani Yassein, “Anomaly-based Intrusion Detection System
through Feature Selection Analysis and Building Hybrid Efficient Model,” Journal
of Computational Science, vol. 25, pp. 152-160, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[17] R. Vinayakumar et al.,
“Deep Learning Approach for Intelligent Intrusion Detection System,” IEEE
Access, vol. 7, pp. 41525-41550, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Sydney Mambwe Kasongo, and
Yanxia Sun, “A Deep Learning Method with Wrapper based Feature Extraction for
Wireless Intrusion Detection System,” Computers and Security, vol. 92,
2020.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Robin Sommer, and Vern
Paxson, “Outside the Closed World: On using Machine Learning for Network
Intrusion Detection,” 2010 IEEE Symposium on Security and Privacy,
Oakland, CA, USA, pp. 305-316, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Chuanlong Yin et al., “A
Deep Learning Approach for Intrusion Detection using Recurrent Neural
Networks,” IEEE Access, vol. 5, pp. 21954-21961, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Mohiuddin Ahmed, Abdun
Naser Mahmood, and Jiankun Hu, “A Survey of Network Anomaly Detection
Techniques,” Journal of Network and Computer Applications, vol. 60, pp.
19-31, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Yisroel Mirsky et al.,
“Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection,” arXiv
preprint, pp. 1-15, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Nathan Shone et al., “A
Deep Learning Approach to Network Intrusion Detection,” IEEE Transactions on
Emerging Topics in Computational Intelligence, vol. 2, no. 1, pp. 41-50,
2018.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Ahmad Javaid et al., “A
Deep Learning Approach for Network Intrusion Detection System,” Eai Endorsed
Transactions on Security and Safety, vol. 3, no. 9, pp. 1-6, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[25] W. Haider et al.,
“Generating Realistic Intrusion Detection System Dataset based on Fuzzy
Qualitative Modeling,” Journal of Network and Computer Applications,
vol. 87, pp. 185-192, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Anna Sperotto et al., “An
Overview of IP Flow-based Intrusion Detection,” IEEE Communications Surveys
and Tutorials, vol. 12, no. 3, pp. 343-356, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Wenye Wang, and Zhuo Lu,
“Cyber Security in the Smart Grid: Survey and Challenges,” Computer Networks,
vol. 57, no. 5, pp. 1344-1371, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Mohammad Almseidin et al.,
“Evaluation of Machine Learning Algorithms for Intrusion Detection System,” 2017
IEEE 15th International Symposium on Intelligent Systems and
Informatics (SISY), Subotica, Serbia, pp. 000277-000282, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Jasmin Kevric, Samed Jukic,
and Abdulhamit Subasi, “An Effective Combining Classifier Approach using Tree
Algorithms for Network Intrusion Detection,” Neural Computing and
Applications, vol. 28, no. S1, pp. 1051-1058, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Varun Chandola, Arindam
Banerjee, and Vipin Kumar, “Anomaly Detection: A Survey,” ACM Computing
Surveys (CSUR), vol. 41, no. 3, pp. 1-58, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Rosemarie Y. Saligue, and Emannuel T. Saligue,
“Real-World Traffic Analysis in Pisonet using DTW and Anomaly Detection,” 2025
7th International Conference on Innovative Data Communication
Technologies and Application (ICIDCA), Coimbatore, India, pp. 99-104, 2025.
[CrossRef] [Google Scholar] [Publisher Link]