Elaris Computing Nexus
Received On : 30 May 2025
Revised On : 06 July 2025
Accepted On : 24 July 2025
Published On : 05 August 2025
Volume 01, 2025
Pages : 108-118
The machine learning-based classifiers are challenging it is hard to decide what the cybersecurity threat is because multiclass situations are difficult to detect. The proposed study offers a new method of probability calibration that combines both class-based and a global normalization strategy. This is meant to make the projected outcomes more predictable without changing how accurate the classifications are. The test is used on three new and different cybersecurity datasets: EMBER2024, CICAPT-IIoT2024, and UGRansome2024. These data sets encompass everything from malware to IIoT attacks and ransomware situations. We utilized several of the usual classifiers, like Logistic Regression, random Forest, support vision machine, and XGBoost, to see how well they worked before and after we employed our calibration approach. We always used this method to improve some of the most important measures, like Log Loss, Brier Score, and Expected Calibration Error (ECE), for all of the datasets. Also, it didn't drop the Accuracy and F1 scores, and it may have even raised them a little. The ECE in the EMBER2024 dataset went down from 0.148 to 0.041. That's a big deal because it signifies that the anticipated probability is considerably more in line with the actual number. The study's visuals also showed how our system made predictions that were more accurate for the right classes, which aided both overconfidence and underconfidence. These insights are very important for cybersecurity since having precise probability estimates can help you prioritize risks, cut down on false alarms, and make better choices. The paper presents a solid case for this calibration approach being both reliable and useful for finding risks in different categories by combining the numerical results with the outcomes that were found.
Probability Calibration, Multiclass Classification, Cybersecurity, EMBER2024, IIoT, Ransomware, Log Loss, Brier Score, Expected Calibration Error.
The author reviewed the results and approved the final version of the manuscript.
The author(s) received no financial support for the research, authorship, and/or publication of this article.
No funding was received to assist with the preparation of this manuscript.
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
The datasets used in this study are publicly available and can be accessed through the following sources: EMBER2024 (https://emberdataset.com), CICAPT-IIoT2024 (https://www.unb.ca/cic/datasets/iiot.html), and UGRansome2024 (https://www.gti.ssr.upm.es/datasets/UGRansome). These datasets contain comprehensive cybersecurity features and labeled attack classes, which were used to train and evaluate the baseline classifiers and the proposed Class-Global Calibration (CGC) framework.
Contributions
All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.
Corresponding Author
Open Access This article is licensed under a Creative Commons Attribution NoDerivs is a more restrictive license. It allows you to redistribute the material commercially or non-commercially but the user cannot make any changes whatsoever to the original, i.e. no derivatives of the original work. To view a copy of this license, visit: https://creativecommons.org/licenses/by-nc-nd/4.0/
Abdulhaq Abildtrup, “Reliable Cybersecurity Threat Detection through Probability Calibration in Multiclass Classification”, Elaris Computing Nexus, pp. 108-118, 2025, doi: 10.65148/ECN/2025011.
© 2025 Abdulhaq Abildtrup. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.