What question did this study set out to answer?

This research aims to introduce the TRUST Lab dataset, designed to enhance the evaluation of intrusion detection systems for IoT and edge environments.

May 9, 2026Open Access

TRUSTLab dataset: a real-world CICFlowMeter dataset for IoT/edge intrusion detection

Key Points

This research aims to introduce the TRUST Lab dataset, designed to enhance the evaluation of intrusion detection systems for IoT and edge environments.
Generated in an operational testbed replicating enterprise-grade services.
Includes 15 attack families, processed into 16 single-class files totaling approximately 4.6 million bi-flows.
Conducted comprehensive statistical analyses using a baseline binary classifier.
Achieved a ROC-AUC of 0.9676 and a recall of 0.95, confirming utility for edge-oriented IDS evaluation.
Reported per-family precision, recall, and F1-scores, with confusion mainly in low-and-slow and HTTP-based vectors.

Abstract

Introduction ThIntrusion Detection Systems (IDS) for Internet of Things (IoT) and edge environments require datasets with unambiguous labels, yet existing datasets often mix benign and malicious traffic within the same capture window, producing ambiguous flow labels that may distort model evaluation. Methods This work introduces the TRUST Lab dataset, a flow-based traffic collection generated in an operational testbed reproducing enterprise-grade services and modern application interfaces. The dataset follows a single-class session policy, whereby each capture contains exclusively benign traffic or a single attack family, preventing temporal overlap and ensuring label integrity at the bi-flow level. The dataset includes 15 attack families spanning volumetric flooding, reconnaissance, application-layer exploits, protocol manipulation, evasive techniques, and persistence vectors. Traffic was processed into 16 single-class files totaling approximately 4.6 million bi-flows with 80 features per flow. Results Comprehensive statistical analyses confirm the presence of discriminative signals without requiring payload inspection. A baseline binary classifier achieved an Area Under the Receiver Operating Characteristic Curve (ROC-AUC) of 0.9676 and a recall of 0.95, supporting the dataset’s utility for lightweight, edge-oriented IDS evaluation. The multiclass benchmark further reported per-family precision, recall, and F1-scores, with the main residual confusion concentrated in low-and-slow and HTTP-based vectors. Discussion By enforcing session-level class separation and preserving bi-flow label integrity, TRUST Lab provides a reproducible dataset for evaluating IDS models in IoT and edge environments. The dataset is publicly available to support further research.

Mark Helpful

Bookmark

Relay

View Full Paper