Cross-Dataset IDS Evaluation — Wahyu Ikbal Maulana

PENS

POLITEKNIK ELEKTRONIKA NEGERI SURABAYA

IDS

FINAL PROJECT PROPOSAL

ML

IDS

IoT

IIoT

研究者 Wahyu Ikbal Maulana
3323600056 · 3 SDT B Applied Data Science

PEMBIMBING
Ferry Astika Saputra PEMBIMBING
Tita Karlita, S.Kom, M.Kom

FP
01

TOPIC

CYBER

01 / 09

Why this research exists · 5 interconnected reasons

1

New Dataset

CIC IIoT 2025 Released

Latest IIoT benchmark — synchronized sensor time-series and network traffic. 50 attack types, 7 categories, 40 industrial devices. More representative than its predecessor.

unb.ca/cic · IIoT-Dataset-2025

2

Threat Landscape

Problem

820K Attacks / Day

Up 46% from the previous year (ORDR). The IoT ecosystem keeps expanding — more devices, wider attack surface, older datasets no longer representative enough.

ordr.net · IoT Security Statistics

3

Dataset Bias

Problem

97.7% Class Imbalance

CICIoT2023 is dominated by attack traffic at 97.7%. 99% accuracy looks great, but the model is mostly guessing the majority class — not actually learning.

CICIoT2023 · class analysis

4

ML Implementation

Problem

Dataset → IDS Model

CIC IIoT 2025 is relevant as an ML benchmark for IDS. But IoT security data is complex — multi-class performance falls far below binary classification.

Feature Selection · IDS · Benchmark

5

Generalization

Cross-Dataset Eval

A model that performs well on one dataset may not generalize to another domain. Cross-dataset evaluation proves the model learns general attack patterns, not memorized training data.

Cross-domain · Generalization · IDS

02 / 09

Objectives

Research Goals

1

Evaluate the cross-domain generalization capability of ML-based IDS between IoT and IIoT environments.

2

Quantify the generalization gap between in-dataset and cross-dataset evaluation scenarios.

3

Identify robust network traffic features that remain effective across different datasets.

4

Establish a cross-dataset benchmark using Decision Tree, Random Forest, and XGBoost.

Contributions

Research Benefits

Provides a baseline benchmark for the newly released DataSense (CIC IIoT 2025) dataset.

Supports the development of more reliable and generalizable IDS models for real-world IIoT deployment.

Reduces the risk of overestimating IDS performance caused by single-dataset evaluation.

Contributes empirical evidence on the impact of domain shift between IoT and IIoT datasets.

Serves as a reference for future research on cross-dataset intrusion detection evaluation.

03 / 09

Attribute	CICIoT 2023	CIC IIoT 2025 · DataSense
Domain	Consumer IoT · smart home / campus	Industrial IoT · factory / plant floor
Devices	105 heterogeneous IoT devices	40 IIoT + OT industrial devices
Modality	Network traffic (flow / packet)	Sensor time-series + network traffic
Attacks	33 types · 7 categories	50 types · 7 categories
Features	~48 features per flow	Multi-modal: sensor + network features
Labels	Benign + 33 attack classes	Benign + 50 attack types
Challenge	97.7% class imbalance	Domain shift from IoT to IIoT

CICIoT 2023 unb.ca/cic/datasets/iotdataset-2023.html CIC IIoT 2025 unb.ca/cic/datasets/iiot-dataset-2025.html

04 / 09

Feature
Engineering

Raw traffic → variance threshold → correlation filter → 48-feature subset. PCA for dimensionality reduction before classification. Focus on computational efficiency.

INPUTRaw packet capture

SELECTIONVariance + Correlation

REDUCTIONPCA · Subset

CLASSIFIERRandom Forest · SVM

Ensemble
Learning

Multiple base classifiers (RF, DT, SVM, NB) → majority voting or stacking → ensemble decision. Class imbalance handled with SMOTE or class weighting.

BASERF · DT · SVM · NB

COMBINEVoting · Stacking

IMBALANCESMOTE · Class weight

EVALPrecision · Recall · F1

Deep
Learning

Normalized flow → LSTM/CNN for sequential detection. Autoencoder for anomaly detection on encrypted or unlabeled IoT traffic.

INPUTTime-series network flow

MODELLSTM · CNN · Autoencoder

TASKAnomaly · Classification

EDGEEncrypted traffic

Cross-
Validation

Stratified k-fold CV with per-class metrics. Some studies add cross-dataset testing to validate generalization to other environments.

EVALStratified k-fold CV

METRICSPrecision · Recall · F1

EXTRACross-dataset validation

REPORTConfusion matrix · ROC

05 / 09

06 / 09

System design diagram for the cross-dataset evaluation pipeline

Train on CICIoT2023, then align features and test on CIC IIoT 2025 to measure the generalization gap.

07 / 09

3

The final result is a three-part answer to the research problem.

Generalization Gap

Measures how far performance drops when the model moves from in-dataset testing to cross-dataset testing.

Robust Feature Analysis

Identifies which features stay useful across both datasets and which ones are too dataset-specific.

Baseline Recommendation

Gives a practical starting point for the best model-feature combination to carry forward.

Mar - Apr 01

Understand the data

Exploration, cleaning, and a quick look at both datasets before any modeling starts.

Apr - May 02

Prepare features

Normalization, imbalance handling, and selecting features that are worth keeping.

Jun - Jul 03

Train and compare

Build DT, RF, and XGBoost models, then test them in-dataset and cross-dataset.

Aug - Oct 04

Analyze and write

Measure the generalization gap, decide the baseline, and finish the final report and defense.

Thank You