detection_of_IoT_botnet_attacks_N_BaIoT

Donated on 3/18/2018

This dataset addresses the lack of public botnet datasets, especially for the IoT. It suggests *real* traffic data, gathered from 9 commercial IoT devices authentically infected by Mirai and BASHLITE.

Dataset Characteristics

Multivariate, Sequential

Subject Area

Computer Science

Associated Tasks

Classification, Clustering

Feature Type

Real

# Instances

7062606

# Features

-

Dataset Information

Additional Information

(a) Attribute being predicted: -- Originally we aimed at distinguishing between benign and Malicious traffic data by means of anomaly detection techniques. -- However, as the malicious data can be divided into 10 attacks carried by 2 botnets, the dataset can also be used for multi-class classification: 10 classes of attacks, plus 1 class of 'benign'. (b) The study's results: -- For each of the 9 IoT devices we trained and optimized a deep autoencoder on 2/3 of its benign data (i.e., the training set of each device). This was done to capture normal network traffic patterns. -- The test data of each device comprised of the remaining 1/3 of benign data plus all the malicious data. On each test set we applied the respective trained (deep) autoencoder as an anomaly detector. The detection of anomalies (i.e., the cyberattacks launched from each of the above IoT devices) concluded with 100% TPR.

Has Missing Values?

No

Variables Table

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
no
no
no
no
no
no
no
no
no
no

0 to 10 of 115

Additional Variable Information

-- The following describes each of the features headers: * Stream aggregation: H: Stats summarizing the recent traffic from this packet's host (IP) HH: Stats summarizing the recent traffic going from this packet's host (IP) to the packet's destination host. HpHp: Stats summarizing the recent traffic going from this packet's host+port (IP) to the packet's destination host+port. Example 192.168.4.2:1242 -> 192.168.4.12:80 HH_jit: Stats summarizing the jitter of the traffic going from this packet's host (IP) to the packet's destination host. * Time-frame (The decay factor Lambda used in the damped window): How much recent history of the stream is capture in these statistics L5, L3, L1, ... * The statistics extracted from the packet stream: weight: The weight of the stream (can be viewed as the number of items observed in recent history) mean: ... std: ... radius: The root squared sum of the two streams' variances magnitude: The root squared sum of the two streams' means cov: an approximated covariance between two streams pcc: an approximated covariance between two streams

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download
0 citations
14748 views

Creators

Yair Meidan

Michael Bohadana

Yael Mathov

Yisroel Mirsky

Dominik Breitenbacher

Asaf

Asaf Shabtai

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy