detection_of_IoT_botnet_attacks_N_BaIoT

Donated on 3/18/2018

This dataset addresses the lack of public botnet datasets, especially for the IoT. It suggests *real* traffic data, gathered from 9 commercial IoT devices authentically infected by Mirai and BASHLITE.

Dataset Characteristics

Multivariate, Sequential

Subject Area

Computer Science

Associated Tasks

Classification, Clustering

Feature Type

Real

# Instances

7062606

# Features

Dataset Information

Additional Information

(a) Attribute being predicted: -- Originally we aimed at distinguishing between benign and Malicious traffic data by means of anomaly detection techniques. -- However, as the malicious data can be divided into 10 attacks carried by 2 botnets, the dataset can also be used for multi-class classification: 10 classes of attacks, plus 1 class of 'benign'. (b) The study's results: -- For each of the 9 IoT devices we trained and optimized a deep autoencoder on 2/3 of its benign data (i.e., the training set of each device). This was done to capture normal network traffic patterns. -- The test data of each device comprised of the remaining 1/3 of benign data plus all the malicious data. On each test set we applied the respective trained (deep) autoencoder as an anomaly detector. The detection of anomalies (i.e., the cyberattacks launched from each of the above IoT devices) concluded with 100% TPR.

Has Missing Values?

Introductory Paper

N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep Autoencoders

By Yair Meidan, Michael Bohadana, Yael Mathov, Yisroel Mirsky, Dominik Breitenbacher, A. Shabtai, Y. Elovici. 2018

Published in IEEE pervasive computing

Variables Table

Variable Name	Role	Type	Description	Units	Missing Values
					no
					no
					no
					no
					no
					no
					no
					no
					no
					no

Rows per page

0 to 10 of 115

Additional Variable Information

-- The following describes each of the features headers: * Stream aggregation: H: Stats summarizing the recent traffic from this packet's host (IP) HH: Stats summarizing the recent traffic going from this packet's host (IP) to the packet's destination host. HpHp: Stats summarizing the recent traffic going from this packet's host+port (IP) to the packet's destination host+port. Example 192.168.4.2:1242 -> 192.168.4.12:80 HH_jit: Stats summarizing the jitter of the traffic going from this packet's host (IP) to the packet's destination host. * Time-frame (The decay factor Lambda used in the damped window): How much recent history of the stream is capture in these statistics L5, L3, L1, ... * The statistics extracted from the packet stream: weight: The weight of the stream (can be viewed as the number of items observed in recent history) mean: ... std: ... radius: The root squared sum of the two streams' variances magnitude: The root squared sum of the two streams' means cov: an approximated covariance between two streams pcc: an approximated covariance between two streams

Dataset Files

File	Size
Philips_B120N10_Baby_Monitor/benign_traffic.csv	204.4 MB
Danmini_Doorbell/mirai_attacks.rar	177.9 MB
Philips_B120N10_Baby_Monitor/mirai_attacks.rar	166.4 MB
SimpleHome_XCS7_1003_WHT_Security_Camera/mirai_attacks.rar	163.2 MB
Ecobee_Thermostat/mirai_attacks.rar	162.7 MB

Rows per page

0 to 5 of 27

Download (1.7 GB)

1 citations

25489 views

Creators

Yair Meidan

Michael Bohadana

Yael Mathov

Yisroel Mirsky

Dominik Breitenbacher

Asaf

Asaf Shabtai

DOI

10.24432/C5RC8J

License

This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.