Kitsune Network Attack
Donated on 10/15/2019
A cybersecurity dataset containing nine different network attacks on a commercial IP-based surveillance system and an IoT network. The dataset includes reconnaissance, MitM, DoS, and botnet attacks.
Dataset Characteristics
Multivariate, Sequential, Time-Series
Subject Area
Computer Science
Associated Tasks
Classification, Clustering, Causal-Discovery
Feature Type
Real
# Instances
27170754
# Features
7
Dataset Information
Additional Information
==== Overview ==== The are 9 network capture datasets in total, listed below. Viol. is the security violation (Confidentiality, Integrity, and Authenticity). Attack Type Attack Name Tool Viol. Description: The attacker Recon. -1 OS Scan Nmap C scans the network for hosts, and their operating systems, to reveal possible vulnerabilities. -2 Fuzzing SFuzz C searches for vulnerabilities in the camera's web servers by sending random commands to their cgis. Man in the Middle -3 Video Injection Video Jack C,I injects a recorded video clip into a live video stream. -4 ARP MitM Ettercap C intercepts all LAN traffic via an ARP poisoning attack. -5 Active Wiretap R.PI 3B C intercepts all LAN traffic via active wiretap (network bridge) covertly installed on an exposed cable. Denial of Service -6 SSDP Flood Saddam A overloads the DVR by causing cameras to spam the server with UPnP advertisements. -7 SYN DoS Hping3 A disables a camera's video stream by overloading its web server. -8 SSL Reneg. THC A disables a camera's video stream by sending many SSL renegotiation packets to the camera. Botnet Malware -9 Mirai Telnet C,I infects IoT with the Mirai malware by exploiting default credentials, and then scans for new vulnerable victims network. -For more details on the attacks themselves, please refer to our paper. ==== Data Organization ==== For each attack (network capture) above we provide (1) a csv of the features used in our paper where each row is a network packet, (2) the corresponding labels [benign, malicious], and (3) the original network capture in truncated pcap format. -Each attack dataset is located in a separate directory -Each directory contains three files: <Attack>_pcap.pcapng : A raw pcap capture of the original N packets. The packets have been truncated to 200 bytes for privacy reasons. <Attack>_dataset.csv : An N-by-M matrix of M-sized feature vectors, each describing the packet and the context of that packet's channel (see our paper for details). <Attack>_labels.csv : An N-by-1 vector of 0-1 values which indicate whether each packet in <Attack>_pcap.pcapng (and <Attack>_dataset.csv) is malicious ('1') or not ('0'). For the Man-in-middle-Attacks, all packets which have passed through the MitM are marked as '1'. -Every attack dataset begins with benign traffic, and then at some point (1) the attacker connects to the network and (2) initiates the given attack.
Has Missing Values?
No
Variable Information
=== The features in the csv files === Each row in the csv is a packet captured (chronologically). More a deep explanation, please see our paper. In general, each row (feature vector) are recent (temporal) statistics which describes the context of the packet's channel and its communicating parties: Whenever a packet arrives, we extract a behavioral snapshot of the hosts and protocols which communicated the given packet. The snapshot consists of 115 traffic statistics capturing a small temporal window into: (1) the packet's sender in general, and (2) the traffic between the packet's sender and receiver. Specifically, the statistics summarize all of the traffic... ...originating from this packet's source MAC and IP address (denoted SrcMAC-IP). ...originating from this packet's source IP (denoted SrcIP). ...sent between this packet's source and destination IPs (denoted Channel). ...sent between this packet's source and destination TCP/UDP Socket (denoted Socket). A total of 23 features (capturing the above) can be extracted from a single time window λ (see Table II). The FE extracts the same set of features from a total of five time damped windows of approximately: 100ms, 500ms, 1.5sec, 10sec, and 1min into the past (λ = 5, 3, 1, 0.1, 0.01), thus totaling 115 features. We note that not every packet applies to every channel type (e.g., there is no socket if the packet does not contain a TCP or UDP datagram). In these cases, these features are zeroed. Thus, the final feature vector ~x, which the FE passes to the FM, is always a member of R^n, where n = 115. The feature extraction code (pcap to csv) is available at: https://github.com/ymirsky/Kitsune-py
Dataset Files
File | Size |
---|---|
ssdp_flood/SSDP Flood_dataset.csv.gz | 3 GB |
arp_mitm/ARP MitM_dataset.csv.gz | 2.2 GB |
video_injection/Video Injection_dataset.csv.gz | 2.1 GB |
syn_dos/SYN DoS_dataset.csv.gz | 2 GB |
active_wiretap/Active Wiretap_dataset.csv.gz | 1.9 GB |
0 to 5 of 28
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset kitsune_network_attack = fetch_ucirepo(id=516) # data (as pandas dataframes) X = kitsune_network_attack.data.features y = kitsune_network_attack.data.targets # metadata print(kitsune_network_attack.metadata) # variable information print(kitsune_network_attack.variables)
Kitsune Network Attack [Dataset]. (2019). UCI Machine Learning Repository. https://doi.org/10.24432/C5D90Q.
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.