Shell Commands Used by Participants of Hands-on Cybersecurity Training

External

Linked on 8/24/2023

We present a dataset of 21459 shell commands from 275 participants who attended cybersecurity training and solved assignments in the Linux terminal. Each acquired data record contains a command with its arguments and metadata, such as a timestamp, working directory, and host identification in the emulated training infrastructure. The commands were captured in Bash, ZSH, and Metasploit shells. The data are stored as JSON records collected using an open-source logging toolset and two open-source interactive learning environments. Researchers and developers may freely use the dataset or deploy the learning environments with the logging toolset to generate their own data in the same format.

Dataset Characteristics

Sequential

Subject Area

Computer Science

Associated Tasks

Classification, Regression, Clustering, Other

Feature Type

Categorical, Integer

# Instances

21459

# Features

9

Dataset Information

Who funded the creation of the dataset?

This research was supported by ERDF project CyberSecurity, CyberCrime and Critical Information Infrastructures Center of Excellence (No. CZ.02.1.01/0.0/0.0/16_019/0000822).

What do the instances in this dataset represent?

Linux shell commands submitted by trainees.

Does the dataset contain data that might be considered sensitive in any way?

No. The data are anonymous and cannot be linked to specific individuals.

Was there any data preprocessing performed?

Missing and invalid records were removed.

Additional Information

Detailed description and documentation of the dataset is available in the associated open-access journal paper: Valdemar Švábenský, Jan Vykopal, Pavel Seda, Pavel Čeleda. Dataset of Shell Commands Used by Participants of Hands-on Cybersecurity Training. In Elsevier Data in Brief. 2021. https://doi.org/10.1016/j.dib.2021.107398

Has Missing Values?

No

Introductory Paper

Dataset of shell commands used by participants of hands-on cybersecurity training

By Valdemar Švábenský, Jan Vykopal, Pavel Seda, Pavel Čeleda. 2021

Published in Data in Brief

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Dataset Home Page
1 citations
4929 views

Citations/Acknowledgements

If you use this dataset, please follow the acknowledgment policy on the original dataset website.

Creators

Valdemar Švábenský

valdemar@mail.muni.cz

Masaryk University

Jan Vykopal

vykopal@fi.muni.cz

Masaryk University

Pavel Seda

seda@fi.muni.cz

Masaryk University

Pavel Čeleda

celeda@fi.muni.cz

Masaryk University

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy