Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact

Repository Web            Google
View ALL Data Sets

× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site.

Detect Malware Types Data Set
Download: Data Folder, Data Set Description

Abstract: Provide a short description of your data set (less than 200 characters).

Data Set Characteristics:  

Multivariate, Time-Series, Text

Number of Instances:




Attribute Characteristics:


Number of Attributes:


Date Donated


Associated Tasks:


Missing Values?


Number of Web Hits:



Provide the names, email addresses, institutions, and other contact information of the donors and creators of the data set.

Data Set Information:

This study seeks to obtain data which will help to address machine learning based malware research gaps. The specific objective of this study is to build a benchmark dataset for Windows operating system API calls of various malware. This is the first study to undertake metamorphic malware to build sequential API calls. It is hoped that this research will contribute to a deeper understanding of how metamorphic malware change their behavior (i.e. API calls) by adding meaningless opcodes with their own dissembler/assembler parts. In our research, we have translated the families produced by each of the software into 8 main malware families: Trojan, Backdoor, Downloader, Worms, Spyware Adware, Dropper, Virus. Table 1 shows the number of malware belonging to malware families in our data set. As you can see in the table, the number of samples of other malware families except AdWare is quite close to each other. There is such a difference because we don't find too much of malware from the adware malware family.

Attribute Information:

Various Windows API calls

Relevant Papers:

Provide references to papers that have cited this data set in the past (if any).

Citation Request:

AF. Yazı, FÖ Çatak, E. Gül, Classification of Metamorphic Malware with Deep Learning (LSTM), IEEE Signal Processing and Applications Conference, 2019.

Catak, FÖ., Yazi, AF., A Benchmark API Call Dataset for Windows PE Malware Classification, [Web Link], 2019.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML