REJAFADA

Donated on 8/14/2023

REJAFADA (Retrieval of Jar Files Applied to Dynamic Analysis) aims to be used, as benchmark, to check the quality of the detection of Jar malware.

Dataset Characteristics

Multivariate

Subject Area

Computer Science

Associated Tasks

Classification

Feature Type

Integer

# Instances

1996

# Features

6825

Dataset Information

Additional Information

The REJAFADA (Retrieval of Jar Files Applied to Dynamic Analysis) is a dataset which allows the classification of files with Jar extension between benign and malwares. The REJAFADA is composed of 998 malware Jar files and 998 other benign Jar files. The REJAFADA dataset, consequently, is suitable for learning endowed with AI (Artificial Intelligence), considering that the Jar files presented the same amount in the different classes (malware and benign). The goal is that tendentious classifiers, in relation to a certain class, do not have their success taxes favored. In relation to virtual plagues, REJAFADA extracted malicious Jar files from VirusShare which is a repository of malware samples to provide security researchers, incident responders, forensic analysts, and the morbidly curious access to samples of live malicious code. With respect to benign Jar files, the catalog was given from application repositories such as Java2s.com, and findar.com. All of the benign files have been audited by VirusTotal. Then, the benign Jar files, contained in REJAFADA, had their benevolence attested by the main commercial antiviruses of the world. The obtained results corresponding to the analyses of the benign and malware Jar files, resulting from the VirusTotal audit, are available for consultation at the virtual address of REJAFADA ¹. The features of Jar files originate through the dynamic analysis of suspicious files. Therefore, in our methodology, the malware is executed in order to infect, intentionally, the Java Virtual Machine installed in Windows 7 audited, in real time (dynamic), by the Cuckoo Sandbox. 1. REJAFADA (A Retrieval of Jar Files Applied to Dynamic Analysis). Available in: https://github.com/rewema/rejafada. Accessed on June 2018.

Has Missing Values?

No

Introductory Paper

Next Generation Antivirus Applied to Jar Malware Detection based on Runtime Behaviors using Neural Networks

By Ricardo P Pinheiro, Sidney M. L. Lima, Sérgio M. M. Fernandes, E. D. Q. Albuquerque, S. Medeiros, Danilo Souza, T. Monteiro, Petrônio Lopes, Rafael Lima, Jemerson Oliveira, Sthéfano Silva. 2019

Published in International Conference on Computer Supported Cooperative Work in Design

Variable Information

1) Application name 2) Class (M = malware, B = benign) 3) Input Attribute (3-6826). Next, the groups of features are detailed - Features related to virtual machines. - Features related to malware. - Features related to Backdoors. - Features related to the banking threats (Trojan horses). - Features related to Bitcoin. - Features related to bots (machines that perform automatic network tasks, malicious or not, without the knowledge of their owners). - Features related to browsers. - Features related to Firewall. - Features related to cloud computing. - Features related to DDoS (Dynamic Danial of Service) attacks. - Features that seek to disable features of Windows 7 OS and other utilities. - Features associated with network traffic hint windows 7 OS in PCAP format. - Features related to DNS servers (Domain Name System, servers responsible for the translation of URL addresses in IP). - Features related to native Windows 7 OS programs. - Features related to Windows 7 Boot OS. - Features related to Windows 7 OS (Regedit). - Features related to the use of sandboxes. The digital forensics examines whether the file tried tries to detect whether sandboxes: Cuckoo, Joe, Anubis, Sunbelt, ThreatTrack / GFI / CW or Fortinet are being used, through the presence of their own files. - Features related to antivirus. Checks if the file being investigated tries to check for registry keys, in regedit, for Chinese antivirus. - Features related to Ransomware (type of malware that by means of encryption, leaves the victim's files unusable, then request a redemption in exchange for the normal use later of the user's files, a redemption usually paid in a non-traceable way, such as bitcoins). - Features related to exploit-related features which constitute malware attempting to exploit known or unackaged vulnerabilities, faults or defects in the system or one or more of its components in order to cause unforeseen instabilities and behavior on both your hardware and in your software. - Features related to Infostealers, malicious programs that collect confidential information from the affected computer.

Dataset Files

FileSize
REJAFADA.zip365.8 KB

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (344.1 KB)
1 citations
2231 views

Creators

Ricardo Pinheiro

Sidney M. L. de Lima

sidney.lima@ufpe.br

Federal University of Pernambuco

Sérgio Murilo

Edison Albuquerque

Danilo Souza

Thyago Monteiro

Petrônio Lopes

Rafael Lima

Jemerson Oliveira

Sthéfano Silva

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy