Browse Datasets

Drug_induced_Autoimmunity_Prediction

This dataset comprises molecular descriptors generated using RDKit, specifically curated for the study of drug-induced autoimmunity through ensemble machine learning approaches. It is divided into a training set and a testing set, containing numerical features that represent molecular properties and structural characteristics of drugs. The dataset supports predictive modeling tasks aimed at identifying potential autoimmune risks associated with drug candidates. These molecular descriptors include physicochemical properties, providing a comprehensive foundation for machine learning analysis. The dataset facilitates the development of interpretable models for drug toxicity prediction, contributing to advancements in computational toxicology and drug safety assessment.

PIRvision_FoG_presence_detection

The PIRvision dataset contains occupancy detection data collected from a Synchronized Low-Energy Electronically-chopped Passive Infra-Red sensing node in residential and office environments. Each observation represents 4 seconds of recorded human activity within the sensor Field-of-View (FoV).

Lattice-physics (PWR fuel assembly neutronics simulation results)

This dataset encompasses lattice-physics parameters—the infinite multiplication factor (k-inf) and the pin power peaking factor (PPPF)—modeled as functions of variations in fuel pin enrichments for the NuScale US600 fuel assembly type C-01 (NFAC-01) [NuScale FSAR]. These critical parameters were computed using the MCNP6 code, a Monte Carlo-based tool for nuclear reactor criticality simulations. Fuel pin enrichments were uniformly sampled within the range of 0.7–5.0 weight percent (w/o) U-235 to generate the dataset. The dataset contains 39 features, each representing the enrichment of a specific fuel rod in a one-eighth symmetry of the NFAC assembly. The outputs of interest are the k-inf and PPPF values associated with these enrichments.

Gas sensor array low-concentration

This dataset contains 6 gas responses collected by a sensor array consisting of 10 metal oxide semiconductor sensors, with gas concentrations at the ppb level (below the minimum detection limit of the sensors)

Twitter Geospatial Data

Seven days of geo-tagged Tweet data from the United States with exact GPS location and timestamp.

CAN-MIRGU

A Comprehensive CAN Bus Attack Dataset from Moving Vehicles for Intrusion Detection System Evaluation This dataset includes CAN bus attacks collected from a modern automobile equipped with autonomous driving capabilities, operating in real-world driving scenarios. The dataset encompasses physically verified attacks to enhance the comparison and validation of in-vehicle network Intrusion Detection Systems.

Assessing Mathematics Learning in Higher Education

MathE is a mathematical platform developed under the MathE project (mathe.pixel-online.org). The dataset has 9546 answers to questions in the Mathematical topics taught in higher education. The file has eight features, named: Student ID, Student Country, Question ID, Type of answer (correct or incorrect), Question level (basic or advanced), Math Topic, Math Subtopic, and Question Keywords. The question level was associated with the professor who submitted the question. The data was obtained from February 2019 until December 2023.

Turkish Crowdfunding Startups

This dataset contains data on crowdfunding campaigns in Turkey. The dataset includes various characteristics such as crowdfunding projects, project descriptions, targeted and raised funds, campaign durations, and number of backers. Collected in 2022, this dataset provides a valuable resource for researchers who want to understand and analyze the crowdfunding ecosystem in Turkey. In total, there are data from more than 1500 projects on 6 different platforms. The dataset is particularly useful for training natural language processing (NLP) and machine learning models. This dataset is an important reference point for studies on the characteristics of successful crowdfunding campaigns and provides comprehensive information for entrepreneurs, investors and researchers in Turkey.

Synthetic Circle Data Set

This dataset comprises 10000 two-dimensional points arranged into 100 circles, each containing 100 points. It was designed to evaluate clustering algorithms, such as k-means, by providing a clear and structured clustering challenge.

Micro Gas Turbine Electrical Energy Prediction

This dataset consists of measurements of electrical power corresponding to an input control signal over time, collected from a 3-kilowatt commercial micro gas turbine.

0 to 10 of 677

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy