
ApisTox
Donated on 4/23/2024
ApisTox is a dataset focusing on the toxicity of pesticides to honey bees (Apis mellifera). This dataset combines and leverages data from existing sources such as ECOTOX and PPDB, providing an extensive, consistent, and curated collection that surpasses the previous datasets. ApisTox incorporates a wide array of data, including toxicity levels for chemicals, details such as time of their publication in literature, and identifiers linking them to external chemical databases. This dataset may serve as an important tool for environmental and agricultural research, but also can support the development of policies and practices aimed at minimizing harm to bee populations. Finally, ApisTox offers a unique resource for benchmarking molecular property prediction methods on agrochemical compounds, facilitating advancements in both environmental science and cheminformatics. Code used to produce the dataset is available at https://github.com/j-adamczyk/apis_tox_dataset
Dataset Characteristics
Other
Subject Area
Biology
Associated Tasks
Classification
Feature Type
-
# Instances
1035
# Features
13
Dataset Information
Has Missing Values?
No
Introductory Paper
By Jakub Adamczyk, Jakub Poziemski, Paweł Siedlecki. 2024
Published in arXiv
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
name | Feature | Categorical | no | ||
CID | Feature | Integer | no | ||
CAS | Feature | Categorical | no | ||
SMILES | Feature | Categorical | no | ||
source | Feature | Categorical | no | ||
year | Feature | Integer | no | ||
toxicity_type | Feature | Categorical | no | ||
herbicide | Feature | Binary | no | ||
fungicide | Feature | Binary | no | ||
insecticide | Feature | Binary | no |
0 to 10 of 13
Additional Variable Information
Class Labels
Binary labels ("label" column, using EPA methodology): 0 - non-toxic 1 - toxic Ternary level ("ppdb_level" column, using PPDB methodology): 0 - non-toxic 1 - moderately toxic 2 - highly toxic
Dataset Files
File | Size |
---|---|
dataset_final.csv | 131.9 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset apistox = fetch_ucirepo(id=995) # data (as pandas dataframes) X = apistox.data.features y = apistox.data.targets # metadata print(apistox.metadata) # variable information print(apistox.variables)
Adamczyk, J., Poziemski, J., & Siedlecki, P. (2024). ApisTox [Dataset]. UCI Machine Learning Repository. https://doi.org/10.5281/zenodo.11062076.
Keywords
Creators
Jakub Adamczyk
jadamczy@agh.edu.pl
AGH University of Krakow
Jakub Poziemski
Institute of Biochemistry and Biophysics Polish Academy of Sciences
Paweł Siedlecki
Institute of Biochemistry and Biophysics Polish Academy of Sciences
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.