DGP2 - The Second Data Generation Program
Generates application domains based on specific parameters, number of features, and proportion of positive to negative examples
Dataset Characteristics
Data-Generator
Subject Area
Other
Associated Tasks
-
Feature Type
Real
# Instances
-
# Features
-
Dataset Information
Additional Information
DGP/2 is an improvement of DGP. It allows for additional parameters and automates the setting of the standard deviation parameter, which is not easily done by the user. In particular, DGP/2 allows for variation in the number of instances, the number of features, the range of feature values, the number of peaks, the percent of positive instances desired and a radius around the peaks that these instances will fall within (this controls instance density, and determines the standard deviation value for the normal distribution function).
Has Missing Values?
No
Dataset Files
File | Size |
---|---|
DGP-2.c | 35.1 KB |
DGP-2.names | 28.3 KB |
Index | 105 Bytes |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset dgp2_the_second_data_generation_program = fetch_ucirepo(id=35) # data (as pandas dataframes) X = dgp2_the_second_data_generation_program.data.features y = dgp2_the_second_data_generation_program.data.targets # metadata print(dgp2_the_second_data_generation_program.metadata) # variable information print(dgp2_the_second_data_generation_program.variables)
Benedict, P. (1990). DGP2 - The Second Data Generation Program [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5P02V.
Creators
Powell Benedict
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.