Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Browse Through:

Default Task - Undo

Classification (82)
Regression (19)
Clustering (17)
Other (4)

Attribute Type - Undo

Categorical (1)
Numerical (17)
Mixed (1)

Data Type

Multivariate (17)
Univariate (3)
Sequential (0)
Time-Series (2)
Text (3)
Domain-Theory (0)
Other (0)

Area - Undo

Life Sciences (17)
Physical Sciences (3)
CS / Engineering (43)
Social Sciences (1)
Business (13)
Game (0)
Other (11)

# Attributes

Less than 10 (5)
10 to 100 (9)
Greater than 100 (3)

# Instances

Less than 100 (1)
100 to 1000 (7)
Greater than 1000 (9)

Format Type

Matrix (16)
Non-Matrix (1)

17 Data Sets

Table View  List View


1. Anuran Calls (MFCCs): Acoustic features extracted from syllables of anuran (frogs) calls, including the family, the genus, and the species labels (multilabel).

2. Breath Metabolomics: Breath analysis is a pivotal method for biological phenotyping. In a pilot study, 100 experiments with four subjects have been performed to study the reproducibility of this technique.

3. Cervical Cancer Behavior Risk: The dataset contains 19 attributes regarding ca cervix behavior risk with class label is ca_cervix with 1 and 0 as values which means the respondent with and without ca cervix, respectively.

4. Diabetes 130-US hospitals for years 1999-2008: This data has been prepared to analyze factors related to readmission as well as other outcomes pertaining to patients with diabetes.

5. Drug Review Dataset (Drugs.com): The dataset provides patient reviews on specific drugs along with related conditions and a 10 star patient rating reflecting overall patient satisfaction.

6. Epileptic Seizure Recognition: This dataset is a pre-processed and re-structured/reshaped version of a very commonly used dataset featuring epileptic seizure detection.

7. Estimation of obesity levels based on eating habits and physical condition : This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition.

8. Exasens: This repository introduces a novel dataset for the classification of 4 groups of respiratory diseases: Chronic Obstructive Pulmonary Disease (COPD), asthma, infected, and Healthy Controls (HC).

9. Exasens: This repository introduces a novel dataset for the classification of 4 groups of respiratory diseases: Chronic Obstructive Pulmonary Disease (COPD), asthma, infected, and Healthy Controls (HC).

10. gene expression cancer RNA-Seq: This collection of data is part of the RNA-Seq (HiSeq) PANCAN data set, it is a random extraction of gene expressions of patients having different types of tumor: BRCA, KIRC, COAD, LUAD and PRAD.

11. HCV data: The data set contains laboratory values of blood donors and Hepatitis C patients and demographic values like age.

12. Heart failure clinical records: This dataset contains the medical records of 299 patients who had heart failure, collected during their follow-up period, where each patient profile has 13 clinical features.

13. KEGG Metabolic Reaction Network (Undirected): KEGG Metabolic pathways modeled as un-directed reaction network. Variety of graphical features presented.

14. KEGG Metabolic Relation Network (Directed): KEGG Metabolic pathways modeled as directed relation network. Variety of graphical features presented.

15. Mice Protein Expression: Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning.

16. seeds: Measurements of geometrical properties of kernels belonging to three different varieties of wheat. A soft X-ray technique and GRAINS package were used to construct all seven, real-valued attributes.

17. Tamilnadu Electricity Board Hourly Readings: This data can be effectively produced the result to fewer parameter of the Load profile can be reduced in the Database


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML