Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Browse Through:

Default Task - Undo

Classification (63)
Regression (12)
Clustering (10)
Other (3)

Attribute Type - Undo

Categorical (1)
Numerical (10)
Mixed (1)

Data Type

Multivariate (10)
Univariate (2)
Sequential (0)
Time-Series (1)
Text (3)
Domain-Theory (0)
Other (0)

Area - Undo

Life Sciences (10)
Physical Sciences (3)
CS / Engineering (34)
Social Sciences (0)
Business (5)
Game (0)
Other (11)

# Attributes

Less than 10 (3)
10 to 100 (5)
Greater than 100 (2)

# Instances

Less than 100 (0)
100 to 1000 (2)
Greater than 1000 (8)

Format Type

Matrix (10)
Non-Matrix (0)

10 Data Sets

Table View  List View


1. Anuran Calls (MFCCs): Acoustic features extracted from syllables of anuran (frogs) calls, including the family, the genus, and the species labels (multilabel).

2. Diabetes 130-US hospitals for years 1999-2008: This data has been prepared to analyze factors related to readmission as well as other outcomes pertaining to patients with diabetes.

3. Drug Review Dataset (Drugs.com): The dataset provides patient reviews on specific drugs along with related conditions and a 10 star patient rating reflecting overall patient satisfaction.

4. Epileptic Seizure Recognition: This dataset is a pre-processed and re-structured/reshaped version of a very commonly used dataset featuring epileptic seizure detection.

5. gene expression cancer RNA-Seq: This collection of data is part of the RNA-Seq (HiSeq) PANCAN data set, it is a random extraction of gene expressions of patients having different types of tumor: BRCA, KIRC, COAD, LUAD and PRAD.

6. KEGG Metabolic Reaction Network (Undirected): KEGG Metabolic pathways modeled as un-directed reaction network. Variety of graphical features presented.

7. KEGG Metabolic Relation Network (Directed): KEGG Metabolic pathways modeled as directed relation network. Variety of graphical features presented.

8. Mice Protein Expression: Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning.

9. seeds: Measurements of geometrical properties of kernels belonging to three different varieties of wheat. A soft X-ray technique and GRAINS package were used to construct all seven, real-valued attributes.

10. Tamilnadu Electricity Board Hourly Readings: This data can be effectively produced the result to fewer parameter of the Load profile can be reduced in the Database


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML