Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact

Repository Web            Google
View ALL Data Sets

Browse Through:

Default Task - Undo

Classification (13)
Regression (5)
Clustering (6)
Other (0)

Attribute Type

Categorical (1)
Numerical (10)
Mixed (2)

Data Type - Undo

Multivariate (13)
Univariate (2)
Sequential (2)
Time-Series (2)
Text (2)
Domain-Theory (2)
Other (0)

Area - Undo

Life Sciences (13)
Physical Sciences (9)
CS / Engineering (26)
Social Sciences (4)
Business (6)
Game (3)
Other (12)

# Attributes - Undo

Less than 10 (7)
10 to 100 (13)
Greater than 100 (4)

# Instances - Undo

Less than 100 (2)
100 to 1000 (29)
Greater than 1000 (13)

Format Type

Matrix (11)
Non-Matrix (2)

13 Data Sets

Table View  List View

1. EEG Eye State: The data set consists of 14 EEG values and a value indicating the eye state.

2. Covertype: Forest CoverType dataset

3. Mushroom: From Audobon Society Field Guide; mushrooms described in terms of physical characteristics; classification: poisonous or edible

4. Diabetic Retinopathy Debrecen Data Set: This dataset contains features extracted from the Messidor image set to predict whether an image contains signs of diabetic retinopathy or not.

5. Thyroid Disease: 10 separate databases from Garavan Institute

6. Cardiotocography: The dataset consists of measurements of fetal heart rate (FHR) and uterine contraction (UC) features on cardiotocograms classified by expert obstetricians.

7. Diabetes 130-US hospitals for years 1999-2008: This data has been prepared to analyze factors related to readmission as well as other outcomes pertaining to patients with diabetes.

8. Mice Protein Expression: Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning.

9. Anuran Calls (MFCCs): Acoustic features extracted from syllables of anuran (frogs) calls, including the family, the genus, and the species labels (multilabel).

10. Parkinson Speech Dataset with Multiple Types of Sound Recordings: The training data belongs to 20 Parkinson's Disease (PD) patients and 20 healthy subjects. From all subjects, multiple types of sound recordings (26) are taken.

11. EEG Steady-State Visual Evoked Potential Signals: This database consists on 30 subjects performing Brain Computer Interface for Steady State Visual Evoked Potentials (BCI-SSVEP).

12. KEGG Metabolic Relation Network (Directed): KEGG Metabolic pathways modeled as directed relation network. Variety of graphical features presented.

13. KEGG Metabolic Reaction Network (Undirected): KEGG Metabolic pathways modeled as un-directed reaction network. Variety of graphical features presented.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML