Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Browse Through:

Default Task

Classification (25)
Regression (5)
Clustering (8)
Other (2)

Attribute Type - Undo

Categorical (7)
Numerical (31)
Mixed (13)

Data Type

Multivariate (21)
Univariate (2)
Sequential (5)
Time-Series (8)
Text (2)
Domain-Theory (1)
Other (3)

Area - Undo

Life Sciences (60)
Physical Sciences (38)
CS / Engineering (107)
Social Sciences (7)
Business (17)
Game (1)
Other (31)

# Attributes

Less than 10 (2)
10 to 100 (19)
Greater than 100 (4)

# Instances

Less than 100 (1)
100 to 1000 (10)
Greater than 1000 (16)

Format Type

Matrix (20)
Non-Matrix (11)

31 Data Sets

Table View  List View

Name

Data Types

Default Task

Attribute Types

# Instances

# Attributes

Year

 

Activity Recognition from Single Chest-Mounted Accelerometer

Univariate, Sequential, Time-Series 

Classification, Clustering 

Real 

 

 

2014 

 

Air quality

Multivariate, Time-Series 

Regression 

Real 

9358 

15 

2016 

 

Australian Sign Language signs (High Quality)

Multivariate, Time-Series 

Classification 

Real 

2565 

22 

2002 

 

Bag of Words

Text 

Clustering 

Integer 

8000000 

100000 

2008 

 

Chronic_Kidney_Disease

Multivariate 

Classification 

Real 

400 

25 

2015 

 

CMU Face Images

Image 

Classification 

Integer 

640 

 

1999 

 

Connectionist Bench (Vowel Recognition - Deterding Data)

 

Classification 

Real 

528 

10 

 

 

Corel Image Features

Multivariate 

 

Real 

68040 

89 

1999 

 

Dexter

Multivariate 

Classification 

Integer 

2600 

20000 

2008 

 

DGP2 - The Second Data Generation Program

Data-Generator 

 

Real 

 

 

 

 

Facebook Comment Volume Dataset

Multivariate 

Regression 

Integer, Real 

40949 

54 

2016 

 

Geographical Original of Music

Multivariate 

Classification, Regression 

Real 

1059 

68 

2014 

 

Gesture Phase Segmentation

Multivariate, Sequential, Time-Series 

Classification, Clustering 

Real 

9900 

50 

2014 

 

Hill-Valley

Sequential 

Classification 

Real 

606 

101 

2008 

 

Image Segmentation

Multivariate 

Classification 

Real 

2310 

19 

1990 

 

Japanese Vowels

Multivariate, Time-Series 

Classification 

Real 

640 

12 

 

 

Libras Movement

Multivariate, Sequential 

Classification, Clustering 

Real 

360 

91 

2009 

 

Madelon

Multivariate 

Classification 

Real 

4400 

500 

2008 

 

QSAR biodegradation

Multivariate 

Classification 

Integer, Real 

1055 

41 

2013 

 

Record Linkage Comparison Patterns

Multivariate 

Classification 

Real 

5749132 

12 

2011 

 

seismic-bumps

Multivariate 

Classification 

Real 

2584 

19 

2013 

 

Sentence Classification

Text 

Classification 

Integer 

 

 

2014 

 

Spoken Arabic Digit

Multivariate, Time-Series 

Classification 

Real 

8800 

13 

2010 

 

Statlog (Image Segmentation)

Multivariate 

Classification 

Real 

2310 

19 

1990 

 

Statlog (Vehicle Silhouettes)

Multivariate 

Classification 

Integer 

946 

18 

 

 

StoneFlakes

Multivariate 

Classification, Clustering, Causal-Discovery 

Real 

79 

2014 

 

Synthetic Control Chart Time Series

Time-Series 

Classification, Clustering 

Real 

600 

 

1999 

 

Tennis Major Tournament Match Statistics

Multivariate 

Classification, Regression, Clustering 

Integer, Real 

127 

42 

2014 

 

User Identification From Walking Activity

Univariate, Sequential, Time-Series 

Classification, Clustering 

Real 

 

 

2014 

 

USPTO Algorithm Challenge, run by NASA-Harvard Tournament Lab and TopCoder Problem: Pat

Domain-Theory 

Classification 

Integer 

306 

2013 

 

YearPredictionMSD

Multivariate 

Regression 

Real 

515345 

90 

2011 

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML