Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Browse Through:

Default Task

Classification (66)
Regression (17)
Clustering (20)
Other (9)

Attribute Type

Categorical (15)
Numerical (46)
Mixed (15)

Data Type

Multivariate (73)
Univariate (6)
Sequential (9)
Time-Series (18)
Text (9)
Domain-Theory (3)
Other (1)

Area

Life Sciences (21)
Physical Sciences (9)
CS / Engineering (30)
Social Sciences (6)
Business (6)
Game (2)
Other (14)

# Attributes - Undo

Less than 10 (90)
10 to 100 (186)
Greater than 100 (68)

# Instances

Less than 100 (8)
100 to 1000 (39)
Greater than 1000 (41)

Format Type

Matrix (68)
Non-Matrix (22)

90 Data Sets

Table View  List View

Name

Data Types

Default Task

Attribute Types

# Instances

# Attributes

Year

 

3D Road Network (North Jutland, Denmark)

Sequential, Text 

Regression, Clustering 

Real 

434874 

2013 

 

AAAI 2013 Accepted Papers

Multivariate 

Clustering 

 

150 

2014 

 

AAAI 2014 Accepted Papers

Multivariate 

Clustering 

 

399 

2014 

 

Abalone

Multivariate 

Classification 

Categorical, Integer, Real 

4177 

1995 

 

Activity Recognition system based on Multisensor data fusion (AReM)

Multivariate, Sequential, Time-Series 

Classification 

Real 

42240 

2016 

 

Acute Inflammations

Multivariate 

Classification 

Categorical, Integer 

120 

2009 

 

Airfoil Self-Noise

Multivariate 

Regression 

Real 

1503 

2014 

 

Artificial Characters

Multivariate 

Classification 

Categorical, Integer, Real 

6000 

1992 

 

Auto MPG

Multivariate 

Regression 

Categorical, Real 

398 

1993 

 

Bach Chorales

Univariate, Time-Series 

 

Categorical, Integer 

100 

 

 

Badges

Univariate, Text 

Classification 

 

294 

1994 

 

Balance Scale

Multivariate 

Classification 

Categorical 

625 

1994 

 

Balloons

Multivariate 

Classification 

Categorical 

16 

 

 

banknote authentication

Multivariate 

Classification 

Real 

1372 

2013 

 

BLOGGER

Multivariate 

Classification 

 

100 

2013 

 

Blood Transfusion Service Center

Multivariate 

Classification 

Real 

748 

2008 

 

Breast Cancer

Multivariate 

Classification 

Categorical 

286 

1988 

 

CalIt2 Building People Counts

Multivariate, Time-Series 

 

Categorical, Integer 

10080 

2006 

 

Car Evaluation

Multivariate 

Classification 

Categorical 

1728 

1997 

 

Challenger USA Space Shuttle O-Ring

Multivariate 

Regression 

Integer 

23 

1993 

 

Character Trajectories

Time-Series 

Classification, Clustering 

Real 

2858 

2008 

 

Chess (King-Rook vs. King)

Multivariate 

Classification 

Categorical, Integer 

28056 

1994 

 

chestnut – LARVIC

 

Classification, Clustering 

 

1451 

2017 

 

Combined Cycle Power Plant

Multivariate 

Regression 

Real 

9568 

2014 

 

Computer Hardware

Multivariate 

Regression 

Integer 

209 

1987 

 

Concrete Compressive Strength

Multivariate 

Regression 

Real 

1030 

2007 

 

Connectionist Bench (Nettalk Corpus)

Multivariate 

 

Categorical 

20008 

 

 

Contraceptive Method Choice

Multivariate 

Classification 

Categorical, Integer 

1473 

1997 

 

Cuff-Less Blood Pressure Estimation

Multivariate 

Classification, Regression 

Real 

12000 

2015 

 

Daphnet Freezing of Gait

Multivariate, Time-Series 

Classification 

Real 

237 

2013 

 

Dataset for ADL Recognition with Wrist-worn Accelerometer

Multivariate, Time-Series 

Classification, Clustering 

 

 

2014 

 

Dodgers Loop Sensor

Multivariate, Time-Series 

 

Categorical, Integer 

50400 

2006 

 

Eco-hotel

Text 

 

 

401 

2017 

 

Ecoli

Multivariate 

Classification 

Real 

336 

1996 

 

EEG Database

Multivariate, Time-Series 

 

Categorical, Integer, Real 

122 

1999 

 

EMG dataset in Lower Limb

Multivariate, Time-Series 

 

Real 

132 

2014 

 

EMG Physical Action Data Set

Time-Series 

Classification 

Real 

10000 

2011 

 

Energy efficiency

Multivariate 

Classification, Regression 

Integer, Real 

768 

2012 

 

Haberman's Survival

Multivariate 

Classification 

Integer 

306 

1999 

 

Hayes-Roth

Multivariate 

Classification 

Categorical 

160 

1989 

 

HIV-1 protease cleavage

Multivariate 

Classification 

Categorical 

6590 

2015 

 

HTRU2

Multivariate 

Classification, Clustering 

Real 

17898 

2017 

 

Improved Spiral Test Using Digitized Graphics Tablet for Monitoring Parkinson’s Disease

Multivariate 

Classification, Regression, Clustering 

Real 

40 

2016 

 

Individual household electric power consumption

Multivariate, Time-Series 

Regression, Clustering 

Real 

2075259 

2012 

 

Indoor User Movement Prediction from RSS data

Multivariate, Sequential, Time-Series 

Classification 

Real 

13197 

2016 

 

Iris

Multivariate 

Classification 

Real 

150 

1988 

 

ISTANBUL STOCK EXCHANGE

Multivariate, Univariate, Time-Series 

Classification, Regression 

Real 

536 

2013 

 

LED Display Domain

Multivariate, Data-Generator 

Classification 

Categorical 

 

1988 

 

Lenses

Multivariate 

Classification 

Categorical 

24 

1990 

 

Liver Disorders

Multivariate 

 

Categorical, Integer, Real 

345 

1990 

 

Localization Data for Person Activity

Univariate, Sequential, Time-Series 

Classification 

Real 

164860 

2010 

 

Machine Learning based ZZAlpha Ltd. Stock Recommendations 2012-2014

Sequential, Time-Series 

Classification 

Real 

314080 

2015 

 

Mammographic Mass

Multivariate 

Classification 

Integer 

961 

2007 

 

Mechanical Analysis

Multivariate 

Classification 

Categorical, Integer, Real 

209 

1990 

 

MONK's Problems

Multivariate 

Classification 

Categorical 

432 

1992 

 

News Aggregator

Multivariate 

Classification, Clustering 

 

422937 

2016 

 

Nursery

Multivariate 

Classification 

Categorical 

12960 

1997 

 

NYSK

Multivariate, Sequential, Text 

Clustering 

 

10421 

2013 

 

Occupancy Detection

Multivariate, Time-Series 

Classification 

Real 

20560 

2016 

 

Online Retail

Multivariate, Sequential, Time-Series 

Classification, Clustering 

Integer, Real 

541909 

2015 

 

Parkinson Disease Spiral Drawings Using Digitized Graphics Tablet

Multivariate 

Classification, Regression, Clustering 

Integer 

77 

2017 

 

Perfume Data

Univariate, Domain-Theory 

Classification, Clustering 

Integer 

560 

2014 

 

Physicochemical Properties of Protein Tertiary Structure

Multivariate 

Regression 

Real 

45730 

2013 

 

Pima Indians Diabetes

Multivariate 

Classification 

Integer, Real 

768 

1990 

 

Post-Operative Patient

Multivariate 

Classification 

Categorical, Integer 

90 

1993 

 

QtyT40I10D100K

Sequential 

 

Integer 

3960456 

2012 

 

Qualitative_Bankruptcy

Multivariate 

Classification 

 

250 

2014 

 

Reuters-21578 Text Categorization Collection

Text 

Classification 

Categorical 

21578 

1997 

 

seeds

Multivariate 

Classification, Clustering 

Real 

210 

2012 

 

ser Knowledge Modeling Data (Students' Knowledge Levels on DC Electrical Machines)

Multivariate 

Classification 

Real 

403 

2013 

 

Servo

Multivariate 

Regression 

Categorical, Integer 

167 

1993 

 

Shuttle Landing Control

Multivariate 

Classification 

Categorical 

15 

1988 

 

Skin Segmentation

Univariate 

Classification 

Real 

245057 

2012 

 

Statlog (Shuttle)

Multivariate 

Classification 

Integer 

58000 

 

 

StoneFlakes

Multivariate 

Classification, Clustering, Causal-Discovery 

Real 

79 

2014 

 

Syskill and Webert Web Page Ratings

Multivariate, Text 

Classification 

Categorical 

332 

1998 

 

Tamilnadu Electricity Board Hourly Readings

Multivariate 

Classification, Regression, Clustering 

Real 

45781 

2013 

 

Taxi Service Trajectory - Prediction Challenge, ECML PKDD 2015

Multivariate, Sequential, Time-Series, Domain-Theory 

Clustering, Causal-Discovery 

Real 

1710671 

2015 

 

Teaching Assistant Evaluation

Multivariate 

Classification 

Categorical, Integer 

151 

1997 

 

Tic-Tac-Toe Endgame

Multivariate 

Classification 

Categorical 

958 

1991 

 

Twitter Data set for Arabic Sentiment Analysis

Text 

Classification 

 

2000 

2014 

 

User Knowledge Modeling

Multivariate 

Classification, Clustering 

Integer 

403 

2013 

 

USPTO Algorithm Challenge, run by NASA-Harvard Tournament Lab and TopCoder Problem: Pat

Domain-Theory 

Classification 

Integer 

306 

2013 

 

Vertebral Column

Multivariate 

Classification 

Real 

310 

2011 

 

Wholesale customers

Multivariate 

Classification, Clustering 

Integer 

440 

2014 

 

Wilt

Multivariate 

Classification 

 

4889 

2014 

 

Yacht Hydrodynamics

Multivariate 

Regression 

Real 

308 

2013 

 

Yeast

Multivariate 

Classification 

Real 

1484 

1996 

 

YouTube Comedy Slam Preference Data

Text 

Classification 

 

1138562 

2012 

 

YouTube Spam Collection

Text 

Classification 

 

1956 

2017 

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML