Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site.

Browse Through:

Default Task

Classification (90)
Regression (23)
Clustering (18)
Other (4)

Attribute Type - Undo

Categorical (15)
Numerical (99)
Mixed (18)

Data Type

Multivariate (83)
Univariate (10)
Sequential (5)
Time-Series (16)
Text (3)
Domain-Theory (0)
Other (1)

Area - Undo

Life Sciences (99)
Physical Sciences (46)
CS / Engineering (178)
Social Sciences (17)
Business (34)
Game (3)
Other (40)

# Attributes

Less than 10 (29)
10 to 100 (49)
Greater than 100 (15)

# Instances

Less than 100 (6)
100 to 1000 (48)
Greater than 1000 (41)

Format Type

Matrix (74)
Non-Matrix (25)

99 Data Sets

Table View  List View

Name

Data Types

Default Task

Attribute Types

# Instances

# Attributes

Year

 

ICU

Multivariate, Time-Series 

 

Real 

 

 

 

 

Quadruped Mammals

Multivariate, Data-Generator 

Classification 

Real 

 

72 

1992 

 

PubChem Bioassay Data

Multivariate 

Classification 

Integer, Real 

 

 

2011 

 

Early biomarkers of Parkinson’s disease based on natural connected speech Data Set

Multivariate 

Classification 

Real 

 

 

2018 

 

Lung Cancer

Multivariate 

Classification 

Integer 

32 

56 

1992 

 

Multi-view Brain Networks

Multivariate 

Classification, Clustering 

Integer 

70 

70 

2020 

 

Cervical Cancer Behavior Risk

Multivariate, Univariate 

Classification, Clustering 

Integer 

72 

19 

2019 

 

Caesarian Section Classification Dataset

Univariate 

Classification 

Integer 

80 

2018 

 

Immunotherapy Dataset

Univariate 

Classification 

Integer, Real 

90 

2018 

 

Cryotherapy Dataset

Univariate 

Classification 

Integer, Real 

90 

2018 

 

Fertility

Multivariate 

Classification, Regression 

Real 

100 

10 

2013 

 

Breath Metabolomics

Multivariate, Time-Series 

Classification, Clustering 

Real 

104 

1656 

2019 

 

Autistic Spectrum Disorder Screening Data for Adolescent

Multivariate 

Classification 

Integer 

104 

21 

2017 

 

Breast Tissue

Multivariate 

Classification 

Real 

106 

10 

2010 

 

Breast Cancer Coimbra

Multivariate 

Classification 

Integer 

116 

10 

2018 

 

LSVT Voice Rehabilitation

Multivariate 

Classification 

Real 

126 

309 

2014 

 

Early biomarkers of Parkinson’s disease based on natural connected speech

Multivariate 

Classification, Regression 

Integer, Real 

130 

65 

2017 

 

Horton General Hospital

Multivariate, Time-Series 

Causal-Discovery 

Integer 

139 

2019 

 

Somerville Happiness Survey

 

Classification 

Integer 

143 

2018 

 

Iris

Multivariate 

Classification 

Real 

150 

1988 

 

HCC Survival

Multivariate 

Classification 

Integer, Real 

165 

49 

2017 

 

Divorce Predictors data set

Multivariate, Univariate 

Classification 

Integer 

170 

54 

2019 

 

Divorce Predictors data set

Multivariate, Univariate 

Classification 

Integer 

170 

54 

2019 

 

Bone marrow transplant: children

Multivariate 

Classification, Regression 

Integer, Real 

187 

39 

2020 

 

Amphibians

Multivariate 

Classification 

Integer, Real 

189 

23 

2020 

 

Parkinsons

Multivariate 

Classification 

Real 

197 

23 

2008 

 

Breast Cancer Wisconsin (Prognostic)

Multivariate 

Classification, Regression 

Real 

198 

34 

1995 

 

Risk Factor prediction of Chronic Kidney Disease

Multivariate 

Classification, Regression 

Real 

202 

29 

2021 

 

seeds

Multivariate 

Classification, Clustering 

Real 

210 

2012 

 

Daphnet Freezing of Gait

Multivariate, Time-Series 

Classification 

Real 

237 

2013 

 

Algerian Forest Fires Dataset

Multivariate 

Classification, Regression 

Real 

244 

12 

2019 

 

SPECTF Heart

Multivariate 

Classification 

Integer 

267 

44 

2001 

 

Quality Assessment of Digital Colposcopies

Multivariate 

Classification 

Real 

287 

69 

2017 

 

Autistic Spectrum Disorder Screening Data for Children

Multivariate 

Classification 

Integer 

292 

21 

2017 

 

Heart failure clinical records

Multivariate 

Classification, Regression, Clustering 

Integer, Real 

299 

13 

2020 

 

Abscisic Acid Signaling Network

Multivariate 

Causal-Discovery 

Integer 

300 

43 

2008 

 

extention of Z-Alizadeh sani dataset

 

Classification 

Integer, Real 

303 

59 

2017 

 

Z-Alizadeh Sani

 

Classification 

Integer, Real 

303 

56 

2017 

 

Haberman's Survival

Multivariate 

Classification 

Integer 

306 

1999 

 

Ecoli

Multivariate 

Classification 

Real 

336 

1996 

 

Exasens

Multivariate 

Classification, Clustering 

Integer 

399 

2020 

 

Exasens

Multivariate 

Classification, Clustering 

Integer 

399 

2020 

 

Refractive errors

Multivariate 

Classification 

Integer 

467 

79 

2020 

 

Thoracic Surgery Data

Multivariate 

Classification 

Integer, Real 

470 

17 

2013 

 

Demospongiae

Multivariate 

Classification 

Integer 

503 

 

2010 

 

Hungarian Chickenpox Cases

Time-Series 

Regression 

Real 

521 

20 

2021 

 

Breast Cancer Wisconsin (Diagnostic)

Multivariate 

Classification 

Real 

569 

32 

1995 

 

ILPD (Indian Liver Patient Dataset)

Multivariate 

Classification 

Integer, Real 

583 

10 

2012 

 

Shoulder Implant X-Ray Manufacturer Classification

Multivariate 

Classification 

Real 

597 

2020 

 

Shoulder Implant X-Ray Manufacturer Classification

Multivariate 

Classification 

Real 

597 

2020 

 

HCV data

Multivariate 

Classification, Clustering 

Integer, Real 

615 

14 

2020 

 

Breast Cancer Wisconsin (Original)

Multivariate 

Classification 

Integer 

699 

10 

1992 

 

gene expression cancer RNA-Seq

Multivariate 

Classification, Clustering 

Real 

801 

20531 

2016 

 

Cervical cancer (Risk Factors)

Multivariate 

Classification 

Integer, Real 

858 

36 

2017 

 

Raisin Dataset

Multivariate 

Classification 

Integer, Real 

900 

2021 

 

Arcene

Multivariate 

Classification 

Real 

900 

10000 

2008 

 

MicroMass

Multivariate 

Classification 

Real 

931 

1300 

2013 

 

Mammographic Mass

Multivariate 

Classification 

Integer 

961 

2007 

 

Parkinson Speech Dataset with Multiple Types of Sound Recordings

Multivariate 

Classification, Regression 

Integer, Real 

1040 

26 

2014 

 

QSAR fish bioconcentration factor (BCF)

Multivariate 

Regression 

Integer, Real 

1056 

2019 

 

Mice Protein Expression

Multivariate 

Classification, Clustering 

Real 

1080 

82 

2015 

 

Diabetic Retinopathy Debrecen Data Set

Multivariate 

Classification 

Integer, Real 

1151 

20 

2014 

 

Hepatitis C Virus (HCV) for Egyptian patients

Multivariate 

Classification 

Integer, Real 

1385 

29 

2019 

 

Yeast

Multivariate 

Classification 

Real 

1484 

1996 

 

One-hundred plant species leaves data set

 

Classification 

Real 

1600 

64 

2012 

 

Myocardial infarction complications

Multivariate 

Classification 

Real 

1700 

124 

2020 

 

Dorothea

Multivariate 

Classification 

Integer 

1950 

100000 

2008 

 

Estimation of obesity levels based on eating habits and physical condition

Multivariate 

Classification, Regression, Clustering 

Integer 

2111 

17 

2019 

 

Cardiotocography

Multivariate 

Classification 

Real 

2126 

23 

2010 

 

sEMG for Basic Hand movements

Time-Series 

Classification 

Real 

3000 

2500 

2014 

 

Simulated Falls and Daily Living Activities Data Set

Time-Series 

Classification 

Integer 

3060 

138 

2018 

 

Activity recognition using wearable physiological measurements

Multivariate 

Classification 

Real 

4480 

533 

2019 

 

chipseq

Sequential 

Classification 

Integer 

4960 

 

2018 

 

Parkinsons Telemonitoring

Multivariate 

Regression 

Integer, Real 

5875 

26 

2009 

 

Anuran Calls (MFCCs)

Multivariate 

Classification, Clustering 

Real 

7195 

22 

2017 

 

EEG Steady-State Visual Evoked Potential Signals

Multivariate, Time-Series 

Classification, Regression 

Integer 

9200 

16 

2018 

 

Smartphone-Based Recognition of Human Activities and Postural Transitions

Multivariate, Time-Series 

Classification 

Real 

10929 

561 

2015 

 

Epileptic Seizure Recognition

Multivariate, Time-Series 

Classification, Clustering 

Integer, Real 

11500 

179 

2017 

 

Cuff-Less Blood Pressure Estimation

Multivariate 

Classification, Regression 

Real 

12000 

2015 

 

EEG Eye State

Multivariate, Sequential, Time-Series 

Classification 

Integer, Real 

14980 

15 

2013 

 

p53 Mutants

Multivariate 

Classification 

Real 

16772 

5409 

2010 

 

EMG data for gestures

Time-Series 

Classification 

Real 

30000 

2019 

 

Physicochemical Properties of Protein Tertiary Structure

Multivariate 

Regression 

Real 

45730 

2013 

 

Tamilnadu Electricity Board Hourly Readings

Multivariate 

Classification, Regression, Clustering 

Real 

45781 

2013 

 

KEGG Metabolic Relation Network (Directed)

Multivariate, Univariate, Text 

Classification, Regression, Clustering 

Integer, Real 

53414 

24 

2011 

 

Secondary Mushroom Dataset

Univariate 

Classification 

Real 

61069 

21 

2021 

 

KEGG Metabolic Reaction Network (Undirected)

Multivariate, Univariate, Text 

Classification, Regression, Clustering 

Integer, Real 

65554 

29 

2011 

 

Activity recognition with healthy older people using a batteryless wearable sensor

Sequential 

Classification 

Real 

75128 

2016 

 

Influenza outbreak event prediction via Twitter data

Multivariate 

Classification 

Integer, Real 

75840 

525 

2020 

 

Diabetes 130-US hospitals for years 1999-2008

Multivariate 

Classification, Clustering 

Integer 

100000 

55 

2014 

 

Sepsis survival minimal clinical records

Multivariate 

Classification 

Integer 

110341 

2020 

 

Reuters RCV1 RCV2 Multilingual, Multiview Text Categorization Test collection

Multivariate 

Classification 

Real 

111740 

 

2013 

 

Simulated data for survival modelling

Multivariate, Time-Series 

Regression 

Integer, Real 

120000 

25 

2018 

 

9mers from cullpdb

Sequential 

Classification, Regression 

Real 

158716 

2021 

 

Localization Data for Person Activity

Univariate, Sequential, Time-Series 

Classification 

Real 

164860 

2010 

 

Drug Review Dataset (Drugs.com)

Multivariate, Text 

Classification, Regression, Clustering 

Integer 

215063 

2018 

 

Bar Crawl: Detecting Heavy Drinking

Multivariate, Time-Series 

Classification, Regression 

Real 

14057567 

2020 

 

Bar Crawl: Detecting Heavy Drinking

Multivariate, Time-Series 

Classification, Regression 

Real 

14057567 

2020 

 

KASANDR

Multivariate 

Causal-Discovery 

Integer 

17764280 

2158859 

2017 

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML