Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site.

Welcome to the UC Irvine Machine Learning Repository!

We currently maintain 622 data sets as a service to the machine learning community. You may view all data sets through our searchable interface. For a general overview of the Repository, please visit our About page. For information about citing data sets in publications, please read our citation policy. If you wish to donate a data set, please consult our donation policy. For any other questions, feel free to contact the Repository librarians.

Supported By:

In Collaboration With:

Latest News:
09-24-2018: Welcome to the new Repository admins Dheeru Dua and Efi Karra Taniskidou!
04-04-2013: Welcome to the new Repository admins Kevin Bache and Moshe Lichman!
03-01-2010: Note from donor regarding Netflix data
10-16-2009: Two new data sets have been added.
09-14-2009: Several data sets have been added.
03-24-2008: New data sets have been added!
06-25-2007: Two new data sets have been added: UJI Pen Characters, MAGIC Gamma Telescope


Featured Data Set:  Gisette

Task: Classification
Data Type: Multivariate
# Attributes: 5000
# Instances: 13500

GISETTE is a handwritten digit recognition problem. The problem is to separate the highly confusible digits '4' and '9'. This dataset is one of five datasets of the NIPS 2003 feature selection challenge.
Newest Data Sets:
06-05-2021:
 Average Localization Error (ALE) in sensor node localization process in WSNs
05-25-2021:
 9mers from cullpdb
05-18-2021:
 TamilSentiMix
05-02-2021:
 Accelerometer
04-21-2021:
 Synchronous Machine Data Set
04-21-2021:
 Synchronous Machine Data Set
04-20-2021:
 Pedal Me Bicycle Deliveries
04-20-2021:
 Wikipedia Math Essentials
04-20-2021:
 Wikipedia Math Essentials
04-14-2021:
 Turkish Headlines Dataset
04-11-2021:
 Secondary Mushroom Dataset
04-03-2021:
 Power consumption of Tetouan city
Most Popular Data Sets (hits since 2007):
5261388:
 Iris
2740593:
 Adult
2209209:
 Dry Bean Dataset
2150573:
 Heart Disease
2142500:
 Wine
2131308:
 Wine Quality
2029072:
 Bank Marketing
1966461:
 Rice (Cammeo and Osmancik)
1950838:
 Breast Cancer Wisconsin (Diagnostic)
1731710:
 Car Evaluation
1559429:
 Raisin Dataset
1415913:
 Abalone

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML