Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Welcome to the UC Irvine Machine Learning Repository!

We currently maintain 497 data sets as a service to the machine learning community. You may view all data sets through our searchable interface. For a general overview of the Repository, please visit our About page. For information about citing data sets in publications, please read our citation policy. If you wish to donate a data set, please consult our donation policy. For any other questions, feel free to contact the Repository librarians.

Supported By:

In Collaboration With:

Latest News:
09-24-2018: Welcome to the new Repository admins Dheeru Dua and Efi Karra Taniskidou!
04-04-2013: Welcome to the new Repository admins Kevin Bache and Moshe Lichman!
03-01-2010: Note from donor regarding Netflix data
10-16-2009: Two new data sets have been added.
09-14-2009: Several data sets have been added.
03-24-2008: New data sets have been added!
06-25-2007: Two new data sets have been added: UJI Pen Characters, MAGIC Gamma Telescope


Featured Data Set:  NSF Research Award Abstracts 1990-2003

Data Type: Text
# Instances: 129000

This data set consists of (a) 129,000 abstracts describing NSF awards for basic research, (b) bag-of-word data files extracted from the abstracts, (c) a list of words used for indexing the bag-of-word
Newest Data Sets:
02-24-2020:
 Bar Crawl: Detecting Heavy Drinking
02-18-2020:
 Bias correction of numerical prediction model temperature forecast
12-24-2019:
 A study of Asian Religious and Biblical Texts
12-05-2019:
 Real-time Election Results: Portugal 2019
11-27-2019:
 QSAR fish bioconcentration factor (BCF)
10-16-2019:
 Kitsune Network Attack Dataset
10-11-2019:
 QSAR Bioconcentration classes dataset
10-06-2019:
 WISDM Smartphone and Smartwatch Activity and Biometrics Dataset
10-01-2019:
 QSAR oral toxicity
10-01-2019:
 QSAR androgen receptor
09-30-2019:
 Hepatitis C Virus (HCV) for Egyptian patients
09-23-2019:
 QSAR fish toxicity
Most Popular Data Sets (hits since 2007):
3347862:
 Iris
1838766:
 Adult
1418898:
 Wine
1260258:
 Breast Cancer Wisconsin (Diagnostic)
1233423:
 Heart Disease
1229115:
 Wine Quality
1204537:
 Bank Marketing
1187768:
 Car Evaluation
988530:
 Human Activity Recognition Using Smartphones
937304:
 Abalone
885251:
 Forest Fires
669991:
 Student Performance

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML