Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Welcome to the UC Irvine Machine Learning Repository!

We currently maintain 481 data sets as a service to the machine learning community. You may view all data sets through our searchable interface. For a general overview of the Repository, please visit our About page. For information about citing data sets in publications, please read our citation policy. If you wish to donate a data set, please consult our donation policy. For any other questions, feel free to contact the Repository librarians.

Supported By:

In Collaboration With:

Latest News:
09-24-2018: Welcome to the new Repository admins Dheeru Dua and Efi Karra Taniskidou!
04-04-2013: Welcome to the new Repository admins Kevin Bache and Moshe Lichman!
03-01-2010: Note from donor regarding Netflix data
10-16-2009: Two new data sets have been added.
09-14-2009: Several data sets have been added.
03-24-2008: New data sets have been added!
06-25-2007: Two new data sets have been added: UJI Pen Characters, MAGIC Gamma Telescope


Featured Data Set:  OPPORTUNITY Activity Recognition

Task: Classification
Data Type: Multivariate, Time-Series
# Attributes: 242
# Instances: 2551

The OPPORTUNITY Dataset for Human Activity Recognition from Wearable, Object, and Ambient Sensors is a dataset devised to benchmark human activity recognition algorithms (classification, automatic data segmentation, sensor fusion, feature extraction, etc).
Newest Data Sets:
07-30-2019:
 PPG-DaLiA
07-24-2019:
 Divorce Predictors data set
07-22-2019:
 Alcohol QCM Sensor Dataset
07-14-2019:
 Incident management process enriched event log
06-30-2019:
 Wave Energy Converters
06-22-2019:
 Query Analytics Workloads Dataset
06-17-2019:
 Opinion Corpus for Lebanese Arabic Reviews (OCLAR)
05-07-2019:
 Metro Interstate Traffic Volume
04-22-2019:
 Facebook Live Sellers in Thailand
04-15-2019:
 Gas sensor array temperature modulation
04-14-2019:
 Rice Leaf Diseases
04-10-2019:
 Parkinson Dataset with replicated acoustic features
Most Popular Data Sets (hits since 2007):
2830810:
 Iris
1578140:
 Adult
1224565:
 Wine
1034146:
 Car Evaluation
1016958:
 Wine Quality
1006413:
 Heart Disease
994417:
 Breast Cancer Wisconsin (Diagnostic)
976942:
 Bank Marketing
882592:
 Human Activity Recognition Using Smartphones
821986:
 Abalone
784203:
 Forest Fires
551230:
 Poker Hand

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML