Welcome to the UC Irvine Machine Learning Repository
We currently maintain 634 datasets as a service to the machine learning community. Here, you can donate and find datasets used by millions of people all around the world!
Popular Datasets
Iris
A small classic dataset from Fisher, 1936. One of the earliest datasets used for evaluation of classification methodologies.
Heart Disease
4 databases: Cleveland, Hungary, Switzerland, and the VA Long Beach
Adult
Predict whether income exceeds $50K/yr based on census data. Also known as "Census Income" dataset.
Dry Bean Dataset
Images of 13,611 grains of 7 different registered dry beans were taken with a high-resolution camera. A total of 16 features; 12 dimensions and 4 shape forms, were obtained from the grains.
Diabetes
This diabetes dataset is from AIM '94
Wine
Using chemical analysis determine the origin of wines
New Datasets
TamilSentiMix
We created a gold standard Tamil-English code-switched, sentiment-annotated corpus containing 15,744 comment posts from YouTube.
Non verbal tourists data
This dataset contains the information about non-verbal preferences of tourists
Power consumption of Tetouan city
This dataset is related to power consumption of three different distribution networks of Tetouan city which is located in north Morocco.
Secondary Mushroom Dataset
Dataset of simulated mushrooms for binary classification into edible and poisonous.
Accelerometer
Accelerometer data from vibrations of a cooler fan with weights on its blades. It can be used for predictions, classification and other tasks that require vibration analysis, especially in engines.
Pedal Me Bicycle Deliveries
A dataset of weekly bicycle package deliveries by Pedal Me in London during 2020 and 2021. Nodes in the graph represent geographical units and edges are proximity based mutual adjacency relationships.

