Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact

Repository Web            Google
View ALL Data Sets

Browse Through:

Default Task - Undo

Classification (12)
Regression (1)
Clustering (1)
Other (1)

Attribute Type - Undo

Categorical (0)
Numerical (12)
Mixed (2)

Data Type - Undo

Multivariate (12)
Univariate (0)
Sequential (1)
Time-Series (2)
Text (0)
Domain-Theory (0)
Other (2)

Area - Undo

Life Sciences (14)
Physical Sciences (12)
CS / Engineering (17)
Social Sciences (2)
Business (4)
Game (0)
Other (11)

# Attributes - Undo

Less than 10 (1)
10 to 100 (12)
Greater than 100 (3)

# Instances

Less than 100 (0)
100 to 1000 (5)
Greater than 1000 (7)

Format Type - Undo

Matrix (12)
Non-Matrix (2)

12 Data Sets

Table View  List View

1. Ozone Level Detection: Two ground ozone level data sets are included in this collection. One is the eight hour peak set (, the other is the one hour peak set ( Those data were collected from 1998 to 2004 at the Houston, Galveston and Brazoria area.

2. Waveform Database Generator (Version 1): CART book's waveform domains

3. Waveform Database Generator (Version 2): CART book's waveform domains

4. Glass Identification: From USA Forensic Science Service; 6 types of glass; defined in terms of their oxide content (i.e. Na, Fe, K, etc)

5. Ionosphere: Classification of radar returns from the ionosphere

6. HEPMASS: The search for exotic particles requires sorting through a large number of collisions to find the events of interest. This data set challenges one to detect a new particle of unknown mass.

7. Wine: Using chemical analysis determine the origin of wines

8. Statlog (Landsat Satellite): Multi-spectral values of pixels in 3x3 neighbourhoods in a satellite image, and the classification associated with the central pixel in each neighbourhood

9. Connectionist Bench (Sonar, Mines vs. Rocks): The task is to train a network to discriminate between sonar signals bounced off a metal cylinder and those bounced off a roughly cylindrical rock.

10. MAGIC Gamma Telescope: Data are MC generated to simulate registration of high energy gamma particles in an atmospheric Cherenkov telescope

11. MiniBooNE particle identification: This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos (signal) from muon neutrinos (background).

12. Climate Model Simulation Crashes: Given Latin hypercube samples of 18 climate model input parameter values, predict climate model simulation crashes and determine the parameter value combinations that cause the failures.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML