Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Browse Through:

Default Task

Classification (11)
Regression (6)
Clustering (1)
Other (1)

Attribute Type

Categorical (0)
Numerical (10)
Mixed (0)

Data Type

Multivariate (10)
Univariate (1)
Sequential (0)
Time-Series (2)
Text (2)
Domain-Theory (0)
Other (0)

Area - Undo

Life Sciences (36)
Physical Sciences (11)
CS / Engineering (13)
Social Sciences (6)
Business (8)
Game (0)
Other (14)

# Attributes - Undo

Less than 10 (15)
10 to 100 (13)
Greater than 100 (12)

# Instances - Undo

Less than 100 (2)
100 to 1000 (13)
Greater than 1000 (42)

Format Type

Matrix (7)
Non-Matrix (6)

13 Data Sets

Table View  List View


1. Leaf: This dataset consists in a collection of shape and texture features extracted from digital images of leaf specimens originating from a total of 40 different plant species.

2. MHEALTH Dataset: The MHEALTH (Mobile Health) dataset is devised to benchmark techniques dealing with human behavior analysis based on multimodal body sensing.

3. Mesothelioma’s disease data set : Mesothelioma’s disease data set were prepared at Dicle University Faculty of Medicine in Turkey. Three hundred and twenty-four Mesothelioma patient data. In the dataset, all samples have 34 features.

4. GPS Trajectories: The dataset has been feed by Android app called Go!Track. It is available at Goolge Play Store(https://play.google.com/store/apps/details?id=com.go.router).

5. Concrete Slump Test: Concrete is a highly complex material. The slump flow of concrete is not only determined by the water content, but that is also influenced by other concrete ingredients.

6. Planning Relax: The dataset concerns with the classification of two mental stages from recorded EEG signals: Planning (during imagination of motor act) and Relax state.

7. Optical Interconnection Network : This dataset contains 640 performance measurements from a simulation of 2-Dimensional Multiprocessor Optical Interconnection Network.

8. Behavior of the urban traffic of the city of Sao Paulo in Brazil: The database was created with records of behavior of the urban traffic of the city of Sao Paulo in Brazil.

9. Paper Reviews: This sentiment analysis data set contains scientific paper reviews from an international conference on computing and informatics. The task is to predict the orientation or the evaluation of a review.

10. CSM (Conventional and Social Media Movies) Dataset 2014 and 2015: 12 features categorized as conventional and social media features. Both conventional features, collected from movies databases on Web as well as social media features(YouTube,Twitter).

11. Dresses_Attribute_Sales: This dataset contain Attributes of dresses and their recommendations according to their sales.Sales are monitor on the basis of alternate days.

12. Student Academics Performance: The dataset tried to find the end semester percentage prediction based on different social, economic and academic attributes.

13. Restaurant & consumer data: The dataset was obtained from a recommender system prototype. The task was to generate a top-n list of restaurants according to the consumer preferences.


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML