Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Browse Through:

Default Task - Undo

Classification (40)
Regression (22)
Clustering (16)
Other (2)

Attribute Type

Categorical (0)
Numerical (21)
Mixed (0)

Data Type - Undo

Multivariate (60)
Univariate (3)
Sequential (8)
Time-Series (22)
Text (9)
Domain-Theory (1)
Other (0)

Area

Life Sciences (1)
Physical Sciences (3)
CS / Engineering (14)
Social Sciences (1)
Business (2)
Game (0)
Other (1)

# Attributes

Less than 10 (4)
10 to 100 (15)
Greater than 100 (3)

# Instances - Undo

Less than 100 (2)
100 to 1000 (3)
Greater than 1000 (22)

Format Type - Undo

Matrix (22)
Non-Matrix (11)

22 Data Sets

Table View  List View


1. Air Quality: Contains the responses of a gas multisensor device deployed on the field in an Italian city. Hourly responses averages are recorded along with gas concentrations references from a certified analyzer.

2. Amazon Access Samples: Amazon's InfoSec is getting smarter about the way Access data is leveraged. This is an anonymized sample of access provisioned within the company.

3. Beijing Multi-Site Air-Quality Data: This hourly data set considers 6 main air pollutants and 6 relevant meteorological variables at multiple sites in Beijing.

4. Beijing PM2.5 Data: This hourly data set contains the PM2.5 data of US Embassy in Beijing. Meanwhile, meteorological data from Beijing Capital International Airport are also included.

5. Buzz in social media : This data-set contains examples of buzz events from two different social networks: Twitter, and Tom's Hardware, a forum network focusing on new technology with more conservative dynamics.

6. CNNpred: CNN-based stock market prediction using a diverse set of variables: This dataset contains several daily features of S&P 500, NASDAQ Composite, Dow Jones Industrial Average, RUSSELL 2000, and NYSE Composite from 2010 to 2017.

7. Condition monitoring of hydraulic systems: The data set addresses the condition assessment of a hydraulic test rig based on multi sensor data. Four fault types are superimposed with several severity grades impeding selective quantification.

8. Educational Process Mining (EPM): A Learning Analytics Data Set: Educational Process Mining data set is built from the recordings of 115 subjects' activities through a logging application while learning with an educational simulator.

9. EEG Steady-State Visual Evoked Potential Signals: This database consists on 30 subjects performing Brain Computer Interface for Steady State Visual Evoked Potentials (BCI-SSVEP).

10. Gas Sensor Array Drift Dataset at Different Concentrations: This archive contains 13910 measurements from 16 chemical sensors exposed to 6 different gases at various concentration levels.

11. Gas sensor array temperature modulation: A chemical detection platform composed of 14 temperature-modulated metal oxide (MOX) gas sensors was exposed during 3 weeks to mixtures of carbon monoxide and humid synthetic air in a gas chamber.

12. GNFUV Unmanned Surface Vehicles Sensor Data: The data-set contains four (4) sets of mobile sensor readings data (humidity, temperature) corresponding to a swarm of four (4) Unmanned Surface Vehicles (USVs) in a test-bed in Athens (Greece).

13. GNFUV Unmanned Surface Vehicles Sensor Data Set 2: The data-set contains eight (2x4) data-sets of mobile sensor readings data (humidity, temperature) corresponding to a swarm of four Unmanned Surface Vehicles (USVs) in a test-bed, Athens, Greece.

14. Metro Interstate Traffic Volume: Hourly Minneapolis-St Paul, MN traffic volume for westbound I-94. Includes weather and holiday features from 2012-2018.

15. News Popularity in Multiple Social Media Platforms: Large data set of news items and their respective social feedback on multiple platforms: Facebook, Google+ and LinkedIn.

16. Online Retail II: A real online retail transaction data set of two years.

17. PM2.5 Data of Five Chinese Cities: This hourly data set contains the PM2.5 data in Beijing, Shanghai, Guangzhou, Chengdu and Shenyang. Meanwhile, meteorological data for each city are also included.

18. PPG-DaLiA: PPG-DaLiA contains data from 15 subjects wearing physiological and motion sensors, providing a PPG dataset for motion compensation and heart rate estimation in Daily Life Activities.

19. Real-time Election Results: Portugal 2019: Data set of the real-time election results of the 2019 Portuguese Parliamentary Election.

20. SML2010: This dataset is collected from a monitor system mounted in a domotic house. It corresponds to approximately 40 days of monitoring data.

21. UJIIndoorLoc-Mag: The UJIIndoorLoc-Mag is an indoor localization database to test Indoor Positioning System that rely on Earth's magnetic field variations.

22. WESAD (Wearable Stress and Affect Detection): WESAD (Wearable Stress and Affect Detection) contains data of 15 subjects during a stress-affect lab study, while wearing physiological and motion sensors.


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML