1. NoisyOffice: Corpus intended to do cleaning (or binarization) and enhancement of noisy grayscale printed text images using supervised learning methods. Noisy images and their corresponding ground truth provided.
2. ElectricityLoadDiagrams20112014: This data set contains electricity consumption of 370 points/clients.
3. GPS Trajectories: The dataset has been feed by Android app called Go!Track. It is available at Goolge Play Store(https://play.google.com/store/apps/details?id=com.go.router).
4. Alcohol QCM Sensor Dataset: Five different QCM gas sensors are used, and five different gas measurements (1-octanol, 1-propanol, 2-butanol, 2-propanol and 1-isobutanol) are conducted in each of these sensors.
5. Fertility: 100 volunteers provide a semen sample analyzed according to the WHO 2010 criteria. Sperm concentration are related to socio-demographic data, environmental factors, health status, and life habits
6. Behavior of the urban traffic of the city of Sao Paulo in Brazil: The database was created with records of behavior of the urban traffic of the city of Sao Paulo in Brazil.
7. Paper Reviews: This sentiment analysis data set contains scientific paper reviews from an international conference on computing and informatics. The task is to predict the orientation or the evaluation of a review.
8. CSM (Conventional and Social Media Movies) Dataset 2014 and 2015: 12 features categorized as conventional and social media features. Both conventional features, collected from movies databases on Web as well as social media features(YouTube,Twitter).
9. Lab Test: This data set consists of ALT, AST, urea, glucose, and creatine kinase laboratory values of the patients. Creatine Kinase values have been converted according to general reference values.