1. Breast Cancer Wisconsin (Prognostic): Prognostic Wisconsin Breast Cancer Database 2. Algerian Forest Fires Dataset : The dataset includes 244 instances that regroup a data of two regions of Algeria. 3. Hungarian Chickenpox Cases: A spatio-temporal dataset of weekly chickenpox cases from Hungary. The dataset consists of a county-level adjacency matrix and time series of the county-level reported cases between 2005 and 2015. 4. GPS Trajectories: The dataset has been feed by Android app called Go!Track. It is available at Goolge Play Store(https://play.google.com/store/apps/details?id=com.go.router). 5. Risk Factor prediction of Chronic Kidney Disease: Chronic kidney disease (CKD) is an increasing medical issue that declines the productivity of renal capacities and subsequently damages the kidneys. 6. Water Quality Prediction: The goal is to predict the spatio-temporal water quality in terms of the “power of hydrogen (pH)” value for the next day based on the historical data of water measurement indices. 7. Stock portfolio performance: The data set of performances of weighted scoring stock portfolios are obtained with mixture design from the US stock market historical database. 8. Forest Fires: This is a difficult regression task, where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data (see details at: http://www.dsi.uminho.pt/~pcortez/forestfires). 9. Concrete Slump Test: Concrete is a highly complex material. The slump flow of concrete is not only determined by the water content, but that is also influenced by other concrete ingredients. 10. Fertility: 100 volunteers provide a semen sample analyzed according to the WHO 2010 criteria. Sperm concentration are related to socio-demographic data, environmental factors, health status, and life habits 11. Heart failure clinical records: This dataset contains the medical records of 299 patients who had heart failure, collected during their follow-up period, where each patient profile has 13 clinical features. 12. South German Credit: 700 good and 300 bad credits with 20 predictor variables. Data from 1973 to 1975. Stratified sample from actual credits with bad credits heavily oversampled. A cost matrix can be used. 13. Tennis Major Tournament Match Statistics: This is a collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows. 14. Bone marrow transplant: children: The data set describes pediatric patients with several hematologic diseases, who were subject to the unmanipulated allogeneic unrelated donor hematopoietic stem cell transplantation. 15. South German Credit (UPDATE): 700 good and 300 bad credits with 20 predictor variables. Data from 1973 to 1975. Stratified sample from actual credits with bad credits heavily oversampled. A cost matrix can be used. 16. Early biomarkers of Parkinsons disease based on natural connected speech: Predict a pattern of neurodegeneration in the dataset of speech features obtained from patients with early untreated Parkinson’s disease and patients at high risk developing Parkinson’s disease. 17. Optical Interconnection Network : This dataset contains 640 performance measurements from a simulation of 2-Dimensional Multiprocessor Optical Interconnection Network. 18. Behavior of the urban traffic of the city of Sao Paulo in Brazil: The database was created with records of behavior of the urban traffic of the city of Sao Paulo in Brazil. 19. Student Performance: Predict student performance in secondary education (high school). 20. Facebook metrics: Facebook performance metrics of a renowned cosmetic's brand Facebook page. 21. Las Vegas Strip: This dataset includes quantitative and categorical features from online reviews from 21 hotels located in Las Vegas Strip, extracted from TripAdvisor (http://www.tripadvisor.com). 22. Paper Reviews: This sentiment analysis data set contains scientific paper reviews from an international conference on computing and informatics. The task is to predict the orientation or the evaluation of a review. 23. CSM (Conventional and Social Media Movies) Dataset 2014 and 2015: 12 features categorized as conventional and social media features. Both conventional features, collected from movies databases on Web as well as social media features(YouTube,Twitter). |