1. Algerian Forest Fires Dataset : The dataset includes 244 instances that regroup a data of two regions of Algeria.
2. Auto MPG: Revised from CMU StatLib library, data concerns city-cycle fuel consumption
3. Automobile: From 1985 Ward's Automotive Yearbook
4. Average Localization Error (ALE) in sensor node localization process in WSNs: This data set can be used to test any regression-based machine learning algorithm. You can predict the ALE variable using four features.
5. Bone marrow transplant: children: The data set describes pediatric patients with several hematologic diseases, who were subject to the unmanipulated allogeneic unrelated donor hematopoietic stem cell transplantation.
6. Breast Cancer Wisconsin (Prognostic): Prognostic Wisconsin Breast Cancer Database
7. Computer Hardware: Relative CPU Performance Data, described in terms of its cycle time, memory size, etc.
8. Concrete Slump Test: Concrete is a highly complex material. The slump flow of concrete is not only determined by the water content, but that is also influenced by other concrete ingredients.
9. DrivFace: The DrivFace contains images sequences of subjects while driving in real scenarios. It is composed of 606 samples of 640Ă—480, acquired over different days from 4 drivers with several facial features.
10. Early biomarkers of Parkinson’s disease based on natural connected speech: Predict a pattern of neurodegeneration in the dataset of speech features obtained from patients with early untreated Parkinsonâ€™s disease and patients at high risk developing Parkinsonâ€™s disease.
11. Energy efficiency: This study looked into assessing the heating load and cooling load requirements of buildings (that is, energy efficiency) as a function of building parameters.
12. Facebook metrics: Facebook performance metrics of a renowned cosmetic's brand Facebook page.
13. Forest Fires: This is a difficult regression task, where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data (see details at: http://www.dsi.uminho.pt/~pcortez/forestfires).
14. Gas sensor array exposed to turbulent gas mixtures: A chemical detection platform composed of 8 chemoresistive gas sensors was exposed to turbulent gas mixtures generated naturally in a wind tunnel. The acquired time series of the sensors are provided.
15. Heart failure clinical records: This dataset contains the medical records of 299 patients who had heart failure, collected during their follow-up period, where each patient profile has 13 clinical features.
16. ISTANBUL STOCK EXCHANGE: Data sets includes returns of Istanbul Stock Exchange with seven other international index; SP, DAX, FTSE, NIKKEI, BOVESPA, MSCE_EU, MSCI_EM from Jun 5, 2009 to Feb 22, 2011.
17. Optical Interconnection Network : This dataset contains 640 performance measurements from a simulation of 2-Dimensional Multiprocessor Optical Interconnection Network.
18. QSAR aquatic toxicity: Data set containing values for 8 attributes (molecular descriptors) of 546 chemicals used to predict quantitative acute aquatic toxicity towards Daphnia Magna..
19. QSAR Bioconcentration classes dataset: Dataset of manually-curated Bioconcentration factor (BCF, fish) and mechanistic classes for QSAR modeling.
20. QSAR fish toxicity: Data set containing values for 6 attributes (molecular descriptors) of 908 chemicals used to predict quantitative acute aquatic toxicity towards the fish Pimephales promelas (fathead minnow).
21. Real estate valuation data set: The â€śreal estate valuationâ€ť is a regression problem. The market historical data set of real estate valuation are collected from Sindian Dist., New Taipei City, Taiwan.
22. Residential Building Data Set: Data set includes construction cost, sale prices, project variables, and economic variables corresponding to real estate single-family residential apartments in Tehran, Iran.
23. Risk Factor prediction of Chronic Kidney Disease: Chronic kidney disease (CKD) is an increasing medical issue that declines the productivity of renal capacities and subsequently damages the kidneys.
24. Servo: Data was from a simulation of a servo system
25. South German Credit: 700 good and 300 bad credits with 20 predictor variables. Data from 1973 to 1975. Stratified sample from actual credits with bad credits heavily oversampled. A cost matrix can be used.
26. South German Credit (UPDATE): 700 good and 300 bad credits with 20 predictor variables. Data from 1973 to 1975. Stratified sample from actual credits with bad credits heavily oversampled. A cost matrix can be used.
27. Stock portfolio performance: The data set of performances of weighted scoring stock portfolios are obtained with mixture design from the US stock market historical database.
28. Student Performance: Predict student performance in secondary education (high school).
29. Synchronous Machine Data Set: Synchronous motors (SMs) are AC motors with constant speed.A SM dataset is obtained from a real experimental set. The task is to create the strong models to estimate the excitation current of SM.
30. Synchronous Machine Data Set: Synchronous motors (SMs) are AC motors with constant speed.A SM dataset is obtained from a real experimental set. The task is to create the strong models to estimate the excitation current of SM.
31. Tennis Major Tournament Match Statistics: This is a collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.
32. Twin gas sensor arrays: 5 replicates of an 8-MOX gas sensor array were exposed to different gas conditions (4 volatiles at 10 concentration levels each).
33. Water Quality Prediction: The goal is to predict the spatio-temporal water quality in terms of the â€śpower of hydrogen (pH)â€ť value for the next day based on the historical data of water measurement indices.
34. wiki4HE: Survey of faculty members from two Spanish universities on teaching uses of Wikipedia
35. Yacht Hydrodynamics: Delft data set, used to predict the hydodynamic performance of sailing yachts from dimensions and velocity.