1. Zoo: Artificial, 7 classes of animals
2. Thyroid Disease: 10 separate databases from Garavan Institute
3. Thoracic Surgery Data: The data is dedicated to classification problem related to the post-operative life expectancy in the lung cancer patients: class 1 - death within one year after surgery, class 2 - survival.
4. Statlog (Heart): This dataset is a heart disease database similar to a database already present in the repository (Heart Disease databases) but in a slightly different form
5. SPECTF Heart: Data on cardiac Single Proton Emission Computed Tomography (SPECT) images. Each patient classified into two categories: normal and abnormal.
6. SPECT Heart: Data on cardiac Single Proton Emission Computed Tomography (SPECT) images. Each patient classified into two categories: normal and abnormal.
7. Soybean (Small): Michalski's famous soybean disease database
8. Soybean (Large): Michalski's famous soybean disease database
9. Quality Assessment of Digital Colposcopies: This dataset explores the subjective quality assessment of digital colposcopies.
10. Quadruped Mammals: The file animals.c is a data generator of structured instances representing quadruped animals
11. Primary Tumor: From Ljubljana Oncology Institute
12. Parkinsons: Oxford Parkinson's Disease Detection Dataset
13. Parkinson Speech Dataset with Multiple Types of Sound Recordings: The training data belongs to 20 Parkinson's Disease (PD) patients and 20 healthy subjects. From all subjects, multiple types of sound recordings (26) are taken.
14. Mushroom: From Audobon Society Field Guide; mushrooms described in terms of physical characteristics; classification: poisonous or edible
15. Mice Protein Expression: Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning.
16. Lymphography: This lymphography domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. (Restricted access)
17. Lung Cancer: Lung cancer data; no attribute definitions
18. KEGG Metabolic Relation Network (Directed): KEGG Metabolic pathways modeled as directed relation network. Variety of graphical features presented.
19. KEGG Metabolic Reaction Network (Undirected): KEGG Metabolic pathways modeled as un-directed reaction network. Variety of graphical features presented.
20. ILPD (Indian Liver Patient Dataset): This data set contains 10 variables that are age, gender, total Bilirubin, direct Bilirubin, total proteins, albumin, A/G ratio, SGPT, SGOT and Alkphos.
21. Horse Colic: Well documented attributes; 368 instances with 28 attributes (continuous, discrete, and nominal); 30% missing values
22. Hepatitis: From G.Gong: CMU; Mostly Boolean or numeric-valued attribute types; Includes cost data (donated by Peter Turney)
23. Heart Disease: 4 databases: Cleveland, Hungary, Switzerland, and the VA Long Beach
24. HCC Survival: Hepatocellular Carcinoma dataset (HCC dataset) was collected at a University Hospital in Portugal. It contains real clinical data of 165 patients diagnosed with HCC.
25. Forest type mapping: Multi-temporal remote sensing data of a forested area in Japan. The goal is to map different forest types using spectral data.
26. Fertility: 100 volunteers provide a semen sample analyzed according to the WHO 2010 criteria. Sperm concentration are related to socio-demographic data, environmental factors, health status, and life habits
27. EEG Steady-State Visual Evoked Potential Signals: This database consists on 30 subjects performing Brain Computer Interface for Steady State Visual Evoked Potentials (BCI-SSVEP).
28. EEG Eye State: The data set consists of 14 EEG values and a value indicating the eye state.
29. Echocardiogram: Data for classifying if patients will survive for at least one year after a heart attack
30. Early biomarkers of Parkinson’s disease based on natural connected speech: Predict a pattern of neurodegeneration in the dataset of speech features obtained from patients with early untreated Parkinsonâ€™s disease and patients at high risk developing Parkinsonâ€™s disease.
31. Diabetic Retinopathy Debrecen Data Set: This dataset contains features extracted from the Messidor image set to predict whether an image contains signs of diabetic retinopathy or not.
32. Diabetes 130-US hospitals for years 1999-2008: This data has been prepared to analyze factors related to readmission as well as other
outcomes pertaining to patients with diabetes.
33. Dermatology: Aim for this dataset is to determine the type of Eryhemato-Squamous Disease.
34. Covertype: Forest CoverType dataset
35. Cervical cancer (Risk Factors): This dataset focuses on the prediction of indicators/diagnosis of cervical cancer. The features cover demographic information, habits, and historic medical records.
36. Cardiotocography: The dataset consists of measurements of fetal heart rate (FHR) and uterine contraction (UC) features on cardiotocograms classified by expert obstetricians.
37. Breast Tissue: Dataset with electrical impedance measurements of freshly excised tissue samples from the breast.
38. Breast Cancer Wisconsin (Prognostic): Prognostic Wisconsin Breast Cancer Database
39. Breast Cancer Wisconsin (Original): Original Wisconsin Breast Cancer Database
40. Breast Cancer Wisconsin (Diagnostic): Diagnostic Wisconsin Breast Cancer Database
41. Breast Cancer Coimbra: Clinical features were observed or measured for 64 patients with breast cancer and 52 healthy controls.
42. Autistic Spectrum Disorder Screening Data for Children : Children screening data for autism suitable for classification and predictive tasks
43. Autistic Spectrum Disorder Screening Data for Adolescent : Autistic Spectrum Disorder Screening Data for Adolescent. This dataset is related to classification and predictive tasks.
44. Audiology (Standardized): Standardized version of the original audiology database
45. Anuran Calls (MFCCs): Acoustic features extracted from syllables of anuran (frogs) calls, including the family, the genus, and the species labels (multilabel).