Browse Datasets
Sort by # Views, desc
Breast Cancer Wisconsin (Diagnostic)
Diagnostic Wisconsin Breast Cancer Database.
Breast Cancer Wisconsin (Original)
Original Wisconsin Breast Cancer Database
Breast Cancer
This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. This is one of three domains provided by the Oncology Institute that has repeatedly appeared in the machine learning literature. (See also lymphography and primary-tumor.)
Credit Approval
This data concerns credit card applications; good mix of attributes
Lung Cancer
Lung cancer data; no attribute definitions
Cervical Cancer (Risk Factors)
This dataset focuses on the prediction of indicators/diagnosis of cervical cancer. The features cover demographic information, habits, and historic medical records.
Clickstream Data for Online Shopping
The dataset contains information on clickstream from online store offering clothing for pregnant women.
Differentiated Thyroid Cancer Recurrence
This data set contains 13 clinicopathologic features aiming to predict recurrence of well differentiated thyroid cancer. The data set was collected in duration of 15 years and each patient was followed for at least 10 years.
Glioma Grading Clinical and Mutation Features
Gliomas are the most common primary tumors of the brain. They can be graded as LGG (Lower-Grade Glioma) or GBM (Glioblastoma Multiforme) depending on the histological/imaging criteria. Clinical and molecular/mutation factors are also very crucial for the grading process. Molecular tests are expensive to help accurately diagnose glioma patients. In this dataset, the most frequently mutated 20 genes and 3 clinical features are considered from TCGA-LGG and TCGA-GBM brain glioma projects. The prediction task is to determine whether a patient is LGG or GBM with a given clinical and molecular/mutation features. The main objective is to find the optimal subset of mutation genes and clinical features for the glioma grading process to improve performance and reduce costs.
TCGA Kidney Cancers
The TCGA Kidney Cancers Dataset is a bulk RNA-seq dataset that contains transcriptome profiles of patients diagnosed with three different subtypes of kidney cancers. This dataset can be used to make predictions about the specific subtype of kidney cancers given the normalized transcriptome profile data, as well as providing a hands-on experience on large and sparse genomic information.
0 to 10 of 24