HCC Survival
Donated on 11/28/2017
Hepatocellular Carcinoma dataset (HCC dataset) was collected at a University Hospital in Portugal. It contains real clinical data of 165 patients diagnosed with HCC.
Dataset Characteristics
Multivariate
Subject Area
Health and Medicine
Associated Tasks
Classification
Feature Type
Integer, Real
# Instances
165
# Features
-
Dataset Information
Additional Information
HCC dataset was obtained at a University Hospital in Portugal and contais several demographic, risk factors, laboratory and overall survival features of 165 real patients diagnosed with HCC. The dataset contains 49 features selected according to the EASL-EORTC (European Association for the Study of the Liver - European Organisation for Research and Treatment of Cancer) Clinical Practice Guidelines, which are the current state-of-the-art on the management of HCC. This is an heterogeneous dataset, with 23 quantitative variables, and 26 qualitative variables. Overall, missing data represents 10.22% of the whole dataset and only eight patients have complete information in all fields (4.85%). The target variables is the survival at 1 year, and was encoded as a binary variable: 0 (dies) and 1 (lives). A certain degree of class-imbalance is also present (63 cases labeled as “dies†and 102 as “livesâ€). A detailed description of the HCC dataset (feature's type/scale, range, mean/mode and missing data percentages) is provided in Santos et al. A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients, Journal of biomedical informatics, 58, 49-59, 2015.
Has Missing Values?
Yes
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no |
0 to 10 of 49
Additional Variable Information
Gender: nominal Symptoms: nominal Alcohol: nominal Hepatitis B Surface Antigen: nominal Hepatitis B e Antigen: nominal Hepatitis B Core Antibody: nominal Hepatitis C Virus Antibody: nominal Cirrhosis : nominal Endemic Countries: nominal Smoking: nominal Diabetes: nominal Obesity: nominal Hemochromatosis: nominal Arterial Hypertension: nominal Chronic Renal Insufficiency: nominal Human Immunodeficiency Virus: nominal Nonalcoholic Steatohepatitis: nominal Esophageal Varices: nominal Splenomegaly: nominal Portal Hypertension: nominal Portal Vein Thrombosis: nominal Liver Metastasis: nominal Radiological Hallmark: nominal Age at diagnosis: integer Grams of Alcohol per day: continuous Packs of cigarets per year: continuous Performance Status: ordinal Encefalopathy degree: ordinal Ascites degree: ordinal International Normalised Ratio: continuous Alpha-Fetoprotein (ng/mL): continuous Haemoglobin (g/dL): continuous Mean Corpuscular Volume (fl): continuous Leukocytes(G/L): continuous Platelets (G/L): continuous Albumin (mg/dL): continuous Total Bilirubin(mg/dL): continuous Alanine transaminase (U/L): continuous Aspartate transaminase (U/L): continuous Gamma glutamyl transferase (U/L): continuous Alkaline phosphatase (U/L): continuous Total Proteins (g/dL): continuous Creatinine (mg/dL): continuous Number of Nodules: integer Major dimension of nodule (cm): continuous Direct Bilirubin (mg/dL): continuous Iron (mcg/dL): continuous Oxygen Saturation (%): continuous Ferritin (ng/mL): continuous Class: nominal (1 if patient survives, 0 if patient died)
Dataset Files
File | Size |
---|---|
hcc-survival/hcc-data.txt | 22.2 KB |
hcc-survival/hcc-description.txt | 8.2 KB |
__MACOSX/hcc-survival/._hcc-description.txt | 428 Bytes |
__MACOSX/hcc-survival/._hcc-data.txt | 239 Bytes |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset hcc_survival = fetch_ucirepo(id=423) # data (as pandas dataframes) X = hcc_survival.data.features y = hcc_survival.data.targets # metadata print(hcc_survival.metadata) # variable information print(hcc_survival.variables)
Santos, M., Abreu, P., Garcia-Laencina, P., Simao, A., & Carvalho, A. (2015). HCC Survival [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5TS4S.
Creators
Miriam Santos
Pedro Abreu
Pedro Garcia-Laencina
Adelia Simao
Armando Carvalho
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.