Cirrhosis Patient Survival Prediction
Donated on 9/11/2023
Utilize 17 clinical features for predicting survival state of patients with liver cirrhosis. The survival states include 0 = D (death), 1 = C (censored), 2 = CL (censored due to liver transplantation).
Dataset Characteristics
Tabular
Subject Area
Health and Medicine
Associated Tasks
Classification
Feature Type
Real, Categorical
# Instances
418
# Features
17
Dataset Information
For what purpose was the dataset created?
Cirrhosis results from prolonged liver damage, leading to extensive scarring, often due to conditions like hepatitis or chronic alcohol consumption. The data provided is sourced from a Mayo Clinic study on primary biliary cirrhosis (PBC) of the liver carried out from 1974 to 1984.
Who funded the creation of the dataset?
Mayo Clinic
What do the instances in this dataset represent?
People
Does the dataset contain data that might be considered sensitive in any way?
Gender, Age
Was there any data preprocessing performed?
1. Drop all the rows where miss value (NA) were present in the Drug column 2. Impute missing values with mean results 3. One-hot encoding for all category attributes
Additional Information
During 1974 to 1984, 424 PBC patients referred to the Mayo Clinic qualified for the randomized placebo-controlled trial testing the drug D-penicillamine. Of these, the initial 312 patients took part in the trial and have mostly comprehensive data. The remaining 112 patients didn't join the clinical trial but agreed to record basic metrics and undergo survival tracking. Six of these patients were soon untraceable after their diagnosis, leaving data for 106 of these individuals in addition to the 312 who were part of the randomized trial.
Has Missing Values?
Yes (symbol: NA)
Introductory Paper
By E. Dickson, P. Grambsch, T. Fleming, L. Fisher, A. Langworthy. 1989
Published in Hepatology
Variables Table
Variable Name | Role | Type | Demographic | Description | Units | Missing Values |
---|---|---|---|---|---|---|
ID | ID | Integer | unique identifier | no | ||
N_Days | Other | Integer | number of days between registration and the earlier of death, transplantation, or study analysis time in July 1986 | no | ||
Status | Target | Categorical | status of the patient C (censored), CL (censored due to liver tx), or D (death) | no | ||
Drug | Feature | Categorical | type of drug D-penicillamine or placebo | yes | ||
Age | Feature | Integer | Age | age | days | no |
Sex | Feature | Categorical | Sex | M (male) or F (female) | no | |
Ascites | Feature | Categorical | presence of ascites N (No) or Y (Yes) | yes | ||
Hepatomegaly | Feature | Categorical | presence of hepatomegaly N (No) or Y (Yes) | yes | ||
Spiders | Feature | Categorical | presence of spiders N (No) or Y (Yes) | yes | ||
Edema | Feature | Categorical | presence of edema N (no edema and no diuretic therapy for edema), S (edema present without diuretics, or edema resolved by diuretics), or Y (edema despite diuretic therapy) | no |
0 to 10 of 20
Additional Variable Information
1. ID: unique identifier 2. N_Days: number of days between registration and the earlier of death, transplantation, or study analysis time in July 1986 3. Status: status of the patient C (censored), CL (censored due to liver tx), or D (death) 4. Drug: type of drug D-penicillamine or placebo 5. Age: age in [days] 6. Sex: M (male) or F (female) 7. Ascites: presence of ascites N (No) or Y (Yes) 8. Hepatomegaly: presence of hepatomegaly N (No) or Y (Yes) 9. Spiders: presence of spiders N (No) or Y (Yes) 10. Edema: presence of edema N (no edema and no diuretic therapy for edema), S (edema present without diuretics, or edema resolved by diuretics), or Y (edema despite diuretic therapy) 11. Bilirubin: serum bilirubin in [mg/dl] 12. Cholesterol: serum cholesterol in [mg/dl] 13. Albumin: albumin in [gm/dl] 14. Copper: urine copper in [ug/day] 15. Alk_Phos: alkaline phosphatase in [U/liter] 16. SGOT: SGOT in [U/ml] 17. Triglycerides: triglicerides in [mg/dl] 18. Platelets: platelets per cubic [ml/1000] 19. Prothrombin: prothrombin time in seconds [s] 20. Stage: histologic stage of disease (1, 2, 3, or 4)
Class Labels
Status: status of the patient 0 = D (death), 1 = C (censored), 2 = CL (censored due to liver transplantation)
Dataset Files
File | Size |
---|---|
cirrhosis.csv | 31.1 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset cirrhosis_patient_survival_prediction = fetch_ucirepo(id=878) # data (as pandas dataframes) X = cirrhosis_patient_survival_prediction.data.features y = cirrhosis_patient_survival_prediction.data.targets # metadata print(cirrhosis_patient_survival_prediction.metadata) # variable information print(cirrhosis_patient_survival_prediction.variables)
Dickson, E., Grambsch, P., Fleming, T., Fisher, L., & Langworthy, A. (1989). Cirrhosis Patient Survival Prediction [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5R02G.
Creators
E. Dickson
P. Grambsch
T. Fleming
L. Fisher
A. Langworthy
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.