Cirrhosis Patient Survival Prediction

Donated on 9/11/2023

Utilize 17 clinical features for predicting survival state of patients with liver cirrhosis. The survival states include 0 = D (death), 1 = C (censored), 2 = CL (censored due to liver transplantation).

Dataset Characteristics

Tabular

Subject Area

Health and Medicine

Associated Tasks

Classification

Feature Type

Real, Categorical

# Instances

418

# Features

17

Dataset Information

For what purpose was the dataset created?

Cirrhosis results from prolonged liver damage, leading to extensive scarring, often due to conditions like hepatitis or chronic alcohol consumption. The data provided is sourced from a Mayo Clinic study on primary biliary cirrhosis (PBC) of the liver carried out from 1974 to 1984.

Who funded the creation of the dataset?

Mayo Clinic

What do the instances in this dataset represent?

People

Does the dataset contain data that might be considered sensitive in any way?

Gender, Age

Was there any data preprocessing performed?

1. Drop all the rows where miss value (NA) were present in the Drug column 2. Impute missing values with mean results 3. One-hot encoding for all category attributes

Additional Information

During 1974 to 1984, 424 PBC patients referred to the Mayo Clinic qualified for the randomized placebo-controlled trial testing the drug D-penicillamine. Of these, the initial 312 patients took part in the trial and have mostly comprehensive data. The remaining 112 patients didn't join the clinical trial but agreed to record basic metrics and undergo survival tracking. Six of these patients were soon untraceable after their diagnosis, leaving data for 106 of these individuals in addition to the 312 who were part of the randomized trial.

Has Missing Values?

Yes (symbol: NA)

Introductory Paper

Prognosis in primary biliary cirrhosis: Model for decision making

By E. Dickson, P. Grambsch, T. Fleming, L. Fisher, A. Langworthy. 1989

Published in Hepatology

Variables Table

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
IDIDIntegerunique identifierno
N_DaysOtherIntegernumber of days between registration and the earlier of death, transplantation, or study analysis time in July 1986no
StatusTargetCategoricalstatus of the patient C (censored), CL (censored due to liver tx), or D (death)no
DrugFeatureCategoricaltype of drug D-penicillamine or placeboyes
AgeFeatureIntegerAgeagedaysno
SexFeatureCategoricalSexM (male) or F (female)no
AscitesFeatureCategoricalpresence of ascites N (No) or Y (Yes)yes
HepatomegalyFeatureCategoricalpresence of hepatomegaly N (No) or Y (Yes)yes
SpidersFeatureCategoricalpresence of spiders N (No) or Y (Yes)yes
EdemaFeatureCategoricalpresence of edema N (no edema and no diuretic therapy for edema), S (edema present without diuretics, or edema resolved by diuretics), or Y (edema despite diuretic therapy)no

0 to 10 of 20

Additional Variable Information

1. ID: unique identifier 2. N_Days: number of days between registration and the earlier of death, transplantation, or study analysis time in July 1986 3. Status: status of the patient C (censored), CL (censored due to liver tx), or D (death) 4. Drug: type of drug D-penicillamine or placebo 5. Age: age in [days] 6. Sex: M (male) or F (female) 7. Ascites: presence of ascites N (No) or Y (Yes) 8. Hepatomegaly: presence of hepatomegaly N (No) or Y (Yes) 9. Spiders: presence of spiders N (No) or Y (Yes) 10. Edema: presence of edema N (no edema and no diuretic therapy for edema), S (edema present without diuretics, or edema resolved by diuretics), or Y (edema despite diuretic therapy) 11. Bilirubin: serum bilirubin in [mg/dl] 12. Cholesterol: serum cholesterol in [mg/dl] 13. Albumin: albumin in [gm/dl] 14. Copper: urine copper in [ug/day] 15. Alk_Phos: alkaline phosphatase in [U/liter] 16. SGOT: SGOT in [U/ml] 17. Triglycerides: triglicerides in [mg/dl] 18. Platelets: platelets per cubic [ml/1000] 19. Prothrombin: prothrombin time in seconds [s] 20. Stage: histologic stage of disease (1, 2, 3, or 4)

Class Labels

Status: status of the patient 0 = D (death), 1 = C (censored), 2 = CL (censored due to liver transplantation)

Dataset Files

FileSize
cirrhosis.csv31.1 KB

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (10.8 KB)
1 citations
26681 views

Creators

E. Dickson

P. Grambsch

T. Fleming

L. Fisher

A. Langworthy

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy