ILPD (Indian Liver Patient Dataset)

Donated on 5/20/2012

Death by liver cirrhosis continues to increase, given the increase in alcohol consumption rates, chronic hepatitis infections, and obesity-related liver disease. Notwithstanding the high mortality of this disease, liver diseases do not affect all sub-populations equally. The early detection of pathology is a determinant of patient outcomes, yet female patients appear to be marginalized when it comes to early diagnosis of liver pathology. The dataset comprises 584 patient records collected from the NorthEast of Andhra Pradesh, India. The prediction task is to determine whether a patient suffers from liver disease based on the information about several biochemical markers, including albumin and other enzymes required for metabolism.

Dataset Characteristics


Subject Area

Life Science

Associated Tasks


Feature Type

Integer, Real

# Instances


# Features


Dataset Information

What do the instances in this dataset represent?

Medical patients

Does the dataset contain data that might be considered sensitive in any way?

Yes. The data contains information about the age and gender of the patients.

Was there any data preprocessing performed?

Any patient whose age exceeded 89 is listed as being of age "90".

Additional Information

This data set contains records of 416 patients diagnosed with liver disease and 167 patients without liver disease. This information is contained in the class label named 'Selector'. There are 10 variables per patient: age, gender, total Bilirubin, direct Bilirubin, total proteins, albumin, A/G ratio, SGPT, SGOT and Alkphos. Of the 583 patient records, 441 are male, and 142 are female. The current dataset has been used to study - differences in patients across US and Indian patients that suffer from liver diseases. - gender-based disparities in predicting liver disease, as previous studies have found that biochemical markers do not have the same effectiveness for male and female patients.

Has Missing Values?


Variables Table

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
AgeFeatureIntegerAgeAge of the patient. Any patient whose age exceeded 89 is listed as being of age "90".yearsno
GenderFeatureBinaryGenderGender of the patientno
TBFeatureContinuousTotal Bilirubinno
DBFeatureContinuousDirect Bilirubinno
AlkphosFeatureIntegerAlkaline Phosphotaseno
SgptFeatureIntegerAlamine Aminotransferaseno
SgotFeatureIntegerAspartate Aminotransferaseno
TPFeatureContinuousTotal Proteinsno
A/G RatioFeatureContinuousAlbumin and Globulin Rationo

1 citations


Liver disease diagnosisClassificationhealthhealth equity


Bendi Ramana

N. Venkateswarlu


