
HCV data
Donated on 6/9/2020
The data set contains laboratory values of blood donors and Hepatitis C patients and demographic values like age.
Dataset Characteristics
Multivariate
Subject Area
Life Science
Associated Tasks
Classification, Clustering
Feature Type
Integer, Real
# Instances
615
# Features
11
Dataset Information
What do the instances in this dataset represent?
Instances are patients
Additional Information
The target attribute for classification is Category (blood donors vs. Hepatitis C, including its progress: 'just' Hepatitis C, Fibrosis, Cirrhosis).
Has Missing Values?
Yes
Introductory Paper
By Georg F. Hoffmann, A. Bietenbeck, R. Lichtinghagen, F. Klawonn. 2018
Published in Journal of Laboratory and Precision Medicine
Variables Table
Variable Name | Role | Type | Demographic | Description | Units | Missing Values |
---|---|---|---|---|---|---|
ID | ID | Integer | Patient ID | no | ||
Age | Feature | Integer | Age | years | no | |
Sex | Feature | Binary | Sex | no | ||
ALB | Feature | Continuous | yes | |||
ALP | Feature | Continuous | yes | |||
AST | Feature | Continuous | yes | |||
BIL | Feature | Continuous | no | |||
CHE | Feature | Continuous | no | |||
CHOL | Feature | Continuous | yes | |||
CREA | Feature | Continuous | no |
0 to 10 of 13
Additional Variable Information
All attributes except Category and Sex are numerical. The laboratory data are the attributes 5-14. 1) X (Patient ID/No.) 2) Category (diagnosis) (values: '0=Blood Donor', '0s=suspect Blood Donor', '1=Hepatitis', '2=Fibrosis', '3=Cirrhosis') 3) Age (in years) 4) Sex (f,m) 5) ALB 6) ALP 7) ALT 8) AST 9) BIL 10) CHE 11) CHOL 12) CREA 13) GGT 14) PROT
Lichtinghagen,Ralf, Klawonn,Frank, and Hoffmann,Georg. (2020). HCV data. UCI Machine Learning Repository. https://doi.org/10.24432/C5D612.
@misc{misc_hcv_data_571, author = {Lichtinghagen,Ralf, Klawonn,Frank, and Hoffmann,Georg}, title = {{HCV data}}, year = {2020}, howpublished = {UCI Machine Learning Repository}, note = {{DOI}: https://doi.org/10.24432/C5D612} }
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset hcv_data = fetch_ucirepo(id=571) # data (as pandas dataframes) X = hcv_data.data.features y = hcv_data.data.targets # metadata print(hcv_data.metadata) # variable information print(hcv_data.variables)
Creators
Ralf Lichtinghagen
lichtinghagen.ralf@mh-hannover.de
Institute of Clinical Chemistry; Medical University Hannover (MHH)
Frank Klawonn
frank.klawonn@helmholtz-hzi.de
Helmholtz Centre for Infection Research
Georg Hoffmann
georg.hoffmann@trillium.de
Trillium GmbH
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.