HCV data
Donated on 6/9/2020
The data set contains laboratory values of blood donors and Hepatitis C patients and demographic values like age.
Dataset Characteristics
Multivariate
Subject Area
Health and Medicine
Associated Tasks
Classification, Clustering
Feature Type
Integer, Real
# Instances
615
# Features
12
Dataset Information
What do the instances in this dataset represent?
Instances are patients
Additional Information
The target attribute for classification is Category (blood donors vs. Hepatitis C, including its progress: 'just' Hepatitis C, Fibrosis, Cirrhosis).
Has Missing Values?
Yes
Introductory Paper
By Georg F. Hoffmann, A. Bietenbeck, R. Lichtinghagen, F. Klawonn. 2018
Published in Journal of Laboratory and Precision Medicine
Variables Table
Variable Name | Role | Type | Demographic | Description | Units | Missing Values |
---|---|---|---|---|---|---|
ID | ID | Integer | Patient ID | no | ||
Age | Feature | Integer | Age | years | no | |
Sex | Feature | Binary | Sex | no | ||
ALB | Feature | Continuous | yes | |||
ALP | Feature | Continuous | yes | |||
AST | Feature | Continuous | yes | |||
BIL | Feature | Continuous | no | |||
CHE | Feature | Continuous | no | |||
CHOL | Feature | Continuous | yes | |||
CREA | Feature | Continuous | no |
0 to 10 of 14
Additional Variable Information
All attributes except Category and Sex are numerical. The laboratory data are the attributes 5-14. 1) X (Patient ID/No.) 2) Category (diagnosis) (values: '0=Blood Donor', '0s=suspect Blood Donor', '1=Hepatitis', '2=Fibrosis', '3=Cirrhosis') 3) Age (in years) 4) Sex (f,m) 5) ALB 6) ALP 7) ALT 8) AST 9) BIL 10) CHE 11) CHOL 12) CREA 13) GGT 14) PROT
Dataset Files
File | Size |
---|---|
hcvdat0.csv | 45.1 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset hcv_data = fetch_ucirepo(id=571) # data (as pandas dataframes) X = hcv_data.data.features y = hcv_data.data.targets # metadata print(hcv_data.metadata) # variable information print(hcv_data.variables)
Lichtinghagen, R., Klawonn, F., & Hoffmann, G. (2020). HCV data [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5D612.
Creators
Ralf Lichtinghagen
lichtinghagen.ralf@mh-hannover.de
Institute of Clinical Chemistry; Medical University Hannover (MHH)
Frank Klawonn
frank.klawonn@helmholtz-hzi.de
Helmholtz Centre for Infection Research
Georg Hoffmann
georg.hoffmann@trillium.de
Trillium GmbH
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.