Cervical Cancer (Risk Factors)
Donated on 3/2/2017
This dataset focuses on the prediction of indicators/diagnosis of cervical cancer. The features cover demographic information, habits, and historic medical records.
Dataset Characteristics
Multivariate
Subject Area
Health and Medicine
Associated Tasks
Classification
Feature Type
Integer, Real
# Instances
858
# Features
36
Dataset Information
Additional Information
The dataset was collected at 'Hospital Universitario de Caracas' in Caracas, Venezuela. The dataset comprises demographic information, habits, and historic medical records of 858 patients. Several patients decided not to answer some of the questions because of privacy concerns (missing values).
Has Missing Values?
Yes
Introductory Paper
By Kelwin Fernandes, Jaime S. Cardoso, Jessica C. Fernandes. 2017
Published in Iberian Conference on Pattern Recognition and Image Analysis
Variables Table
Variable Name | Role | Type | Demographic | Description | Units | Missing Values |
---|---|---|---|---|---|---|
Age | Feature | Integer | Age | no | ||
Number of sexual partners | Feature | Continuous | Other | yes | ||
First sexual intercourse | Feature | Continuous | yes | |||
Num of pregnancies | Feature | Continuous | yes | |||
Smokes | Feature | Continuous | yes | |||
Smokes (years) | Feature | Continuous | yes | |||
Smokes (packs/year) | Feature | Continuous | yes | |||
Hormonal Contraceptives | Feature | Continuous | yes | |||
Hormonal Contraceptives (years) | Feature | Continuous | yes | |||
IUD | Feature | Continuous | yes |
0 to 10 of 36
Additional Variable Information
(int) Age (int) Number of sexual partners (int) First sexual intercourse (age) (int) Num of pregnancies (bool) Smokes (bool) Smokes (years) (bool) Smokes (packs/year) (bool) Hormonal Contraceptives (int) Hormonal Contraceptives (years) (bool) IUD (int) IUD (years) (bool) STDs (int) STDs (number) (bool) STDs:condylomatosis (bool) STDs:cervical condylomatosis (bool) STDs:vaginal condylomatosis (bool) STDs:vulvo-perineal condylomatosis (bool) STDs:syphilis (bool) STDs:pelvic inflammatory disease (bool) STDs:genital herpes (bool) STDs:molluscum contagiosum (bool) STDs:AIDS (bool) STDs:HIV (bool) STDs:Hepatitis B (bool) STDs:HPV (int) STDs: Number of diagnosis (int) STDs: Time since first diagnosis (int) STDs: Time since last diagnosis (bool) Dx:Cancer (bool) Dx:CIN (bool) Dx:HPV (bool) Dx (bool) Hinselmann: target variable (bool) Schiller: target variable (bool) Cytology: target variable (bool) Biopsy: target variable
Dataset Files
File | Size |
---|---|
risk_factors_cervical_cancer.csv | 99.7 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset cervical_cancer_risk_factors = fetch_ucirepo(id=383) # data (as pandas dataframes) X = cervical_cancer_risk_factors.data.features y = cervical_cancer_risk_factors.data.targets # metadata print(cervical_cancer_risk_factors.metadata) # variable information print(cervical_cancer_risk_factors.variables)
Fernandes, K., Cardoso, J., & Fernandes, J. (2017). Cervical Cancer (Risk Factors) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5Z310.
Creators
Kelwin Fernandes
Jaime Cardoso
Jessica Fernandes
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.