Breast Cancer

Donated on 7/10/1988

Breast Cancer Data (Restricted Access)

Dataset Characteristics

Multivariate

Subject Area

Health and Medicine

Associated Tasks

Classification

Feature Type

Categorical

# Instances

286

# Features

9

Dataset Information

Additional Information

This is one of three domains provided by the Oncology Institute that has repeatedly appeared in the machine learning literature. (See also lymphography and primary-tumor.) This data set includes 201 instances of one class and 85 instances of another class. The instances are described by 9 attributes, some of which are linear and some are nominal.

Has Missing Values?

Yes

Variables Table

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
ClassTargetBinaryno-recurrence-events, recurrence-eventsno
ageFeatureCategoricalAge10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99yearsno
menopauseFeatureCategoricallt40, ge40, premenono
tumor-sizeFeatureCategorical0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59no
inv-nodesFeatureCategorical 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-23, 24-26, 27-29, 30-32, 33-35, 36-39no
node-capsFeatureBinaryyes, noyes
deg-maligFeatureInteger1, 2, 3no
breastFeatureBinaryleft, rightno
breast-quadFeatureCategoricalleft-up, left-low, right-up, right-low, centralyes
irradiatFeatureBinaryyes, nono

0 to 10 of 10

Additional Variable Information

1. Class: no-recurrence-events, recurrence-events 2. age: 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99. 3. menopause: lt40, ge40, premeno. 4. tumor-size: 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59. 5. inv-nodes: 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-23, 24-26, 27-29, 30-32, 33-35, 36-39. 6. node-caps: yes, no. 7. deg-malig: 1, 2, 3. 8. breast: left, right. 9. breast-quad: left-up, left-low, right-up, right-low, central. 10. irradiat: yes, no.

Class Labels

no-recurrence-events, recurrence-events

Baseline Model Performance

Papers Citing this Dataset

Online Data Poisoning Attack

By Xuezhou Zhang, Xiaojin Zhu, Laurent Lessard. 2019

Published in ArXiv.

QUOTIENT: Two-Party Secure Neural Network Training and Prediction

By Nitin Agrawal, Ali Shamsabadi, Matt Kusner, Adria Gasc'on. 2019

Published in ArXiv.

Optimized Realization of Bayesian Networks in Reduced Normal Form using Latent Variable Model

By Giovanni Gennaro, Amedeo Buonanno, Francesco Palmieri. 2019

Published in ArXiv.

Target-Focused Feature Selection Using a Bayesian Approach

By Orpaz Goldstein, Mohammad Kachuee, Kimmo Karkkainen, Majid Sarrafzadeh. 2019

Published in

A Novel Hyperparameter-free Approach to Decision Tree Construction that Avoids Overfitting by Design

By Rafael Leiva, Antonio Anta, Vincenzo Mancuso, Paolo Casari. 2019

Published in ArXiv.

0 to 5 of 147

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download
147 citations
109559 views

Keywords

cancerhealth

Creators

Matjaz Zwitter

Milan Soklic

Notes

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy