Thyroid Disease

Donated on 12/31/1986

10 separate databases from Garavan Institute

Dataset Characteristics

Multivariate, Domain-Theory

Subject Area

Health and Medicine

Associated Tasks

Classification

Feature Type

Categorical, Real

# Instances

7200

# Features

5

Dataset Information

Additional Information

# From Garavan Institute # Documentation: as given by Ross Quinlan # 6 databases from the Garavan Institute in Sydney, Australia # Approximately the following for each database: ** 2800 training (data) instances and 972 test instances ** Plenty of missing data ** 29 or so attributes, either Boolean or continuously-valued # 2 additional databases, also from Ross Quinlan, are also here ** Hypothyroid.data and sick-euthyroid.data ** Quinlan believes that these databases have been corrupted ** Their format is highly similar to the other databases # 1 more database of 9172 instances that cover 20 classes, and a related domain theory # Another thyroid database from Stefan Aeberhard ** 3 classes, 215 instances, 5 attributes ** No missing values # A Thyroid database suited for training ANNs ** 3 classes ** 3772 training instances, 3428 testing instances ** Includes cost data (donated by Peter Turney)

Has Missing Values?

No

Variables Table

Variable NameRoleTypeDescriptionUnitsMissing Values
ClassTargetCategoricalno
Attribute1FeatureIntegerno
Attribute2FeatureContinuousno
Attribute3FeatureContinuousno
Attribute4FeatureContinuousno
Attribute5FeatureContinuousno

0 to 6 of 6

Baseline Model Performance

Dataset Files

FileSize
thyroid0387.data754.5 KB
ann-train.data258.5 KB
allhypo.data237.5 KB
allbp.data236.8 KB
ann-test.data235.5 KB

0 to 5 of 39

Papers Citing this Dataset

Support vector machine with quantile hyper-spheres for pattern classification

By Maoxiang Chu, Xiaoping Liu, Rongfen Gong, Jie Zhao. 2019

Published in PloS one.

Extreme Value Theory for Open Set Classification -- GPD and GEV Classifiers

By Edoardo Vignotto, Sebastian Engelke. 2018

Published in ArXiv.

DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN

By Swee Lim, Yi Loo, Ngoc-Trung Tran, Ngai-Man Cheung, Gemma Roig, Yuval Elovici. 2018

Published in ArXiv.

Entity Attribute Value Style Modeling Approach for Archetype Based Data

By Shivani Batra, Shelly Sachdeva, Subhash Bhalla. 2018

Published in Information.

Credit Card Fraud Detection in e-Commerce: An Outlier Detection Approach

By Utkarsh Porwal, Smruthi Mukund. 2018

Published in ArXiv.

0 to 5 of 18

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (610.3 KB)
18 citations
60251 views

Creators

Ross Quinlan

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy