Thyroid Disease
Donated on 12/31/1986
10 separate databases from Garavan Institute
Dataset Characteristics
Multivariate, Domain-Theory
Subject Area
Health and Medicine
Associated Tasks
Classification
Feature Type
Categorical, Real
# Instances
7200
# Features
5
Dataset Information
Additional Information
# From Garavan Institute # Documentation: as given by Ross Quinlan # 6 databases from the Garavan Institute in Sydney, Australia # Approximately the following for each database: ** 2800 training (data) instances and 972 test instances ** Plenty of missing data ** 29 or so attributes, either Boolean or continuously-valued # 2 additional databases, also from Ross Quinlan, are also here ** Hypothyroid.data and sick-euthyroid.data ** Quinlan believes that these databases have been corrupted ** Their format is highly similar to the other databases # 1 more database of 9172 instances that cover 20 classes, and a related domain theory # Another thyroid database from Stefan Aeberhard ** 3 classes, 215 instances, 5 attributes ** No missing values # A Thyroid database suited for training ANNs ** 3 classes ** 3772 training instances, 3428 testing instances ** Includes cost data (donated by Peter Turney)
Has Missing Values?
No
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
Class | Target | Categorical | no | ||
Attribute1 | Feature | Integer | no | ||
Attribute2 | Feature | Continuous | no | ||
Attribute3 | Feature | Continuous | no | ||
Attribute4 | Feature | Continuous | no | ||
Attribute5 | Feature | Continuous | no |
0 to 6 of 6
Baseline Model Performance
Dataset Files
File | Size |
---|---|
thyroid0387.data | 754.5 KB |
ann-train.data | 258.5 KB |
allhypo.data | 237.5 KB |
allbp.data | 236.8 KB |
ann-test.data | 235.5 KB |
0 to 5 of 39
Papers Citing this Dataset
Sort by Year, desc
By Maoxiang Chu, Xiaoping Liu, Rongfen Gong, Jie Zhao. 2019
Published in PloS one.
By Edoardo Vignotto, Sebastian Engelke. 2018
Published in ArXiv.
By Swee Lim, Yi Loo, Ngoc-Trung Tran, Ngai-Man Cheung, Gemma Roig, Yuval Elovici. 2018
Published in ArXiv.
By Shivani Batra, Shelly Sachdeva, Subhash Bhalla. 2018
Published in Information.
By Utkarsh Porwal, Smruthi Mukund. 2018
Published in ArXiv.
0 to 5 of 18
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset thyroid_disease = fetch_ucirepo(id=102) # data (as pandas dataframes) X = thyroid_disease.data.features y = thyroid_disease.data.targets # metadata print(thyroid_disease.metadata) # variable information print(thyroid_disease.variables)
Quinlan, R. (1986). Thyroid Disease [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5D010.
Creators
Ross Quinlan
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.