Diabetic Retinopathy Debrecen
Donated on 11/2/2014
This dataset contains features extracted from the Messidor image set to predict whether an image contains signs of diabetic retinopathy or not.
Dataset Characteristics
Multivariate
Subject Area
Health and Medicine
Associated Tasks
Classification
Feature Type
Integer, Real
# Instances
1151
# Features
19
Dataset Information
What do the instances in this dataset represent?
Medical patients
Additional Information
This dataset contains features extracted from the Messidor image set to predict whether an image contains signs of diabetic retinopathy or not. All features represent either a detected lesion, a descriptive feature of a anatomical part or an image-level descriptor. The underlying method image analysis and feature extraction as well as our classification technique is described in Antal and Hajdu, Knowledge-Based Systems, 2014. The image set (Messidor) is available at http://messidor.crihan.fr/index-en.php.
Has Missing Values?
No
Introductory Paper
By B. Antal, A. Hajdu. 2014
Published in Knowledge-Based Systems
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
quality | Feature | Binary | The binary result of quality assessment. 0 = bad quality 1 = sufficient quality. | no | |
pre_screening | Feature | Binary | The binary result of pre-screening, where 1 indicates severe retinal abnormality and 0 its lack. | no | |
ma1 | Feature | Integer | ma1 - ma-6 contain the results of MA detection. Each feature value stand for the number of MAs found at the confidence levels alpha = 0.5, . . . , 1, respectively. | no | |
ma2 | Feature | Integer | no | ||
ma3 | Feature | Integer | no | ||
ma4 | Feature | Integer | no | ||
ma5 | Feature | Integer | no | ||
ma6 | Feature | Integer | no | ||
exudate1 | Feature | Continuous | exudate1 - exudate8 contain the same information as 2-7) for exudates. However, as exudates are represented by a set of points rather than the number of pixels constructing the lesions, these features are normalized by dividing the number of lesions with the diameter of the ROI to compensate different image sizes. | no | |
exudate2 | Feature | Continuous | no |
0 to 10 of 20
Additional Variable Information
0) The binary result of quality assessment. 0 = bad quality 1 = sufficient quality. 1) The binary result of pre-screening, where 1 indicates severe retinal abnormality and 0 its lack. 2-7) The results of MA detection. Each feature value stand for the number of MAs found at the confidence levels alpha = 0.5, . . . , 1, respectively. 8-15) contain the same information as 2-7) for exudates. However, as exudates are represented by a set of points rather than the number of pixels constructing the lesions, these features are normalized by dividing the number of lesions with the diameter of the ROI to compensate different image sizes. 16) The euclidean distance of the center of the macula and the center of the optic disc to provide important information regarding the patient’s condition. This feature is also normalized with the diameter of the ROI. 17) The diameter of the optic disc. 18) The binary result of the AM/FM-based classification. 19) Class label. 1 = contains signs of DR (Accumulative label for the Messidor classes 1, 2, 3), 0 = no signs of DR.
Dataset Files
File | Size |
---|---|
messidor_features.arff | 114.5 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset diabetic_retinopathy_debrecen = fetch_ucirepo(id=329) # data (as pandas dataframes) X = diabetic_retinopathy_debrecen.data.features y = diabetic_retinopathy_debrecen.data.targets # metadata print(diabetic_retinopathy_debrecen.metadata) # variable information print(diabetic_retinopathy_debrecen.variables)
Antal, B. & Hajdu, A. (2014). Diabetic Retinopathy Debrecen [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5XP4P.
Creators
Balint Antal
Andras Hajdu
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.