Thoracic Surgery Data
Donated on 11/12/2013
The data is dedicated to classification problem related to the post-operative life expectancy in the lung cancer patients: class 1 - death within one year after surgery, class 2 - survival.
Dataset Characteristics
Multivariate
Subject Area
Health and Medicine
Associated Tasks
Classification
Feature Type
Integer, Real
# Instances
470
# Features
16
Dataset Information
What do the instances in this dataset represent?
Individual patients
Additional Information
The data was collected retrospectively at Wroclaw Thoracic Surgery Centre for patients who underwent major lung resections for primary lung cancer in the years 2007 to 2011. The Centre is associated with the Department of Thoracic Surgery of the Medical University of Wroclaw and Lower-Silesian Centre for Pulmonary Diseases, Poland, while the research database constitutes a part of the National Lung Cancer Registry, administered by the Institute of Tuberculosis and Pulmonary Diseases in Warsaw, Poland.
Has Missing Values?
No
Introductory Paper
By Maciej Ziȩba, Jakub M. Tomczak, M. Lubicz, J. Swiatek. 2014
Published in Applied Soft Computing
Variables Table
Variable Name | Role | Type | Demographic | Description | Units | Missing Values |
---|---|---|---|---|---|---|
DGN | Feature | Categorical | Diagnosis - specific combination of ICD-10 codes for primary and secondary as well multiple tumours if any | no | ||
PRE4 | Feature | Continuous | Forced vital capacity - FVC (numeric) | no | ||
PRE5 | Feature | Continuous | Volume that has been exhaled at the end of the first second of forced expiration - FEV1 (numeric) | no | ||
PRE6 | Feature | Categorical | Performance status - Zubrod scale (PRZ2,PRZ1,PRZ0) | no | ||
PRE7 | Feature | Binary | Pain before surgery (T,F) | no | ||
PRE8 | Feature | Binary | Haemoptysis before surgery (T,F) | no | ||
PRE9 | Feature | Binary | Dyspnoea before surgery (T,F) | no | ||
PRE10 | Feature | Binary | Cough before surgery (T,F) | no | ||
PRE11 | Feature | Binary | Weakness before surgery (T,F) | no | ||
PRE14 | Feature | Categorical | T in clinical TNM - size of the original tumour, from OC11 (smallest) to OC14 (largest) (OC11,OC14,OC12,OC13) | no |
0 to 10 of 17
Additional Variable Information
1. DGN: Diagnosis - specific combination of ICD-10 codes for primary and secondary as well multiple tumours if any (DGN3,DGN2,DGN4,DGN6,DGN5,DGN8,DGN1) 2. PRE4: Forced vital capacity - FVC (numeric) 3. PRE5: Volume that has been exhaled at the end of the first second of forced expiration - FEV1 (numeric) 4. PRE6: Performance status - Zubrod scale (PRZ2,PRZ1,PRZ0) 5. PRE7: Pain before surgery (T,F) 6. PRE8: Haemoptysis before surgery (T,F) 7. PRE9: Dyspnoea before surgery (T,F) 8. PRE10: Cough before surgery (T,F) 9. PRE11: Weakness before surgery (T,F) 10. PRE14: T in clinical TNM - size of the original tumour, from OC11 (smallest) to OC14 (largest) (OC11,OC14,OC12,OC13) 11. PRE17: Type 2 DM - diabetes mellitus (T,F) 12. PRE19: MI up to 6 months (T,F) 13. PRE25: PAD - peripheral arterial diseases (T,F) 14. PRE30: Smoking (T,F) 15. PRE32: Asthma (T,F) 16. AGE: Age at surgery (numeric) 17. Risk1Y: 1 year survival period - (T)rue value if died (T,F) Class Distribution: the class value (Risk1Y) is binary valued. Risk1Y Value: Number of Instances: T 70 N 400 Summary Statistics: Binary Attributes Distribution: PRE7 Value: Number of Instances: T 31 N 439 PRE8 Value: Number of Instances: T 68 N 402 PRE9 Value: Number of Instances: T 31 N 439 PRE10 Value: Number of Instances: T 323 N 147 PRE11 Value: Number of Instances: T 78 N 392 PRE17 Value: Number of Instances: T 35 N 435 PRE19 Value: Number of Instances: T 2 N 468 PRE25 Value: Number of Instances: T 8 N 462 PRE30 Value: Number of Instances: T 386 N 84 PRE32 Value: Number of Instances: T 368 N 2 Nominal Attributes Distribution: DGN Value: Number of Instances: DGN3 349 DGN2 52 DGN4 47 DGN6 4 DGN5 15 DGN8 2 DGN1 1 PRE6 Value: Number of Instances: PRZ2 27 PRZ1 313 PRZ0 130 PRE14 Value: Number of Instances: OC11 177 OC14 17 OC12 257 OC13 19 Numeric Attributes Statistics: Min Max Mean SD PRE4: 1.4 6.3 3.3 0.9 PRE5: 0.96 86.3 4.6 11.8 AGE: 21 87 52.5 8.7
Dataset Files
File | Size |
---|---|
ThoraricSurgery.arff | 23.7 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset thoracic_surgery_data = fetch_ucirepo(id=277) # data (as pandas dataframes) X = thoracic_surgery_data.data.features y = thoracic_surgery_data.data.targets # metadata print(thoracic_surgery_data.metadata) # variable information print(thoracic_surgery_data.variables)
Lubicz, M., Pawelczyk, K., Rzechonek, A., & Kolodziej, J. (2014). Thoracic Surgery Data [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5Z60N.
Creators
Marek Lubicz
Konrad Pawelczyk
Adam Rzechonek
Jerzy Kolodziej
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.