Lung Cancer
Donated on 4/30/1992
Lung cancer data; no attribute definitions
Dataset Characteristics
Multivariate
Subject Area
Health and Medicine
Associated Tasks
Classification
Feature Type
Integer
# Instances
32
# Features
56
Dataset Information
Additional Information
This data was used by Hong and Young to illustrate the power of the optimal discriminant plane even in ill-posed settings. Applying the KNN method in the resulting plane gave 77% accuracy. However, these results are strongly biased (See Aeberhard's second ref. above, or email to stefan@coral.cs.jcu.edu.au). Results obtained by Aeberhard et al. are : RDA : 62.5%, KNN 53.1%, Opt. Disc. Plane 59.4% The data described 3 types of pathological lung cancers. The Authors give no information on the individual variables nor on where the data was originally used. Notes: - In the original data 4 values for the fifth attribute were -1. These values have been changed to ? (unknown). (*) - In the original data 1 value for the 39 attribute was 4. This value has been changed to ? (unknown). (*)
Has Missing Values?
Yes
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
class | Target | Categorical | no | ||
Attribute1 | Feature | Categorical | no | ||
Attribute2 | Feature | Categorical | no | ||
Attribute3 | Feature | Categorical | no | ||
Attribute4 | Feature | Categorical | yes | ||
Attribute5 | Feature | Categorical | no | ||
Attribute6 | Feature | Categorical | no | ||
Attribute7 | Feature | Categorical | no | ||
Attribute8 | Feature | Categorical | no | ||
Attribute9 | Feature | Categorical | no |
0 to 10 of 57
Additional Variable Information
Attribute 1 is the class label. All predictive attributes are nominal, taking on integer values 0-3
Baseline Model Performance
Dataset Files
File | Size |
---|---|
lung-cancer.data | 3.6 KB |
lung-cancer.names | 2.1 KB |
Index | 126 Bytes |
Papers Citing this Dataset
Sort by Year, desc
By Zhiguo Zhou, Zhi-Jie Zhou, Hongxia Hao, Shulong Li, Xi Chen, You Zhang, Michael Folkert, Jing Wang. 2017
Published in ArXiv.
By Mitra Montazeri, Mahdieh Baghshah, Ahmad Enhesari. 2015
Published in J. Basic Appl. Sci. Res, 2013. 3(10): p. 134-140.
By Kwetishe Danjuma. 2015
Published in IJCSI International Journal of Computer Science Issues, Volume 12, Issue 2, March 2015.
By M. Reddy, L. Reddy. 2010
Published in International Journal of Computer Science Issues, IJCSI, Vol. 7, Issue 1, No. 1, January 2010, http://ijcsi.org/articles/Dimensionality-Reduction-An-Empirical-Study-on-the-Usability-of-IFE-CF-(Independent-Feature-Elimination-by-C-Correlation-and-F-Correlation)-Measures.php.
0 to 4 of 4
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset lung_cancer = fetch_ucirepo(id=62) # data (as pandas dataframes) X = lung_cancer.data.features y = lung_cancer.data.targets # metadata print(lung_cancer.metadata) # variable information print(lung_cancer.variables)
Hong, Z. & Yang, J. (1991). Lung Cancer [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C57596.
Creators
Z.Q. Hong
J.Y Yang
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.