Census Income
Donated on 4/30/1996
Predict whether income exceeds $50K/yr based on census data. Also known as Adult dataset.
Dataset Characteristics
Multivariate
Subject Area
Social Science
Associated Tasks
Classification
Feature Type
Categorical, Integer
# Instances
48842
# Features
14
Dataset Information
Additional Information
Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0)) Prediction task is to determine whether a person makes over 50K a year.
Has Missing Values?
Yes
Variables Table
Variable Name | Role | Type | Demographic | Description | Units | Missing Values |
---|---|---|---|---|---|---|
age | Feature | Integer | Age | N/A | no | |
workclass | Feature | Categorical | Income | Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked. | yes | |
fnlwgt | Feature | Integer | no | |||
education | Feature | Categorical | Education Level | Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool. | no | |
education-num | Feature | Integer | Education Level | no | ||
marital-status | Feature | Categorical | Other | Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse. | no | |
occupation | Feature | Categorical | Other | Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces. | yes | |
relationship | Feature | Categorical | Other | Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried. | no | |
race | Feature | Categorical | Race | White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black. | no | |
sex | Feature | Binary | Sex | Female, Male. | no |
0 to 10 of 15
Additional Variable Information
Listing of attributes: >50K, <=50K. age: continuous. workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked. fnlwgt: continuous. education: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool. education-num: continuous. marital-status: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse. occupation: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces. relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried. race: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black. sex: Female, Male. capital-gain: continuous. capital-loss: continuous. hours-per-week: continuous. native-country: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.
Dataset Files
File | Size |
---|---|
adult.data | 4MB |
adult.test | 2MB |
adult.names | 5.2KB |
old.adult.names | 4.3KB |
Index | 140B |
Papers Citing this Dataset
By James Wexler, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viegas, Jimbo Wilson. 2019
Published in ArXiv.
By Yair Horesh, Noa Haas, Elhanan Mishraky, Yehezkel Resheff, Shir Lador. 2019
Published in ArXiv.
By Xiao Chen, Thomas Navidi, Stefano Ermon, Ram Rajagopal. 2019
Published in ArXiv.
By Carlos Murguia, Iman Shames, Farhad Farokhi, Dragan Nesic. 2019
Published in ArXiv.
By Yeounoh Chung, Tim Kraska, Neoklis Polyzotis, Ki Tae, Steven Whang. 2018
Published in IEEE Transactions on Knowledge and Data Engineering.
0 to 5 of 24
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset census_income = fetch_ucirepo(id=20) # data (as pandas dataframes) X = census_income.data.features y = census_income.data.targets # metadata print(census_income.metadata) # variable information print(census_income.variables)
Kohavi, R. (1996). Census Income [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5GP7S.
Keywords
Creators
Ron Kohavi
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.