Census Income

Donated on 4/30/1996

Predict whether income exceeds $50K/yr based on census data. Also known as Adult dataset.

Dataset Characteristics

Multivariate

Subject Area

Social Science

Associated Tasks

Classification

Feature Type

Categorical, Integer

# Instances

48842

# Features

14

Dataset Information

Additional Information

Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0)) Prediction task is to determine whether a person makes over 50K a year.

Has Missing Values?

Yes

Variables Table

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
ageFeatureIntegerAgeN/Ano
workclassFeatureCategoricalIncomePrivate, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.yes
fnlwgtFeatureIntegerno
educationFeatureCategoricalEducation Level Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.no
education-numFeatureIntegerEducation Levelno
marital-statusFeatureCategoricalOtherMarried-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse.no
occupationFeatureCategoricalOtherTech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.yes
relationshipFeatureCategoricalOtherWife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.no
raceFeatureCategoricalRaceWhite, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.no
sexFeatureBinarySexFemale, Male.no

0 to 10 of 15

Additional Variable Information

Listing of attributes: >50K, <=50K. age: continuous. workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked. fnlwgt: continuous. education: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool. education-num: continuous. marital-status: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse. occupation: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces. relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried. race: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black. sex: Female, Male. capital-gain: continuous. capital-loss: continuous. hours-per-week: continuous. native-country: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.

Dataset Files

FileSize
adult.data3.8 MB
adult.test1.9 MB
adult.names5.1 KB
old.adult.names4.2 KB
Index140 Bytes

Papers Citing this Dataset

The What-If Tool: Interactive Probing of Machine Learning Models

By James Wexler, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viegas, Jimbo Wilson. 2019

Published in ArXiv.

Paired-Consistency: An Example-Based Model-Agnostic Approach to Fairness Regularization in Machine Learning

By Yair Horesh, Noa Haas, Elhanan Mishraky, Yehezkel Resheff, Shir Lador. 2019

Published in ArXiv.

Distributed generation of privacy preserving data with user customization

By Xiao Chen, Thomas Navidi, Stefano Ermon, Ram Rajagopal. 2019

Published in ArXiv.

Information-Theoretic Privacy through Chaos Synchronization and Optimal Additive Noise

By Carlos Murguia, Iman Shames, Farhad Farokhi, Dragan Nesic. 2019

Published in ArXiv.

Automated Data Slicing for Model Validation:A Big data - AI Integration Approach

By Yeounoh Chung, Tim Kraska, Neoklis Polyzotis, Ki Tae, Steven Whang. 2018

Published in IEEE Transactions on Knowledge and Data Engineering.

0 to 5 of 24

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (650.1 KB)
24 citations
78690 views

Keywords

Creators

Ron Kohavi

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy