IPUMS Census Database
Donated on 11/8/1999
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990.
Dataset Characteristics
Multivariate
Subject Area
Social Science
Associated Tasks
-
Feature Type
Categorical, Integer
# Instances
256932
# Features
-
Dataset Information
Additional Information
The original source for this data set is the IPUMS project (RugglesSobek, 1997). The IPUMS project is a large collection of federal census data which has standardized coding schemes to make comparisons across time easy. The data is an unweighted 1 in 100 sample of responses from the Los Angeles -- Long Beach area for the years 1970, 1980, and 1990. The household and individual records were flattened into a single table and we used all variables that were available for all three years. When there was more than one version of a variable, such as for race, we used the most general. For occupation and industry we used the 1950 basis. Note that PUMS data is based on cluster samples, i.e. samples are made of households or dwellings from which there may be multiple individuals. Individuals from the same household are no longer independent. Ruggles (1995) considers this issue further and discusses its effect (along with the effects of stratification) on standard errors. The variable schltype appears to have different coding values across the years 1970, 1980, and 1990. There are two versions of this data set: 1. The Small Data Set The small data set contains a 1 in 1000 sample of the Los Angeles and Long Beach area. It was formed by sampling from the large data set. 2. The Large Data Set The large data set contains a 1 in 100 sample of the Los Angeles and Long Beach area.
Has Missing Values?
No
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no |
0 to 10 of 61
Additional Variable Information
Please see ipums.la.names
Dataset Files
File | Size |
---|---|
ipums.la.97.gz.old | 1.7 MB |
ipums.la.names | 26.1 KB |
ipums.data.html | 6.5 KB |
codebook | 4 KB |
ipums.html | 1.9 KB |
0 to 5 of 11
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset ipums_census_database = fetch_ucirepo(id=127) # data (as pandas dataframes) X = ipums_census_database.data.features y = ipums_census_database.data.targets # metadata print(ipums_census_database.metadata) # variable information print(ipums_census_database.variables)
Ruggles, S. & Sobek, M. (1997). IPUMS Census Database [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5BG63.
Creators
Steven Ruggles
Matthew Sobek
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.