IPUMS Database

Data Type



This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be consistent across years.


Original Owner

Historical Census Projects
University of Minnesota
614 Social Sciences
267 19th Avenue South
Minneapolis, MN 55455


Stephen Bay
Department of Information and Computer Science,
University of California, Irvine
Irvine, CA 92697
Date Donated: November 9, 1999

Data Characteristics

The original source for this data set is the IPUMS project (RugglesSobek, 1997). The IPUMS project is a large collection of federal census data which has standardized coding schemes to make comparisons across time easy.

The data is an unweighted 1 in 100 sample of responses from the Los Angeles -- Long Beach area for the years 1970, 1980, and 1990. The household and individual records were flattened into a single table and we used all variables that were available for all three years. When there was more than one version of a variable, such as for race, we used the most general. For occupation and industry we used the 1950 basis.

Note that PUMS data is based on cluster samples, i.e. samples are made of households or dwellings from which there may be multiple individuals. Individuals from the same household are no longer independent. Ruggles (1995) considers this issue further and discusses its effect (along with the effects of stratification) on standard errors.

Other Relevant Information

The variable schltype appears to have different coding values across the years 1970, 1980, and 1990.

Data Format

There are two versions of this data set.

The Small Data Set

The small data set contains a 1 in 1000 sample of the Los Angeles and Long Beach area. It was formed by sampling from the large data set.

The Large Data Set

The large data set contains a 1 in 100 sample of the Los Angeles and Long Beach area.

Past Usage

S. D. Bay and M. J. Pazzani. (1999) "Detecting Group Differences: Mining Contrast Sets". submitted.

Acknowledgements, Copyright Information, and Availability

Reproduced here is the original IPUMS citation and use documentation.

All persons are granted a limited license to use and distribute this documentation and the accompanying data, subject to the following conditions: 

In addition, we request that users send us a copy of any publications, research reports, or educational material making use of the data or documentation. Printed matter should be sent to: 

Historical Census Projects 
University of Minnesota 
614 Social Sciences 
267 19th Avenue South 
Minneapolis, MN 55455
Send all electronic material to ipums@hist.umn.edu

References and Further Information

The IPUMS home pages contains additional documentation and data.

The United States Census Bureau Web Site.

S. Ruggles. (1995). "Sample Designs and Sampling Errors". Historical Methods. Volume 28. Number 1. Pages 40 - 46.

The UCI KDD Archive
Information and Computer Science
University of California, Irvine
Irvine, CA 92697-3425
Last modified: November 9, 1999.