Abstract

Virtually all quantitative microdata used by social scientists derive from samples that incorporate clustering, strati cation, and weighting adjustments (Kish 1965, 1992). Such data can yield standard error estimates that differ dramatically from those derived from a simple random sample of the same size. Researchers using historical U.S. census microdata, however, usually apply methods designed for simple random samples. The resulting p values and confidence intervals could be inaccurate and could lead to erroneous research conclusions. Because U.S. census microdata samples are among the most widely used sources for social science and policy research, the need for reliable standard error estimation is critical. We evaluate the historical microdata samples of the Integrated Public Use Microdata Series (IPUMS) project from 1850 to 1950 in order to determine (1) the impact of sample design on standard error estimates, and (2) how to apply modern standard error estimation software to historical census samples. We exploit a unique new data source from the 1880 census to validate our methods for standard error estimation, and then we apply this approach to the 1850–1870 and 1900–1950 decennial censuses. We conclude that Taylor series estimation can be used effectively with the historical decennial census microdata samples and should be applied in research analyses that have the potential for substantial clustering effects.

The text of this article is only available as a PDF.

References

Davern, M., Jones, A., Lepkowski, J., Davidson, G., & Blewett, L.A. (
2007
).
Estimating Standard Errors for Regression Coefficients Using the Current Population Survey’s Public Use File
.
Inquiry
,
44
,
211
24
.
Dippo, C.S., & Wolter, K.M. (
1984
).
A Comparison of Variance Estimators Using the Taylor Series Approximation
ASA Proceedings of the Section on Survey Research Methods
(pp.
112
21
).
Arlington, VA
:
American Statistical Association
.
Goeken, R., Nguyen, C., Ruggles, S., & Sargent, W.L. (
2003
).
The 1880 United States Population Database
.
Historical Methods
,
36
(
4
),
27
34
.
Graubard, B.I., & Korn, E.L. (
1996
).
Survey Inference for Subpopulations
.
American Journal of Epidemiology
,
144
,
102
106
.
Hammer, H., Hee-Choon Shin, and L.E. Porcellini. 2003. “A Comparison of Taylor Series and JK1 Resampling Methods for Variance Estimation.” Pp. 1–9 in Proceedings of the Hawaii International Conference on Statistics. Honolulu, HI.
Hansen, M.H., Hurwitz, W., & Madow, W. (
1953
).
Sample Survey Methods and Theory
.
New York
:
Wiley and Sons
.
Kalton, G. (
2002
).
Model in the Practice of Survey Sampling (Revisited)
.
Journal of Official Statistics
,
18
,
129
54
.
Kish, L. (
1965
).
Survey Sampling
.
New York
:
Wiley and Sons
.
Kish, L. (
1992
).
Weighting for Unequal Pi
.
Journal of Official Statistics
,
8
,
183
200
.
Kish, L., & Frankel, M.R. (
1974
).
Inference From Complex Samples
.
Journal of the Royal Statistical Society Series B
,
36
,
1
37
.
Korn, E.L., & Graubard, B.I. (
1995
).
Examples of Differing Weighted and Unweighted Estimates From a Sample Survey
.
American Statistician
,
49
,
291
95
. 10.2307/2684203
Korn, E.L. (
1999
).
Analysis of Health Surveys
.
New York
:
Wiley
.
Krewski, D., & Rao, J.N.K. (
1981
).
Inference From Strati ed Samples: Properties of Linearization, Jackknife and Balanced Repeated Replication Methods
.
Annals of Statistics
,
9
,
1010
19
. 10.1214/aos/1176345580
Little, R.J.A. 2003. “To Model or Not To Model? Competing Modes of Inference for Finite Population Sampling.” The University of Michigan Department of Biostatistics Working Paper Series, Working Paper 4. Available online at http://www.bepress.com/umichbiostat/paper4.
Magnuson, D.L. 1995. “The Making of a Modern Census: The United States Census of Population, 1790–1940.” Ph.D. dissertation. Department of History, University of Minnesota.
Magnuson, D.L., & King, M.L. (
1995
).
Enumeration Procedures
.
Historical Methods
,
28
,
27
32
.
Rosenfeld, M.J. (
2006
).
Young Adulthood as a Factor in Social Change in the United States
.
Population and Development Review
,
43
,
617
29
.
Ruggles, S. (
2007
).
The Decline of Intergenerational Coresidence in the United States
.
American Sociological Review
,
72
,
962
89
. 10.1177/000312240707200606
Ruggles, S., & Brower, S. (
2003
).
The Measurement of Family and Household Composition in the United States, 1850–1999
.
Population and Development Review
,
29
,
73
101
. 10.1111/j.1728-4457.2003.00073.x
Ruggles, S., & Menard, R.R. (
1995
).
Public Use Microdata Sample of the 1880 United States Census of Population: User’s Guide and Technical Documentation
.
Minneapolis
:
Social History Research Laboratory
.
Ruggles, S., Sobek, M., Alexander, T., Fitch, C.A., Goeken, R., Hall, P.K., King, M., & Ronnander, C. (
2004
).
Integrated Public Use Microdata Series: Version 3.0
.
Minneapolis, MN
:
Minnesota Population Center
.
Rust, K. (
1985
).
Variance Estimation for Complex Estimators in Sample Surveys
.
Journal of Official Statistics
,
1
,
381
97
.
Documentation for SAS Version 8
. (
1999
).
Cary, NC
:
SAS Institute, Inc
.
Schwartz, C.R., & Mare, R.D. (
2005
).
Trends in Educational Assortative Mating, 1940–2003
.
Demography
,
42
,
621
46
. 10.1353/dem.2005.0036
Short, S.E., Goldscheider, F.K., & Torr, B.M. (
2006
).
Less Help for Mother: The Decline in Coresidential Female Support for the Mothers of Young Children, 1880–2000
.
Demography
,
43
,
617
29
. 10.1353/dem.2006.0038
Correctly and Easily Compute Statistics for Complex Sampling
. (
2003
).
Chicago
:
SPSS Inc.
.
Reference Manual
. (
2001
).
College Station, TX
:
STATA Press
.
Verma, V. (
1993
).
Sampling Errors in Household Surveys
.
New York
:
U.N. Statistics Division, United Nations
.
Weng, S.S., Zhang, F., & Cohen, M.P. (
1995
).
Variance Estimates Comparison by Statistical Software
ASA Proceedings of the Section on Survey Research Methods
(pp.
333
38
).
Arlington, VA
:
American Statistical Association
.
Wolter, K.M. (
1985
).
Introduction to Variance Estimation
.
New York
:
Springer-Verlag
.