The quality of the decennial census of the United States is compromised by population undercount, which often misses immigrants and racial/ethnic minorities, thereby diminishing federal resources allocated to such groups. Using a modified version of demographic analysis and informed by the latest contributions of emigration scholarship, this research estimates net undercount for the 1990 census relative to the 2000 census by age, sex, year-of-entry, and place-of-birth cohorts. Ordinary least squares estimates suggest that males, recent arrivals, and cohorts aged 15–44 had higher relative net undercount for 1990 compared with 2000. Much higher relative net undercount was found for cohorts from Africa, Latin America, and the Caribbean (excluding Cuba and Puerto Rico) who were ineligible for amnesty under the Immigration Reform and Control Act of 1986 (i.e., those fitting the profile of an undocumented immigrant). Larger implications of these findings suggest that the political climate in which a person is embedded—particularly for persons who may feel threatened or marginalized by the government and/or the public—affects that person’s willingness to respond to the census.
Although often viewed with suspicion (including by former president George W. Bush; Hillygus et al. 2006), the decennial U.S. census is constitutionally mandated and its results have significant outcomes. Census results are used, among other purposes, to reapportion the House of Representatives, adjust the Electoral College (and, hence, the voting power for presidential races), draw a variety of district lines, and address political and social disparities for minorities. The census is not perfect, however, and certain populations—such as non-Hispanic whites—have less net undercount, and have even been overcounted (Mule 2012; Mulry 2007) in comparison with minorities such as blacks or Hispanics. The differential undercount of specific populations curtails political power of undercounted groups and potentially cuts the federal resources that would otherwise benefit them. With so much at a stake, differential undercount between populations has been a focus of heated debate since the 1990 census (Anderson and Fienberg 1999; Freedman 1991; Hillygus et al. 2006; Skerry 2000; Walashek and Swanson 2006).
This article begins by summarizing the literature on undercount that informs the hypotheses of this study (listed in Table 1). Previous studies of undercount largely fall into two schools: studies on the magnitude (and select distributions) of undercount, and studies on the causes of undercount. Scholarship on the magnitude of undercount has found higher net undercounts for Hispanics, blacks, males, and younger persons (Mulry 2007; Robinson et al. 1993, 2002). Scholarship on the underlying causes of undercount has identified physical impediments to reaching people (de la Puente 1992; Mahler 1993), interviewer/instrument effects (Couper et al. 1998; Groves and Couper 1998; Iversen et al. 1999), political motivations (Hillygus et al. 2006; Swanson and Walashek 2011), and privacy/confidentiality concerns (Couper et al. 1998; Hillygus et al. 2006; Singer et al. 1993) as causes of undercount.
Combining some of the strengths of both schools of undercount scholarship, this research produces estimates of relative net undercount for the 1990 census vis-à-vis the 2000 census for foreign-born1 cohorts while offering some insight regarding the dynamics that lie behind undercount. The study of the 1990 and 2000 censuses is only partially based on the data that are available at the time of the writing of this research: namely, in the availability of the public-use microdata for 1990 and 2000 (and the lack of microdata for the 2010 census) and the recent scholarship on emigration estimates. More importantly, the political climate in which the 1990 census was situated makes the 1990 census unique given that this census and the attendant population counts were undoubtedly the most affected by the recent passage of the Immigration Reform and Control Act of 1986 (IRCA). As displayed in Table 1 (which summarizes the hypotheses and results of the study), males, cohorts aged 15–44, and recent entrants are found to have higher relative net undercount for the 1990 census vis-à-vis the 2000 census. The strongest predictors of relative net undercount for 1990 vis-à-vis 2000, however, are the characteristics that are associated with the stereotypical profile of the undocumented immigrant: namely, Latin American, Caribbean (sans Cuba), and African cohorts who are ineligible for amnesty under IRCA. This finding is consistent with the ideas that socioeconomic minorities are disproportionately undercounted and that persons suspicious of the government and/or the public are less likely to respond accurately to the census.
For decades, scholars have attempted to estimate the error of the decennial census through independent estimates of the size of the population. The primary measure of census error is net undercount (hereafter referred to as “undercount”), which is the net of omissions, duplications, and other erroneous inclusions/exclusions of the census. Undercount is primarily estimated through two methods. The first method uses post-census surveys and a capture-recapture methodology (e.g., the Post-Enumeration Survey (PES)). These surveys estimated high undercoverage for blacks, Hispanics, American Indians, and younger cohorts (Hogan 1993; Mulry 2007; Mulry and Spencer 1993), although these studies are sensitive to correlation,2 matching/processing, and sampling error (Breiman 1994; Freedman 1991; Mulry and Spencer 1993). Other survey-based studies have found that mail response of census forms varies by age, sex, education, and race/ethnicity (Couper et al. 1998; Singer et al. 1993).
A second popular method to measure undercount is demographic analysis (DA), which estimates the size of the population by using administrative data to the greatest extent possible. The purest form of DA estimates a population’s size by summing estimates of births and immigrants, while subtracting estimates of deaths and emigrants. Like the post-census survey studies, DA for the 1990 census finds higher undercounts for blacks and males, although high undercount by age groups is most pronounced for males aged 18–49 (vs. younger cohorts in general) (Robinson 2001; Robinson et al. 1993). Across nearly all demographic characteristics, both post-census surveys and DA estimate higher net undercount for the 1990 census than for the 2000 census (Mulry 2007; Robinson 2001). Drawing from these studies and assuming that the results apply to the foreign-born population specifically, one can hypothesize that persons aged 15–443 (Hypothesis 1) and males (Hypothesis 2) are more likely to have higher relative undercount for 1990 than for 2000.
The quality of a DA study is naturally limited by the quality of the data components available. Although birth and death data have generally been complete by 1960 (McDevitt et al. 2001; U.S. Census Bureau 1988), the quality of migration data is notoriously questionable. Emigration data in particular have been lacking because traditional “residual” methods produce unusable emigration estimates (Ahmed and Robinson 1994; Mulder et al. 2002). The Census Bureau DA evaluation of the 2000 census drew from emigration rates calculated for the 1980s (which themselves contained many unusable estimates) to estimate emigration for the 1990s because of the difficulties in estimating emigration rates (Mulder et al. 2001; Robinson 2001). One problem with the use of 10-year-old emigration rates is that the patterns of migration had shifted in the 1990s: the passage of migration legislation around 1990 (described later herein) shifted Mexican migration patterns toward permanent settlement in place of circular migration (Massey et al. 2002).
Qualitative studies are rich in information that enhances one’s understanding of undercount, highlighting the roles that social forces play in census coverage. The political context and the motivations of interested parties to conduct an accurate census are fundamental contributors to census coverage. In light of the differential undercount of urban minorities in the 1990 census and the federal resources at stake, various public groups as well as private interest groups pushed for the adjustment of the 1990 census. The campaigns for adjustment were ultimately turned down by the Supreme Court, leading political forces to allocate unprecedented levels of funding to public outreach and civic mobilization campaigns to raise awareness for the 2000 census (Hillygus et al. 2006). Aided by the political and economic resources behind the 2000 census, estimates of net undercounts decreased from 4.99 % in 1990 to 0.71 % in 2000 for Hispanics, and from 4.57 % to 1.84 % for blacks. In contrast, non-Hispanic whites had net undercounts of 0.68 % in 1990 and net overcounts of 1.13 % in 2000 (Mulry 2007).
A number of issues have affected census response in 1990 for minorities and less-affluent individuals. The highly technical and “encyclopedic” terminology used for the census has been described as “confusing” and has led to omissions and misreporting for less-educated or less-literate persons (Garcia 1992; Iversen et al. 1999; Rodriguez and Hagan 1991). Perpetually mobile minority populations were disproportionately missed because of their lack of a “usual residence” on which enumeration is based (Mahler 1993; Nguyen 1996; Rodriguez and Hagan 1991). Persons in unconventional housing, such as boarders and residents of (often illegal and clandestine) multi-unit households, have been difficult to locate and hence have been missed by the census (Mahler 1993; Velasco 1992). Other foreign-born persons were simply uninformed of the census and believed that it did not apply to them (Rodriguez and Hagan 1991). These results suggest that those who are less “rooted” in the United States are more likely to be undercounted on the census. One can thus hypothesize that among the foreign-born, more recent U.S. entrants are more likely to be undercounted in the 1990 census relative to the 2000 census (Hypothesis 3).
Aside from the “logistical” difficulties of census enumeration (as described earlier), social resistance against the census is another difficulty that the Census Bureau has had to manage in conducting its decennial census. Public sentiment that the census is overly intrusive, for example, affects the public’s willingness to respond fully and truthfully to the census because of concerns of privacy (Fay et al. 1991; Hillygus et al. 2006; Singer et al. 1993). Others are concerned about breaches of census confidentiality, expressing fear that the information provided in the census can lead to the monitoring of public assistance and housing violations (Couper et al. 1998; Garcia 1992; Mahler 1993; Rodriguez and Hagan 1991; Sullivan 1990). Additionally, some undocumented immigrants, largely Hispanics, have provided inaccurate data for fear of government reprisal regarding legal status issues (Montoya 1991; Romero 1992; Velasco 1992). Negative feelings about the government in general also affect the quality of census data because some persons who are cynical about the government’s actions and in the census process itself have (at times deliberately) provided false information (Iversen et al. 1999).
The Current Study
Because census counts are affected by opinions on privacy, confidentiality, and the government in general, it can be inferred that the political climate in which one is embedded affects one’s responsiveness to the census, particularly if one feels threatened by the government or the public. For the 1990 census, feelings of threat were especially acute for undocumented persons.
In 1986, the U.S. government passed IRCA, which profoundly affected immigrants’ lives in two contrasting ways: on the one hand, paths to legal residence and citizenship were offered to many undocumented immigrants arriving before 19824; on the other hand, negative sanctions were applied to persons who knowingly hired undocumented immigrants. Those who became eligible for legal residence (amnesty) included seasonal agricultural workers and undocumented immigrants who could verify continuous presence (aside from “brief, casual, and innocent” absences5) in the United States from January 1, 1982. Later undocumented arrivals became unable to legally work in the United States and became ineligible for many forms of federal assistance. Furthermore, extra funding was channeled into regulatory institutions that were charged with enforcing these new workplace restrictions, and funding for border control was boosted by more than $400 million for both 1987 and 1988. Within the next few years, other pieces of legislation affecting immigrants were passed, including the Immigration Act of 1990 (which tightened border security and increased sanctions on existing restrictions) and Proposition 187 in California (which prevented undocumented immigrants from receiving many forms of state-funded assistance). If these pieces of legislation can serve as an indicator of the political climate under which undocumented immigrants were living, one could surmise that immigrants (particularly those arriving after 1981) were living in tense political times.
Because of IRCA’s stringent policies toward undocumented migrants entering after 1981, it is hypothesized that pre-IRCA entrants who were ineligible for amnesty (entrants between 1982 and 1986) and post-IRCA entrants (1987–1990) have higher net undercount in 1990 (relative to 2000) than those arriving prior to 1982 (Hypotheses 4 and 5, respectively).
Additionally, it is hypothesized that persons from regions whose inhabitants fit the stereotypical image of the undocumented immigrant are more likely to be undercounted in 1990 (relative to 2000). Based on Passel et al. (2004) and census counts, higher proportions of Mexicans, Central/South Americans, and Africans who entered prior to 1990 were undocumented in 2000 (see Online Resource 1), and persons from these same regions were presumably also more highly undocumented in 1990. Although IRCA directly affected undocumented persons’ ability to work, a more intangible outcome of “anti-illegal-immigrant legislation” may have been the creation of a hostile political and social environment against persons who fit the stereotypical profile of the “illegal immigrant.” That is, public rhetoric through political channels and the media may have stigmatized groups that are perceived to be undocumented and thus altered the behavior of persons who are less receptive to immigrants, leading some stereotyped “illegal immigrants” to feel threatened by the government or public. This dynamic may affect “other Caribbeans” (excluding Cubans) who share similarities with the highly undocumented cohorts, whether through African ancestry (e.g., Haitians) and/or a Spanish native tongue (e.g., Dominicans). Hypothesis 6, thus, is that cohorts from Mexico, Central/South America, Africa, and “other Caribbean Islands” (hereafter referred to as “undocumented regions”) are undercounted at higher rates in 1990 than in 2000.
The closest approximation of undocumented cohorts can be operationalized as those from undocumented regions that have entered at times that would make one ineligible for amnesty under IRCA. The central hypothesis of this study, therefore, is that persons from Mexico, Central/South America, Africa, and other Caribbean Islands who also entered in periods for which IRCA did not grant amnesty have higher net undercount in 1990 than in 2000 (Hypothesis 7).
Methodologically, this research produces a data set of relative net undercount estimates by cohorts defined by age, sex, year of entry, and place of origin. Using a modified demographic analysis, counts for each respective cohort in the 1990 census are carried forward into 2000, and counts for said cohorts for the 2000 census are carried backward to 1990 using death and emigration estimates. The differences between the expected and observed counts are used to construct a proxy for the relative net undercount of the 1990 census vis-à-vis the 2000 census (akin to the error of closure between the two censuses). Rather than using Census Bureau–based estimates of emigration, this research will draw on recent scholarship on emigration rates (Passel et al. 2006; Schwabish 2009; Van Hook et al. 2006) to estimate emigration inputs. Van Hook et al. (2006) and Passel et al. (2006) used matched Current Population Survey (CPS) data to solve for emigration rates when cases were observed at an initial time and not observed in a subsequent time by assuming that cases are not matched because of death, internal migration, emigration, and “other reasons.”6 Schwabish (2009) estimated emigration rates through the use of matched Social Security data where cases that lacked data for two consecutive years were assumed to have either died or emigrated.7 For the sake of brevity, the emigration estimates produced using the aforementioned sources will be referred to as the “Van Hook,” “Passel,” and “Schwabish” estimates.
Problematic for all three alternate emigration estimates is the assumption that (non)sampling errors are minimal. Van Hook and Passel estimates are sensitive to the assumption that the components comprising nonmatches (i.e., death rates, internal migration, and “other non-follow-up reasons”) are accurately measured. Schwabish estimates rest on the assumption that a two-year disappearance from tax filing is due to emigration or death, which overstates emigration for mothers and others who leave the work force. Despite the problems associated with these assumptions, these estimates can be used to produce a range of possible emigration estimates, and they offer improvements on the Census Bureau’s emigration component in a number of ways: the years for which the emigration rates are calculated more closely coincide with the years of interest; adjustments are made for circular migration; the new emigration rates do not omit cohorts from regions that produce unusable emigration rates (e.g., Mexico and El Salvador); and the new emigration rates are produced by greater demographic detail, including years of residence in the United States. These emigration rates allow for the estimation of emigration for specific cohorts: in this case, emigration is specified for each cohort defined by specific combinations of age, sex, year of entry, and region of origin. Of most value to this study, these estimates allow for the construction of a data set that can be used to test hypotheses that explain relative net undercount as a function of cohort characteristics, capturing the magnitude of relative undercount in a multivariate framework.
Data and Methods
Data on population counts, deaths, and emigration rates are used in both forecasting and backcasting methodologies to compare the relative counts across the 1990 and 2000 censuses. Total population counts are drawn from the Census Bureau’s American Factfinder tool, which summarizes population counts from the decennial census. Population counts by demographic detail are estimated using the Integrated Public Use Microdata Samples (IPUMS) (Ruggles et al. 2010), which feature weighted microdata for the decennial censuses. Deaths are estimated using the National Center for Health Statistics (NCHS) Multiple Cause of Death File for 1990, which provides complete person-level microdata for deaths. Emigration estimates draw from recent studies on emigration rates (Passel et al. 2006; Schwabish 2009; Van Hook et al. 2006).
Counts for the foreign-born population for both 1990 and 2000 (using IPUMS) are summed by sex, age, year-of-entry, and place-of-birth cohorts. Because year of entry is available only for select intervals in the 1990 IPUMS data, it is assumed that the numbers of foreign-born persons within each year-of-entry interval are distributed evenly into single years of entry. For census 2000, counts of entrants in 1990 are spline-smoothed downward and spread across the surrounding years to deal with overreporting on round numbers—in this case, entry in 1990. The resultant smoothed counts for 1990 entrants are then divided by 4 because it is assumed that one-fourth of those entering in 1990 were present for the 1990 census day (i.e., for April 1, 1990, or one-fourth of the way into 1990). Place of origin is derived from the “place of birth” item. Origins for survived cohorts from “unknown” origins in 1990 (4.2 % of the weighted sample) are imputed using a multinomial logistic regression model. Although reporting for year of entry is inconsistent for circular migrants (Redstone and Massey 2004), changes in year-of-entry reporting are not of concern to this study because such reporting inconsistencies will work against the central hypothesis and therefore set a more rigorous threshold for the central hypothesis.8
Deaths are estimated using life table single-year survivorship ratios (Kintner 2004), which are based on population and death counts. Total population counts for 1990 by sex and select age groups are gathered from American Factfinder and distributed into single years of age by using the single-year distributions tabulated with IPUMS. Deaths by age and sex are drawn from the NCHS Multiple Cause of Death files for census year 1990 (i.e., April 1, 1990–March 31, 1991). Population and death counts are then used to calculate survivorship ratios (using the life table), and deaths are estimated by progressively applying the survivorship ratios over the span of the decade to those at risk of dying for a given year.
Emigration rates are drawn from Van Hook et al. (2006), Passel et al. (2006), and Schwabish (2009), as described earlier and presented in Table 2. Emigration rates are provided by four characteristics: years of residence, age, sex, and place of origin.9 Emigration for a year is first estimated by applying the emigration rates by years of residence in the United States to cohorts at risk of emigration. The forecast and backcast emigration estimates for a single year for a year-of-entry cohort are estimated as follows:
where years_US represents a specific number of years that a person has resided in the United States, Count represents the number of persons at risk of emigration, and Rate represents an emigration rate.
For forecasts, baseline counts begin with the 1990 census. A one-year forecast begins with the subtraction of emigrants using Eqs. (1a), (2a), and (3). Deaths are then estimated for those remaining by using survivorship ratios. Backcasts begin with the counts of the 2000 census. A one-year backcast first estimates deaths by reverse-surviving the estimated population using survivorship ratios. Emigration is subsequently estimated using Eqs. (1b), (2b), and (3). Following the one-year forecasts and backcasts, the ages and years of residence in the United States for those remaining are adjusted, and the entire process repeats nine times in order to estimate the expected population for the subsequent census. (See Online Resource 1 for emigration estimates.)
The left side of Eq. (4) is the relative net undercount (or the “error of closure”) of the 1990 census vis-à-vis the 2000 census (i.e., the difference in the nets of the undercounts and overcounts/duplications across the two censuses). This figure is multiplied by 100 and divided by the averages of the forecasted, backcasted, and observed estimates/counts to estimate the percentage relative net (PRN) undercount of the 1990 vis-à-vis 2000 census. PRN undercount will serve as the main dependent variable; positive values can be interpreted as “percentage lower counts for the 1990 census (or, conversely, percentage higher counts for the 2000 census) for a cohort,” and negative values can be interpreted as “percentage lower counts for the 2000 census (or percentage higher counts for the 1990 census) for a cohort.” If census 2000 were to have complete coverage, for example, PRN undercount would be interpreted as the net undercount in 1990.
The final product of the aforementioned procedures is a data set that includes PRN undercount by cohorts of unique age (in 16 five-year categories), sex (2 categories), place of origin (9 categories), and year-of-entry values (10 categories), producing a total of 2,320 unique cohorts. Ordinary least squares (OLS) regression with robust standard errors (to adjust for the few outliers that presumably result from matching errors11) is used to test effects that the independent variables have on percentage relative net undercount. OLS regression is a good fit because of its simplicity in interpretation as well as in the normal distribution of the dependent variable and residuals. Individual cohorts are treated as cases, and sampling weights weigh the data by the sizes of the cohorts while maintaining the sample size of 2,320. For the sake of brevity, the terms “relative undercount” or “PRN undercount” will refer to the “PRN undercount for the 1990 vis-à-vis 2000 census.”
Table 3 displays three estimates (defined by the emigration component used) of the PRN undercount for the 1990 vis-à-vis 2000 census (or simply, the “PRN/relative undercount”) for foreign-born persons by the parameters under study. In general, the Van Hook estimates feature higher undercount, and the Schwabish estimates show lower relative undercount. The very high Van Hook relative undercount estimates suggest that the emigration rates for Van Hook are overstated. Schwabish estimates of relative undercount are generally low and feature some peculiarities, including negative relative undercounts for the youngest cohorts (–3.4 %) and higher relative undercount for males than females (10.9 % vs. 10.5 %). These differences may be a product of the Social Security data employed by Schwabish, which are selective on persons who may be less likely to emigrate (i.e., those legally in the United States), thereby lowering emigration rates (particularly for those who are more likely to be undocumented). Additionally, many voluntary exits from the labor force for childbearing women are falsely coded as exits from the country. Van Hook’s and Passel’s use of the CPS is more inclusive and captures emigration for both undocumented and documented cohorts, although slight differences between these methodologies produce higher emigration (and thus higher relative undercount) for Van Hook. Combining the three estimates under the average estimate attenuates some of the variation across estimates and between variable categories. Despite the variation across estimates, the sets of relative undercount are highly correlated (see Online Resource 1).
Age has a parabolic relationship on PRN undercount across all estimates: cohorts of the middle age ranges tend to have high relative undercount. Year of entry is strongly related to relative undercount: cohorts ineligible for amnesty under IRCA have much higher relative undercount than those eligible for amnesty. On average, the relative undercount estimates for entrants in 1980–1981 (14.3 %) are almost one-half the size of the estimates for 1982–1984 (25.3 %), demonstrating that relative undercount is not linearly distributed by years of residence. The “undocumented regions”—namely, Africa (averaging 27.5 % relative undercount), Central/South America (13.9 %), Mexico (33.3 %), and other Caribbean Islands (22.1 %)—have higher relative undercount. These patterns parallel the results of previous undercount studies (albeit at much higher magnitudes) of the total population for the 1990 census: DA had estimated that black males (8.13 %) and males aged 18–49 (2.31 % to 3.47 %) were highly undercounted (Robinson 2001), while the PES had estimated that blacks (4.57 %) and Hispanics (4.99 %) are highly undercounted (Mulry 2007).
Figure 1 displays relative undercount estimates by year and region of origin, where regions bearing similar patterns or low counts are grouped (see Online Resource 1 for corresponding estimates). The Asia/Canada/Cuba/Europe/Oceania group depicts lower PRN undercount, which generally increases for progressively recent entrants. The Africa/Mexico/other Caribbean group tends to feature the highest relative undercount values, although their values dramatically increase for cohorts that are ineligible for amnesty under IRCA (in the shaded areas in Fig. 1). This bivariate result suggests that the higher relative undercount estimates for cohorts ineligible for amnesty may largely be driven by these four region-of-origin cohorts in particular, consistent with the story that those fitting the profile of the undocumented immigrant are those who are more likely to be relatively undercounted on the 1990 (vis-à-vis the 2000) census.
The pattern for Central/South Americans, however, is unique in that relative undercount is high for the earliest arrivals and declines into the early 1980s, only to rise steeply for cohorts that are ineligible for amnesty under IRCA. Drawing from the central ideas proposed in the hypotheses, one may speculate that the declining relative undercount rates ending in the early 1980s may be due to the high numbers of (documented) asylees and refugees arriving from Central/South America to flee from the military dictatorships and civil wars that were prevalent throughout Latin America in the 1970s. The high relative undercount for Central/South Americans arriving after 1981, however, may stem from the higher probability that the members of these cohorts are undocumented.
Table 4 displays the results of the OLS regression models that predict PRN undercount. Model 1 includes the basic control variables. Age has a very small linear effect on relative undercount, although this relationship is arguably negligible when considering its small size and the inclusion of other variables in the model (i.e., ages 15–44 and years of residence). An expected age effect is found for cohorts aged 15–44, which tend to have 8.76 % higher relative undercount, supporting Hypothesis 1 (H1) (i.e., cohorts aged 15–44 having higher relative undercount; see Table 1). Male cohorts also tend to have around 9.07 % higher relative undercount than female cohorts, confirming Hypothesis 2 (H2). Model 1 demonstrates that cohorts who have been in the United States for longer periods of time are less relatively undercounted by 0.76 % for an extra year of residence, confirming Hypothesis 3 (H3). All these relationships are robust under the following model specifications.
Model 2 adds dummy variables for two cohorts that are ineligible for IRCA amnesty: one arriving before and the other after the passage of IRCA. As hypothesized, persons arriving prior to IRCA but who were ineligible for amnesty have 6.13 % higher PRN undercount, confirming Hypothesis 4 (H4) in this model. Arrival after the passage of IRCA (also making one ineligible for amnesty) has an even stronger effect (9.16 %) on relative undercount, supporting Hypothesis 5 (H5).
Model 3 adds dummy variables for regions of origin, omitting Canada, Europe, and Oceania (e.g. Australia and New Zealand).12 Although African, Mexican, and other Caribbean cohorts feature higher PRN undercount (by 7.75 %, 15.12 %, and 5.93 %, respectively), Central/South American cohorts feature lower PRN undercount (–3.26 %). Using this model, Hypothesis 6 (H6) is supported for Africans, Mexicans, and other Caribbeans, although not for Central/South Americans. Cohorts from Asia and Cuba have lower relative undercount than the omitted categories, which may reflect their lower probabilities of being undocumented compared with the omitted categories (drawing from Passel et al. (2004); see Online Resource 1).
Model 4 controls for the slope dummy variables for years of residence by place of origin and does not address any of the hypotheses directly, although it provides an illuminating contrast with Model 5. The interaction coefficients identify relative undercount by years of residence in the United States that are specific to each place-of-origin cohort. In Model 4, a majority of the theorized relationships from the previous models continue to hold, although the dummy variable for Africa loses its positive relationship with PRN undercount. Slope dummy variables for years of residence in the United States are significant for Asians (positive) and Mexicans (negative). However, neither of these relationships—nor many other hypothesized relationships—remain significant under the full model (Model 5), which adds interactions between place of birth and entry in periods that make one ineligible for amnesty under IRCA (the central independent variables under study).
Model 5 demonstrates that several of the interactions between origins from undocumented regions and entry during periods of IRCA amnesty ineligibility (1982–1990) are significant and often dwarf the other relationships in the model. These coefficients identify the relationship between relative undercount and IRCA amnesty ineligibility for specific region-of-origin cohorts. Africans, Central/South Americans, Mexicans, and other Caribbeans who entered after IRCA have higher relative undercounts of 22.99 %, 30.01 %, 41.91 %, and 39.01 %, respectively. Central/South Americans and Mexicans arriving prior to IRCA who are ineligible for amnesty (1982–1986) also feature higher relative undercount (10.51 % and 16.13 %, respectively). These results support Hypothesis 7 (H7), demonstrating higher relative undercount for cohorts that are likely to be viewed as undocumented. The inclusion of these interaction effects eliminates the significance of many other variables and challenges their attendant hypotheses (as summarized in Table 1): specifically, IRCA ineligibility (associated with H4 and H5) and origin from undocumented regions (H6) no longer explain relative undercount as they had in the previous models. Rather, these effects are accounted for by the interactions of these two concepts. This suggests that neither persons from undocumented regions (with the exception of Mexicans) nor arrivals who are ineligible for IRCA amnesty necessarily have high relative undercount. Rather, persons from undocumented regions who were also ineligible for amnesty are those who have high relative undercount.
To test the robustness of the main findings, the following sensitivity analyses make three cumulative adjustments to retest the central hypothesis under a worst-case scenario. First, one can challenge the accuracy of the death estimates for Latinos based on literature on the so-called Hispanic mortality paradox (Palloni and Arias 2004) by decreasing death rates for Latinos vis-à-vis the total population (informed by U.S. Census Bureau (2012)). Second, one can challenge the assumption that circular migrants were not temporarily absent for the 1990 census, only to return to be counted on the 2000 census (thus artificially inflating relative undercount). To adjust for this scenario, cohorts from Latin America and other Caribbean Islands entering between 1987 and 1990 are “reverse-survived/emigrated” three years from 1990, and the estimated emigrants who were not in the country for the 1990 census are assigned annual return probabilities of 4 % throughout the 1990s (Massey et al. 2002) and are added to the expected counts for said cohorts. Third, one can challenge the assumption that the average emigration estimates are accurate by rerunning the analyses with data sets that use different emigration inputs.
Table 5 displays the coefficients corresponding to the central hypothesis (H7) based on the full regression model (Model 5 in Table 4) under six different scenarios in order to test the robustness of the findings. Model 1 displays the coefficients from Model 5 on Table 4 to serve as a baseline comparison. Model 2 adjusts deaths upward for Latin Americans, and Model 3 further adjusts expected counts for return migration. Under both scenarios, the expected relationships decrease in size and significance, but all coefficients remain significant, although the coefficient for Central/South Americans entering in 1982–1986 retains significance only when a one-tailed hypothesis is used. Model 4 uses the average of the Passel and Schwabish PRN estimates to see whether the relationships still hold when not using the (high) Van Hook estimates, and features the loss of significance only for other Caribbeans entering in 1982–1986. Not surprisingly, Van Hook estimates (Model 5) produce significant relationships for all variables of interest, with the exception of Africans entering in 1982–1986. Interestingly enough, Model 5 also produces a significant positive relationship for post-IRCA Asians, although this finding is unique to the Van Hook estimates. Like the Van Hook estimates, the Passel estimates (Model 6) produce significant relationships for all variables of interest besides Africans entering in 1982–1986. The Schwabish estimates (Model 7), however, retain significance only for post-IRCA Mexicans and other Caribbeans. Thus, omitting the Schwabish estimates from “average” relative undercount estimates would further support the central hypothesis; again, the Schwabish methodology omits (and likely understates) emigration for undocumented migrants, which depresses their relative undercount estimates. In short, the findings that support the central hypothesis are robust, and the main relationships of the full model (Model 5 on Table 4) retain their significance over the majority of the models.
This research explores relationships between cohort characteristics and percentage relative net undercount of the foreign-born population for the 1990 vis-à-vis 2000 census (“PRN/relative undercount,” for short). A variation of demographic analysis was implemented to estimate the extent of relative undercount by age, sex, year-of-entry, and place-of-origin cohorts. Multivariate regression models then explore some of the correlates of relative undercount, demonstrating that males and persons aged 15–44 tend to feature higher relative undercount, whereas years of residence in the United States decreases relative undercount.
The cohorts with the highest PRN undercount are those that fit the stereotypical image of the “undocumented immigrant”—namely, cohorts from Africa, Central/South America, Mexico, and the Caribbean (excluding Cuba), who also entered the country in periods that made them ineligible for amnesty through IRCA. Under the full model, the magnitude of these interaction effects generally surpasses that of the effects found for the other variables, squelching many to nonsignificance (including the aforementioned noninteracted region and period-of-entry dummy variables). That is, relative undercount is explained neither by entry in relation to the IRCA amnesty period nor by origin from “undocumented regions,” but rather by the interactions of these variables.
Many explanations of undercount that have been offered in previous research can help to uncover some of the drivers of PRN undercount estimated in this study. At a very basic level, the 2000 census featured much lower net undercount than did the 1990 census, which would naturally produce high relative undercount for all cohorts under study. The distinct patterns of the magnitude of relative undercount between different cohorts can also be addressed using previous literature. For example, previous literature has noted that many of the foreign-born were simply uninformed of the census and felt that it was not applicable to them (Rodriguez and Hagan 1991). This relates to the finding that cohorts less “rooted” in the United States (operationalized as those who have lived in the United States for fewer years) tend to be relatively undercounted. Additionally, relative undercount may be a reflection of recent immigration from poor countries, given that persons living in irregular housing (Mahler 1993; Velasco 1992) and/or with limited English skills (Garcia 1992) were undercounted in 1990. In some ways, this can be reflected in the main relationships found in this study because cohorts from “undocumented regions” (who are more likely to be poor) who are ineligible for amnesty under IRCA have higher relative undercount. However, such an interpretation of the results does not account for the lack of a corresponding linear relationship between relative undercount and year of entry from poor countries (particularly for Central/South Americans for whom the relationship is in the opposite direction). Instead, the effects are specific to cohorts that entered between 1982 and 1990, which suggests that other dynamics may have stronger influences on relative undercount than recent entry in itself.
Tying this research with literature on the effect of privacy and confidentiality concerns on census response (particularly for the undocumented population; see Montoya 1991; Romero 1992; Velasco 1992), I suggest that the political context in which a census is embedded affects census counts. For the 1990 census, undocumented persons were more likely to feel threatened by the political environment because the recent passage of IRCA arguably created a climate of fear and distrust for undocumented populations. IRCA may have led undocumented persons to feel threatened by the government (and thereby the census) as well as by xenophobic populations who had become more exposed to anti-illegal-immigrant rhetoric. This climate may not only affect highly undocumented cohorts but also those who share similarities with the stereotypical image of the “illegal immigrant.” Other Caribbeans, for example, may have African roots (e.g., Haitians and Jamaicans) and/or a Spanish native tongue (e.g., Dominicans), which may lead persons to treat them as if they were members of highly undocumented cohorts. Cubans may be buffered from such dynamics through social enclaves and federal assistance programs, not to mention the economic and psychological security of receiving relatively instantaneous permanent legal status and preferential treatment among migrants.
Methodologically, this research offers some extensions that can be made to current demographic analysis. Alternate emigration estimates by cohort characteristics (Passel et al. 2006; Schwabish 2009; Van Hook et al. 2006) allow this research to use DA to set up a regression analysis. Future research can produce updated emigration estimates by using recent data and by drawing from different sources to apply to different cohorts (e.g., Social Security emigration estimates for documented cohorts, and adjusted CPS estimates for other cohorts). An important and relatively simple adjustment to make to Schwabish estimates is to adjust emigration rates for probabilities of childbearing or exits from the work force to prevent false-positive identification of emigration. This research’s approach to DA, along with modified Schwabish estimates of emigration, can be adopted by scholars—including scholars at the Office of Immigration Statistics (i.e., Hoefer et al. 2011)—who wish to estimate the size and characteristics of the enumerated undocumented population. Specifically, the “documented” foreign-born population can be survived and emigrated, and the differences between the estimated documented foreign-born population and the total enumerated foreign-born population could be used as an estimate of the undocumented population. Further specifications for such a study may also draw on well-developed methodologies such as the methodology employed by Passel et al. (2004).
Larger implications of this study suggest that the perceptions of the data collector can affect data quality. In this case, groups that feel negatively targeted may increase their suspicions of the intentions of the government and thus of the government’s operations and the census, underscoring the necessity of the Census Bureau to remain neutral and nonthreatening in the information that it gathers. Future studies may demonstrate that many populations that feel discriminated against or targeted—including Hispanics, blacks, South/West Asians, and Middle Easterners (although some of the latter two categories would be prompted to identify as “white”)—continue to be undercounted and underrepresented. A natural extension of this study is to apply some of the ideas proposed here to the 2010 census, which I was unable to do owing to the current unavailability of microdata. Future studies may also further explore how various indicators of political climate and/or sentiments toward the government affect participation in other governmental institutions (such as the educational system); minority participation in higher education (or another outcome) may be related to feelings of threat or general cynicism of the government and thus in the government’s institutions.
On a practical level, this research has implications for the political outcomes of minority populations, including those of legal residents. Population counts are used to inform congressional and electoral power, as well as funding for educational programs, road construction, and other federal programs. Preliminary studies of undercount for the 2010 census continue to demonstrate that blacks (2.1 %), Hispanics (1.5 %), and Native Americans (on reservations, 4.9 %) continue to be undercounted relative to Asians (0 %) and Non-Hispanic whites (overcounted by 0.8 %) (Mule 2012). Ironically, many of those in need of federal support—that is, cohorts that may be at an economic disadvantage or feel discriminated against—are those who are more likely to be deprived of appropriate levels of funding as a result of their undercount.
Thanks to David Swanson, Vanesa Estrada-Correa, Augustine Kposowa, J. Gregg Robinson, Robert Bozick, Trey Miller, Megan Beckett, Peter Brownell, Tori Velkoff, Michael Rendall, and Demography’s anonymous reviewers.
Puerto Ricans are excluded in this analysis.
Correlation error of a capture-recapture study refers to the tendency for persons who are missed in census enumeration (the “capture” phase) also to be missed in the recapture phase.
Ages 15–44 (vs. 18–49) are selected based on preliminary explorations of the data.
By the end of 2000, an estimated 2.7 million immigrants were granted legal permanent residence through IRCA legislation (U.S. Immigration and Naturalization Service 2002). In comparison, Passel and Woodrow (1987) estimated that 2,093,000 undocumented residents over the age of 13 were captured in the Current Population Survey in April 1983. Although Passel and Woodrow (1987) provided an imperfect measure of all undocumented immigrants eligible for amnesty, it is reasonable to assume that a majority of eligible immigrants were legalized under IRCA.
Any one absence could not exceed 45 days, and the total absences could not exceed the sum of 180 days.
Van Hook et al. (2006) and Passel et al. (2006) (referred to here as “Van Hook” and “Passel,” respectively) assumed that total non-follow-up (NFU) in the CPS was equal to the sum of NFU resulting from emigration, internal migration, death, and “other reasons.” Internal migration was estimated using the “residence one year ago” item on the CPS. Deaths were estimated using the National Health Interview Survey (Van Hook) or life tables (Passel). For Van Hook, NFU for “other reasons” for native-born persons was solved for by subtracting NFU for deaths and internal migration (emigration was assumed to be 0), and this rate was assumed to be identical for foreign-born persons. For Passel, NFU for “other reasons” was a function of matching-processing error and was estimated using a multivariate model. Emigration rates were thus solved for as the remaining probability that makes up total NFU, which was then adjusted using estimates of circular migration (based on Massey et al. (2002)).
Schwabish (2009) used three linked administrative data sources provided by the Social Security Administration (i.e., the Detailed Earnings Records, Numerical Identification System, and Master Beneficiary Record) to identify foreign-born work histories. Workers who reported earnings for at least one year and subsequently have had at least two years of non-employment were assumed to have died or emigrated from the country. Although Schwabish (2009) estimated emigration rates for the documented foreign-born, the total emigration rate did not differ between the documented and undocumented populations when one controls for emigration rates by year of entry. Using separate emigration rates for undocumented residents and total emigration rates by year of entry (Passel et al. 2006), assuming that the counted undocumented population in 1990 was 2,176,000 (Woodrow-Lafield 1995) and that the number of undocumented who have entered before 1980 are negligible, I find that the rate of emigration for undocumented persons is nearly identical to that of all emigrants (2.1 vs. 2.08; results not shown).
Because circular migrants are more likely to report a more recent year of entry upon their return to the United States (Redstone and Massey 2004), circular migrants (who are also more likely to be undocumented) are more likely to report a year of entry more recent than 1990 on the 2000 census. This will result in deflated counts in 2000 relative to 1990 for the cohorts of interest and work against the central hypothesis. Sensitivity analyses that follow account for circular migrants who were absent for the 1990 census but returned for the 2000 census.
Cubans who establish permanent residence in another country lose all rights to reside in Cuba (see Cuba’s Ley No. 989/1961), and repatriation is restricted to select minors and persons over the age of 60 (with a few exceptions). Because of these limitations, it is assumed that only the proportions of Cuban cohorts who did not naturalize (drawing from U.S. Immigration and Naturalization Service 1997, 2002) who entered after 1960 are at risk of emigration. That is, the ratio of naturalizing Cubans to all Cuban entrants (by decade) is multiplied by the sizes of the corresponding Cuban cohorts to estimate the Cubans at risk of emigration. Additionally, it is assumed that 1980 entrants are at risk of emigration as a result of the repatriation of the 1980 “Marielitos.”
This approach is not negatively affected by cohorts with few or zero members.
A total of 18 projected cohorts are not matched on all parameters for the corresponding observed cohorts. Because nonmatches are exclusive to older cohorts, nonmatches are presumably a product of age misreporting, and ages are adjusted such that matches could be made.
When these dummy variables and their interaction effects are grouped to replicate Model 5, the error sum of squares is not significantly decreased in comparison with the larger model that includes separate (nongrouped) dummy and interaction variables.