A person’s racial or ethnic self-identification can change over time and across contexts, which is a component of population change not usually considered in studies that use race and ethnicity as variables. To facilitate incorporation of this aspect of population change, we show patterns and directions of individual-level race and Hispanic response change throughout the United States and among all federally recognized race/ethnic groups. We use internal U.S. Census Bureau data from the 2000 and 2010 censuses in which responses have been linked at the individual level (N = 162 million). Approximately 9.8 million people (6.1 %) in our data have a different race and/or Hispanic-origin response in 2010 than they did in 2000. Race response change was especially common among those reported as American Indian, Alaska Native, Native Hawaiian, Other Pacific Islander, in a multiple-race response group, or Hispanic. People reported as non-Hispanic white, black, or Asian in 2000 usually had the same response in 2010 (3 %, 6 %, and 9 % of responses changed, respectively). Hispanic/non-Hispanic ethnicity responses were also usually consistent (13 % and 1 %, respectively, changed). We found a variety of response change patterns, which we detail. In many race/Hispanic response groups, we see population churn in the form of large countervailing flows of response changes that are hidden in cross-sectional data. We find that response changes happen across ages, sexes, regions, and response modes, with interesting variation across racial/ethnic categories. Researchers should address the implications of race and Hispanic-origin response change when designing analyses and interpreting results.
Racial and ethnic groups1 are not inherent divisions of society. The definition of each group and concepts of the “typical” member vary over time and place, and are affected by political regimes, intergroup relations, and personal interactions (Barth 1969; Haney López 1996). Relatedly, people sometimes change their sense of which race(s) or ethnicity best describe them. Although many researchers understand that race and ethnicity are social constructions, acknowledgment that some individuals change their race and/or ethnicity response is not usually meaningfully incorporated into analyses of the social world. Instead, analyses are designed and described as if each person is a permanent member of a racial/ethnic group (e.g., Adamczyk et al. 2016; Damaske and Frech 2016; Elliott 2015; Krivo et al. 2015; Lopoo and London 2016; Mehta et al. 2013).
In this study, we use linked data from the 2000 and 2010 censuses (N = 162 million; not nationally representative) to provide a dramatic expansion of information about individual-level changes in race and Hispanic-origin responses.2 To what extent do race and/or Hispanic-origin responses change? We use descriptive statistics and data visualization to show the extent of individual-level response stability and change across the decade. Is change more common to/from some racial/ethnic groups than others? We show the 20 most common detailed changes and three case studies showing transitions between common combinations of racial/ethnic groups. Does the propensity to change responses vary by characteristics of the individual? We provide rates of individual-level race and/or Hispanic response change by age and sex for 12 race/Hispanic categories. And to what extent do race and Hispanic-origin response changes affect research findings? We calculate response change rates for a variety of aggregated race and Hispanic-origin categories.
We are not the first to study response change. Using the limited data available, other scholars have raised important questions and given helpful insights (e.g., DeFina and Hannon 2016; Guo et al. 2014; Harris and Sim 2002; Liebler and Ortyl 2014; Loveman and Muniz 2007; Saperstein and Gullickson 2013; Saperstein and Penner 2012, 2014). Previous empirical studies have been constrained in their ability to give information about all types of people in the United States. In quantitative studies, sample sizes have inhibited investigation of dynamics within smaller response groups (particularly American Indians, Pacific Islanders, and double minorities; e.g., DeFina and Hannon 2016; Saperstein and Penner 2012). Cohort-specific studies have focused attention on a limited range of ages (e.g., Harris and Sim 2002; Saperstein and Penner 2014). Analyses of cross-sectional data have revealed net changes but not flow or churn (e.g., Liebler and Ortyl 2014; Perez and Hirschman 2009). Publicly available linked census data are from a century ago (Loveman and Muniz 2007; Saperstein and Gullickson 2013). Qualitative studies necessarily focus on particular populations (e.g., Rockquemore 1998; Sturm 2011). We advance knowledge of response change using much larger and more diverse data than has been available to other researchers. We show detailed flow information and include all federally defined race and Hispanic-origin groups, and people of all ages, in the modern era.
Differences across response categories in the extent and types of response change may reflect key cross-category differences in how race and ethnicity are socially constructed. We give empirical evidence of age-, sex-, and group-specific patterns that have not yet been accounted for in sociological theories of race.
Response change can affect any analysis in which racial/ethnic groups are compared with one another. We present three related examples. First, if a highly educated person changes her response from X to Y, then group Y’s mean education rises and group X’s falls; statistics show a change in the measured attributes of both groups yet no individual gained (or lost) education. Second, group-specific cross-sectional data on a characteristic such as income reflects the income of people who have stable identification with the group as well as the income of those who are newly identified group members. Third, the extent to which two groups are (measured as) residentially segregated reflects not only the locations of people with stable responses but also any location-specific processes that increase or decrease the chances a person will give a particular response. Our work highlights the extent to which the assumption of universal response stability is untenable, as well as areas where the assumption holds relatively well.
Extent of Response Change and Response Groups Most Affected
Overall Rates of Race/Hispanic Response Change
To evaluate decennial data quality, the Census Bureau conducts postcensus reinterview studies (usually by phone) with a sample of people in households (Dusch and Meier 2012; Singer and Ennis 2003; U.S. Census Bureau 1993).3 Census reinterview studies found an overall rate of race response change of 4 % in 1990 (U.S. Census Bureau 1993: table 4.6), 8 % in 2000 (Singer and Ennis 2003: table E.24), and 6 % in 2010 (Dusch and Meier 2012: table 8). Using a study of adolescents in the 1990s (National Longitudinal Study of Adolescent Health (Add Health), 1994–1995), Harris and Sim (2002) showed that 12 % reported a different race at home versus at school. In a panel of General Social Survey respondents surveyed in 2008, 2010, and 2012, 5 % changed responses between white, black, and another race response (DeFina and Hannon 2016).
Hispanic response change has been less common. Hispanic (yes/no) responses changed for 1 % of those reinterviewed in 1990 (U.S. Census Bureau 1993: table D.1), 2 % in 2000 (Singer and Ennis 2003: table E.8), and 1 % in 2010 (Dusch and Meier 2012: table 8). A comparison of Census 2000 responses to Current Population Survey (CPS) responses in nearby months revealed that 3 % of people were reported as Hispanic in one of these studies but not the other4 (del Pinal and Schmidley 2005:5; also see Alba and Islam 2009 and Eschbach and Gómez 1998).
Prior research on smaller or demographically limited samples shows that the extent of response change varies substantially by race response group, with notable stability in the white, black, Asian, and Hispanic response categories (Bentley et al. 2003; Brown et al. 2006; del Pinal and Schmidley 2005; Doyle and Kao 2007; Singer and Ennis 2003). For example, 2 % to 3 % of people who reported white, black, or Asian in Census 2000 gave a different response in the Census Quality Survey (CQS; Bentley et al. 2003:28). Although these groups have high levels of response stability, some people who self-report these single-race groups have mixed racial heritage (Bratter 2007; Liebler 2016) or mixed Hispanic and non-Hispanic heritage (Emeka and Vallejo 2011; Miyawaki 2016). Because these are large groups, even if a small proportion of people change responses, the number of changes can be substantial.
Race Response Instability
American Indian, Pacific Islander, and multiracial responses exhibit much greater instability. People of mixed heritage (a group including many with American Indian and/or Pacific Islander heritage), sometimes have a dynamic or border-straddling identity that is not easily brought into standard race categories (Rockquemore 1998; Root 1996); their outward self-presentation may be at odds with their family heritage and/or internal personal identities (Khanna and Johnson 2010) or be interpreted inconsistently by others (Porter et al. 2016). As feelings and experiences vary across time and context, people of mixed heritage may shift between marking both/all of their affiliated groups and marking only one.
About one quarter (22 % to 28 %) of people reported as American Indian, Pacific Islander, or Some Other Race (SOR) in Census 2000 had a different response in the CQS (Bentley et al. 2003:30). Of people reported as non-Hispanic multiple-race in Census 2000, about 60 % were reported as non-Hispanic single-race in the CQS (Bentley et al. 2003: tables 10 and 11). Even higher race response change rates have been found among people reported as Hispanic. A different race response was reported in the 2000 CPS than in Census 2000 for 13 % of those reported as Hispanic white in Census 2000, 45 % of those reported as Hispanic black, and 78 % of those reported as Hispanic American Indian (del Pinal and Schmidley 2005: table 13). See Roth (2012) for related qualitative research.
Instability in American Indian responses: Despite specific tribal and federal legal definitions of who is considered American Indian (Robertson 2013; Snipp 2003; Thornton 1997), there has been a large net increase over the past half century in identification as American Indian on the census (Eschbach 1993, 1995; Eschbach et al. 1998; Harris 1994; Liebler and Ortyl 2014; Passel 1976, 1997; Passel and Berman 1986) and in daily life (Fitzgerald 2007; Nagel 1996; Sturm 2011). We include non-Hispanic American Indians in one of our case studies. Rather than reporting net change as in prior research, we show both in-flows and out-flows to/from each race/Hispanic response group (also see Liebler et al. 2016).
Instability in Pacific Islander responses: Census Bureau Content Reinterview studies reveal “medium to high” levels of inconsistency in responses to the Pacific Islander category (Dusch and Meier 2012: table 27; Singer and Ennis 2003). Of those in the CPS comparison, 28 % of those reported as non-Hispanic Pacific Islander in Census 2000 had a different response in the CPS (del Pinal and Schmidley 2005: table 11). With our very large data set, we are able to include Pacific Islanders in all parts of our analyses (including a case study), providing some of the first insights into response change within the Pacific Islander category.
Instability in multiracial responses: The social and legal history of the United States likely impacts which response change patterns are more common. The “black” category has been defined relatively strictly (Davis 2001; Haney López 1996) and may constrain people with black heritage to virtually always include a black response, even if they sometimes report additional races (Doyle and Kao 2007; Gullickson and Morning 2011; Guo et al. 2014; Harris and Sim 2002). Cross-sectional data show that multiple-race reports dominate among those who have mixed Asian or Pacific Islander heritage (Gullickson and Morning 2011; Hixson et al. 2012; Liebler 2016; Spickard 2001).
Instability in race responses when Hispanic origin is reported: Several factors might heighten race response instability among those who report Hispanic origins. First, the race question does not include a Hispanic response option, so people who view their race as Hispanic may be relatively uncommitted to a different race response (Compton et al. 2012; Rodríguez 2000). Second, people who identify with Latin American terms, such as mulatto or mestizo (Golash-Boza and Darity 2008), may not see their identity captured in U.S. racial categories. Third, like all immigrants, foreign-born people who identify as Hispanic might change their identities and race/Hispanic responses as they become more integrated into U.S. society (Landale and Oropesa 2002; Mowen and Stansfield 2016; Roth 2012; Waters 1999). Fourth, questionnaire design changes (discussed later) may influence some to report one (or more) of the federally defined race groups (Humes et al. 2011; Stokes et al. 2011). We give empirical evidence of patterns in race response stability and change among those reported as Hispanic and those reported as non-Hispanic.
Response Change From One Single Race to Another
Some response changes are from one single-race response to another, although little is known about these types of response changes. The limited data available show that adolescents of mixed heritage (e.g., Harris and Sim 2002) have made this response change and that thousands of people have changed to a single-race American Indian response from another (unknown) single-race response (Eschbach et al. 1998; Harris 1994; Liebler and Ortyl 2014; Liebler et al. 2016; Passel 1976, 1997). Some single-race-to-single-race response changes are likely a reflection of identity awakenings (Fitzgerald 2007; Sturm 2011), whereas others might reflect a change in context, reference group orientation, socioeconomic status, or a “chameleon change” experience (Miville et al. 2005; also see Kana’iaupuni and Liebler 2005; Liebler 2010; Stokes-Brown 2012). Our information about population churning between single-race groups is an important contribution.
Characteristics of People With Unstable Race/Hispanic Responses
Our data are not well suited for parsing the reasons for race and Hispanic response change and stability. In support of future efforts, however, we present summary information about the respondent characteristics available in our data: sex, age, location, and enumeration mode. Each characteristic might be associated with unstable responses. Some evidence suggests that women are socialized to have more complex identities than men (see Root 1998); and (among college freshmen) women are more likely to report multiple races than men (Davenport 2016), so women may be more likely to change their race/Hispanic responses. Younger people go through stages of personal identity development (Erickson 1980) and may change responses as a result. Also, older children may have parent-reported responses in 2000 but self-reported responses in 2010. The West region5 has higher levels of interracial marriage and multiple-race reporting than elsewhere in the United States (Jones and Bullock 2012; Wright et al. 2003), perhaps setting the stage for more response change in that region. Finally, the presence of an enumerator6 may influence which response is provided, potentially causing a different response in one year than another (Khanna 2004; Wilkinson 2011).
There are many other reasons why a person’s race and/or Hispanic origin response might change, including questionnaire design (Lavrakas et al. 2005; Snipp 2003; Stokes et al. 2011), situational identities (Harris and Sim 2002; Rodríguez 2000; Root 1996), difference in who fills out the form (Sweet 1994), change in self-understanding through change in circumstance or location (Eschbach 1993; Kana’iaupuni and Liebler 2005; Root 1998), and differences in post-enumeration procedures.
To What Extent Do These Changes Affect Research Findings?
Researchers often rely on the assumption that a racial or ethnic group includes the same individuals at each time point, except for differences due to births, migration, and deaths. Most data resources do not include measures of race/ethnicity at multiple time points. We show rates of response change in various aggregations of racial/ethnic categories to help researchers understand the extent to which response change might be affecting their data and results.
We used internal linked U.S. Census Bureau data from the 2000 and 2010 censuses, linked by the Census Bureau’s Center for Administrative Records Research and Applications (CARRA). CARRA used probability record linkage techniques and name, sex, date of birth, and address to assign each person (as possible) in each data set an anonymized Protected Identification Key (PIK; see Wagner and Layne 2014). The PIKs were then used to link each person’s Census 2000 record to his or her 2010 census record.
By definition, linked data include only those who were present in both data sets. We cannot include Census 2000 respondents who died or left the country by 2010, new immigrants who arrived after 2000, children born after Census 2000, or people who were present but not enumerated in Census 2000 and/or the 2010 census (Mule 2012; U.S. Census Bureau 2003). The process of linking the data excludes everyone who does not have a Social Security number (SSN) or an individual Taxpayer Identification Number (TIN); see Wagner and Layne (2014) and Bond et al. (2013). Compared with those in the linked data, people in Census 2000 who could not be linked to 2010 had an older age distribution and were disproportionately reported as non-Hispanic black or Hispanic. Approximately 200 million people (199,917,723) were present, enumerated, and assigned a unique PIK in the full-count decennial censuses of 2000 and 2010, accounting for 81 % of the people with unique PIKs in 2000.
To minimize response changes due to differences in how the information was gathered, we excluded cases in which (1) the person lived in group quarters, because this information is often drawn from local administrative records (Chun and Gan 2014) (6,845,302 people); (2) information was collected from a neighbor or other proxy respondent (Porter et al. 2016) (4,868,556 people); or (3) the race or Hispanic origin was imputed or edited by the U.S. Census Bureau (21,144,912 people). Remaining responses were very likely given by the individual or by someone else in the household. (For more about who fills out census forms, see Sweet 1994.)
To minimize the chances of a false match, we excluded cases in which (1) the person’s age difference between the two censuses was less than eight years or more than 12 years (5,410,733 people); (2) all age information in a year was imputed (3,994,504 people); (3) the person’s sex did not match between the two censuses (1,232,272 people); or (4) sex information in a year was imputed (3,885,179 people). Despite these exclusions, it is likely that in some remaining cases, PIKs were not assigned to the correct person. Based on Layne et al. (2014), we anticipate that 0.2 % to 1.2 % of the cases in our data are false matches. False matches disproportionately affect rates for rare events—or in this case, numerically small groups (Hemenway 1997). See Table S1 in Online Resource 1 for our related calculations.
Some changes to the race and Hispanic questions and instructions occurred between 2000 and 2010 (see Fig. 1). The changes, detailed by Humes et al. (2011:2), were intended to increase reporting within the five Office of Management and Budget (OMB) race categories, decrease item nonresponse, and increase detailed race/ethnicity reporting. For example, the 2010 instruction, “For this census, Hispanic origins are not races,” was intended to encourage reporting in one of the five federally defined race groups as opposed to providing a response outside these groups (e.g., Mexican) that was then recorded as SOR. Experimental evidence suggests that the changes to the questions and instructions had the intended effects (Stokes et al. 2011). To minimize effects of questionnaire differences, we excluded people from households who returned an Alternative Questionnaire Experiment census form in 2010 (347,301 people; see Compton et al. 2012). Other effects of questionnaire changes remain in the data.
We also excluded cases in which the person was listed as SOR and at least one other race in 2000 (1,903,447 people). During the process of making the Census 2000 race write-in entries consistent between the enumerator-filled questionnaire and the mailout-mailback questionnaire, a processing error caused approximately 1 million cases to be permanently recoded as SOR multiracial (see U.S. Census Bureau 2007: data note 5). We excluded some cases for multiple reasons. We study the full set of people remaining after our exclusions, leaving us with 161,700,185 people.
Write-in responses were categorized into federally defined race groups using slightly different protocols in 2000 and 2010. We corrected for this by applying the 2010 coding scheme to write-in responses given in 2000. Also, coding procedures for write-in lines changed from 2000 to 2010 for those who wrote more than two race responses on one write-in line. This may have a small impact on our results.
Our data are not nationally representative and should not be interpreted as such. Because these are total U.S. population data (not sample data), there are no weights. We show in Table 1 the distribution of race and Hispanic responses in the 162 million cases in our analysis data and compare them with parallel numbers for the full 2000 and 2010 population data. For example, 64 % of those reported as non-Hispanic white in 2000 are in our study data, but only 20 % of those reported as SOR Hispanic in 2010 are included. Liebler et al. (2014) applied response change rates from the linked data to the full population in Census 2000 and estimated that 8.3 % of the total population in 2000 was reported as a different race and/or ethnicity in 2010.
These data are uniquely well suited to study cross-time changes in individuals’ race and Hispanic responses. We have millions of responses that describe the same individual at two points in time; thus, we can observe response changes directly as opposed to using inference (e.g., cohort component analysis). Our data cover more than one-half of all people in the United States at the time, with a density that allows disaggregation into the many federally defined race/Hispanic-origin categories, as opposed to a study of only the largest groups. In addition, because these are the most recent U.S. decennial census data—data often used to study programs, policies, and American life—response changes in these data are worth understanding in and of themselves.
Extent of Response Change and Response Groups Most Affected
Overall Rates of Race/Hispanic Response Change
To what extent, and in which racial/ethnic groups, did individuals’ race and/or Hispanic responses change? Of the 161,700,185 people in our data, 6.1 % (9,782,918 people) had a different race and/or Hispanic response in 2010 than they did in 2000. Figure 2 displays a visual cross-tabulation, or “heat map,” of all response changes in our data.7 Each cell is darkened in accordance with the number of people with that combination of responses (stable responses are on the diagonals of each quadrant).8
Response changes spanned the full variety of race and Hispanic-origin groups. Shaded boxes are found throughout the figure, and many denote a large number of people. For example, thousands of people who were reported as single-race non-Hispanic Asian in 2000 (row 4) were reported in 2010 as (1) a different non-Hispanic single race (columns 1, 2, 3, 5, and 6 of the top-left quadrant), (2) non-Hispanic and multiple races (the remaining columns in the top-left quadrant), or (3) Hispanic Asian (column 4 in the top-right quadrant). Many people’s responses changed from one single race to another single race (first six rows/columns of each quadrant). Also, ethnicity responses changed for thousands of people; the top-right and bottom-left quadrants are both well populated with shaded boxes.
In Fig. 3 we give more detail about the identification flows into and out of each single-race and two-race response group, and into and out of the Hispanic ethnicity group. Each row in the figure includes all people who had a particular race/Hispanic response combination in 2000, 2010, or both. The charts in Fig. 3 can be seen as Venn diagrams of overlapping rectangles describing columns C, D, and E. For example, the center section shows the proportion of people who had that particular race/Hispanic response in both years (column D / column C + D + E). See Table S1 in Online Resource 1 for estimates of the effect of false matches on these numbers. In most groups, the size of the population who left (column C) is similar to the size of the population who joined (column E). In other words, response churning is mostly hidden in cross-sectional comparisons of the 2000 to 2010 data.
The large center bars for the single-race non-Hispanic white, black, and Asian response groups in Fig. 3 show that they were largely stable groups: 3 %, 6 %, and 9 % of these responses changed, respectively. The non-Hispanic and Hispanic response groups (regardless of race responses) also had substantially stable sets of incumbents: respectively, 1 % and 13 % of these responses changed. These same groups were usually found to be relatively stable in the short-term census follow-up studies (Bentley et al. 2003; del Pinal and Schmidley 2005; Singer and Ennis 2003), among adolescents and young adults (Brown et al. 2006; Doyle and Kao 2007; Saperstein and Penner 2014), and in assumptions by neighbors (Porter et al. 2016). This finding may be evidence that the socially constructed boundaries of the non-Hispanic white, black, and Asian groups are relatively well defined, as well as the Hispanic/non-Hispanic boundary.
In all other racial/ethnic groups shown in Fig. 3, the number of people who left or joined is large relative to the number of people who stayed in the group. We find very substantial population churning among those reported as American Indian, Pacific Islander, and/or multiracial, and in terms of race responses among those reported as Hispanic. The double-minority response groups (rarely included in other studies) have the highest levels of response change. We conclude that to understand response stability and change, researchers need to study the full diversity of heritages, not just the larger groups.
Three Case Studies of Response Churning
When people had a different race/Hispanic response in 2010 than 2000, which specific groups did they leave/join? We answer this with three case studies, chosen because they encompass the most common response changes (discussed later) and because they include race groups that have extensive response change but small sample sizes in other studies: American Indians and Pacific Islanders. Our case studies are the first to show detailed response changes among double-minorities or people of all ages in the modern era.
In our first case study, shown in Table 2, we focus on Hispanic, white, and/or SOR responses in 2000 and/or 2010. The rows/columns list specific race/ethnicity groups, and the cells show 16 of the 8,064 cells depicted in Fig. 2. Hispanic race responses have most often been white or SOR; for more about race responses of people reporting Hispanic origins, see Golash-Boza and Darity (2008), Logan (2003), Miyawaki (2016), and Tafoya (2004).
The cells on the diagonal show that non-Hispanic white responses are quite stable whereas non-Hispanic SOR responses are not. Two situations in Table 2 do not follow the pattern of (generally) offsetting flows seen elsewhere in this table and in Fig. 3. First, more people changed responses from Hispanic SOR to Hispanic white (2,380,183) than the reverse (1,243,630). This imbalance is consistent with the 2010 questionnaire text intended to encourage a federally recognized race response (“For this census, Hispanic origins are not races”; see Stokes et al. 2011). Second, more people changed responses from non-Hispanic white to Hispanic white (710,019) than the reverse (417,855), perhaps because the inclusive word “origin” was added to the Hispanic-origin question instructions (see Stokes et al. 2011).
In our second case study, we highlight response changes among people who were reported as non-Hispanic black, American Indian, and/or white in 2000 and/or 2010; see Table 3. These are the most long-standing U.S. populations, with centuries of interracial unions and many people of a variety of mixed descents (Brooks 2002; Katz 1986; Nash 1974; Perdue 2003).
Table 3 shows substantial response change from a single-race to a different single-race response between white and American Indian, as was suspected but not proven in prior research (Eschbach et al. 1998; Liebler and Ortyl 2014). See Liebler et al. (2016) for further analysis. Some response churning occurred between single-race white and single-race black responses, as found in historical data by Saperstein and Gullickson (2013) and Loveman and Muniz (2007).
Many people in Table 3 were reported as one race in one census and an additional race in the other census. Among those with white-black responses in 2000 (416,956), there were 90,086 reported as single-race black in 2010 and 35,837 reported as single-race white. This distribution of single-race responses shows relatively more white responses than found among adolescents who reported white-black multiracial in the 1990s (Doyle and Kao 2007; Harris and Sim 2002). White–American Indian and black–American Indian multiracial responses have a strong tendency toward white or black single-race responses when the responses change.
In our third case study (shown in Table 4), we highlight non-Hispanic Asian, Pacific Islander, and/or white responses. These three groups have a long, intertwined history, especially in what is now the western United States (Williams-León and Nakashima 2001). Those reported as Pacific Islander have high rates of response change (as shown in Fig. 3), yet the small total group size has excluded them from virtually all previous studies. Our research provides unique information about response changes affecting the Pacific Islander response group.
Of 152,640 people reported as non-Hispanic single-race Pacific Islander in 2000, approximately twice as many were reported as multiple-race white–Pacific Islander or Asian–Pacific Islander in 2010 than were reported as single-race white or Asian. This tendency to keep the Pacific Islander designation but add or remove an additional response impacts the effect of various category aggregations shown in upcoming Fig. 6. In our data, people whose responses changed between non-Hispanic white-Asian and a single race were usually reported as non-Hispanic white. Adolescents in the Add Health data, in contrast, more often collapsed a white-Asian response to single-race Asian (Doyle and Kao 2007; Harris and Sim 2002).
The three case studies reveal some general patterns. First, race and/or Hispanic response changes were in many different directions: between different single races, adding and dropping races, and changing whether Hispanic origins were reported. Second, in our data with people of all ages, some single-race responses were more common than others when a response changed from one race to two or vice versa; the largest of the groups was favored, except among white-black responses. Third, we continue to see similarly sized, countervailing flows between specific race/Hispanic-origin response categories. This churning is hidden in cross-sectional data.
Top 20 Response Changes
Next we show the 20 most common race response changes in our data (Fig. 4 and Online Resource 1, Table S2). The two most common response changes (seen earlier in Table 2) were changing from a Hispanic SOR response to a Hispanic white response, and the reverse (see Stokes et al. 2011). These two most common changes make up 37 % of race/Hispanic response changes in our data.9
At least three other patterns among the most common response moves are notable. First, many of these response moves involved a modification to the response, not a complete change; this is the case in 14 of the top 20 most common moves (ranks 3–7, 9, 11, 12, 14, and 16–20), suggesting that most response changes were not errors or false links. See Rockquemore (1998) and Root (1996) for qualitative research about fluid multiracial identities.
Second, many common response changes involved a change between a non-Hispanic single-race white response (the majority group) and a minority group response, often a two-race response that included white (ranks 3–7, and 14). Some of these response moves were anticipated by previous research about adolescents (Doyle and Kao 2007; Harris and Sim 2002).
Third, a number of people changed from one single-race response to another, most commonly from a majority group response to/from a non-Hispanic American Indian response (ranks 8 and 10) or a non-Hispanic black response (ranks 13 and 15). In our data, more of these moves involved leaving the majority group. That these people were ever reported as non-Hispanic white raises the possibility that their response (and perhaps their identity) might be “optional” and without social costs, as has been shown for ancestry responses among people who identify as non-Hispanic white (Gans 1979; Waters 1990). Optional identities have been thought to be a special case of white privilege and not available to those whose physical appearance generates socially enforced race labels imposed by others. Thus, these single-race to single-race response change patterns are a new and relatively unexpected finding.
Characteristics of People Whose Race and/or Hispanic Responses Changed
Relative Representation in the Top 20
There is limited information gathered in full-count censuses. For each of the top 20 response changes, we show proportions who (in 2000) were children, women, living in the West, and used the mail response mode in both years (also see Online Resource 1, Table S2). The top two rows in Fig. 4 show the averages for all people in our data and all those whose response changed.10 People whose responses changed were more commonly children, living in the West, and/or using other response modes besides mail in at least one census (e.g., nonresponse follow-up by an enumerator). Women were not overrepresented among response changers. Our data are unique in their broad scope, so we are able to see that there were also millions of adults, people in other regions, and/or people who responded to the censuses by mail among those whose race and/or Hispanic response changed across the decade in our data.
Important variations in patterns are evident across the different response changes. Children predominated among those whose reports changed from black to white-black, or vice versa (ranks 12 and 18), while adults predominated among those reported as combinations of white and American Indian (ranks 5, 6, 8, 10, 16, and 19).
Those sometimes reported as Asian (ranks 17 and 20) or Hispanic SOR (ranks 1, 2, and 11) were more often in the West, while those sometimes reported as black (ranks 12, 13, 15, and 18) were mostly in non-West regions. Those whose responses changed between Hispanic white and non-Hispanic white (ranks 3 and 4) were less often in the West.
The experience of interacting with an enumerator can influence a person’s response (Wilkinson 2011), and people whose responses were through the mail in both years do not have this potential source of response change. Among those in Fig. 4, using the mail response mode in both censuses was most common among people who were sometimes reported as white-Asian (ranks 7, 14, 17, and 20). Involvement of an enumerator at least once was most common among people reported as Hispanic SOR in 2010 but not 2000 (rank 2) and those with single-race to single-race response changes (ranks 8, 10, 13, and 15).
Does Race/Hispanic Response Churning Affect Social Science Researchers?
Extent of Response Change, by Age and Sex
To assist others in understanding how response changes may be affecting their data, we use Fig. 5 to display the rates of race and/or Hispanic response change among people in our linked data by age group, sex, and within 12 relatively large race/ethnicity categories (5 age groups × 2 sexes × 12 race/ethnicity categories = 120 subpopulations; all variables are as of 2000); associated numbers are in Online Resource 1, Table S3. Each horizontal bar represents one of the subpopulations, sums to 100 %, and shows the percentage of those who had (1) the same response in both censuses, (2) a different race response, (3) a different Hispanic response, or (4) different race and Hispanic responses in 2010 than 2000.
The rate of response change among those reported as single-race non-Hispanic white, black, or Asian was low across all age and sex groups with a tendency for more response change among children. We find higher levels of response change but fairly little variation by age or by sex among people reported (in 2000) as non-Hispanic American Indian, Pacific Islander, white–American Indian, another non-Hispanic response, or a Hispanic race response that was neither white nor SOR. Theories about reasons for response change for these groups should not rest heavily on age or gender dynamics.
Other groups in Fig. 5 do show age and/or sex gradients in race response changes. Young people had a lower rate of race response change than older people among those reported in 2000 as non-Hispanic white-black, non-Hispanic white-Asian, or Hispanic SOR. Increases in white-black and white-Asian interracial unions have perhaps allowed the younger generation to be relatively comfortable with (and stable in) a multiracial identity (see Korgen 1998). Older people reported as Hispanic white in 2000 had more stable race/Hispanic responses than did younger people in the same response category; perhaps the reasons for choosing a white race response are clearer to older people who identify as Hispanic (see Dowling 2014 and Vargas 2015).
Extent of Response Change When Categories Are Combined for Analysis
Some analyses strategies have an implicit assumption that responses do not change. These include, for example, traditional race−/ethnicity-specific life tables or residential segregation measures that assume responses are unaffected by neighborhood composition. Researchers using these methods might wish to reduce cross-category response change by aggregating categories. We use Fig. 6 (and Table S4 in Online Resource 1) to illustrate the extent of response churning across various aggregations of response categories.
Various aggregations of white responses show about the same (high) level of stability of individuals in the response category, whether including or excluding white Hispanic responses and/or multiple-race part-white responses. Different aggregations of black responses and (to a lesser extent) Asian responses also contain about the same proportion of stable race/Hispanic responses.
The four strategies for coding Hispanic responses give different levels of response stability. A coding strategy that divides Hispanic responses into groups based on the race response in 2000 would have a substantial proportion of different people in 2010. Because relatively few people add or drop the Hispanic designation but relatively many with Hispanic responses had a different race response from one census to the other (Fig. 3), a recode of all Hispanic responses into a single group (ignoring the race responses) would include most of the same individuals in 2000 as in 2010.
No such simple coding solution exists for increasing the consistency of individuals in the American Indian, Pacific Islander, or multiple-race groups. About 53 % of those reported as non-Hispanic single-race American Indian in 2000 were reported as the same race/Hispanic origin in 2010. Because a relatively high proportion of people in our data who were reported as American Indian in one census have an entirely non–American Indian response in the other, making the coding scheme very inclusive does not reduce population churning; in fact, stable responses decline to only 40 % of all responses. Broadly aggregating Pacific Islander responses increases the stability of the studied group (to 57 % stable responses) because people reported as Pacific Islander in one census are more often adding or dropping other race responses.
Summary and Conclusions
We investigated person-level changes in race and/or Hispanic origin responses using remarkable data: information about 162 million people whose responses in Census 2000 were linked to their responses in the 2010 census. We were not the first to notice that people’s race and Hispanic-origin responses can and do change, but our data allowed us to expand substantially on prior knowledge by studying the modern era and including all federally defined race and Hispanic-origin groups throughout the nation, including those of all ages.
To what extent do individuals’ race and/or Hispanic-origin responses change? We found that 6.1 % of people in our data had a different race and/or Hispanic-origin response in 2010 than in 2000.11 Race and/or Hispanic-origin responses changed in a wide variety of ways in patterns anticipated by previous research on adolescents, people living a century ago, particular racial/ethnic subgroups, and based on short-term census follow-up studies. Responses changed in some ways anticipated by substantial previous research (e.g., adding or taking away a race response) and other ways that have not been well studied (e.g., from one single race to another and high rates of change if reported as a double minority). Inflows to each race/Hispanic group were in most cases similar in size to the outflows; cross-sectional views of these data show a small net change. The most imbalanced response change flows may be uneven because of questionnaire design changes. Theoretical explanations for response change should take into account response churning—countervailing flows of response changes—as opposed to focusing on only one direction of response change.
Is change more common to/from some racial/ethnic groups than others? The extent of response change varies by racial/ethnic group. Those reported as single-race non-Hispanic white, black, or Asian showed response stability between 2000 and 2010, with only 3 % to 9 % of people in these groups changing responses. Hispanic/non-Hispanic responses (disregarding the race response) were also particularly stable (13 % and 1 % response change, respectively). However, we found extensive population churning among those reported as American Indian, Pacific Islander, or multiple races (response groups usually excluded from other studies). Most people reported as Hispanic in our data had a different race response in 2010 than 2000 (more than 50 % race response change in all Hispanic race groups).
Does the propensity to change responses vary by characteristics of the individual? Response changes happen throughout our society: among males and females, children and adults, in all regions, and across response modes. At the same time, response changes were relatively common among children, those living in the West in 2000, and/or those who used a nonmail response mode in one or both years. We found variation in these patterns across the 20 most common response changes. For example, children were overrepresented among those changing responses between non-Hispanic white-black and single-race black, whereas adults were overrepresented among those with combinations of non-Hispanic white and/or American Indian responses.
Do these changes affect research findings? Analysts who use race and/or Hispanic-origin data need to take into account the possibility of response changes, especially when working with data on smaller race groups and/or race responses among people reported as Hispanic. To assist, we showed response change rates for 120 age-, sex-, race- and Hispanic origin-subgroups. We also calculated the extent to which response change rates are sensitive to various aggregations of the 126 possible combinations of race and Hispanic-origin categories. Analysts’ coding decisions can notably increase or reduce response change rates in the Hispanic, American Indian, and Pacific Islander groups. If Hispanic were a response option in a future combined race/Hispanic census question (Compton et al. 2012), our results suggest that it may reduce instability in race responses within the Hispanic group (by eliminating the request for a race response). The white, black, and Asian response groups show about the same levels of response change across age and sex groups and across different aggregation schemes.
Like all research, our work has limitations. Although we apply strict case selection rules, a small proportion of response changes are likely due to different individuals filling out the form, faulty data links, or post-enumeration processing issues. Some people may have provided erroneous information, either by mistake or on purpose. We also are limited in the conclusions we can draw from these data: we have only two data points, we do not know who filled out the form, and census responses are not equivalent to identities. Our data overrepresent people reported as non-Hispanic white (the most stable response group) and underrepresent other response groups.
For almost all race response groups, response instability is an important factor to consider in analyses. When deciding on the number and types of questions to ask about race and ethnicity, study designers gathering data need to recognize that these are concepts with complexity (Page 2015), and they will not be able to be well captured in a single, simple one-time question. Multiple questions (about different aspects of the concepts) and repeated measures are likely to be increasingly necessary as the United States becomes even more diverse in terms of racial/ethnic groups, immigrant generations, countries of immigration, and descendants of interracial unions.
When a survey has multiple questions about race and ethnicity (e.g., race, ethnicity, ancestry, tribe, parent’s birthplace, or skin tone; see Roth 2016), researchers can combine and compare answers, find previously hidden subpopulations, and apply the measure(s) most appropriate to the topic of study. For example, although their race responses often change, people reported as Hispanic white have a different socioeconomic and geographic profile than people reported as Hispanic SOR (Logan 2003).
Repeated measures—at each wave in panel data—allow practical and theoretical understanding of how the respondent’s current race/ethnicity, past race/ethnicity, and response stability are related to other factors, such as their health, residential location, or educational attainment. Response change means that race-/ethnicity-specific population sizes and characteristics can change for other reasons besides birth, death, migration, and individuals’ achievements (such as a completed degree). When using cross-sectional data, researchers should caution their readers that the given race/ethnicity response is effective for that point in time (and that measurement strategy), and may or may not be the same as in the past, in the future, or when assessed with different measures. Although not a current practice with race and ethnicity data, this same caveat holds for most other measured characteristics, such as education, location, and marital status.
At a conceptual level, our results highlight an oft-stated declaration: race and ethnicity are complex, multifaceted constructs. People are constantly experiencing and negotiating their racial and ethnic identities in interactions with people and institutions, and in personal, local, national, and historical context. Some racial and ethnic identities cannot be effectively translated to a census or survey questionnaire fixed-category format. Given the many forces urging instability in responses, the fact that we did find response stability (93.9 % of race/ethnicity responses did not change) is a testament to the power of social norms and racial ideology in directing these responses.
We use the terms “race,” “ethnicity,” and “Hispanic origin” in congruence with the federal statistical guidelines used to collect the data (Office of Management and Budget (OMB) 1997). Federally defined race categories are white, black or African American (“black” here), American Indian or Alaska Native (“American Indian” here), Asian, and Native Hawaiian or Other Pacific Islander (“Pacific Islander” here) (OMB 1997). The Census Bureau also uses a residual category called Some Other Race (“SOR” here). The two federally defined ethnicity categories are Hispanic and non-Hispanic. The ethnicity question is separate from the race question; see Fig. 1.
We study all response changes in the same way but acknowledge that each change has its own meaning and reasons. For example, adding or dropping a second race response could reflect a different identity phenomenon than switching responses from one single race to another.
Population churning—countervailing flows into and out of a response category—is (at most) minimally discussed in these reports.
This comparison is limited by differences in response mode (mail vs. phone) and question format (multiple race responses invited vs. one response invited).
The West region includes Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington, and Wyoming.
Enumerators are involved when the household does not return the mailed census form; when the address is in an area that consists of mostly seasonal homes; and in some extremely rural areas, such as western American Indian reservations, Alaska Native areas, and rural Maine (Fallica et al. 2012; Walker et al. 2012).
The 63 race response categories (six race groups alone and in each combination) are not labeled in Fig. 2 but are in the same order as Fig. 3 and Census 2000 Summary File 1 (see U.S. Census Bureau 2007:6–1 to 6–3).
Recall that our case selection criteria exclude people whose 2000 data list them as multiple-race including SOR (62 race/Hispanic response categories); those 62 empty rows are not shown.
Two other common patterns (ranked 9th and 11th) also show race response changes by people who were consistently identified as Hispanic.
The vertical line in each column of Fig. 4 marks the average among response changers, as shown in the second row above the table.
When group-specific response change rates are applied to the full Census 2000 population, the estimated rate of response change increases to 8.3 % (Liebler et al. 2014).