Will the rise of genetic ancestry tests (GATs) change how Americans respond to questions about race and ancestry on censuses and surveys? To provide an answer, we draw on a unique study of more than 100,000 U.S. adults that inquired about respondents' race, ancestry, and genealogical knowledge. We find that people in our sample who have taken a GAT, compared with those who have not, are more likely to self-identify as multiracial and are particularly likely to select three or more races. This difference in multiple-race reporting stems from three factors: (1) people who identify as multiracial are more likely to take GATs; (2) GAT takers are more likely to report multiple regions of ancestral origin; and (3) GAT takers more frequently translate reported ancestral diversity into multiracial self-identification. Our results imply that Americans will select three or more races at higher rates in future demographic data collection, with marked increases in multiple-race reporting among middle-aged adults. We also present experimental evidence that asking questions about ancestry before racial identification moderates some of these GAT-linked reporting differences. Demographers should consider how the meaning of U.S. race data may be changing as more Americans are exposed to information from GATs.
People would ask me . . . what is your nationality? And I would always answer “Hispanic.” But when I got my . . . [genetic ancestry test] results, it was a shocker: I'm everything! I'm from all nations. I . . . look at forms, now, and wonder, “what do I mark?”—Livie, from AncestryDNA (2016) advertisement
Genetic ancestry tests (GATs) offer consumers new types of information about their family ancestry. By 2019, Americans with access to genetic ancestry information included more than 26 million people who participated in direct-to-consumer genetic testing (Regalado 2019), as well as any biological relatives with whom they shared their results (Foeman et al. 2015; Rubanovich et al. 2021). The recent exponential growth of GAT sales (Keshavan 2016) has prompted scholars across disciplines to weigh the consequences, ranging from questioning the validity of test results (Bolnick et al. 2007; Jobling et al. 2016) to considering how the industry's growth is likely to affect Americans' conceptions of race (Phelan et al. 2014; Roth et al. 2020). Yet, demographers have been largely absent from these conversations despite the potential implications of widespread ancestry testing on racial and ethnic reporting in surveys and censuses.
Does taking a GAT change how American adults respond to race and ancestry questions on demographic questionnaires? To answer this question, we draw on unique data that include information about the self-reported race and ancestry of more than 100,000 U.S. adults who were registered as potential bone marrow donors with the National Marrow Donor Program (NMDP). The survey also asked how much respondents knew about their family ancestry and how they came by that knowledge. These features, along with question order randomization, allow us to contrast the race and ancestry responses of people who reported taking a GAT with those of people who had not taken a GAT at the time of the survey.
We find that GAT takers, compared with nontakers, were more likely to report multiple races in part because GAT takers in our sample were more likely to translate awareness of mixed ancestry into multiracial identification. The patterns are similar regardless of whether respondents were asked about taking a GAT before they reported their race and ancestry or after, suggesting an enduring difference in responses for GAT takers. Overall, our results suggest that many GAT takers treat genetic estimates of their geographic origins as the correct answer to survey questions about both their ancestry and racial identification. This implies a shift from racial identification being based on personal experience and family socialization toward being informed by a distant and more abstract conception of ancestry. We expect that our findings are harbingers of greater changes to come in both the conceptualization and reporting of race and ancestry as more Americans are exposed to genetic ancestry information.
Race and Genetic Ancestry Testing
GAT services are provided by two types of firms: genealogy companies, such as AncestryDNA, and health-focused genetic-testing companies, such as 23andMe. Nevertheless, the user experience is similar: customers purchase a kit, provide a saliva sample, and several weeks later receive a report of regions around the world from whence their ancestors ostensibly originated. For autosomal admixture tests, the results typically include specific ancestry percentages, such as “30% Scandinavian” or “65% sub-Saharan African,” with increasing specificity of purported percentages down to the country level (e.g., “Swedish” or “Japanese”) as databases have expanded. These estimates of genetic ancestry are based on a process of probabilistic assignment informed by genetic markers that are differentially distributed across reference populations drawn partly from the companies' customer database (Jobling et al. 2016).
Research has questioned the validity of these techniques, including how GATs operationalize historical origins and their conflation with contemporary racial identities (Bolnick et al. 2007; Lee et al. 2009; Royal et al. 2010; Weiss and Long 2009). The production of GAT results is poorly understood by the general population. Even those who were taught how to interpret GATs struggle to articulate exactly what they measure (Bobkowski et al. 2020). Yet, GATs continue to be marketed as a way to access authoritative evidence about both personal identity and ethnic community membership (Putman and Cole 2020), including designations such as “Native American” (Walajahi et al. 2019). They also tend to be pursued because of this perceived authority; for example, one survey of GAT takers found people were motivated to take a genetic ancestry test to prove whether family stories were true (Roth and Lyon 2018).
Relatively little is known about who is more or less likely to participate in genetic ancestry testing. People who engage in any form of genealogy are more likely to be women, older, and more highly educated than the general U.S. population, and some research suggests that demographic patterns in genealogical interest extend to the subpopulation who specifically take GATs (Horowitz et al. 2019). Indeed, many people report they take GATs as part of a larger project of genealogical research, and many people who have taken a GAT report taking more than one (Roth and Lyon, 2018).
Do GATs change conceptions of race? Existing research supports two somewhat contradictory expectations about the relationship between GATs and conceptions of race. On one hand, previous research suggests that any contact with genetic information, including through GATs, may contribute to biologically essentialist perspectives on race. Science studies scholars have found that contemporary genetic and biomedical research continues to conflate the concepts of race, ancestry, and geographic origins in ways that imply racial categories are byproducts of genetic diversity (Benjamin 2009; Fujimura and Rajagopalan 2011; Fullwiley 2011, 2014; Roberts 2011). This flexible and ambiguous definition of human “populations” may serve to consolidate the authority of genetic research rather than undermining it (Panofsky and Bliss 2017), allowing essentialized beliefs about racial difference to persist among academics, policy-makers and laypeople alike. In this context, scholars have warned that people who receive GAT results will be more likely to believe that race is genetically determined (Duster 2011, 2015; Nelson 2008), with increasing racial essentialism being most likely among people with less understanding of the science behind their results (Roth et al. 2020).
On the other hand, studies focused specifically on whether people incorporate GAT results into their racial identity suggest that evidence of genetic ancestry might not influence identification as much as a genetic determinism perspective expects. About one in five GAT takers in Roth and Lyon's (2018) survey reported changing how they identified by race after receiving GAT results. The respondents who maintained their pre-test racial identity did so for a number of reasons, including rejecting the accuracy of the genetic test. But the vast majority appeared to treat genetic ancestry as one among many options from which they could choose to build their racial and ethnic identities (Lawton and Foeman 2017; Panofsky and Donovan, 2019; Roth and Ivemark 2018; Shim et al. 2018). In this sense, the influence of GATs could be relatively insignificant compared with the multitude of life experiences that influence a person's racial identity. However, the likelihood of changing racial identity based on GAT results also may vary by race; for example, people who identified as White or Asian before taking a GAT were more likely to report changing their racial identity than those who identified as Black or Hispanic before taking a GAT (Roth and Ivemark 2018). This pattern is consistent with historical norms of racial classification, such as the “one-drop rule” (Davis 2010), that permitted some racial categories to encompass more ancestral heterogeneity than others.
These bodies of scholarship address the influence of genetics on societal-level conceptions of race and on individual-level identity development. However, few studies offer direct evidence about whether GATs affect reporting on demographic surveys. In one exception, Foeman et al. (2015) compared pre- and post-GAT responses to a 2010 census-style race question for 43 college students and found that 13 (30%) changed their racial identification, all by adding a category compared with their pre-test responses. Roth and Ivemark (2018) also found that 14% of their sample reported both changing their racial identities after seeing GAT results and marking new race responses on the 2010 U.S. Census. Given the recent popularity of GATs, even a relatively small effect on responses to demographic surveys would be amplified in population-level statistics. Further, some GAT marketing specifically encourages the idea that test results can change how people respond to race questions on official forms (e.g., the testimonial from Livie quoted earlier). For large-scale data-collection efforts, such as the 2020 U.S. Census, demographers must seriously consider the potential influence of GATs on racial identification and their implications for the conceptualization of race and ancestry.
In this study, we ask whether GAT takers respond to race and ancestry questions on demographic questionnaires differently than people who have not taken a GAT. We also explore whether the differences we find between GAT takers and non-GAT takers reflect selection into GAT taking by other demographic characteristics as well as the association between GATs and other forms of genealogical research. Finally, we examine whether question order, such as responding to ancestry or race questions first, contributes to reporting differences between GAT takers and non-GAT takers.
Data and Methods
We draw on unique data that includes information about the self-reported race and ancestry of more than 100,000 U.S. adults who were registered as potential bone marrow donors with the National Marrow Donor Program (NMDP). All registered NMDP donors with valid email addresses were invited to participate in a survey about race, ancestry, and genetics between May and July 2015. Of the nearly 2 million invitees, 20% opened the email, and 5% (n = 109,830) completed the survey. This response rate is normal for email-based surveys and does not indicate low data quality (see Fan and Yan 2010), but it does mean that generalization beyond this sample should be undertaken with caution. Our analyses are restricted to the 100,855 respondents (92% of the full sample) who completed all questions about race, ancestry, genealogical knowledge, and demographics.1
The survey was designed to examine the relationship between measures of genetic ancestry and self-reported race and ancestry measures (for more details, see Horowitz et al. 2019). Here, we focus on responses to racial self-identification and self-reported ancestry.
Race and Ancestry Measures
Racial self-identification was collected using a combined question format recently tested by the U.S. Census Bureau in which the “Hispanic or Latino” response was offered alongside other federally recognized race categories. Respondents could select multiple responses, and 11.5% did so. This frequency of multiple-race reporting is considerably higher than the 2% to 3% typically found in nationally representative surveys that use a separate question approach to measure race and Hispanic origin, but it is in line with estimates of 10% to 13% found for various versions of the combined question in the 2015 National Content Test (Mathews et al. 2017). It also is possible that potential marrow donors who identify as multiracial were more likely to participate in a study aimed to improve outcomes for transplant matching (Bergstrom et al. 2012; Shay 2010). However, the frequency of multiple-race reporting varied by question order and previous exposure to GATs, as we discuss later in the article.
Respondents also were asked to report their ancestral origins. They were offered a list of geographic regions, such as Eastern Europe, Middle East, or Northern Africa, and were instructed to “select as many categories from the list below as needed to fully describe the origins of your family” (see Figure A1 in the online appendix for full question wording and response options). Overall, 55% of respondents selected more than one ancestry. Respondents also could report unknown ancestry. About one in six (17%) respondents selected “I do not know some, or all, of my family origins” alone or in conjunction with other ancestries. Our indicator of whether respondents selected multiple ancestries does not include these Unknown responses. People who identified as Black only or American Indian only, or who selected multiple races, listed Unknown ancestry more frequently than other respondents. GAT takers in our sample were less likely to list Unknown ancestry than non-GAT takers (12% vs. 17%, p < .001).
We create a count of what we call race-unique ancestries by mapping respondents' reported ancestries to the race response(s) that would be expected based on the Office of Management and Budget's geographic origin and ancestry definitions for U.S. official racial categories (Office of Management and Budget [OMB] 1997). In line with previous research (Gullickson 2016), we consider a respondent as reporting one race-unique ancestry if they selected one or more responses within a particular race-ancestry category (see Table A1 in the online appendix for ancestry-race correspondence based on OMB definitions). Note that Caribbean ancestry is not counted for the purposes of this measure because the OMB does not assign people from the Caribbean to a racial category. Similarly, Unknown ancestry cannot be matched to a specific racial category and therefore is not counted as a race-unique ancestry. That is, reporting Caribbean or Unknown does not add or subtract from a person's number of race-unique ancestries, and people who reported only Caribbean or Unknown ancestry are counted as having no race-unique ancestries. Overall, about 23% of the sample reported multiple race-unique ancestries, which is double the proportion that selected multiple responses for racial identification.
The genealogy section of the survey asked respondents to report how much they knew about their ancestry on each side of their family and whether they had undertaken any specific efforts to research such information. Respondents could indicate that they engaged in any (or multiple) of the following knowledge-seeking activities: talking with family members, looking at family documents, using an ancestry website, sending away for official documents, searching records in a library or archive, or taking a GAT. Overall, 5% of our sample indicated that they had taken a GAT (n = 5,461).
Respondents also were offered an opened-ended response allowing them to list any other research they had done. Some volunteered that although they had not taken a GAT, a close family member (e.g., sibling or parent) had (n = 264). We include these people in our indicator of GAT taking because they had access to genetic ancestry information and because how they were counted does not affect our conclusions. However, as we discuss later, this distinction likely is relevant for future research.2 After accounting for open-ended responses, we find that 6% of the sample reported no previous genealogical research (n = 5,957).
We use these knowledge-seeking questions to compare people who report similar amounts of genealogical research. There are multiple ways to account for the amount of research, based on reported activities beyond taking a GAT, and our results are consistent across several coding schemes. For descriptive illustration, we use a three-category variable that ranks research activities based on the implied level of engagement. For example, talking to a family member about family history is common and typically involves low effort, whereas searching records in a library takes more explicit motivation and planning. People who reported doing nothing or only talking with family are coded as doing “very little research”; people who reported using a genealogical website or studying family documents are coded as doing “some research” (regardless of whether they reported talking to family); and people who reported going to a library or requesting official documents are coded as doing “a lot of research” (regardless of any other reported research options).3Table 1 compares the research distribution of GAT and non-GAT takers. In line with previous studies suggesting that people take GATs to confirm their genealogical knowledge, GAT takers in our sample are underrepresented in “very little” and “some research” compared with non-GAT takers, and overrepresented in “a lot of research.”
Other Demographic Characteristics
Our data also include self-reported nativity, educational attainment, region of residence, and age group (18–24, 25–34, 35–44, 45–54, and 55–64). Respondents' sex (female/male) comes from NMDP enrollment forms. Like the NMDP registry, our sample includes only ages 18–64 because people over age 65 are not eligible to be bone marrow donors. Respondents aged 25–44 are somewhat overrepresented in our sample, which is relatively common in online surveys (Börkan 2010); female respondents and people with advanced degrees are also somewhat overrepresented. However, the regional distribution of respondents is very similar to contemporary estimates of the population aged 18–64 from the American Community Survey (see Table A2 in the online appendix for a comparison between our sample and the 2015 American Community Survey sample). Controlling for these characteristics in multiple regression models allows us to hold constant some factors related to selection into GAT taking as well as some of the ways our respondents differ from nationally representative samples.
Several aspects of the survey varied between respondents: (1) whether they received a (randomly assigned) long or short recruitment email; (2) whether they responded to the first recruitment message or only after one or more reminders; and (3) the survey question order. We control for recruitment differences in all models and leverage the randomized ordering in our analyses.
Assignment to survey conditions was randomized across two dimensions related to race and ancestry reporting and genealogical knowledge. This experimental design offers stronger evidence of the influence of GAT taking on patterns of reporting compared with an otherwise similar cross-sectional survey.
The first randomization dimension varied whether participants were primed to think about genealogical research before reporting race and ancestry. In the knowledge prime condition, respondents were asked about their genealogical research, including whether they had taken a GAT, before they answered questions about race or ancestry. In the unprimed condition, respondents answered race and ancestry questions before they answered genealogical research questions.
Differences in reporting between the knowledge prime and unprimed conditions speak to whether GATs shape race and ancestry responses. Reporting differences between GAT takers and non-GAT takers only in the primed condition would suggest that GAT takers respond to race and ancestry questions differently when reminded of GATs. However, if reporting differences in the unprimed condition are equal to or greater than reporting differences in the primed condition, there could be an independent effect of GAT taking on race and ancestry responses (because the GAT takers and nontakers would have responded differently even before being asked about GATs).
The second randomization dimension varied whether respondents saw race questions before ancestry questions (the race before ancestry condition) or whether they saw ancestry questions before race questions (the ancestry before race condition). In part, this randomization averages priming effects of ancestry responses on race responses, and vice versa, when we consider the full sample. As we discuss later, these between-condition reporting differences also offer insight into survey-design best practices in the age of GATs.
We generally present descriptive frequencies to demonstrate differences between GAT takers and non-GAT takers for ease of interpretation. However, given other factors likely relate to both GAT taking and race/ancestry reporting, we estimate logistic regressions with controls for sex, age, education, region, and genealogical research behaviors, along with survey condition and recruitment method. All logistic regression results are similar in direction, magnitude, and statistical significance to reported descriptive differences.
We also extend our logistic regressions with propensity score matching and conclude that neither observed nor unobserved selectivity is likely to be driving our results (see the online appendix for details). Such pseudo-causal inference methods can help to rule out the possibility of selectivity or spurious correlation (i.e., a third variable causing both treatment and outcome). However, these models and sensitivity analyses cannot differentiate the direction of the causal arrow—that is, whether GAT taking causes changes in racial identification or whether differences in racial identification motivate some people to take GATs more than others. Our propensity score analysis is therefore best understood as a robustness check rather than providing clear causal evidence. We return to this point in the Discussion.
Given our large sample size, we are wary of overstating a finding's substantive significance by relying solely on statistical significance tests. Throughout, we focus on frequencies that are both statistically significant and of meaningful magnitude. Our sample size is helpful in that it affords us the statistical power to assess patterns for subpopulations, such as people who report three or more races (n = 1,208), which are often overlooked in nationally representative survey research. This feature, along with the built-in survey experiment, allows for a unique analysis of differential patterns of race and ancestry reporting.
We first present race reporting differences between GAT takers and non-GAT takers overall and in key experimental conditions of interest. Next, we examine ancestry reporting differences between GAT takers and non-GAT takers, as well as ancestry-race correspondences, to explore whether GAT takers exhibit distinct patterns in how they report information across the race and ancestry questions. These combined reporting patterns are especially important to consider for demographic surveys that ask questions about both race and ancestry, such as the annual American Community Survey.
Differences in Race Reporting
We find that GAT takers are significantly more likely to self-identify with multiple races than are non-GAT takers, including in survey conditions that most closely correspond to existing national data collection. Most surveys ask race questions before ancestry questions (if they ask the latter at all) and do not ask about genealogical research; therefore, the experimental condition in which respondents saw race before ancestry and were unprimed by genealogy questions most closely replicates traditional questionnaire designs.
Figure 1 shows the difference in rates of selecting various race responses in the race before ancestry and unprimed condition between respondents who had taken a GAT and those who had not. The figure depicts several notable differences in racial identification: in this condition, compared with non-GAT takers, GAT takers were significantly less likely to identify only as Hispanic/Latino (−2.4 percentage point difference; p < .001) or Asian (−1.6 percentage point difference; p < .01) and were significantly more likely to select three or more races (3.4 percentage point difference; p < .001). These descriptive patterns reflect both who is most likely to take GATs and whether GAT takers respond differently after receiving their results. For example, GAT takers' lower rates of identifying as Asian alone are consistent with higher rates of expressed disinterest in genetic ancestry testing among self-identified Asian Americans (see Horowitz et al. 2019). We present these descriptive results to provide a sense of the overall magnitude of race-reporting differences between GAT takers and non-GAT takers on standard demographic questionnaires.
Differences in rates of reporting multiple races extend beyond the race before ancestry and unprimed condition. Across our entire sample, the likelihood of selecting multiple races for self-identification differed significantly by whether the respondent had taken a genetic ancestry test. About 1 in 7 GAT takers selected multiple races, compared with 1 in 10 non-GAT takers (14% vs. 11%; p < .001).
The comparatively high rate of multiple-race reporting among GAT takers is robust to modeling. Table 2 shows results from a logistic regression predicting multiple-race reporting by whether a respondent had taken a GAT (Model 1), several demographic and survey administration variables (included in Models 2, 3, and 4), and interactions between taking a GAT and whether the respondent was assigned to the ancestry before race versus race before ancestry or knowledge prime versus unprimed experimental conditions (added in Models 3 and 4, respectively). Regardless of which set of covariates we include, the estimated association between GAT taking and reporting multiple races remains similar in size and significance. The logit estimates indicate that GAT takers have at least 30% greater odds of reporting multiple races than non-GAT takers. This finding is consistent with the propensity score matching results, which help account for selection into GAT taking: the average treatment effect for taking a GAT on multiracial reporting is estimated at a 2.7 percentage point (or 25%) difference (see Table A4, online appendix).
We also consider results for the knowledge prime conditions to determine whether rates of reporting multiple races are higher for respondents who were first asked about their genealogical research. Respondents in knowledge prime conditions were, on average, slightly more likely to select multiple races than respondents in unprimed conditions. However, this association is not significantly different for GAT takers and non-GAT takers in our sample: the difference in multiple-race reporting among GAT takers by knowledge condition is relatively small (15% vs. 13.5%, p = .079) and similar in magnitude to the between-condition difference for non-GAT takers (11.9% vs. 10.8%; see also the nonsignificant interaction term in Table 2, Model 4). This suggests that GAT takers' race responses are not more sensitive to priming about their genealogical knowledge than are the responses of non-GAT takers. When combined with our propensity score approach to accounting for selectivity into GAT taking, these results are consistent with the observed reporting differences reflecting actual changes in how respondents answer race questions after taking a GAT.
Differences in Ancestry Reporting
We find that GAT takers not only selected more ancestry responses than non-GAT takers overall but also were more likely to translate awareness of multiple ancestries into multiracial self-identification. This difference is especially prominent among respondents who had done the least additional genealogical research. After also considering specific race-ancestry combinations being reported in our sample, we interpret these patterns to suggest that GAT takers rely on their test results to answer survey questions about both their ancestry and race.
From Ancestry Awareness to Race Reporting
GAT takers in our sample were significantly more likely than non-GAT takers to select more than one geographic region to describe their family origins: about two-thirds of GAT takers reported multiple ancestries, compared with just over one-half of non-GAT takers (66% vs. 54%; p < .001). GAT takers also were more likely to list multiple race-unique ancestries, although the difference is relatively small (25% vs. 23%; p < .01). Figure 2 shows the GAT–no GAT differences in reporting multiple ancestries, multiple race-unique ancestries, and multiple races.
Notably, the GAT–no GAT difference in rates of reporting multiple race-unique ancestries is much smaller than the difference in rates of reporting multiple races. Figure 3 provides an explanation: among people who reported multiple race-unique ancestries, GAT takers were significantly more likely than non-GAT takers to also report multiple races (45% vs. 37%; p < .001). These results indicate that GAT takers not only are more likely to report mixed ancestry but also translate that awareness into multiracial identification at a higher rate than non-GAT takers.
GATs and Other Genealogical Research
To what extent are these race and ancestry reporting patterns associated with GATs specifically, compared with engaging in genealogy more broadly? Logistic regressions described earlier indicate that the difference between GAT takers and non-GAT takers remains when we control for each type of research activity. Further descriptive analyses indicate that GAT–no GAT reporting differences are largest among people who had otherwise done the least research.
Among people who had done very little or some genealogical research, GAT takers were significantly more likely to report multiple ancestries, multiple race-unique ancestries, and multiple races (see Figure 4, panels a and b). However, among people who reported putting the most effort into genealogy, we find that GAT takers were more likely to report multiple ancestries, less likely to report multiple race-unique ancestries, and equally likely to select multiple races compared with non-GAT takers (Figure 4, panel c). Two distinct pathways could produce these reporting patterns, separately or in combination: (1) people who believe they have multiracial ancestry are more likely to engage in considerable research to substantiate that belief, and (2) people who take GATs as an introduction (or shortcut) to genealogical research are more likely to select multiple races than otherwise similar people who have not taken a GAT. Patterns in both genealogical research and multiple-race reporting also vary by age, which has important implications for where to expect “growth” in the multiracial population, as we discuss later.
Adding and Dropping Ancestries
Although we cannot establish causality, evidence from how respondents combined their race and ancestry reports suggests that GAT takers treat their test results as the correct answer to questions about both ancestry and racial identification, and thus might be making different conceptual links between ancestry and race than non-GAT takers.
We might expect GAT takers to draw on their results when responding to demographic questions about ancestry. Indeed, we find that certain ancestries—those often highlighted in genetic test results—were more frequently reported by GAT takers than by non-GAT takers. For example, all respondents who had taken GATs were significantly more likely to report sub-Saharan African and Scandinavian ancestry (see Table 3). These overall patterns are similar regardless of how the respondents reported their race and are generally consistent regardless of how much research respondents reported (not shown).
How respondents combined race and ancestry responses helps illustrate whether people who take GATs treat these two concepts as more closely linked. For example, tracing ancestry to the original peoples of sub-Saharan Africa is the official definition of the “Black or African American” racial category, but many descendants of former slaves know little about their ancestors' pre-slavery geographic origins (Nelson 2008). To acknowledge this, we offered “sub-Saharan Africa” and “African American” among our ancestry responses. Nearly all respondents who selected African American ancestry were U.S.-born (97%). In contrast, the sub-Saharan Africa response resonated most with two types of respondents: (1) foreign-born people who identified as Black, and (2) GAT takers (see Figure A3, online appendix). The sub-Saharan ancestry reporting difference between GAT takers and non-GAT takers is particularly striking among Black-identified respondents: 56% reported sub-Saharan African ancestry if they had taken a GAT, compared with 13% among non-GAT takers (see Table 3, panel A). This pattern is consistent with GAT takers being exposed to new ways to describe their ancestry and incorporating that information when responding to demographic surveys.
Similarly, among respondents who identified as Hispanic, GAT takers were significantly more likely to report Southern European (i.e., Spanish) and/or American Indian ancestry (see Table 3, panels C and D). This pattern was accompanied by a somewhat lower rate of listing Central or South American ancestry (67% vs. 71%, p = 0.057). These race-ancestry reporting combinations for self-identified Hispanic Americans, like those for self-identified Black Americans, seem to reflect ancestry understandings at different time scales (i.e., recent relatives vs. distant lineage), with GATs making distant ancestry salient (see Zerubavel, 2012).
That said, we find evidence that GAT takers both added ancestries (or races) they may not have reported previously and may have dropped some responses. For example, among respondents who identified as White, GAT takers were significantly less likely to report American Indian ancestry than non-GAT takers (14% vs. 16%)—the only such drop in ancestry reporting for GAT takers across all race and ancestry responses in Table 3.4 This pattern runs counter to recent increases in American Indian ancestry and race reporting among White Americans (Liebler et al. 2016; Nagel 1995); it also suggests that although some may seek GATs to support otherwise tenuous claims to American Indian identity (Roth and Ivemark 2018), others may stop reporting such claims when their GAT results do not support that conclusion.
Whether respondents added Scandinavian, switched to sub-Saharan African or Southern European, or dropped American Indian, the ancestry-reporting patterns in our sample suggest that GAT takers see their test results as providing answers to demographic questions about ancestry. Further, when combined with our finding that GAT takers translated awareness of multiple race-unique ancestries into selecting multiple races at a higher rate than non-GAT takers, these reporting patterns suggest many GAT takers see their test results as providing information about both ancestry and racial identification.
GAT takers in our sample were significantly more likely to report multiple ancestries and multiple races than people who had not taken a GAT. These differences are not solely explained by GAT takers reporting diverse ancestry: we find that GAT takers were more likely to report multiple races even when we compare among people reporting multiple race-unique ancestries. We also show that GAT takers reported specific combinations of ancestry and races at different rates than non-GAT takers, suggesting that they may both add and drop categories as they translate their test results into answers on demographic questionnaires.
We now turn our discussion to a subset of respondents who identified as multiracial: those who selected three or more races. We focus on this reporting pattern because we expect it will produce the most obvious GAT-related shift in race responses and because we find it is the most sensitive to question order. We also discuss broader implications of our findings for how the multiracial population is conceptualized and measured and for how future research should account for causality and context in GAT taking and race reporting.
The Sensitivity of Reporting Three or More Races
Our outcome of interest thus far has been whether respondents selected two or more races because this reflects standard reporting practice for the U.S. multiracial population (see Jones and Bullock 2012). However, the difference in multiple-race reporting between GAT takers and non-GAT takers in our sample is even more pronounced among people who selected three or more races: 3.1% of GAT takers selected three or more races, compared with 1.5% of non-GAT takers (difference p < .001). Similarly, in logistic regression, GAT takers have about two times greater odds of selecting three or more races than non-GAT takers, all other measured factors equal (see Table 4). Although selecting three or more races is rare in absolute percentage point terms, the relative proportion difference suggests that this is an important subpopulation to watch as access to genetic ancestry information grows.
Examining the selection of three or more races by survey condition suggests that question order moderates the effect of GATs on racial identification. Overall, 1.9% of respondents in the race before ancestry conditions selected three or more races, compared with 1.3% of respondents in ancestry before race conditions. However, this difference is much greater among GAT takers: 4.7% of GAT takers selected three or more races when answering race first, compared with just 2% of GAT takers when answering ancestry first (see Figure 5, top panel). A significant interaction between GAT taking and question order also remains in the presence of regression controls (see Table 4). Descriptive differences are similar for reporting two or more races (see Figure 5, bottom panel), but our logistic regression predicting the reporting of two or more races does not indicate a statistically significant interaction between GAT taking and survey condition (see Table 2). These patterns suggest that the biggest GAT-related reporting difference we find—reporting three or more races—also is the most sensitive to survey design.
Notably, race reporting appears to be more sensitive to question order than ancestry reporting (see Figure A4, online appendix). For GAT takers and non-GAT takers alike, we find that differences in rates of reporting two or more races between the race before ancestry and ancestry before race conditions (4.4 and 2.9 percentage point differences for GAT and non-GAT takers, respectively) are larger than between-condition differences in rates of reporting multiple ancestries (1.4 and 0.6 percentage points, respectively) or multiple race-unique ancestries (1.6 and 0.7 percentage points, respectively). The relatively similar rates of multiple ancestry and multiple race-unique ancestry reporting by condition suggest that respondents did not provide more information in response to whichever question they saw first—that is, more ancestry responses when an ancestry question came first and more race responses when a race question came first. Rather, the question order affected only race reporting.
Taken together, these results imply that asking about racial identification before ancestry (or asking for only racial identification, as in most surveys) would yield the highest rates of multiracial reporting by GAT takers. Rates of reporting three or more races might be especially high with the standard question order. Conversely, asking about ancestry first would yield lower rates of multiracial identification overall, and most markedly among GAT takers.
The Multiracial Measurement Gap
Our results suggest that higher rates of multiple-race reporting by GAT takers stem from their increased likelihood of identifying their race in line with more distant ancestries. However, this pattern runs counter to past experiences with multiracial identification in the United States. When the “mark one or more” instruction was added to official racial identification questions, there were initial concerns that the political power of minority populations would be diluted because so many African Americans, in particular, could claim mixed ancestry (Williams 2006). However, when the Census 2000 returns showed that less than 3% of Americans reported multiple races, such fears receded. Demographers know this multiracial measurement gap well: not all Americans who are aware of multiracial ancestry select multiple races on surveys (Goldstein and Morning 2000).
Research finds that American adults generally report multiple races for self-identification at significantly higher rates when known “mixing” occurred recently in their family tree, such as with parents or grandparents, rather than great-grandparents or earlier ancestors (Morning and Saperstein 2018). Yet, our evidence suggests the multiracial measurement gap is narrower among GAT takers for the opposite reason. They are not only more likely than non-GAT takers to report ancestry distant from their personal experience (as when U.S.-born GAT takers who identify as Black report sub-Saharan African ancestry) but also more likely to incorporate awareness of mixed-race ancestry into their racial identification.
Previous research using a national probability sample also found that compared with older cohorts, younger adults report multiracial ancestry at higher rates and are more likely to identify with multiple races, conditional on that awareness (Johfre and Saperstein 2019). Our sample echoes this pattern, with younger respondents reporting multiple ancestries, multiple race-unique ancestries, and multiple races at higher rates than older respondents. However, our results point to striking GAT–no GAT differences among older respondents (see Figure 6). Although younger GAT takers have the highest rates of multiple reporting for both race and ancestry overall, among people aware of multiple race-unique ancestries, the largest GAT-related difference in multiple-race reporting in our sample is among 55- to 64-year-olds (14.5%). These age patterns remain consistent among people who engaged in similar amounts of genealogical research (see Table A7, online appendix).
These age patterns suggest that the expected increases in multiple-race reporting among younger Americans that follow from ever-greater interracial coupling (Alba et al. 2018), will be accompanied by increased rates of multiple-race reporting among middle-aged Americans, who are the most likely to have taken GATs (Horowitz et al. 2019). Thus, we expect that the influence of GATs will be most evident in demographic data collection in two ways: (1) we expect an overall increase in selecting three or more races; and (2) we expect increases in multiple-race reporting among middle-aged or older adults, who appear particularly likely to translate awareness of mixed ancestry into multiracial reporting in the context of GATs.
Causality and Context
A causal interpretation of our results, whereby people who take GATs are more likely to self-identify as multiracial, is consistent with both our question order experiment and propensity score analysis. However, our sample is not representative of the U.S. adult population, and we did not survey the same individuals before and after they received their GAT results. Thus, generalization about the causal effects of GAT taking on race and ancestry reporting based solely on our evidence remains speculative.
Part of the reporting difference likely is explained by the greater likelihood of taking GATs among people with mixed ancestry or a sense of “ancestral uncertainty” (Horowitz et al. 2019). This causal direction is opposite the earlier proposed interpretation, but we see the two as complementary rather than contradictory. We expect that GAT taking and multiracial reporting are mutually constituted: one's racial identity—or assumptions about one's ancestry—prompt interest in genetic testing, and the results received can change how one subsequently reports their ancestry and racial identification on surveys.
One way to account for selection into GAT taking would be to randomly assign a representative sample of Americans to receive GAT results. This randomized control trial–type design, comparing pre- and post-GAT race and ancestry responses for both a treatment group (that receives GATs) and a control group (that does not), can provide causal evidence about whether GATs affect reporting. However, it would not shed light on reciprocal causation (what leads people to take GATs) or address whether people who purchase a GAT respond differently than people offered free tests or who receive GAT information by seeing results from biological relatives. Thus, such studies would sacrifice some external validity in exchange for clearer unidirectional causality.
Indeed, we expect that the context in which people receive GAT results affects their likelihood of changing racial identification. As noted earlier, previous research found that people who identified as White before seeking a GAT were more likely to report a subsequent change in racial identification (Roth and Ivemark 2018). A similar interview study found that Black and Latina women perceived GATs as “just information” without changing how they identified, even when their results indicated mixed ancestry (Shim et al. 2018). Importantly, the women Shim and colleagues interviewed did not purchase a GAT; they were offered ancestry results in return for participating in a long-standing health cohort study. Future research that allows for a range of interest in receiving genetic information could help better disentangle selection and context: perhaps people who do not actively pursue GATs also are less likely to incorporate the results into their racial identity regardless of how they previously identified.
We offer additional evidence along these lines. In our sample, the 264 respondents who volunteered that they had seen a relative's GAT results have reporting patterns that do not fully align with either GAT takers or non-GAT takers. These respondents reported multiple ancestries at similar rates as non-GAT takers, and their rates of reporting at least two races fell roughly in between those of GAT takers and non-GAT takers, but they reported multiple race-unique ancestries and selected three or more races at lower rates than either group (see Table A8 in the online appendix). Although speculative, these descriptive results suggest future research should account for the context in which GAT information is received. For example, existing longitudinal studies could incorporate repeated measures of race along with a module about exposure to GAT results (and beliefs about genetics more broadly; see, e.g., Phelan et al. 2014; Roth et al. 2020).
We also expect the reporting context to matter. Previous interview studies asked open-ended questions about whether people's conception of their race changed after receiving GAT results. Our survey, in contrast, asked closed-ended demographic questions as part of a health-related study. The same people who downplay, or even disavow (Panofsky and Donovan 2019), their GAT results for identity purposes might nevertheless provide GAT-related information in response to race questions when they perceive a benefit or when they believe the context calls for “scientific” responses. Future research on whether people incorporate genetic ancestry information into race reporting on job applications, general social surveys, or vital records will be crucial to understanding the scope and implications of GAT-linked reporting changes in the United States.
We provide important evidence that people who have taken a GAT are significantly more likely to select multiple races for self-identification on a survey compared with people who have not taken a GAT, even among people who report the same number of race-unique ancestries. The difference in multiracial reporting for GAT takers reflects a greater likelihood of selecting at least two races and is particularly pronounced for rates of selecting three or more races. We also demonstrate that asking about ancestry before race helps moderate the differential responses among GAT takers in our sample. Future research that accounts for the causal and contextual factors discussed earlier will further clarify when and for whom GATs are most likely to prompt changes in survey reporting.
What can demographers do in the meantime? Most importantly, be aware that GATs may be changing how Americans report their race and ancestry. However, differences in population counts are only one possible result; there also could be GAT-related changes in associations between racial identification and other outcomes. Such shifts will be particularly consequential for studies of inequality if, for example, predominantly older, better-off Americans who previously identified only as White (and are overrepresented among GAT takers) augment their racial identification to include one or more minority categories. Thus, in addition to accounting for general response “churn” (Liebler et al. 2017), researchers conducting trend analyses should consider whether GAT-linked changes in racial identification could contribute to shifts in other observed average characteristics. Continued net mobility out of the White category (Perez and Hirschman 2009) also has substantive implications for ongoing debates about projecting a majority-minority society (see, e.g., Mora and Rodríguez-Muñiz 2017).
The contemporary justification for collecting data on race in federal statistics is to monitor racial discrimination and enforce antidiscrimination efforts in everything from housing to political representation (Wallman et al. 2000). Thus, a shift to distant ancestry being reported as part of one's racial identity would necessitate greater caution when self-identification is used to study disparate treatment. Unless racial discrimination is activated by one's genetic lineage (which may not be visually obvious), having GAT results reported as race responses would further diminish the data's utility for studies of discrimination, exacerbating concerns about treating self-identification and racialized appearance as interchangeable (Penner and Saperstein 2015; Telles and Lim 1998). Should this development be deemed undesirable, future research could consider how best to collect data that are more directly relevant to the stated purpose of monitoring contemporary racial discrimination.
If public response to Senator Elizabeth Warren's recent genetic results and past American Indian race reporting are any guide (Kaplan 2019; Khalid 2018), the patterns of multiracial identification we find for GAT takers will generate considerable controversy should they prove representative. Some may celebrate increased multiple-race reporting based on knowledge of distant ancestry as a breakdown of rigid norms of racial classification, while others may chafe at people claiming racial identification that does not reflect their current experiences with prejudice or unequal treatment. Either way, the change would offer further evidence on the social construction of race, an ongoing process that demographic questionnaires both reflect and help to reproduce.
We live in an age of genetic ancestry testing. As more Americans seek GATs, they may contribute to shifting conceptions of race, ancestry, and the link between them. These new understandings will be translated into responses on surveys and official statistics. Although recent projections suggest that GAT market growth may be slowing (Molla 2020), the number of people exposed to GAT results extends beyond direct consumers. Information about genetic ancestry is now part of everyday conversations about race and identity among family, friends, and the broader public. It is vital that demographers consider these shifts when designing questions and interpreting results from current and future population surveys.
This work was supported by the National Institutes of Health (grant number R21HG008041-01). We are grateful to Martin Maiers and the National Marrow Donor Program for assistance with data collection, and to Anna Boch for research assistance. We also thank Jeremy Freese, Nicholas Jones, Zhenchao Qian, Florencia Torche, and the Stanford Migration, Ethnicity, Race and Nation workshop for their comments and suggestions. A previous version of this paper was presented at the 2019 annual meeting of the Population Association of America.
A replication package, including programming code and the de-identified data necessary to conduct all analyses presented here, is available upon request.
Sharing GAT results with biological relatives appears to be common; in one study, 80% of GAT takers reported they planned to share their results with family members (Rubanovich et al. 2021).
One alternate coding scheme separated respondents who reported no research from those who reported one or two research activities and those who reported three or more (regardless of the type of activity). The other three-category alternative that we considered for descriptive presentation divided respondents’ research reports into groupings of zero to one, two to three, and four or more activities. Regression models include indicators for the full set of research activities as controls.
Results of a logistic regression show that White GAT takers are 20% less likely than White non-GAT takers to report American Indian ancestry, net of demographic and survey administration controls (see Table A6, online appendix).