In this response to Frank’s comments regarding our recently published article in Demography (Guo et al. 2014), we also address the literature she cites to support her comments. We reiterate a central point already made in our original article. We do not try to explain racial identity. Our objective is to examine racial classification in U.S. social surveys. The general concept of racial identify is a much more complex subject involving many more historical, cultural, political, social, and ancestral factors than we address here. To be sure, a long history of work has attempted to link race to genetics in order to justify racial discrimination and race-based social stratification. However, advances in molecular genetics have afforded opportunities for natural and social scientists to study race using new tools and perspectives that do not necessarily serve to reify race but instead attempt to better explain its many dimensions and possible implications for medical advances and social life.
Molecular genetics studies in the last few decades (Cavalli-Sforza 2007; Jobling et al. 2004; Rotimi 2004; Royal et al. 2010) provide the basis for ancestral analysis, or the analysis of the geographic origin of human populations via genetic data. Starting about 100,000 years ago, anatomically modern humans migrated out of East Africa and spread gradually to South Asia, Australia, Europe, East Asia, and eventually the Americas. All people living today are direct descendants of these earlier humans. The populations living in different parts of the world today exhibit a small number of genetic differences for the following reasons. These groups of migrants were likely to be small, consisting of a few dozen individuals at most. They possessed only a subset of the alleles of the parent population on the African continent. The smaller a migrant group or a founder population is, the larger the genetic disparity from the parent population. Besides migration, mutation and genetic drift also play a role. Occasionally, natural selection could induce a sweeping genetic change. Furthermore, the reproductive isolation among populations caused by geographical barriers ensures that any differences arising from migration, mutation, genetic drift, and natural selection are maintained. Over time, genetic differences across geographically separated populations would solidify into structured differences across populations. This synopsis of human migration and genetic diversification is supported by many recent studies (e.g., Li et al. 2008; Rosenberg et al. 2002) reporting that the main genetic clusters occur among Europeans/West Asians, sub-Saharan Africans, and East Asians/Pacific Islanders/American Indians.
In what follows, we will refer to the preceding synopsis of human migration and resulting genetic diversification among human populations as we respond to Frank’s comments.
Comment 1: Frank states that our work is not new.
Our article describes the unique contributions of our research (see the Introduction and Background sections). Our empirical analysis that establishes the general correspondence between bio-ancestry and race classification has a number of important and unique features. Using two social science data sources (ROOM and Add Health) that are representative of the U.S. population, we conducted a close examination of mixed-race individuals, something that previous ancestry studies have not examined. The cross-replication in our study relied on both U.S. and worldwide data (HapMap and HDGP); previous studies have used either international samples or U.S. samples alone. This distinction is important because ancestry analysis necessarily involves parent populations and ancestral populations (as we elaborate later). Most importantly, ours is the first empirical study to systematically examine how bio-ancestry and sociocultural context interact to influence race classification. In fact, ours was the first study to utilize extensive social and genetic data for such a purpose.
Comment 2: Frank echoes a common critique of bio-ancestry analysis that our study supports the existence of discrete, genetically distinct populations (e.g., Bolnick et al. 2007).
We would like to emphasize that we do not support the distinct-race theory. We explicitly stated that the idea that “races are biologically distinct peoples with differential abilities and behaviors has long been discredited by the scientific community” (p. 142). Our empirical findings are consistent with our position, and we also presented the theoretical work of Kimura (1983) in support of our position (p. 146). However, as the preceding synopsis of human migration suggests, human migration, mutation, genetic drift, and natural selection could still lead to sufficient genetic differences that play a role in race classification in the contemporary United States.
Comment 3: The samples in HapMap and HDGP are not representative of the worldwide populations, and the samples are small.
From Africa, HapMap samples Yoruba individuals; HDGP samples Bantu, Biaka, Mandenka, Mbuti pygmy, Mozabite, San, and Yoruba individuals. As Duster (2011) noted, many groups in Africa are not sampled (e.g., Zulu individuals). It is true that understanding fine resolution of genetic structure of a population would be difficult when working with small samples (Royal et al. 2010). However, our purpose was not to examine fine-scale patterns of past migration and admixture. Rather, we explored whether a set of ancestral informative markers (AIMs) in HapMap (the parent population for our study) can be applied to contemporary populations in the United States. The replications in our analysis suggest that this strategy has worked despite the fact that HapMap and HDGP are not representative of the entire African population. One possibility is that a majority of African Americans originated from West and Central Africa (Rawley 1981; Thompson 1981). However, we agree that this issue needs to be revisited when more diverse African samples become available.
Comment 4: Frank alludes to an argument that ancestry analysis is limited by the complexity of ancestral populations (Duster 2011; Gabriel 2012). According to this argument, each individual’s DNA is from a large number of ancestors. Assuming each new generation emerges every 25 years, about 256 ancestors living about 200 years ago contributed to the DNA of each individual living today.
Our understanding of human migration and ancestry is still quite sketchy at the individual level. However, for our purposes, we do not need to know the entire ancestral history of each individual to know roughly to which genetic cluster he or she may belong. Also, for example, although the DNA from an East Asian individual is from an extremely large number of ancestors, ancestry analysis based on numerous AIMs would still classify this individual’s DNA as part of an East Asian genetic cluster if all or most of the contributing ancestors of this individual were from East Asia.
Comment 5: The vast majority of the AIMs are not population-specific. The absence or presence of a particular marker does not indicate the membership in a population (Bolnick et al. 2008; Duster 2011).
This is correct and is well known. This is precisely why we used 186 AIMs to establish individuals’ cluster memberships.
Comment 6: There is a concern (Duster 2011) that the parent populations in HapMap and HDGP are not comparable to the ancestral populations that are ancestors of the contemporary U.S. populations of Africans, Europeans, and East Asians.
The key question is whether extremely large population changes in regions where parent populations in HapMap and HDGP were sampled have rendered AIM-based ancestral analysis invalid. An AIM ancestral analysis searches for AIMs in a parent population and uses these AIMs to identify individuals in the contemporary United States who belong to a similar genetic cluster as the parent population. First, population changes between an ancestral population and a parent population in HapMap and HDGP must be extremely dramatic in order to invalidate the AIM analysis. As an illustrative example, even the Irish Potato Famine, which reduced the population by 20 % to 25 %, may not have been dramatic enough to change the genetic structure of the Irish population. Historians and historical demographers estimated demographic losses of the Atlantic slave trade during the sixteenth through eighteenth centuries (Fage 2002:245–291). The total number of men and women taken by the slave trade is estimated to be as high as 11,641,000. However, because these slaves were taken over a period of more than three centuries, with an annual average ranging 19,000–61,000 from a population of 25,000,000 in West Africa, “it seems unlikely that the export slave trade would have had any dramatically adverse effect on the size of the population as a whole” (Fage 2002:263). Second, treating parent populations as ancestral populations, numerous AIM studies have been successful in using markers discovered in parent populations to identify individuals in contemporary United States. Such replications would not have been possible if the parent populations in HapMap and HDGP were not at all comparable to the ancestral populations of the contemporary African, European and East Asian populations in the United States.
Comment 7: Frank also alludes to Rotimi’s (2004) piont that individuals may have membership in more than one biogeographical cluster and that the boundaries of these clusters are not distinct.
We neither disagree with this assessment nor consider it a critique of our work. In fact, consistent with this argument, our U.S. data show that mixed-race individuals tend to have membership in more than one cluster, and these individuals also lie outside cluster boundaries.
Comment 8: Frank states, “In the Guo et al. article, genetic contributions (in the form of ‘bio-ancestry estimates’) are represented as value-neutral genetic facts situated in a cultural context . . . . Marks (2013) would argue that the bio-genetic ancestry estimates Guo et al. presented would be more appropriately conceptualized as inherently biocultural facts imbued with values, ideologies, and meanings.” Frank asks us to recognize the social construction of the genetic ancestry estimates and to question their production and the interpretations.
We agree with Frank and Marks that ancestry studies are inherently influenced by social and cultural context. For example, value-free social policies or pharmaceutical drugs are non-existent. However, to continue with the example, that a pharmaceutical drug is partially socially and culturally produced does not automatically preclude its usefulness. Frank’s comment lacks specificity about the ways in which our analysis and interpretation are biased and the ways our analysis and interpretation could be improved when the related sociocultural context is addressed.
The ancestral clusters that correspond to European, African, and East Asian Americans represent merely one of the numerous possible levels of ancestral clusters. The focus of our ancestral analysis on these race classifications evidences historical, political, social, and cultural impacts. Because of these impacts, the information on the classifications is regularly collected in U.S. censuses, social surveys, school applications, social policy, health care, and so on. Shall we abandon the use of these classifications altogether? If it is a worthwhile cause, as Frank states, to “harness the promise of incorporating recent advances in human molecular genetics into social science research,” what specific strategies may be used to address sociocultural biases? We believe our work takes the first step in this direction by examining sociocultural effects on racial classifications.
Comment 9: Frank states that the most problematic aspect of our article is “the repeated use of the ‘biogeographical ancestry’ estimates as stand-ins for the biological/genetic component of race against which racial self-identities are assessed.” The main critique is that “ancestry estimates have led to a ‘molecular reinscription of race’ (Duster 2011; Fullwiley 2011) or that ancestry estimates equate race with ancestry (Gabriel 2012).
We agree that ancestry cannot be equated with race, which is much more complex than genetic ancestry. Demonstrating sociocultural influences of race classification was our most important objective; we went to great lengths to find such influences. However, as our own empirical work shows, consistent with a large body of ancestry studies summarized in the synopsis, race classification has an important bio-ancestral component. Without taking bio-ancestry into account, the sociocultural influences could be severely misestimated. For example, the estimated higher likelihood of changing self-reported race over time among mixed-race individuals is partially due to mixed ancestry. Without considering bio-ancestry, the higher likelihood would be erroneously attributed entirely to sociocultural influences.
One concern is that ancestral analysis seems to suggest an implausible direct link from DNA sequence to such a complex phenomenon as race. We agree that numerous intermediate biological and sociocultural mechanisms must exist between the two. However, a direct link between DNA and a human phenotype is common for simplification. For example, a harmful BRCA1 mutation is shown to increase a woman’s risk for breast cancer by about 50 %, but this does not disregard the known and unknown intermediate mechanisms.
Our discussions about the nature of race in a number of undergraduate classes and in other public forums make clear that the general public more readily accepts the position that the genetic similarities across races are much more dominant than genetic differences and that the small number of genetic differences across races tend to be superficial, compared with the position that categorically dismisses any genetic differences between survey-classified races. The position that race is entirely socially constructed is not as easily accepted because the public receives information that humans are both social and biological beings. Indeed, every person forms his or her own evidence-based judgment. When their judgment is in conflict with ours, they do not necessarily abandon their own.
Work using bio-ancestry markers continues to forge ahead in biology and medicine. It is important that demographers and other social scientists are part of this growing area of research to ensure that the sociocultural nature of race is understood and incorporated into data collection and research designs. Including social scientists in discussions of race and genetics along with molecular geneticists is imperative. But the first step is to understand the contributions each discipline can bring to our understanding of the social world. Only then can we specify a pathway for integrating the approach of each discipline with the goal of reducing social stratification and health disparities.