Abstract
Sequence analysis is an established method used to study the complexity of family life courses. Although individual and societal characteristics have been linked with the complexity of family trajectories, social scientists have neglected the potential role of genetic factors in explaining variation in family transitions and events across the life course. We estimate the genetic contribution to sequence complexity and a wide range of family demographic behaviors using genomic relatedness–based, restricted maximum likelihood models with data from the U.S. Health and Retirement Study. This innovative methodological approach allows us to provide the first estimates of the heritability of composite life course outcomes—that is, sequence complexity. We demonstrate that a number of family demographic indicators (e.g., the age at first birth and first marriage) are heritable and provide evidence that composite metrics can be influenced by genetic factors. For example, our results show that 11% of the total variation in the complexity of differentiated family sequences is attributable to genetic influences. Moreover, we test whether this genetic contribution varies by social environment as indexed by birth cohort over a period of rapid changes in family norms during the twentieth century. Interestingly, we find evidence that the complexity of fertility and differentiated family trajectories decreased across cohorts, but we find no evidence that the heritability of the complexity of partnership trajectories changed across cohorts. Therefore, our results do not substantiate claims that lower normative constraints on family demographic behavior increase the role of genes.
Introduction
Scholars and the general public alike perceive contemporary life courses to be more complex, unstable, and unpredictable than those of the early and mid-twentieth century (Beck 1992, 2009; Sennet 2006; Walsh 2012). Common narratives in family sociology and demography surrounding more complex family trajectories revolve around the decline of the modern nuclear family and the pluralization of family forms (Bengtson 2001). Indeed, the United States experienced a decrease and postponement in marriage and marital fertility that coincided with an increase in nonmarital cohabitation and fertility as well as divorce and remarriage (Cherlin 2010). This increase in the number of family events and states that individuals experience over the course of their lives results in more complex family trajectories (Brückner and Mayer 2005). Increasing family life course complexity may have serious consequences for individuals and societies. For example, complexity generated by early nonmarital parenthood, serial cohabitation, and divorce is likely tightly intertwined with the production of social inequalities and their reproduction across generations (e.g., McLanahan and Percheski 2008).
Three theoretical narratives have been applied when studying the complexity of family events and transitions within individual life courses (see Van Winkle 2018). First, the second demographic transition (SDT) thesis is an ideational account that associates more complex family life courses with a shift from materialist to post-materialist values (Lesthaeghe 2014). Second, an increase in family life course complexity has been connected with increasing economic uncertainty following globalization and deindustrialization (Mills and Blossfeld 2013). Third, life course sociologists and welfare state scholars have argued that labor market and family policies are related to the complexity of family lives (Mayer 2009; Van Winkle 2019). The first two theoretical perspectives are generally invoked to account for change over time, whereas the latter is more commonly used to account for cross-national differences.
However, a fourth theoretical perspective on family demographic behavior is currently emerging: biodemography or genodemography (Conley 2016; Kohler et al. 2006; Mills and Tropf 2015). Biodemography is the study of family demographic behavior that incorporates economic and sociological theories with approaches from biology, especially behavioral and molecular genetics (Kohler et al. 2006). This approach explores genetic factors that influence the components of family life course complexity. Although the aforementioned theoretical perspectives are the most prominent in family demography and sociology, biodemographic research has gained importance in the last decades (for an introduction to sociogenomic research, see Conley and Fletcher 2017; for a review, see D'Onofrio and Lahey 2010).
In this study, we address two research questions. First, to what extent is family life course complexity heritable? The degree of heritability gives researchers insight into the potential importance of sociobiological explanations and the consequences of ignoring them. Discounting biological and genetic accounts may lead to grossly biased results, especially if genetic factors confound associations between social factors and life course complexity (D'Onofrio and Lahey 2010). Second, has the heritability of family life course complexity changed across U.S. birth cohorts? It has been argued in family demography that genetic influences on family demographic behavior will increase in societies with low levels of social constraints because genes can express themselves to a greater extent in contexts where variation in individual behavior is high and is less restricted by formal and informal social norms (Kohler et al. 2006; Udry 1996). In contrast, some have argued that in a context of increasing social inequality and economic uncertainty, the characteristics associated with family demographic behavior will be increasingly influenced by individuals' social backgrounds, and the importance of genes may decrease, to the extent that social background is distinct from genetic background (Adkins and Vaisey 2009).
We apply retrospective life history and molecular genetic data from the U.S. Health and Retirement Study (HRS) to a genomic relatedness–based, restricted maximum likelihood (GREML) model. Specifically, we first reconstruct the family life courses of HRS respondents as sequences. A nuanced metric developed in sequences analysis—the sequence complexity index—is calculated for each individual trajectory (Gabadinho et al. 2010). The sequence complexity index incorporates not only the number of life course transitions but also the degree of unpredictability across individual trajectories. Second, we estimate the heritability of sequence complexity using GREML models. Most heritability estimates for family demographic behavior have used ACE decomposition models that compare monozygotic and dizygotic twins (for heritability estimation with ACE models, see Boomsma et al. 2002). However, heritability estimates can also be determined among nonkin using GREML analyses, which use genetic similarity to decompose trait variance into additive genetic and environmental components (Domingue, Wedow et al. 2016; Yang et al. 2010, 2011).
Background
Family Demographic Trends in Twentieth-Century United States
In the following discussion, we briefly sketch trends in the occurrence of life course states and transitions that comprise family life course complexity: forming and dissolving marital unions as well as entering parenthood. Young adults directly entered marriage after leaving the parental home for the first half of the twentieth century, but more recent cohorts tend to live independently or in nonmarital unions prior to marriage (Buchmann and Kriesi 2011). As the median age at first marriage increased from 22 in 1960 to 28 in 2010 for men and from 20 to 26 for women, the percentage of the U.S. population living in a marital union decreased from 69% to 54% for men and 65% to 52% for women. One factor that contributes to the decrease in marriage is the increased prevalence of divorce. Between 1960 and 1980, the crude divorce rate in the United States more than doubled from 2.2 to 5.2 divorces per 1,000 persons but has since decreased to 3.6 by 2006 (Amato 2010). Higher rates of marital separation have also increased remarriage rates (Coleman et al. 2000).
Meanwhile, the decline of fertility in the United States to near-replacement rates has concerned both scholars and policymakers (Balbo et al. 2013). Lower total fertility rates may result from delayed entry into parenthood—the tempo effect—as well as smaller family sizes and higher rates of childlessness—the quantum effect (Morgan and Taylor 2006; Zeman et al. 2018). Women's median age at first birth rose from 23 to 25 between cohorts born in the 1940s and 1980s, respectively (Finer and Philbin 2014). Completed fertility is comparatively high in the United States: between 2.0 and 2.2 for women born in the 1960s (Zeman et al. 2018). After remaining low for cohorts born in the mid-twentieth century, childlessness among women born in the 1970s reached levels previously held by cohorts born during the 1920s: approximately 15% (Frejka 2017; Sobotka 2017). Men and women transitioned into parenthood directly following marriage for most of the twentieth century (Kiernan 2001). However, the percentage of nonmarital births among women under age 44 increased from 21% in the early 1980s to 28% in the early 1990s (Bumpass and Lu 2000).
The decoupling of marriage and parenthood has made research on the sequencing of events during the transition to adulthood more important. Although some scholars have used multistate event-history analysis or life table modeling to study variation in life course origins and destinations (e.g., Zeng et al. 2012), sequence analysis has emerged as a popular method to describe and visualize life course trajectories holistically (Aisenbrey and Fasang 2010). Sequence analysis commonly proceeds in three steps. First, each individual trajectory is operationalized as a sequence by aligning individual life course states in chronological order. Second, the dissimilarity of each sequence pair is estimated using distance measures that indicate the degree of difference between any two sequences. Finally, a clustering algorithm is applied to group sequences into distinct units that are maximally homogenous.
An advantage of using sequence analysis over other methodological approaches in life course sociology is that it reduces complexity in two ways. First, sequence analysis identifies common life course pathways within a multitude of trajectories that may warrant further study. Second, rather than analyzing numerous point-in-time outcomes, such as the age at first birth, sequence analysis enables researchers to analyze the life courses that amount from those outcomes, so-called process outcomes (Abbott 2005). A number of studies have used sequence analysis to identify family life course patterns and how they vary across countries and birth cohorts (e.g., Aisenbrey and Fasang 2017; Potârcă et al. 2013). These studies demonstrated that the transition to adulthood and family formation has shifted from an early, contracted, and simple pattern to a late, protracted, and complex pattern across birth cohorts in a wide range of countries (Billari and Liefbroer 2010).
Previous Research on Family Life Course Complexity
There is general agreement that complexity should be conceptualized in terms of life course differentiation (Van Winkle 2018; Van Winkle and Fasang 2017). Brückner and Mayer (2005) defined differentiation as an increase in the number of life course states experienced across the life course. Therefore, complexity has often been operationalized using a simple count measure of the number of life course states or transitions experienced across individuals' lives. However, complexity is also associated with an increase in life course uncertainty (Mills and Blossfeld 2013). Composite metrics developed in sequence analysis incorporate both the number of life course states as well as the degree of unpredictability (Elzinga and Liefbroer 2007; Gabadinho et al. 2010). Sequence-based complexity measures have the advantage that they can incorporate many life course states (i.e., the intersection of different life course dimensions) as well as some simple states.
However, little research has explored the complexity of family trajectories using sequence-based complexity metrics. The bulk of studies on life course complexity are interested in the differentiation of education-work-retirement trajectories (Biemann et al. 2011; Ciganda 2015; Riekhoff 2018). Although many studies have applied sequence and cluster analysis to family trajectories, Elzinga and Liefbroer (2007) first studied the complexity of early family life courses in a number of countries and birth cohorts. They found that average early family life course complexity increased only moderately across a small number of their study countries but that average complexity otherwise remained relatively stable. Van Winkle (2018) analyzed long-life family trajectories in a number of European countries and cohorts and concluded that although complexity increased across cohorts, cross-national variation was considerably larger.
The Heritability of Family Life Course Complexity
In this study, we estimate the heritability of family life course complexity. Heritability is a population-level characteristic and reflects the genetic component to the variation in a trait or phenotype: here, complexity. Specifically, (narrow sense) heritability is the proportion of phenotypic variance that is attributed to (additive) genetic variance (i.e., to common genetic variants within a population) as opposed to environmental variance (Domingue, Wedow et al. 2016). The strength of heritability gives researchers insight into the potential importance of sociobiological explanations and whether it is warranted to integrate them into sociological and demographic theory.
During the twentieth century, twin and family studies were the gold standard for estimating the genetic component of a given trait (Boomsma et al. 2002; Neale and Cardon 1992). A common analytical approach—the ACE model—compares monozygotic and dizygotic twins to decompose variance into an additive genetic component (A),1 a component attributable to shared or common environmental factors (C) (e.g., family background), and an environmental component unique to each twin (E) (for a brief discussion, see Diewald et al. 2015). Although sociologists have been mainly interested in quantifying the effects of social and family background (C), twin studies have also demonstrated that many components of family life course complexity are heritable (A). The heritability of women's age at first birth has been estimated to be between 0 in Denmark and .3 in the United Kingdom. Thus, genetic influences account for up to 30% of the total variation of women's age at first birth in the United Kingdom but none in Denmark. The heritability of completed fertility has been estimated to be between .24 for Swedish women and .43 for Danish women and between .24 for Swedish men and .28 for Danish men (Kohler et al. 1999; Mills and Tropf 2015; Tropf, Barban et al. 2015). Johnson and colleagues (2004) estimated large heritability in the propensity to marriage in the United States: .72 for women and .66 for men. Estimates for the propensity to divorce are similarly high, between .52 and .59 for both U.S. men and women (Jocklin et al. 1996; McGue and Lykken 1992).
However, twin ACE models come with strong assumptions, which lead to biased heritability estimates if violated (Horwitz et al. 2003). For example, it is assumed that monozygotic twins do not mutually influence one another more than dizygotic twins and that dizygotic twins share an average of 50% of their genes (Conley et al. 2013). Further, heritability estimates from twin studies may not be generalizable if there is nonrandom genetic stratification; for example, genes associated with high fertility are more common among twins (Mills and Tropf 2015). Recent advances in molecular genetics and low-cost DNA sequencing methods have enabled researchers to base heritability estimates on true genetic similarity rather than assumed genetic similarity (Domingue, Wedow et al. 2016). Molecular geneticists, as opposed to quantitative behavioral geneticists, are primarily interested in isolating and locating genetic variants associated with a trait rather than estimating the heritability of a trait (Conley 2016). Genome-wide association studies (GWAS) are commonly used to locate single nucleotide polymorphisms (SNPs)—markers of genetic variation—which are associated with a trait. Complex traits, such as fertility or cognitive ability, are polygenic traits that are affected by multiple SNPs rather than a single gene. Following GWAS, polygenic scores (PGS) that estimate individuals' genetic risk for a certain trait are constructed. The variance that PGSs explain is often only a fraction of heritability estimates from twin models (for a discussion on the problem of missing heritability, see Eichler et al. 2010; Manolio et al. 2009).
Genome-wide complex trait analysis (GCTA) consists of several methodological approaches to estimate heritability based on all available SNPs. The GREML model uses genetic similarity to decompose the total variance of a trait into genetic and environmental components (Yang et al. 2010, 2011). GREML studies have been used to estimate the heritability of several anthropomorphic and some social demographic outcomes (Domingue, Wedow et al. 2016). Tropf and colleagues (Tropf, Stulp et al. 2015) estimated a .10 SNP heritability for completed fertility and a .15 SNP heritability for the age at first birth for a pooled sample of women from the United Kingdom and the Netherlands. SNP heritability estimates from GREML are generally larger than PGS heritability estimates but are still lower than twin-based estimates. A study by Yang and colleagues (2015) demonstrated that SNP heritability of height and body mass index (BMI) from GREML using the entire genome with imputed SNPs, rather than a sample of SNPs, are more similar to those from twin studies; however, other reasons remain as to why twin model heritability still tends to be higher than GREML-based ones. For instance, twin models capture the effects of rare variants that are not captured by SNP chips nor by imputation panels. Likewise, twin-based approaches also capture nonadditive effects (epistasis). Finally, twin-based estimates may be higher because of faulty assumptions in the modeling strategy. To our knowledge, no studies have estimated SNP heritability for family demographic outcomes aside from fertility, such as marriage or divorce. However, based on the large heritability found in twin studies, both life course states are likely to be heritable.
Exactly how genetic factors influence social demographic traits remains largely unclear (for an overview of early biosocial models for fertility, see Udry 1996). Kohler and colleagues (2006) argued that genes affect fertility (1) directly through biological pathways, such as menarche; (2) indirectly through conscious life course decision-making, such as that based on knowledge about fecundity; and (3) indirectly through subconscious life course decision-making, such as that influenced by personality characteristics. Freese (2008) contended that genetic effects on sociological outcomes are mediated through the body, which he termed the phenotypic bottleneck. These biosocial pathways are displayed in Figure 1 with a stylized example. In panel a of Figure 1, genetic factors (G) partially determine the intermediate phenotype personality traits (Z), which affects the timing of first birth (Y). Educational attainment (X) has an independent direct effect on the age at first birth. In panel b, personality traits moderate the association between educational attainment and fertility, and personality traits simply precede educational attainment in panel c. Finally, panel d demonstrates how personality traits are confounded with education. Although our study does not address the interplay between educational attainment and the heritability of family demographic phenotypes, the biosocial models by Kohler and colleagues (2006) and Freese (2008) highlight the many pathways that genes affect social outcomes.
Given the consistent findings for the relevance of genetic influences, family life course complexity is likely affected by genetic factors both directly (e.g., biologically determined fecundity) and indirectly (e.g., phenotypes that affect family demographic decisions). Therefore, we expect that
Hypothesis 1 (H1): The complexity of family trajectories will be significantly heritable.
We consider family complexity to be a higher-level—and thus also a highly polygenic—phenotype (e.g., Barghi et al. 2020). Because of family complexity's composite nature, genetic effects will run through comparatively lower-level phenotypes, such as the age at first birth, the number of children, and the propensity to marry and divorce. Based on the previous research discussed earlier, we think that all elements of family complexity are likely to be heritable and contribute to the heritability of family complexity. However, it is an empirical question as to whether specific aspects of family complexity, such as the age at first birth, drive the heritability of complexity to a greater degree than others, such as the transition to marriage.
The Case for Increasing Heritability of Family Life Course Complexity
The relevance of genetic factors for family life course complexity may vary across social contexts, such as across birth cohorts. Societal norms and institutions can change the heritability of social outcomes by influencing the relationship between intermediate phenotypes and outcomes. As an example outside of family demography, Domingue and colleagues (2016) found evidence that the heritability of smoking in the United States increased from roughly .13 to .32 between cohorts born 1939–1945 and 1947–1959. They concluded that the influence of genetic factors associated with nicotine addiction strengthened as evidence on the dangers of smoking emerged during the 1960s. Stated differently, the environmental change of expanded knowledge about the effects of smoking increased the effect of genetic factors, which in turn increased the heritability of smoking. As environmental forces increase or decrease the variance of behavioral outcomes, the relative importance of genes changed accordingly.
In family demography, the most common hypothesis is that the heritability of family demographic outcomes will be higher for cohorts transitioning to adulthood after the onset of the SDT. Udry (1996) proposed a multilevel biosocial model predicting that societal characteristics influence the relationships between genes and social outcomes. Specifically, the genetic influence on voluntary behavioral outcomes, such as entering parenthood and marriage, will increase in societies with low levels of social constraints. Udry (1996:335) argued that behavioral variation will increase in egalitarian and individualistic contexts and that genes will have more opportunities to express themselves as a result. According to the SDT thesis (Lesthaeghe 2014), more diverse and complex family demographic behavior is the result of a societal shift from materialist to post-materialist values.
Studies that estimated cohort differences in the heritability of family demographic behavior are limited to fertility but are generally in line with Udry's (1996) argument. Using historical twin data, Bras and colleagues (2013) found evidence for increasing heritability during the first demographic transition in the nineteenth century. They argued that genes for fertility became more important as the social control of women's fertility decreased. More recent studies using twin data have observed higher heritability for Danish cohorts transitioning to adulthood after the onset of the SDT for fertility motivation and fertility (Kohler et al. 1999, 2002; Rodgers et al. 2001). Similarly, Tropf and colleagues (Tropf, Barban et al. 2015) found that the heritability of women's age at first birth in the United Kingdom increased for cohorts characterized by liberalization and the sexual revolution but that the introduction of modern contraception and economic recessions decreased heritability shortly thereafter.
Aside from a value shift, a number of legal changes across U.S. states have coincided with more complex family demographic behavior. The U.S. Food and Drug Administration announced the pending approval of the first oral contraceptive in 1960. However, it was not until 1965 that birth control use became legal for married couples in all U.S. states. Unmarried women would have to wait until a U.S. Supreme Court ruling in 1972. In the following year, the Supreme Court granted women the right to have an abortion without excessive government restriction. Studies have demonstrated that the diffusion of oral contraceptives and abortion legalization delayed women's entry into parenthood and is associated with an overall decrease in completed fertility (Ananat et al. 2007; Ananat and Hungerman 2012; Guldi 2008). During the same period, divorce legislation was liberalized across the United States. California was the first U.S. state to pass unilateral no-fault divorce legislation in 1969, which allows one partner to petition for divorce on the grounds of irreconcilable differences. By the mid-1980s, nearly all U.S. states had legislated no-fault divorce, although many required mutual consent for no-fault divorce. Unsurprisingly, the diffusion of no-fault divorce legislation is associated with considerably higher divorce rates and, subsequently, remarriage rates (Nakonezny et al. 1995).
Contrary to the influence of the 1964 U.S. Surgeon General's Report on the heritability of smoking, it is unlikely that one single change in legislation influenced the heritability of family complexity in the United States. It is also unlikely that a shift toward post-materialism without the legal options for contraception or divorce will have a considerable impact on the heritability of family demographic behavior. We expect that
Hypothesis 2 (H2): The heritability of family trajectory complexity will increase across birth cohorts.
We expect H2 to hold especially for individuals born after the 1940s, who transitioned to adulthood after the onset of the SDT and the liberalization of contraceptive and divorce legislation. Once social structures governing union formation and dissolution and reproductive behavior loosen, individual preferences and capabilities (that may be more or less related to genes) are able to emerge more strongly. This hypothesis is guided by prior research finding that heritability increased for BMI as environmental (i.e., caloric) constraints ebbed (Conley et al. 2016).
The Case for Decreasing Heritability of Family Life Course Complexity
Perhaps the largest competitor of the SDT thesis has become Mills and Blossfeld's (2003, 2005, 2013) argument that increasing economic uncertainty—not a shift toward post-materialism—is the integral factor behind change in family demographic behavior. The general argument is that young adults who are facing economic uncertainty in a globalized world economy delay marriage and parenthood because of high costs and long commitments, thus resorting to other family models instead (e.g., cohabitation). Although the narrative of decreasing normative constraints and increasing heritability is the dominant theme in much research, some sociogenomic perspectives in social stratification link increasing social inequality to decreasing heritability (see Adkins and Vaisey 2009). It is surprising that the purported relationship between inequality and heritability has received so little attention in biodemographic theorizing because of the important role inequality plays in contemporary family demographic theory.
Links between economic uncertainty and delayed or forgone family formation can be traced to the sociological literature of the 1970s and 1980s. According to Easterlin (1975, 1976), delayed entry into parenthood is the result of a conflict between consumption aspirations and resources. Young adults are expected to enter parenthood after they achieve their desired standard of living formed during childhood. Poor labor market conditions make it more difficult for young adults to fulfill their consumption aspirations, which leads to postponed or forgone parenthood. Oppenheimer (1988) argued that delayed marriage is the consequence of increasing uncertainty in the marriage market. For example, women use men's labor market status to estimate their future socioeconomic attainment. However, as unstable, temporary, and low-wage employment as well as unemployment became more common, men needed more time to establish themselves on the labor market (Oppenheimer and Kalmijn 1995). In sum, rising economic insecurity is associated with more complex family life courses.
In a context of increasing social inequality and economic uncertainty, the characteristics associated with family demographic behavior are increasingly influenced by individuals' social backgrounds, and the importance of genes may decrease (Adkins and Vaisey 2009). As an example, assume that individuals wish to enter marriage and have two children by their early 20s rather than their early 30s but that they prioritize a certain standard of living beforehand. In a society where social background differences are low and allow early marriage and parenthood, genetic influences could be relatively high because equal social backgrounds will enable everyone to enter marriage and parenthood, and any variation will be due to genetic variation. Similarly, in a society where social background differences are low but do not facilitate early marriage and parenthood, any variation will be due to genes. In contrast to those scenarios, genetic influences will be relatively small in societies characterized by large social background differences.
Some evidence supports the inverse relationship between social inequality and heritability. For example, Floud and colleagues (1990) used historical data from the United Kingdom to demonstrate that genes became more important for height as social inequalities in terms of nutrition and sanitation lessened. In family demography, Tropf, Barban, and colleagues (2015) showed that the heritability of UK and Dutch women's age at first birth decreased for cohorts born during the 1950s, who experienced the oil and economic crises of the 1970s during their 20s. However, several studies have failed to find lower heritability in more unequal contexts (e.g., Colodro-Conde et al. 2015; Rimfeld et al. 2018). In line with the economic constraints framework, we expect that
Hypothesis 3 (H3): The heritability of family trajectory complexity will decrease across birth cohorts.
We expect that H3 will hold especially for individuals born after the 1950s, who transitioned to adulthood in times of economic turmoil.
Data and Methods
Sample and Sequence Definition
The HRS is a biennial panel study with prospective and retrospective data on family formation, as well as molecular genetic information from the respondents and their partners (HRS 2018). The original HRS cohort was first sampled in 1992 and consisted of men and women born in 1931–1941 at ages 51–61. A second study cohort, the Asset and Health Dynamics Among the Oldest Old (AHEAD), of men and women born before 1924 was collected the following year. The two study cohorts were interviewed simultaneously in 1998 when the third and fourth cohorts were introduced: the Children of the Great Depression born in 1924–1930 and the War Babies born in 1942–1947. Since then, a refreshment sample has been added every three waves (i.e., every six years). We restrict our analyses to respondents born between 1905 and 1969.
We use retrospective information in the HRS based on respondents’ self-reports on the number of children and the year of their birth, but we also include best-guess information on the year of children's birth prepared by the RAND Corporation, as well as the year of up to three marriages, the year of their dissolution, and the reason for their dissolution. This information allows us to create a data set that includes information on whether individuals experienced a transition and the age at which they experienced that transition.
We reconstruct individuals' family life courses as sequences from ages 18 to 45 using five sequence alphabets. The first sequence alphabet includes only information on fertility: at any given age, an individual can be childless (C0) or have any number of children (e.g., C1, C3, or C5). Rather than fertility, our second sequence alphabet focuses on detailed marital histories, categorizing individuals as single (S); in a first, second, or third marriage (M1, M2, or M3, respectively); divorced (D); or widowed (W). The other three sequence specifications combine union and fertility histories. The most differentiated alphabet simply combines the fertility and union sequences, generating states such as single with one child (SC1), a first marriage with two children (M1C2), divorced without children (DC0), or in a second marriage with five children (M2C5). We also generate sequences with a reduced alphabet differentiating only up to three children (e.g., M1C1, M1C2, M1C3+) and a simple alphabet differentiating only being married versus single and childless versus a parent. Because the HRS has not extensively collected cohabitation and residential histories, these histories cannot be included in the sequence alphabet. However, it is unlikely that the lack of information on parental home leaving and cohabitation will bias our results. Individuals left the parental home at relatively early ages, and cohabitation was not a common living arrangement for our study cohorts.
Calculating Sequence Complexity
where the number of transitions within a sequence, q(x), is divided by the theoretical maximum number of transitions, qmax, and the longitudinal entropy of a sequence, h(x), is divided by the theoretical maximum, hmax.
where π is the proportion of occurrences in a state, i, of the sequence alphabet, s. Entropy within sequences is maximal when each state occurs an equal number of times. Complexity is minimal in sequences with only one state and maximal in sequences that contain every state element with equal durations and have the maximum number of transitions.
Descriptive statistics for our five complexity indicators—fertility complexity index (FCI), union complexity index (UCI), differentiated sequence complexity index (DSCI), reduced sequence complexity index (RSCI), and simple sequence complexity index (SSCI)—can be found in the online appendix. In addition, we provide information on how these indicators are correlated and how those correlations change across birth cohorts (see the online appendix). FCI and UCI are not correlated, except for a moderate positive correlation for cohorts born after 1955. DSCI is nearly perfectly correlated with FCI for older cohorts but is only moderately correlated with UCI. DSCI and FCI are strongly correlated, especially among older cohorts, which is driven by more births than partnership transitions within a high-fertility context where divorce and remarriage are relatively uncommon. FCI is also slightly more strongly correlated with RSCI compared with UCI, whereas the reverse is true for SSCI.
Estimating Genetic Similarity
where N is the number of genetic markers, xij and xik are the number of minor alleles at SNP i for individuals j and k, and p is the minor allele frequency. The matrix A containing the genetic similarity estimates between all respondent pairs is called the genetic relationship matrix (GRM). We calculate the GRMs using all autosomal2 SNPs with a minor allele frequency above 1% in the sample to ensure that genetic similarity is based on variants that are not rare within the population.3 We then prune the sample for cryptic relatedness (Ajk ≥ 0.025)—that is, we remove one person in pairs of persons genetically comparable with second-degree cousins. Genetically similar individuals also share similar environments, which could bias our heritability estimates, although the violation of this assumption does not substantially bias GREML estimates (Conley et al. 2014).
Estimating Heritability
This approach belongs to the class of mixed linear modeling, similar to multilevel modeling commonly used in social demography and sociology (e.g., for cross-country and cross-cohort comparisons, see Van Winkle 2018; Van Winkle and Fasang 2017). Heritability is essentially an intraclass correlation coefficient commonly used in comparative sociological research. Rather than estimating country and cohort random effects to decompose the complexity variance attributable to differences across countries and cohorts, the random effect is used to decompose the variance into a genetic and an environmental component. The variance parameters in our models are constrained to not allow heritability estimates to be lower than 0 or larger than 1. Results from unconstrained models can be found in the online appendix.
We estimate SNP heritability for a pooled HRS sample as well as heritability for an older (1909–1940) and a younger (1941–1969) birth cohort. Following the convention in the literature, we restrict our sample to persons of European ancestry. Our sample includes self-reported non-Hispanic Whites within plus/minus 1 standard deviation of the first and second principal components of all unrelated respondents (for more information, see Ware et al. 2018). All GREML models are adjusted for respondents' gender and birth year as well as 10 principal components to adjust for population stratification—that is, differences in minor allele frequencies that are attributable to ancestral differences. This is commonly called the chopsticks problem, which arises when subgroups have different allele frequencies that coincide with a phenotype that is culturally rather than biologically determined (Hamer and Sirota 2000). For example, persons of Asian and European ancestry will systematically differ on SNP frequencies that would coincide with the different probability of chopstick use between the two subgroups. This adjustment has been criticized given that it may correct for differences that are meaningful for the trait being studied, which could make our estimates conservative. All continuous phenotypes are standardized by gender and birth year before the analyses. We also performed analyses using unstandardized phenotypes to ensure that removing differences in variation across birth years does not affect our cohort comparisons. These findings were substantively similar to the results we report here. A potential issue for our analyses, especially for our cohort comparisons, is mortality bias. As a robustness check, we estimated the probability of surviving until 2006 when genetic data were collected, following Domingue et al. (2017), and included it as a covariate in our GREML models (see the online appendix). The results of these analyses were similar and led to the same substantive conclusions. We use GCTA4 to conduct all analyses (see Yang et al. 2011).
Results
Trends in Sequence Complexity
Summary statistics for our sample are displayed in Figure 2. We present means (black lines) and coefficients of variation (gray lines) by birth year for all our sequence complexity indicators. Summary statistics on all other family demographic indicators can be found in the online appendix. FCI is highest and varies the most for cohorts born in the 1930s, reflecting the higher number of parity transitions as well as the variation in the age at first birth, birth spacing, and the number of children for cohorts born before 1940. In contrast, average UCI is highest for post-1950 cohorts, and UCI variation increases considerably across all cohorts, which matches the trend toward more variance in the propensity to marry and the number of marriages as well as the higher propensity to divorce.
DSCI, RSCI, and SSCI show differing trends. Average DSCI is highest and varies to a greater extent for older cohorts, which corresponds with trajectories consisting of a larger number of fertility transitions than union transitions. RSCI and SSCI give more weight to union transitions. Average RSCI and especially SSCI are higher for post-1950 cohorts, and their variances increase across birth cohorts. Therefore, we might observe different trends for heritability depending on the sequence alphabet: the heritability of FCI and DSCI may decrease as the variance in the complexity of those sequences decreases. However, the heritability of UCI as well as sequences with an RSCI and SSCI may increase as the variance in the complexity of those sequences increases.
GREML Heritability Estimates for Complexity Across Birth Cohorts
SNP heritability estimates for our sequence complexity measures and their 95% confidence intervals are displayed in Figure 3. The average estimate is located on the far left of each panel followed by the estimate for the 1909–1940 and 1941–1969 birth cohorts. We estimate cohort heritability using large cohorts to ensure that we have a sufficient sample to identify moderately large heritability (N ≈ 5,000 for h2 ≈ .2) (see Visscher et al. 2014). Ancillary power analyses show that a cohort heritability estimate must be between .3 and .4 to be statistically different from a cohort estimate of 0 (see the online appendix). We estimate approximate confidence intervals based on the estimated standard errors, which may be upwardly biased (see Schweiger et al. 2016). All heritability estimates and their standard errors, p values, and sample sizes can be found in Table 1.
As displayed on the far left of each panel in Figure 3, we find statistically significant SNP heritability for FCI as well as the DSCI across all cohorts. Results from our adjusted GREML models indicate that roughly 17% of FCI and 12% of DSCI is attributable to common genetic factors. In addition, the average heritability estimate of .07 for RSCI approaches statistical significance (p ≤ .08). However, we find no evidence for nonzero heritability for UCI or SSCI. In sum, we find only partial support for our first hypothesis that the complexity of family trajectories will be heritable.
We find little evidence that the heritability of sequence complexity changed across U.S. cohorts. Our results do indicate that the nonzero heritability estimates for FCI and DSCI may be driven by our oldest cohorts. For example, just over 31% of FCI variance for individuals born between 1909 and 1940 is estimated to be attributed to genetic factors. For the 1941–1969 cohort, the heritability of FCI is estimated to be less than 1%. Similarly, nearly 20% of the DSCI variance for the older cohort can be accounted for by genes, but we find no heritability for the younger cohort. In addition, we do find that early heritability estimates of 11% to 13% for SSCI approach statistical significance (p ≤ .10). In sum, our results do not support our second hypothesis that the heritability of family trajectory complexity will increase across birth cohorts. Rather, our results tend to support our third hypothesis that heritability may decrease across cohorts, although we find no statistically significant differences between heritability estimates across cohorts.
GREML Heritability Estimates for Components of Complexity Across Birth Cohorts
SNP heritability estimates for select transition components of our sequence complexity measures and their 95% confidence intervals are displayed in Figure 4. Heritability estimates for select age components of complexity are presented in Figure 5. We present estimates for the transition to the first and second marriage, the first divorce, and the first three children because these transitions are highly correlated with our complexity indicators (see online appendix). In addition, we present heritability estimates for the age at first marriage as well as the age at first and second birth. The age at transitions is important because the degree of uncertainty (i.e., longitudinal entropy) in sequences is affected by whether transitions occur early in the life course and lead to stability or occur late in the life course and indicate more turbulence. Similar to Figure 2, the average estimate is located on the far left of each panel followed by both cohort estimates. All estimates can also be found in Table 1.
Although we find statistically significant heritability estimates for the transition to parenthood and the transition to divorce, our results point to 0 heritability for the transition to the second child and the transition to first marriage. Roughly 20% of the variance in the transition to a first child is attributable to genetic factors for the 1909–1940 cohort. However, we find no evidence of heritability for our younger cohort. We also find nonzero heritability for the transition to a fourth child (see online appendix), between 19% and 27%, but not for the transition to a second or third child. This finding is interesting because it indicates that estimates on the number of children may be driven by the transition to parenthood and the transition to high-parity births. We find similar results for the transition to divorce. On average, 9% of the transition to the first divorce is heritable; however, this ranges from more than 20% for the oldest birth cohort to nonsignificant heritability for the younger birth cohort. Whereas the transition to a first marriage is not statistically heritable, the transition to a second marriage is heritable.
The heritability estimates for the age at each family demographic transition are more often nonzero and more persistent than whether the transition occurs, for two reasons. First, the standard errors of the case-control GREML models presented here are larger than in standard GREML models for continuous phenotypes. Second, transitioning into marriage and parenthood is a near-universal event for our cohorts, but when those transitions occur varies considerably. On average, just under 15% of the age variation for those transitioning to parenthood is attributable to genetic factors. The cohort estimate is 27% for the oldest cohort but is statistically insignificant thereafter. In contrast to the transition to second birth, the heritability estimate increases from nearly 17% to 20% (p ≤ .08). Similarly, we find nonzero heritability for the age at first marriage for both birth cohorts, ranging from roughly 18% for the oldest to 22% for the youngest cohort, although not for the age at the second marriage (see online supplement).
The results for these select family demographic transition and age indicators demonstrate that the heritability estimates for sequence complexity can be conceived as a weighted average of its components. For example, the heritability of FCI is nonzero for older cohorts for whom the heritability of the transition to parenthood and the age at that transition are also nonzero. For younger cohorts, the heritability of FCI becomes nonsignificant as the heritability of those same indicators reaches 0. The heritability of DSCI seems to be driven mostly by elements of fertility but is augmented by the age at marriage and the transition to divorce. Therefore, the heritability of complexity holistically captures fertility, union, and family life course processes.
Discussion
In this study, we apply a biodemographic approach to the study of family life course complexity using methods developed in molecular genetics. Specifically, we estimate the heritability of family life course complexity as well as a wide range of family demographic indicators using GCTA-GREML on data from the HRS. Based on findings from behavioral genetics research on twins, we hypothesize that family life course complexity and its components would be moderately heritable. Further, based on arguments by Udry (1996; see also Kohler et al. 2006) that heritability increases in contexts characterized by fewer social constraints, we hypothesize that the heritability of family life course complexity and its components would increase across birth cohorts in the United States. In contrast, we extend sociogenomic perspectives found in the social stratification literature to hypothesize that heritability may decrease as societies become more unequal and social backgrounds become more important (Adkins and Vaisey 2009). Although we indeed find moderate heritability for the complexity of fertility and differentiated sequences as well as other family demographic indicators (H1), we find no evidence for increasing heritability (H2) but rather some indication for decreasing heritability (H3).
Our results support micro-biosocial models suggesting that family demographic (e.g., Kohler et al. 2006; Udry 1996) and sociological (Freese 2008) outcomes are heritable. When accounting for the composite nature of family life course complexity—incorporating fertility, union formation, and union dissolution—it seems plausible that genetic factors influence complexity directly through biological mechanisms and indirectly through intermediate phenotypes, such as personality. Our average SNP heritability for age at first birth estimates are in line with those estimated by Tropf and colleagues (Tropf, Stulp et al. 2015) on a pooled sample of UK and Dutch women. As would be expected, SNP heritability for marriage and divorce are somewhat lower than earlier estimates using twin data (Jocklin et al. 1996; Johnson et al. 2004, for marriage; McGue and Lykken 1992, for divorce). One contribution of this study is not only to update heritability estimates for several family demographic outcomes based on twin studies (e.g., the transition to marriage and divorce) but also to provide the first heritability estimates for family demographic transitions and the age at those transitions for the United States using molecular genetic data. In addition, we show that composite life course outcomes (i.e., family complexity) are heritable and demonstrate which family demographic behaviors drive the heritability of family complexity. For our older cohort, the transition to a first child as well as the age at first birth especially drive the heritability of family life course complexity.
However, our results do not validate Udry's (1996) hypothesis that the genetic influence on voluntary behavioral outcomes (e.g., entering parenthood and marriage) will increase in societies characterized by the SDT. We find no evidence that the heritability of the complexity of partnership and less-differentiated changed over time, but we do find evidence for a decrease in the complexity of fertility and differentiated family complexity. Therefore, our findings tend to lend support to a sociogenomic perspective in social stratification that links increasing social inequality to decreasing heritability (see Adkins and Vaisey 2009). In a context of increasing social inequality and economic uncertainty, the characteristics associated with family demographic behavior are increasingly influenced by individuals' social backgrounds (Mills and Blossfeld 2003, 2005, 2013). As social backgrounds (i.e., environmental factors) become increasingly relevant for family demographic behavior, the relative importance of genes begins to decrease. Although our heritability estimates tend to be smaller for our younger cohort, these differences are not statistically significant. We are able to demonstrate this trend for both composite metrics (the complexity of fertility and differentiated family trajectories) as well as their components (especially the transition and age at first birth) in GREML models that are additionally adjusted for educational attainment (see online appendix). In sum, our results are important for life course sociologists and family demographers. Not only do genes affect holistic life course outcomes, but they may also drive some of the cross-temporal changes that have been found (Baizan et al. 2002; Bras et al. 2010; Chaloupková 2010; Robette 2010; Van Winkle 2018).
The estimation of heritability with GREML and using SNPs has limitations (for an overview, see Krishna Kumar et al. 2016). The GCTA-GREML approach is valid only for traits with genetic variants located throughout the entire genome that are weakly associated with that trait (Domingue, Wedow et al. 2016) because the GRM only approximates genome-wide genetic similarity if the sample of SNPs and their association with a trait serve as a reasonably good proxy for the entire genome. SNP heritability estimates may be biased if a genetic variant with an exceptionally large association is missing. However, it is unlikely that our heritability estimate for sequence complexity, which is per definition highly polygenic due to its composite nature, is biased in such a manner.
A further limitation that GCTA shares with other estimation methods for heritability is the “black box” nature of heritability. Without knowing the particular alleles associated with complexity and the phenotypic effects of those loci, it is not possible to follow the causal biological and social pathways connecting genetic factors with complexity. A possible first step would be to estimate chromosome-specific SNP heritability because molecular geneticists have identified specific chromosomes that are associated with phenotypes to a greater or lesser degree. Another approach would be to conduct a novel genome-wide association study on life course complexity to learn about the specific effect distribution across loci, although this approach would require much larger discovery samples. With sample sizes that are smaller—but that exceed what we have in this study—researchers could use bivariate GREML models to test the genetic correlation between two phenotypes, such as complexity and educational attainment (Deary et al. 2012; Lee et al. 2012). This path would allow scholars to inch toward untangling the causal mechanisms involved in the heritability of family life course complexity. Results from a GWAS could also inform the genetic correlations between life course complexity and other social/behavioral and health phenotypes through bivariate linkage disequilibrium score regression analysis (Bulik-Sullivan et al. 2015). For example, we find evidence that the genetic correlation between the age at first birth and educational attainment has increased across birth cohorts (see the online appendix), the heritability of educational attainment has increased (cf. Conley et al. 2016), and the heritability of the age at first birth has decreased. Such differentially shifting heritabilities of the components of life course complexity suggest that scholars should always decompose complex phenotypes into their endophenotypes to understand shifting genetic and environmental influences on complex phenotypes. That is, even if there is no change of the overall, complex, composite, or downstream phenotype, the relative contribution of its constituent phenotypes to its constant heritability may have shifted because of demographic or other societal changes, and such changes should be of interest. Second, our particular finding has implications for theories that link cohort change in the heritability of family demographic behavior to educational attainment, such as Udry's (1996) and Kohler et al.'s (2006) hypothesis on behavioral constraints. That is, although social restraints of different sorts have been lifted for all these endophenotypes, they have not universally increased in their heritability, suggesting that a more nuanced theory is in order. This need for case-specific predictions about changing genetic penetrance by phenotype as constraints are lifted has been underscored by others and is evident by contrasting BMI—where caloric constraints were lifted and h2 increased—with smoking—where informational, tax, and other constraints were imposed but where h2 also increased over the same period (Conley et al. 2016; Domingue, Conley et al. 2016).
As mentioned earlier, to perform bivariate GREMLs and GWAS, researchers will need much larger data sets with detailed life course and molecular data. A possible limitation of our analyses is our restricted power to detect changes over time because of our relatively small sample sizes. Our ancillary power analyses demonstrate that our heritability estimates need to lie between .3 and .4 to identify differences that statistically diverge from a heritability estimate of 0. For fertility complexity, the heritability estimates for the oldest birth cohorts are at the lower end of this bound. Future research with larger data sets should replicate these analyses.
What role can family, labor market, and welfare policy play in societies where social demographic outcomes are heritable? Herrnstein and Murry's (1994) deeply controversial book, The Bell Curve, argued that the United States had become a “genotocracy” (a term applied later by Conley and Fletcher 2017): a society where social stratification is based on genetic differences and social policy can only alleviate the consequences of those differences. Recently, Conley and Fletcher (2017) published a systematic critique of this perspective that stratification patterns in Western societies are genetically based and unalterable. Even if heritability is extremely high, it does not indicate genetic determinism. The impact of genetic factors on sociological and demographic outcomes is dependent on the social environment and must be contextualized. The reaction of individuals with similar genotypes likely varies across social contexts, such as when institutions affect the relationship between intermediate phenotypes and social processes. There is also increasing evidence in the area of epigenetics that environments also trigger or suppress the manifestation of genetic factors at a molecular level (Landecker and Panofsky 2013). Our results in no way support arguments that stratification patterns are static because of genetic influences and cannot be altered by social policy.
Our results demonstrate that holistic life course outcomes can also be heritable and that life course sociologists should begin to incorporate biodemography into their research. A possible first starting point is to estimate heritability for more life course outcomes that are of interest. For example, the heritability of life course patterns derived from cluster analysis and the heritability of deviations from those patterns would be an obvious next step. Depending on the outcomes of those analyses, life course sociologists may need to add existing educational and fertility PGS into analyses that estimate the association between individual characteristics and life course patterns. This could give sociologists leverage on whether educational attainment is causally associated with the choice to follow a certain life course path or whether unobserved intermediate phenotypes are confounding our analyses. Additionally, the development of a PGS for sequence complexity itself would be a welcome addition to the biodemography literature and would allow the direct prediction of sequence complexity and determine whether genetic components bias the social factors that are included in such prediction models. Ultimately, a PGS will allow for more gene-environment interaction research on sequence complexity, giving researchers a fuller picture of how nature and nurture codetermine our demographic life course histories.
Acknowledgments
We thank Anette Fasang, Elizabeth Thomson, Emanuela Struffolino, Tina Baier, Felix Tropf, Boriana Pratt, and members of the Princeton Biosociology Lab as well as the editors and two reviewers of Demography for insightful and constructive comments at various stages of the manuscript. We gratefully acknowledge funding from the International Strategy Office of the Humboldt University Berlin, the Humboldt University-Princeton Partnership, the Robert Wood Johnson Foundation, and Pioneer Award as well as funding from the European Research Council under the European Union's Horizon 2020 research and innovation program under Grant Agreement No. 681546 (FAMSIZEMATTERS).
Notes
Examples of nonadditive genetic effects are epistasis (i.e., interactions between genic variants), dominance deviations, suppression of genetic variants through other genetic variants, and gene-environment interactions.
Autosomal chromosomes are non-sex chromosome pairs 1–22.
GREML assumes that the average effect size on the outcome of interest per standardized SNP is minute and normally distributed. The distribution assumption also assumes that SNPs with low minor allele frequencies have larger effects. Therefore, rare alleles are removed to ensure that heritability estimates are not based on a rare variant found only in a select population (Yang et al. 2017:1307–1308).