## Abstract

Theoretical models of mortality selection have great utility in explaining otherwise puzzling phenomena. The most famous example may be the Black-White mortality crossover: at old ages, Blacks outlive Whites, presumably because few frail Blacks survive to old ages while some frail Whites do. Yet theoretical models of unidimensional heterogeneity, or frailty, do not speak to the most common empirical situation for mortality researchers: the case in which some important population heterogeneity is observed and some is not. I show that, when one dimension of heterogeneity is observed and another is unobserved, neither the observed nor the unobserved dimension need behave as classic frailty models predict. For example, in a multidimensional model, mortality selection can increase the proportion of survivors who are disadvantaged, or “frail,” and can lead Black survivors to be more frail than Whites, along some dimensions of disadvantage. Transferring theoretical results about unidimensional heterogeneity to settings with both observed and unobserved heterogeneity produces misleading inferences about mortality disparities. The unusually flexible behavior of individual dimensions of multidimensional heterogeneity creates previously unrecognized challenges for empirically testing selection models of disparities, such as models of mortality crossovers.

## Introduction

The classical mortality selection model is a triumph of formal demography. It starts from the premise that people vary systematically in mortality risk and derives the conclusion that cohorts are progressively reduced to a group of robust survivors. Models of mortality selection have been used to explain phenomena such as mortality crossovers, reversals in the sign of a disparity (e.g., Berkman et al. 1989; Dupre et al. 2006; Eberstein et al. 2008; Fenelon 2013; Guillot 2007; Hoffmann 2008; Huang and Wu 2010; Kestenbaum 1992; Lynch et al. 2003; Manton et al. 1979; Mohtashemi and Levins 2002; Nam 1995; Nam et al. 1978; Rogers 2002; Thornton 2004; Thornton and Nam 1968; Zeng and Vaupel 2003); mortality deceleration, the slowing of mortality’s rise with age (e.g., Beard 1959, 1971; Fukui et al. 1993; Horiuchi and Wilmoth 1997, 1998; Kannisto 1992; Lynch and Brown 2001; Lynch et al. 2003; Olshansky 1998; Steinsaltz and Wachter 2006; Thatcher et al. 1998; Vaupel et al. 1979; Vaupel and Yashin 1985); and mortality compression, the concentration of deaths into a small age range (e.g., Engelman et al. 2010; Kannisto 2000; Lynch and Brown 2001; Lynch et al. 2003).

But the classical mortality selection model does not speak to some of the most important questions in modern empirical mortality research, which concern the potential contribution of particular dimensions of heterogeneity when other important dimensions are unobserved. The classical model is unidimensional in that the heterogeneity on which mortality selection acts is captured by a single unobserved scalar fact about an individual—that is, in disciplinary jargon, whether the individual is “frail” or “robust.” All standard models of mortality selection are unidimensional in this sense, irrespective of other modeling choices.

The classic unidimensional model, in all its forms, developed at a time when old-age mortality data were limited, and creative theorizing made up for what could not yet be measured. Yet social science theories of stratification are multidimensional and intersectional, and substantive knowledge of health stratification suggests that there are many overlapping but distinct risk factors for mortality (Bowleg 2012). Increasingly, covariate-rich data sets allow some of these distinct dimensions of population heterogeneity to be measured and offer new opportunities to analyze how particular heterogeneities contribute to changing mortality disparities, rather than treating “frailty” as a black box. But because measured heterogeneity is always partial, work that foregrounds selection still needs to engage with unmeasured heterogeneity alongside measured covariates.

The need for a theory of multidimensional mortality selection is underscored by recent empirical analyses, which use unidimensional theory to ask multidimensional questions about the Black-White mortality crossover. The Black-White mortality crossover is the phenomenon that Black mortality exceeds White mortality at younger ages but falls below White mortality around age 85. The classical selection explanation for the crossover posits that Blacks as a group are subject to greater selective pressure than Whites because Blacks’ mortality is higher (e.g., Lynch et al. 2003; Nam 1995; Thornton and Nam 1968; Vaupel and Yashin 1985; Vaupel et al. 1979). Thus, old-age survivors include only the most robust members of the original Black cohort but a broader cross-section of the original White cohort. This long-standing theoretical explanation has increasingly been engaged by empirical studies (Berkman et al. 1989; Dupre et al. 2006; Sautter et al. 2012; Yao and Robert 2011) trying to identify which particular dimensions of heterogeneity might constitute this “frailty.” In current practice, such research draws theoretically on mortality selection models designed to compare full populations (e.g., Blacks vs. Whites) in the presence of unobserved heterogeneity but asks questions that rely on complex, nested comparisons (e.g., Blacks vs. Whites with and without stratifying on a consequential health risk). In this article, I show that the insights of the unidimensional selection model cannot be imported into the multidimensional setting and used to license the same kinds of predictions.

To address this gap between formal theory and empirical practice, I offer a model of the Black-White crossover in the presence of multiple dimensions of heterogeneity and investigate its behavior. This work builds on prior research into the behavior of covariates in survival models.^{1} Yashin and Manton made early advances in incorporating unobserved covariates into more empirically realistic survival analyses (Yashin and Manton 1997) and in estimating unobserved heterogeneity from survival models with an observed covariate and an assumed baseline distribution of the unobserved covariate (Yashin et al. 1985). More broadly, an early line of research promised to meld the theoretical precision of mortality selection modeling with the empirical richness of new longitudinal data. This tradition explored multidimensional models that focused on single-population phenomena, such as mortality deceleration (Manton et al. 1994, 1995; Manton and Woodbury 1983; Woodbury and Manton 1983), and more recently was picked up in theoretical work by Finkelstein and Esaulova (2008) and Finkelstein (2012). Finkelstein (2012) considered a two-dimensional frailty model of mortality deceleration and suggested an approach used in the analysis here: successively breaking populations into heterogeneous subpopulations defined by a single dimension of frailty and then breaking those subpopulations into homogeneous groups. But whereas studying mortality deceleration involves analyzing a single population, understanding mortality crossovers requires comparing selection processes unfolding in multiple populations (e.g., Blacks and Whites). Other analyses (Bretagnolle and Huber-Carol 1988; Henderson and Oman 1999; see discussion in Wienke 2010:127–130) that, like the current study, model multiple observed covariates in the presence of unobserved heterogeneity, focus on quantifying the bias in estimated covariate effects. Each of these strands of prior research forms the lineage for the current study, which asks a different set of questions: what happens to a mortality disparity—such as the Black-White disparity—when we incorporate a new covariate hypothesized to be part of the mortality selection process? Does the disparity change in a predictable way? In particular, how does the presence, absence, or timing of a mortality crossover change when the “frailty” that produces the crossover is partially adjusted for?

In short, I analyze whether the insights developed in unidimensional mortality selection theory can be extended to incorporate covariates representing partial measures of population heterogeneity. I show that, in general, they cannot. Multidimensional mortality selection and unidimensional mortality selection offer similar perspectives when all of the heterogeneity in a population is observed or when none of it is observed. But unidimensional heterogeneity models offer no clear guidance about a multidimensional reality in which some dimensions of heterogeneity are observed but others remain unobserved. Yet this is the most common situation for social scientists studying mortality with data sets that include social and biological covariates representing only some of the heterogeneity within each population. The fact that individual dimensions of “frailty” need not behave like frailty as a whole implies that when selection is occurring along multiple dimensions simultaneously, one cannot recover how it occurs along any one dimension (even qualitatively) without accounting for the other dimensions. This also implies that stratifying the crossover on observed heterogeneity offers quite limited information about the underlying selection processes. Nevertheless, I also show that it is possible to make some predictions about how the age at crossover responds to stratifying on key dimensions of heterogeneity, if certain assumptions can be made. This provides a direction for developing multidimensional selection theory.

I proceed by first presenting the core features of unidimensional mortality selection and then contrasting it with multidimensional mortality selection. For the multidimensional model, I outline two (alternative) predictions about how conditioning on an observed dimension of heterogeneity, in the presence of unobserved heterogeneity, should move the age at crossover. Neither prediction is supported: conditioning on partial measures of “frailty” has essentially unpredictable consequences without quite specific assumptions. I show that key facts about unidimensional heterogeneity do not hold for individual dimensions of partially observed multidimensional heterogeneity, highlighting some previously unrecognized theoretical possibilities, such as *frailty increases* (mortality selection can lead populations to become more frail as they age) and *frailty reversals* (mortality selection can lead Black survivors to be more frail than White survivors), that result from the intrinsic interactivity of multidimensional models.

Throughout, I adhere to the following terminological conventions. I consider two *populations*: Blacks and Whites. Populations may be stratified by one or two dimensions of heterogeneity, which may be unobserved or observed. The unobserved dimension of heterogeneity is always called *frailty* (in the unidimensional model) or *residual frailty* (in the multidimensional model), and the observed dimension of heterogeneity is called *exposure*. I call populations stratified by one dimension of heterogeneity *subpopulations*—for example, the subpopulation of robust Blacks or the subpopulation of exposed Whites. I call populations stratified by two dimensions of heterogeneity *groups*—for example, the group of exposed robust Blacks or the group of unexposed frail Whites. All populations, subpopulations, and groups are analyzed as closed cohorts.

## Mortality Selection With Unidimensional Heterogeneity

The classic model of mortality selection with unidimensional heterogeneity will serve as a baseline for the distinctive dynamics of mortality selection with multidimensional heterogeneity.

### Unidimensional Mortality Selection Model

*frailty*. I analyze frailty as a binary variable, which results in four internally homogenous subpopulations defined by race

*k*= {

*b*,

*w*} and frailty

*j*= {

*f*,

*r*}. Binary frailty allows the article’s insights to be expressed in the simplest and the most direct way; I discuss the implications of this modeling choice after I introduce the multidimensional model. Frailty is unobserved. The subpopulations have proportional Gompertz hazards,

*a*≥ 0, and intercepts α

_{k,j}. The subpopulation-specific intercepts are defined as

Thus, conditional on frailty, Black subpopulations have higher mortality than White subpopulations in proportion *b* > 1 (the *Black mortality multiplier*); and conditional on race, frail subpopulations have higher mortality than robust subpopulations in proportion *f* > 1 (the *frail mortality multiplier*).^{2}

_{k}(

*a*) ≤ 1 is the proportion of race

*k*that is frail, and 1 − π

_{k}(

*a*) is the proportion of race

*k*that is robust at age

*a*. The proportion frail, in turn, is given by

^{3}); and $Sk,raSk,fa$ is the ratio of robust to frail survivors within each race at age

*a*, where $Sk,ja=exp\u2212\u222b0a\mu k,judu$. Because the frail die more quickly than the robust, the survivorship ratio increases and the proportion frail (π

_{k}(

*a*)) decreases monotonically with age. Because Blacks always have higher mortality in each subpopulation but not necessarily in the aggregate, the crossover is an example of Simpson’s paradox (e.g., Hernán et al. 2011; Hutchinton et al. 2000).

### Four Facts About This Unidimensional Heterogeneity Model of Racial Disparities

The unidimensional frailty model of the Black-White mortality crossover just presented makes two key assumptions, from which two results follow.

First, by assumption, conditional on frailty, Black mortality exceeds White mortality at every age, μ_{b, j}(*a*) > μ_{w, j}(*a*).

Second, by assumption, conditional on race, the frail have higher mortality than the robust at every age, μ_{k, f} (*a*) > μ_{k,r} (*a*).

From the second assumption, it follows that, within each race, the proportion of survivors who are frail declines monotonically over age, π_{k}(*a* + 1) < π_{k}(*a*) (Vaupel and Yashin 1985).

From the first and second assumptions together, it follows that if Blacks and Whites have the same proportion frail at baseline, Blacks have a smaller proportion of frail survivors than do Whites at every subsequent age, π_{b}(*a*) < π_{w}(*a*), *a* > 0 (Vaupel et al. 1979). Thus, the racial difference in the frailty of survivors results from the interaction of the between-race disadvantage of Blacks and the within-race disadvantage of the frail.

These four generalizations provide a crucial point of comparison for the multidimensional model discussed later.^{4}

### Two Roles of Unidimensional Frailty

The interaction between the disadvantage of Blacks and the disadvantage of the frail is depicted visually in panel a of Fig. 1, which illustrates the functional relationships given in Eqs. (1)–(4) and provides a point of comparison for the multidimensional model introduced in the next section. Panel a shows that the Black mortality multiplier affects the mortality of robust and frail Blacks, whereas the frail mortality multiplier affects the mortality of frail Blacks and Whites. The mortality of both frail and robust Blacks affects the proportion of Black survivors who are frail, and each of those three terms, in turn, affects aggregate Black mortality; likewise for Whites. (Panel a of Fig. 1 omits the parameters that all subpopulations share: α, β, and π(0).)

Panel b of Fig. 1 zooms in on the part of panel a depicting the effects of the frail mortality multiplier, *f*, on aggregate race-specific mortality, $\mu \xafka$, with +/– marks indicating the sign of each effect.^{5} It shows that *f* plays two competing roles in aggregate mortality. On one hand, increasing *f* raises aggregate mortality by raising the mortality of the frail, $f\u2192+\mu k,fa\u2192+\mu \xafka$. On the other hand, because *f* raises the mortality of the frail at each age, it also lowers aggregate mortality by reducing the proportion of the frail who survive to old age, $f\u2192+\mu k,fa\u2192-\pi ka\u2192+\mu \xafka$. The functional relationships shown qualitatively here are given quantitatively in section 1 of the online appendix.^{6} These two roles of frailty interact to create the potential for a crossover.

### The Black-White Mortality Crossover With Unidimensional Heterogeneity

The key point is that the four facts about mortality with unidimensional heterogeneity fully determine the sign of all three terms.

The first term is the Black-White difference in the mortality of the frail, μ_{b, }_{f} (*a*) − μ_{w,}_{f} (*a*), weighted by the proportion of the Black population that is frail, π_{b}(*a*). The second term is the Black-White difference in the mortality of the robust, μ_{b,}_{r} (*a*) − μ_{w,}_{r} (*a*), weighted by the proportion of the Black population that is robust, 1 − π_{b}(*a*). These two terms are always positive because Black mortality is always higher than White mortality, conditional on frailty.

The third term is the Black-White difference in the proportion frail, π_{b}(*a*) − π_{w}(*a*), weighted by the frail-robust difference in the mortality of Whites, μ_{w,}_{f} (*a*) − μ_{w,}_{r}(*a*). This term is always negative because its two factors have different signs: the Black-White difference in the proportion frail is always negative, whereas the frail-robust mortality difference among Whites is always positive.^{7} I call this third term the *frailty factor*. It represents the contribution of frailty-induced mortality selection to the racial difference in mortality. The frailty factor illuminates the dynamics of the multidimensional heterogeneity model presented in the next section.

Equation (5) highlights the tradeoff at the heart of the crossover in the unidimensional selection model: higher Black disaggregated mortality but lower frailty among Black survivors. This makes crossover dynamics with unidimensional heterogeneity qualitatively simple: the only question is whether and when the White-Black compositional difference will outweigh the Black mortality disadvantage at the subpopulation level.

The first column of Table 1 summarizes these key features of mortality selection with unidimensional heterogeneity, which will serve as a point of comparison with the multidimensional model.

### Mortality Selection With Multidimensional Heterogeneity

The unidimensional mortality selection model is the central reference point for work on mortality crossovers. But it is the wrong reference point for recent empirical work on the Black-White mortality crossover, which is fundamentally multidimensional. These studies (Dupre et al. 2006; Sautter et al. 2012) ask what happens to the crossover when a particular dimension of heterogeneity is observed and other dimensions remain unobserved. They stratify on the observed dimension of heterogeneity and compare the ages at Black-White crossover of the resulting subpopulations with the age at crossover of the aggregate populations. To formalize the theory implicit in this practice, I propose a model of mortality selection with partially observed multidimensional heterogeneity, show that it behaves quite differently from the unidimensional model, and analyze the crossover age in the new model.

### Multidimensional Mortality Selection Model

To demonstrate that multidimensional selection models with partially observed heterogeneity exhibit intrinsically different behaviors than the classical unidimensional model, I present a multidimensional model that differs from it in only one respect: each racial population is crosscut by not one but two dimensions of fixed heterogeneity. The observed dimension of heterogeneity describes whether people suffered a deleterious *exposure*, such as tobacco exposure in utero, given that maternal smoking satisfies the model assumptions tolerably well in that it raises mortality, is fixed at birth, and is relatively evenly distributed by race (Curtin and Mathews 2016). The unobserved dimension of heterogeneity describes whether people are *residually frail* or residually robust.

*k*= {

*b*,

*w*}, observed exposure,

*i*= {

*t*,

*n*}, and unobserved residual frailty,

*j*= {

*f*,

*r*}. The groups have proportional Gompertz hazards,

*a*≥ 0 and group-specific intercepts α

_{k, i, j}. The intercepts are defined as

*b*> 1 is the Black mortality multiplier, as before;

*f*

^{∗}> 1 is the residual frailty mortality multiplier; and

*t*> 1 is the exposure mortality multiplier. (The exposed groups are designated with

*t*as in tobacco exposure, or treatment.) I assume that, at baseline, both unobserved residual frailty and observed exposure are distributed independently of race, although not necessarily independently of each other.

The group-specific mortalities are analogous to the unidimensional model’s subpopulation-specific mortalities. In the multidimensional model, each set of subpopulations defined by one dimension of heterogeneity, aggregating over the other dimension (e.g., tobacco-exposed Whites, aggregated over residual frailty), is a separate instantiation of the unidimensional model.

If both dimensions of heterogeneity were observed, then the Black and White populations could be analyzed straightforwardly in terms of their component groups. If neither dimension of heterogeneity was observed, then the Black and White populations could be analyzed as having just one dimension of heterogeneity with four (rather than two) categories—that is, as a version of the classical unidimensional heterogeneity model. This multidimensional selection model speaks to a third situation at the heart of recent empirical work on the crossover: the case in which one dimension of heterogeneity is observed and the other is unobserved.

*k*and observed exposure

*i*, $\mu \xafk,ia$, is the weighted average of the residually frail and residually robust groups in the subpopulation,

_{k, i}(

*a*) is the proportion of frail members of the subpopulation with exposure

*i*, and 1 − π

_{k, i}(

*a*) is the proportion robust.

^{8}By assumption, $\mu \xafk,ia$ is observed, but its component parts are not.

*k*, $\mu \xafka$, is a weighted average of the subpopulation-specific mortalities, $\mu \xafk,ia$:

_{k}(

*a*) is the proportion of each race that is exposed:

By assumption, T_{k}(*a*) is observed, but its component parts are not. T_{k}(*a*) is defined at the population level, whereas π_{k, j}(*a*) is defined at the subpopulation level.^{9} The interaction between these two dimensions of heterogeneity drives the distinctive behavior of heterogeneity at the aggregate population level (as in T_{k}(*a*), ∏_{k}(*a*)) compared with heterogeneity at the subpopulation level (as in π_{k, i}(*a*), τ_{k, j}(*a*)), which the following sections will elucidate.^{10}

### A Note on Model Interpretation

Two key parametric forms dominate the frailty literature: binary (e.g., Lynch et al. 2003; Vaupel and Yashin 1985; Wrigley-Field 2014) and gamma-distributed (e.g., Gampe 2010; Horiuchi and Wilmoth 1998; Manton and Stallard 1981; Vaupel et al. 1979; Wienke et al. 2003) frailty. In general, researchers use gamma-distributed frailty when the point is to better match empirical reality because much consequential variation between individuals is continuous, and they often use binary frailty when the point is to produce conceptual insight about the nature of selection dynamics—for example, the stylized selection models in the classic work of Vaupel and Yashin (1985). The study here is in the latter tradition. I use a binary frailty model because my aims are pedagogical, and the core intuitions are easier to grasp in a simplified context. (The online appendix discusses some of the results using an alternative, gamma specification.)

In the model used here, it can seem natural to read both the frailty multiplier and the Black mortality multiplier as implying intrinsic, perhaps even genetic, individual disadvantages. Importantly, the model used here does not imply this—and indeed these are unrealistic interpretations of both racial disadvantages and of the stable inequalities captured by “frailty.”

Binary frailty can be thought of as a simplification of a context in which most consequential disadvantages are heavily clustered rather than independently distributed, so that the population crystallizes into sharply distinct advantaged and disadvantaged social groups along just a few dimensions. Similarly, the Black mortality multiplier is a mathematical construct that simplifies the vast complexity of racism in the United States. It makes one key assumption: being socially categorized as Black is disadvantageous for everyone who is categorized that way. This does not mean that all Blacks are disadvantaged *in total* compared with all Whites; indeed, if that were true, no crossover would be possible. Blacks may be disadvantaged by their race but also advantaged in ways represented as exposure and residual frailty, where Whites may be disadvantaged. But it does mean that all Blacks are disadvantaged relative to the mortality we should expect for them if they were treated as Whites are currently treated; in that sense, the model assumes that racial disadvantage is ubiquitous.^{11}

Nor does the Black mortality multiplier (a cohort-specific parameter) imply that racial disadvantage is ahistorical or unchanging. Both the magnitude and real social content of the individual-level disadvantage associated with being socially categorized as Black would be different for a cohort growing up under near-universal segregation enforced by legal and extralegal violence than for a cohort growing up under formal legal equality, mass incarceration, and racialized poverty. The Black mortality multiplier does imply that racial disadvantage does not attenuate over the life course for individuals. This assumption is not an empirical claim, but it is important for the pedagogical purpose I outline: it ensures that the model’s crossovers reflect selection rather than life course dynamics.^{12}

Thus, a model with binary frailty and a Black mortality multiplier is used here because it focuses attention on the multidimensional selection dynamics that produce the surprising patterns shown later, while simplifying away other complexities not essential to those patterns. For example, capturing Black disadvantage explicitly in a mortality multiplier, rather than implicitly in the frailty distribution, preserves in a clear and accessible fashion the distinction between Black-White inequality at the individual level (represented in that multiplier) and the disparity at the population level. This distinction is at the heart of most interpretations of the crossover.^{13} Additionally, much of the article compares aggregate populations with subpopulations that are defined by sharing a particular level of frailty along one dimension. This comparison is a tractable way to present results in a simple form when the population can be broken down into only a few subpopulations, such as the exposed and the non-exposed.^{14} However, a model in which both dimensions of heterogeneity are continuous has infinitely many subpopulations (or, if heterogeneity values are rounded, at least *very many* subpopulations). An analysis that statistically adjusts for the level of (for example) exposure must choose which exposure levels to use to define the subpopulations of interest. I bracket these issues by using a model with just two observed subpopulations and analyzing both. By treating both dimensions of heterogeneity as binary, I also highlight the mathematical symmetry between observed and unobserved dimensions.

### How Stratifying on Partially Observed Heterogeneity Might Change the Age at Crossover: Two Predictions

Identifying particular dimensions of heterogeneity that contribute to the aggregate Black-White crossover requires a testable prediction derived from a multidimensional selection model. A natural place to look for testable predictions is in the outcome that dominates research on mortality selection: the age at onset for some mortality selection artifact (e.g., Berkman et al. 1989; Dupre et al. 2006; Horiuchi and Wilmoth 1997, 1998; Lynch and Brown 2001; Lynch et al. 2003; Sautter et al. 2012; Yao and Robert 2011)—in this case, the crossover. Thus, to connect the multidimensional heterogeneity model to empirical research, I consider the question, What happens to the extent of racial disparities in mortality—and what happens to the age at crossover—when Black and White mortality are stratified on an observed dimension of heterogeneity (“uterine tobacco exposure”) while another dimension (“residual frailty”) goes unobserved?

The following two predictions do not follow directly from the unidimensional model, which is silent on multidimensional applications. Rather, they represent alternative attempts to generalize that model’s logic to the questions asked in empirical practice. I will assess how these predictions fare in describing the crossover under partial stratification.

**Prediction 1.** This prediction is used in recent empirical literature referencing only formal models of unidimensional, not multidimensional, heterogeneity. This empirical work on the Black-White mortality crossover (Dupre et al. 2006, Sautter at el. 2012) first presents a hypothesis that mortality selection in Black and White populations operates simultaneously on an observed dimension of heterogeneity (such as poverty, education, or religiosity) and an unobserved dimension of heterogeneity, residual frailty.^{15} It offers predictions about the ages at crossover in the aggregate and in subpopulations when the Black and White populations are stratified by the observed heterogeneity and tests these predictions in empirical data, concluding that the observed dimension is (in the case of poverty and religiosity) or is not (in the case of low education) a dimension of the heterogeneity that produces the crossover in the aggregate.

Both Sautter et al. (2012) and Dupre et al. (2006) used the criterion that a trait is “[a source] of heterogeneity in individual frailty that contribute[s] to the Black-White mortality crossover” (Sautter et al. 2012:1566) if two regression coefficients on mortality are statistically significant: the trait interacted with age and with race.^{16} They further seem to have taken this criterion as coextensive with the criterion that the observed trait is part of “frailty” (i.e., multidimensional heterogeneity) if and only if conditioning on the trait changes the age at crossover (in some direction). The Dupre/Sautter criterion, then, proposes testing a model with partially observed, multidimensional heterogeneity by conditioning on the observed dimension and assessing whether the age at crossover changes. This prediction is not derived from any formal model of multidimensional heterogeneity.

**Prediction 2.** In translating the unidimensional model into a multidimensional setting, one might also expect a more specific prediction to hold. If each dimension of multidimensional heterogeneity—such as uterine tobacco exposure and residual frailty, or low education and residual frailty—behaved like unidimensional frailty, then each dimension would have a predictable effect on Black-White disparities. Specifically, the tobacco-exposed would necessarily have higher mortality than the non-exposed, more surviving Whites would necessarily be exposed to tobacco than surviving Blacks, and tobacco exposure would necessarily raise aggregate White mortality relative to Black mortality. Stratifying on observed tobacco exposure would therefore raise Black mortality relative to White mortality, delaying the crossover to an older age: the aggregate population would necessarily reach crossover before the subpopulations. This would constitute a testable prediction of the multidimensional heterogeneity model.

In what follows, I show that neither the Dupre/Sautter prediction nor this more specific prediction about crossover order follows from the multidimensional heterogeneity model.

### Unexpected Behaviors of Multidimensional Heterogeneity: The Four Key Facts About Unidimensional Heterogeneity Do Not Apply

In the unidimensional model, I identified four key facts and a resulting decomposition for the Black-White mortality crossover in which all terms had known sign. None of these generalizations extend to the individual dimensions of multidimensional heterogeneity. That is, in a multidimensional heterogeneity scenario where only some dimensions of heterogeneity are observed, neither the observed nor the unobserved dimensions necessarily behave like unidimensional frailty.

The distinctive behaviors of the multidimensional model include phenomena that I label subpopulation race crossovers, frailty crossovers, frailty increases, and frailty reversals. The first two possibilities are straightforward extensions of the unidimensional model to the multidimensional context; the latter two are more surprising departures from unidimensional selection.

*Subpopulation Race Crossovers*

In the unidimensional model, conditional on frailty, *j,* Black mortality is always higher than White mortality, μ_{b, j}(*a*) > μ_{w, j}(*a*). By contrast, in the multidimensional model, the subpopulations can have their own race crossovers. Conditional on unobserved residual frailty, Black mortality can be either higher or lower than White mortality, $\mu \xafb,ja\u2277\mu \xafw,ja$, at any given age. Analogously, conditional on observed exposure, Black mortality can be either higher or lower than White mortality, $\mu \xafb,ia\u2277\mu \xafw,ia$. For example, Fig. 2 illustrates a cohort in which, in the exposed subpopulation, Black mortality is higher than White mortality before age 70 and after age 76 but is lower than White mortality in between. Figure 2 and all following numerical illustrations come from a large universe of simulations described and analyzed in section 3 of the online appendix; the specific parameter values for all illustrative figures are given in Table S3.2.

These Black-White subpopulation crossovers can occur because each subpopulation defined by stratifying on the observed dimension of heterogeneity instantiates the unidimensional heterogeneity model given in Eqs. (1)–(3).

*Frailty Crossovers*

In the unidimensional model, within each race, frail mortality is always higher than robust mortality, μ_{k, f} (*a*) > μ_{k, r}(*a*). In the multidimensional model, within each race, the residually frail subpopulation may have either higher or lower mortality than the residually robust subpopulation, μ_{k, f} (*a*) ≷ μ_{k, r}(*a*), and at any age. Similarly, the exposed subpopulation may have either higher or lower mortality than the non-exposed subpopulation, μ_{k, t}(*a*) ≷ μ_{k, n}(*a*). Frailty crossovers are exactly analogous to Black-White subpopulation crossovers. Figure 3 shows a frailty crossover for a cohort in which residually frail mortality falls below residually robust mortality for both Blacks (ages 60–73) and Whites (ages 71–84).

*Frailty Increases and Frailty Reversals*

In the unidimensional model, survivors are progressively less likely to be frail as the population ages, π_{k}(*a* + 1) < π_{k}(*a*) (the third fact about the unidimensional model). Furthermore, given equal baseline frailty across races, Black survivors are always less likely than White survivors to be frail after baseline, π_{b}(*a*) < π_{w}(*a*) (the fourth fact).

By contrast, in a multidimensional model, mortality selection can increase as well as decrease population-level residual frailty, ∏_{k}(*a* + 1) ≷ ∏_{k}(*a*), or population-level exposure, T_{k}(*a* + 1) ≷ T_{k}(*a*). I call this possibility a *frailty increase*. Furthermore, mortality selection can make Black survivors more or less likely than White survivors to be residually frail, ∏_{b}(*a*) ≷ ∏_{w}(*a*), or more or less likely to be exposed, T_{b}(*a*) ≷ T_{w}(*a*). I call this possibility that Black survivors become more disadvantaged than White survivors a *frailty reversal*. Frailty increases and frailty reversals violate the most important insights into mortality selection derived from the unidimensional model.^{17}

The formal conditions for frailty reversals are given in section 2 of the online appendix, but the intuition is straightforward. Just as unidimensional mortality selection creates a negative association between race and frailty among survivors, multidimensional mortality selection creates a negative association between tobacco exposure and residual frailty within each race. This negative association can become so strong that selecting *against* one of those dimensions of heterogeneity becomes selecting *for* the other. The dimension being selected *for* can thus increase over age (a frailty increase) or—because this selection is stronger among Blacks—can become more common among Blacks than among Whites (a frailty reversal). Whenever multidimensional selection leads mortality to select *for* some dimension of heterogeneity that raises mortality, that dimension is always the one with a weaker effect on mortality because selection for it is driven by complex associations created by selection against the stronger dimension. Thus, Blacks will always be more selected than Whites along the stronger dimension of heterogeneity but not necessarily along the other dimension.

To illustrate frailty increases and a frailty reversal, Fig. 4 shows the proportions of Black and White survivors that are residually frail in a simulated cohort. Frailty increases occur from ages 83 to 94 for Blacks and from ages 90 to 101 for Whites. These frailty increases result from frailty crossovers such that in the Black and White populations at these respective ages, the residually frail have lower mortality than the residually robust. Mortality selection at these ages therefore makes each population more residually frail.

Figure 4 also shows a frailty reversal that occurs from ages 86 to 97. Frailty reversals result from the interaction between the two dimensions of heterogeneity. In this cohort, exposure raises mortality a great deal at the individual level, whereas residual frailty raises mortality much less, *t* ≫ *f*^{∗}. Consequently, both races, and especially Blacks, are heavily selected against exposure. Furthermore, all subpopulations, and especially the exposed, are selected against residual frailty. But because comparatively fewer exposed Blacks than Whites survive, selection against residual frailty occurs predominantly among Whites. The interaction of selection against exposure and selection against residual frailty results in Blacks being less selected against residual frailty than Whites for an 11-year span. (Section 4 of the online appendix illustrates and analyzes a frailty reversal in a cohort simulated with an alternative, gamma-distributed frailty model and serves as a bridge between the results presented here and the gamma-Gompertz frailty literature.)

Frailty increases and frailty reversals underscore just how much the multidimensional selection model differs from the unidimensional one. When only a single dimension of fixed heterogeneity raises mortality, we can be certain that it declines monotonically over age and that if Blacks and Whites start out with the same proportion frail, Blacks end up with fewer frail than Whites at each subsequent age. Neither of these core generalizations necessarily extends to each fixed dimension of heterogeneity that raises mortality when there is more than one. In the next section, I show that the interaction between dimensions of heterogeneity that drives these possibilities stems from a distinctive third role of frailty that is unique to the multidimensional model. As explained in section 2 of the online appendix, this third role of frailty not only forestalls the four key facts about unidimensional heterogeneity but also breaks the dependencies between those facts.

### Three Roles of Frailty in Multidimensional Mortality Selection

The four key facts about unidimensional heterogeneity do not extend to the multidimensional model because each dimension of heterogeneity in the latter plays three, rather than two, roles in determining population-level mortalities, $\mu \xafka$. Figure 5 and Table 1 represent heterogeneity’s three roles, focusing on unobserved residual frailty for convenience. (Analogous arguments apply to observed exposure’s three roles.) Figure 5 and Table 1 are based on the decomposition of population-level mortality given in Eq. (9).

In the unidimensional model, the frail mortality multiplier, *f*, plays two roles in population-level mortality for each race: it simultaneously increases population-level mortality by increasing the mortality of the frail subpopulation and reduces population-level mortality by reducing the proportion of frail survivors. In the multidimensional model, unobserved residual frailty, *j* (and by analogy, observed exposure, *i*), plays the same two roles. An increase in the residual frailty multiplier *f*^{*} increases population-level mortality by increasing the mortality of the residually frail within each subpopulation defined by observed exposure, $f\u2217\u2192+\mu k,i,fa\u2192+\mu \xafk,ia\u2192+\mu \xafka$; it decreases population-level mortality by decreasing the proportion frail within each subpopulation defined by observed exposure, $f\u2217\u2192\u2212\pi k,ia\u2192+\mu \xafk,ia\u2192+\mu \xafka$. As in the unidimensional model, these two roles suffice to produce a Black-White crossover in aggregate mortalities.

The third role of heterogeneity, by contrast, is new and considerably more complex: residual frailty in the multidimensional model affects population-level mortality by changing the proportion of the population that is exposed, T_{k}(*a*). This means that the two dimensions of heterogeneity—unobserved residual frailty and observed exposure—interact. Even if the two dimensions of heterogeneity start out distributed independently of each other, they will become associated as the cohort ages: survivors who are disadvantaged along one dimension are unlikely to also be disadvantaged along the other because such multiply disadvantaged individuals are least likely to survive.^{18}

The effect of residual frailty on population-level mortality via observed exposure composition is essentially unpredictable for two reasons. First, increasing the disadvantage associated with residual frailty, *f*^{*}, can either increase or decrease the proportion of survivors who are exposed, T_{k}(*a*). Insofar as the disadvantage associated with residual frailty increases the mortality of the exposed subpopulation, $\mu \xafk,ta$, it will *decrease* the proportion of survivors who are exposed, $f\u2217\u2192+/\u2212\mu \xafk,ta\u2192-Tka$. Insofar as the disadvantage associated with residual frailty increases the mortality of the non-exposed subpopulation, $\mu \xafk,na$, it will *increase* the proportion of survivors who are exposed, $f\u2217\u2192+/\u2212\mu \xafk,na\u2192+Tka$.^{19} When the total effect of the two paths from *f*^{∗} into T_{k}(*a*) is positive in one population at some age, the result can be a “frailty crossover” between the exposed and non-exposed and a “frailty” (i.e., observed exposure) increase in that population at that age. When the total effect of the two paths into T_{k}(*a*) is larger among Blacks than among Whites for some span of ages, the result can be a “frailty” (i.e., observed exposure) reversal. Specifically, a frailty reversal in observed exposure, in which Black survivors are more likely than White survivors to be exposed for some span of ages, occurs when the total effect of the paths into T_{k}, *cumulative* over all prior ages, is larger for Blacks than for Whites.

Second, increasing the proportion of survivors who are exposed, T_{k}(*a*), can either increase or decrease population-level mortality, $Tka\u2192+/\u2212\mu \xafka$. Increasing T_{k}(*a*) will increase population-level mortality when the exposed have higher mortality than the non-exposed. Increasing T_{k}(*a*) will decrease population-level mortality when the exposed have lower mortality than the non-exposed—that is, after a “frailty” (i.e., exposure) crossover. Thus, absent precise quantitative knowledge of the model parameters, residual frailty’s third role has an unpredictable effect on aggregate mortality, $f\u2217\u2192+/\u2212Tka\u2192+/\u2212\mu \xafka$.^{20}

In sum, in the multidimensional model, the various dimensions of heterogeneity within each population interact with one another, making it extremely difficult to relate any one observed dimension of heterogeneity to clean predictions about population-level mortality. If it is difficult to relate any given observed heterogeneity to aggregate mortality in any one population, then it is doubly difficult to relate an observed dimension of heterogeneity to mortality differences between populations. Next I show what this implies for empirical research: stratifying on an observed dimension of heterogeneity (while another dimension remains unobserved) can either increase or decrease the Black-White disparity in mortality, and the resulting subpopulations can reach a crossover either before or after the aggregate population.

### Decomposition of the Aggregate Crossover With Multidimensional Heterogeneity: Conditioning on Observed Heterogeneity Can Move the Age at Crossover in Either Direction

Equation (11) is exactly analogous to the decomposition of the Black-White mortality disparity with unidimensional heterogeneity given in Eq. (5) except that in the unidimensional case, the signs of all three terms were known *a priori* because they were determined by the key facts about unidimensional heterogeneity. By contrast, here those facts need not apply, and each of the three terms in Eq. (11) can be either positive or negative at any given age.

The first two terms of Eq. (11) are, respectively, the Black-White difference in the mortality of the exposed, $\mu \xafb,ta\u2212\mu \xafw,ta$, weighted by the proportion of Blacks who are exposed, T_{b}(*a*), and the Black-White difference in the mortality of the non-exposed, $\mu \xafb,na\u2212\mu \xafw,na$, weighted by the proportion of Blacks who are non-exposed, 1 − T_{b}(*a*). These two terms can be either positive or negative because a Black-White *subpopulation crossover* may or may not occur in each subpopulation defined by observed exposure.

The third term of Eq. (11) is the *frailty factor*, representing the contribution of the racial compositional difference in observed exposure to the racial difference in aggregate mortality. It is the Black-White difference in observed exposure, T_{b}(*a*) − T_{w}(*a*), weighted by the extent to which observed exposure is associated with higher mortality among Whites, $\mu \xafw,ta\u2212\mu \xafw,na$. This term can take either sign because each of its two factors can take either sign. The mortality difference will be positive ($\mu \xafw,ta\u2212\mu \xafw,na>0$), as in the unidimensional case, if Whites have not had a *frailty crossover* along the observed exposure dimension and will be negative ($\mu \xafw,ta\u2212\mu \xafw,na<0$) if they had a frailty crossover. And the compositional difference will be negative (T_{b}(*a*) − T_{w}(*a*) < 0), as in the unidimensional case, if there has not been a *frailty reversal* along the observed exposure dimension and will be positive (T_{b}(*a*) − T_{w}(*a*) > 0) if there has been a frailty reversal.

When the frailty factor is negative, the Black-White mortality disparity is less positive, or more negative, in the aggregate than in the subpopulations. Thus, the aggregate can have a crossover when the subpopulations do not or can have a more extreme crossover than the subpopulations do. Conversely, when the frailty factor is positive, the Black-White mortality disparity is more positive, or less negative, in the aggregate than in the subpopulations. Thus, a crossover in aggregate mortality can be absent even when one or both of the subpopulations has a crossover.

Consequently, aggregate mortality can reach a crossover before both subpopulations, after both subpopulations, or in between the subpopulations. Stratifying Black and White mortality on any single dimension of heterogeneity therefore moves the crossover in an essentially unpredictable direction. These results are summarized in the third panel of Table 1. Figure 6 shows illustrative examples of all three scenarios, with solid vertical lines marking the onset of the aggregate crossover and dashed vertical lines marking the onset of the subpopulation crossovers.

Importantly, the crossover order—whether the aggregate populations reach a crossover at a younger or older age than the subpopulations—can change in response to very minor shifts in the model parameters. Cohorts that share most of their parameters can nevertheless vary in their crossover order, as shown in section 3 of the online appendix, which analyzes over 1.5 million simulated cohorts. Consequently, there is no obvious *a priori* prediction, absent strong assumptions about the latent model parameters, about whether stratifying on a single dimension of heterogeneity will increase or decrease the Black-White mortality disparity at any given age, or increase or decrease the age at crossover.

### Implications for Empirical Research

These results cast doubt on the two potential tests of the multidimensional heterogeneity model based on stratifying Black and White populations on an observed dimension of heterogeneity and comparing changes in the age at crossover to putative predictions based on the model.

One potential test was based on the Dupre/Sautter prediction that stratifying on a single dimension of multidimensional heterogeneity should move the crossover in some (unspecified) direction. This criterion is neither a necessary nor a sufficient condition for identifying dimensions of heterogeneity that contribute to the aggregate crossover. On one hand, the age at crossover will almost always shift in some direction when any trait associated with both race and mortality is controlled for. This is true regardless of whether that trait behaves like the *frailty* of a mortality selection model—that is, regardless of whether it approximates the model assumptions of being fixed in individuals and raising mortality at all ages. On the other hand, it is possible for the crossover to occur at the same age in the aggregate population and in a subpopulation, even if the trait does constitute a dimension of frailty. Such a confluence of crossovers requires only—in the language of Eq. (11)—that the frailty factor have a very similar magnitude to that of the other subpopulation’s contribution to aggregate mortality around the aggregate crossover age.^{21} Fig. 7 shows an example of such a cohort. In this simulated cohort, the non-exposed subpopulation reaches a crossover at 18 days younger than the aggregate population, or simultaneously from the perspective of any real study of old-age mortality.^{22}

The second potential test of the model was based on the prediction that stratifying on the observed dimension of heterogeneity might necessarily increase the Black-White mortality disparity and delay the crossover. This would be true if individual dimensions of heterogeneity behaved like unidimensional heterogeneity. The results here cast doubt on this criterion as well. The preceding section shows that in the multidimensional context, aggregate and subpopulation crossovers can in fact occur in any order. Thus, empirically identifying particular dimensions of crossover-producing heterogeneity via such directional predictions similarly would not work.

The goal of identifying particular dimensions of heterogeneity that comprise a multidimensional analogue to “frailty” is an essential one for mortality research. But the results presented here highlight the dangers of pursuing it without the benefit of an explicit model of multidimensional mortality selection. Moreover, they suggest that the goal may be surprisingly difficult to achieve, or at least that it may require something other than the standard strategy of analyzing the age at which mortality selection artifacts begin—whether the crossover (Berkman et al. 1989; Dupre et al. 2006; Lynch et al. 2003; Sautter et al. 2012; Yao and Robert 2011) or mortality deceleration (e.g., Horiuchi and Wilmoth 1997, 1998; Lynch and Brown 2001; Lynch et al. 2003).

A question for further research is what other tests of the multidimensional mortality selection model might be possible. Section 3 of the online appendix shows that although the crossover order varies with even small parameter changes over a large swath of simulated parameter space, some predictions nevertheless are possible, contingent on particular combinations of parameter values. The very presence of subpopulation crossovers implies that residual frailty is consequential and relatively common at baseline: whatever the measured exposure contributes to the aggregate crossover, the unmeasured heterogeneity is sufficient to generate a crossover.

In general, when the proportions of disadvantaged members (e.g., the residually frail and the exposed) are small at baseline, the crossover order is more constrained (largely because the aggregate dynamics will be dominated by the large group of more advantaged survivors); when the disadvantaged categories are larger at baseline, frequently any order is possible, even when the other parameters are fixed. With other parameters held fixed, when baseline residual frailty is very high, it is relatively rare for the aggregate crossover to happen after both subpopulation crossovers when exposure is also very high at baseline, and it is relatively rare for the aggregate crossover to happen before both subpopulation crossovers when baseline exposure is low. (An aggregate crossover occurring between the two subpopulation crossovers is ubiquitous across the parameter space.)

The results here also imply that additional empirical tests of the multidimensional model may be possible in the special circumstance that the measured dimension of heterogeneity, *t*, can be assumed to represent a large portion of the total heterogeneity *f* = *t* + *f*^{∗} (although see the discussion in section 4 of the online appendix that complicates this criterion for models in which the dimensions are not equally consequential for each race). Such a scenario is presumably atypical in the case of covariates such as religious participation, which likely account for only a relatively small part of the stable heterogeneity in mortality risk within racial populations; but it might be reasonable in the case of a covariate such as a Charlson Comorbidity Index (Charlson et al. 1994), which summarizes a variety of chronic medical conditions that collectively strongly predict mortality. These results suggest that a good strategy for empirical researchers might be to focus on covariates structured to capture much of the variation in mortality risk, such as by amalgamating many other covariates into a total measure of observed risk, rather than focusing on single covariates whose effects on mortality are not overwhelmingly large.^{23} A covariate capturing much of the total heterogeneity licenses more predictions because it acts more like unidimensional heterogeneity. First, if *t* > *f*^{∗} in this model, then “frailty” reversals and frailty crossovers along the measured (*t*) dimension are impossible.^{24} Measuring the proportion exposed over age in each race would therefore potentially allow this model to be falsified, given the assumption that *t* > *f*^{∗}.^{25} Unfortunately, given the more typical scenario that *t* < *f*^{∗} (i.e., unmeasured hererogeneity is more consequential for individual-level mortality than measured heterogeneity), the prediction that follows is about the unmeasured residual frailty dimension and therefore is not directly empirically testable. A second test may be possible even if *t* < *f*^{∗}, if *t* is still “large.” Any crossover requires that the frailest White mortality exceed the most robust Black mortality. If *t* and *f*^{∗} are similar in magnitude, or more generally if *t* is large, then it is possible for *t* + *f*^{∗} = *f* > *b* while *f*^{∗} < *b* (even if *f*^{∗} > *t*). In this situation, the observed subpopulations defined by exposure would never reach a crossover even though the aggregate population might—an empirically testable conclusion.^{26}

These empirical predictions—and the absence of similar predictions for measured covariates whose effect on mortality is small compared with the effect of the heterogeneity that remains unmeasured—suggest the value of explicit theorizing about mortality disparities in the presence of multiple dimensions of heterogeneity. They also suggest that, wherever possible, we attend to more localized parameter spaces, some of which will be more revealing than others. To determine whether particular observed dimensions of heterogeneity contribute to a crossover, we should focus on dimensions that are highly consequential for mortality at the individual level. More specific predictions are possible as more assumptions are made to limit the simulation space to cohorts that better resemble actual U.S. cohorts (shown in section 3 of the online appendix). To the extent that this is a meaningful exercise in models that remain highly stylized, it suggests more room for developing fruitful predictions in the future.

## Conclusion

In this article, I analyzed the Black-White mortality crossover in the presence of multiple dimensions of heterogeneity within each race. The crossover represents a concrete example through which to understand the frequent circumstance that one dimension of heterogeneity is observed and another is theorized to exist but is unobserved. This situation is not captured by standard unidimensional heterogeneity models of mortality selection, but it is common in empirical research on the crossover (e.g., Berkman et al. 1989; Dupre et al. 2006; Sautter et al. 2012; Yao and Robert 2011) and in empirical studies of mortality broadly. It is also likely to become more frequent as more data sets with rich covariates and sufficient coverage of old ages become available. As I have shown, the most basic facts about the unidimensional theoretical model do not necessarily extend to this situation. Neither the observed nor the unobserved dimensions of heterogeneity necessarily behave as the classic, unidimensional models would predict: individual dimensions of frailty do not behave the same way as frailty in total.

The standard, unidimensional mortality selection model of the crossover is well summarized by four key facts that together allow the crossover to occur and make its dynamics qualitatively simple. In the multidimensional model, none of the four key facts need hold: Blacks may have either higher or lower mortality than Whites; the frail may have either higher or lower mortality than the robust; frailty may increase or decrease over age; and Black survivors may be either more likely or less likely than White survivors to be frail. Generalizations that apply to heterogeneity as a whole—including those that form the foundation of mortality selection’s account of the crossover—need not apply to each dimension of heterogeneity individually.

These possibilities arise because multidimensional mortality selection creates complex and essentially unpredictable—absent strong assumptions about parameter values—associations between the dimensions of heterogeneity. The four facts about unidimensional heterogeneity operate at the level of homogeneous subpopulations defined by race and frailty. But when heterogeneity is multidimensional, subpopulations defined by race and a single dimension of heterogeneity remain otherwise heterogeneous. These heterogeneous subpopulations change their composition over age, producing qualitatively complex behavior at the population level. Multidimensional mortality selection is complex because in it, the basic dynamic of unidimensional mortality selection occurs at many interacting levels simultaneously. The crossover with unidimensional heterogeneity is a straightforward example of Simpson’s paradox, occurring at a single level. But the crossover with multidimensional heterogeneity gives rise to Simpson’s paradoxes at several levels. Thus, the phenomena occurring at the surface level, population-level mortality rates, become less intuitive. Multidimensional populations are mixtures of unidimensional subpopulations, and the mixture need not behave like its ingredients.

These results have theoretical and practical consequences. Theoretically, they suggest that the intuitions derived from the long-standing tradition of unidimensional mortality selection theory do not apply to multidimensional mortality selection. These intuitions work well when frailty can be thought of as a cohesive whole, an amalgam of entirely unobserved traits. But when we want to get specific about “frailty” by identifying individual components of it, those intuitions can become deeply misleading. Even if dimensions of heterogeneity are independently distributed at birth, they become associated—with one another and with race—as a cohort ages because of their joint contribution to mortality. Any dimension of heterogeneity therefore carries information about all the others. Conditioning on any single dimension is not merely conditioning on a noisy measure of overall heterogeneity; it is conditioning *selectively* on whatever dimension was observed and, therefore, also on whatever dimension was not observed. The consequences of such selective stratification can be described only with a model that explicitly incorporates the joint distribution of each dimension of heterogeneity as it changes over age—whether the context is a mortality crossover or any other mortality trajectory.

Practically, the interaction between compositional changes in the observed and unobserved dimensions of heterogeneity produces unpredictable crossovers in the resulting subpopulations and aggregate populations. As a consequence, the crossover order is not an empirical confirmation or refutation of the general form of this model; the seemingly most natural way to test a multidimensional model may not work.

The implications of the results shown here extend well beyond the Black-White mortality crossover. This article joins several others in analyzing the consequences of selection along multiple dimensions simultaneously (Bretagnolle and Huber-Carol 1988; Finkelstein 2012; Henderson and Oman 1999; Manton et al. 1995). Collectively, this research motivates careful consideration of whether theoretical results about frailty as a whole extend to partially observed heterogeneity. The results that are new in this article concern the behavior of mortality disparities in a common research situation: when mortality is stratified by partial measures of heterogeneity.

The gap between theoretical work on mortality selection and recent empirical work on the crossover echoes a wider divergence between two traditions of demographic research in which the study of the crossover has traditionally been situated. Classical demography was an intellectually distinctive field that produced a series of models that excel at shifting perspectives between population aggregates and the individual-level status transitions that produce them. These models are able to reveal a great deal about population processes even without rich data; indeed, the striking creativity of formal demography in this era was presumably spurred by the need to wring as much information as possible from the limited data of the time. Classic mortality selection models, which interpret population-level patterns via theorized, unobserved frailty, are squarely in this tradition.

In contrast, much recent empirical work in demography can be characterized as part of a broader tradition of population studies drawing inspiration from myriad social sciences and engaging more substantively with social stratification. The sources of lifespan inequality are surely multiple and intersecting, and the advent of richer data sets allows some of these multiple heterogeneities to be measured. But recent work on the crossover, which sits uneasily between these traditions of formal and empirical demography, has tried to break open the black box of “frailty” without the benefit of any formal model of multidimensional selection.

Unidimensional frailty models are elegant and powerful tools for answering unidimensional questions, but multidimensional research questions need multidimensional theory. Explicitly multidimensional models of mortality crossovers can provide a first attempt to unite these two demographic traditions so that the substantive questions of recent empirical literature can be addressed with formal precision. Even as data sets grow ever richer, frailty remains an essential concept for mortality studies, as long as the heterogeneity that we do not measure remains as consequential as that which we do. The essential insight of all selection models—that observed associations, taken at face value, can mislead us about issues as fundamental as whether Blacks or Whites are truly disadvantaged in old age—remains powerful and necessary.

But knowing that selection against *something* must wholly or partially account for the crossover is only somewhat satisfying. Ultimately, we want to build and test theories about what constitutes frailty. The results here are a methodological step toward that goal, although they suggest that some plausible avenues of testing selection models are fraught with difficulty. As models are made more substantively realistic by incorporating multiple dimensions of heterogeneity, tests of them using the age at crossover will need to be based in far more specific, substantively grounded—and fallible—assumptions about unmeasured inequalities. Or perhaps we will need other strategies altogether.

## Acknowledgments

This article benefitted from extensive discussion with Felix Elwert, along with helpful comments from James Montgomery, Alberto Palloni, Jenna Nobles, James Walker, Erik Olin Wright, Kirkwood Adams, Jenny Conrad, Michal Engelman, Josh Goldstein, Kathryn Grace, Sarah Grey, Jeffrey Grigg, Paul Hanselman, Anna Haskins, Vida Maralani, Jude Mikal, Phyllis Moen, Michelle Niemann, Sarah Thomas, and Rob Warren; valuable feedback and generous assistance from Andreas Wienke; and valuable feedback from five anonymous reviewers, two editorial teams, and the copyeditor. Funding for this research was provided by the Robert Wood Johnson Foundation Health and Society Scholars; an NICHD training grant (T32 HD07014); graduate fellowships from the National Science Foundation, the Ford Foundation, and the Institute for Research on Poverty at the University of Wisconsin–Madison; and core grants to the Minnesota Population Center (P2C HD041023) at the University of Minnesota, to the Center for Demography and Ecology at the University of Wisconsin–Madison (P2C HD047873), and to the Center for Demography of Health and Aging (P30 AG017266) at the University of Wisconsin–Madison.

## Notes

^{1}

The multidimensional mortality selection considered here differs from the multivariate mortality selection analyzed elsewhere, as in “shared frailty models” (e.g., Guo and Rodriguez 1992; Henderson and Oman 1999; Vaupel 1988; Wienke 2010:131–160). The former deals with multiple independent variables; the latter, multiple (correlated) survival-time outcomes.

^{2}

In the classical mortality selection literature about crossovers, disadvantage has been operationalized in different ways, including greater mortality for Blacks than Whites at all levels of frailty, with Black and White frailty equal at baseline (Vaupel et al. 1979); greater mortality for Blacks than Whites specifically among the frail (Vaupel and Yashin 1985); and greater mortality for Blacks than Whites among the frail and a larger initial proportion of frailty among Blacks at baseline (Lynch et al. 2003). The model I employ is consistent with the approach of Vaupel et al. (1979), which offers the neatest fit with the general preference for proportional hazard models (by assuming Black disadvantage for all cohort members, not just the frail) and allows me to highlight how multidimensional selection produces flexible crossover results even without differences in Black and White initial frailty distributions. An alternative model specification is considered in section 4 of the online appendix.

^{3}

I assume the same baseline distributions of frailty in both populations in order to focus analysis cleanly on the dynamics of mortality selection, rather than other potential sources of racial difference in mortality. The main substantive points do not depend on this assumption. Nonetheless, if Black and White frailty composition differed at birth, some aspects of the presentation of results would differ, as I remark later in footnote 7.

^{4}

These two derivations from unidimensional crossover models (e.g., Vaupel et al. 1979) follow from the widely known fact that in a proportional hazards context with one unmeasured (e.g., frailty) and one measured (e.g., race) covariate, the unmeasured covariate leads to an underestimation of the effect of the measured covariate (see, e.g., Aalen 1988; Henderson and Oman 1999; Hougaard et al. 1994). The second assumption is the defining assumption of fixed-frailty models (Finkelstein 2012), which have wide application beyond the crossover. The first assumption is particular to crossover models. The great achievement of selection models of the crossover is to make this first assumption compatible with the existence of a crossover (Vaupel et al. 1979; Vaupel and Yashin 1985).

^{5}

The arrows in Fig. 1 represent multiplicative effects; thus, the overall sign of a path is the product of the signs on each arrow.

^{6}

Increasing the frailty multiplier can increase as well as decrease mortality in each racial population. The total effect of *f* on aggregate mortality in a population depends on which path dominates. Thus, there can be spans of ages at which population-level mortality would be lower with a larger frailty multiplier because fewer frail survivors would remain.

A Black-White crossover can occur regardless of the signs of the total effect of *f* on aggregate mortality in the Black and White populations. When the total effect of frailty on mortality is less positive, or more negative, for the White population than for the Black population at a given age, a crossover can occur. (Whether a crossover occurs additionally depends on whether the effect of frailty outweighs the Black mortality disadvantage at the subpopulation level, as suggested by panel a of Fig. 1.)

^{7}

The former fact depends on the assumption that Blacks and Whites are equally likely to be frail at birth. If Blacks were more likely than Whites to be frail at birth, then this term would become negative only if the greater selection against frailty among Blacks eventually outweighed their initial excess frailty.

^{8}

The formula for π_{k, i}(*a*) in the multidimensional model is analogous to the formula for π_{k}(*a*) in the unidimensional model given in Eq. (4), replacing the subpopulation-level survivorships *S*_{k, j}(*a*) in Eq. (4) with the corresponding group-level survivorships *S*_{k, i, j}(*a*) for the *i*th (exposed or non-exposed) subpopulation.

^{9}

In the multidimensional model, I use uppercase Greek letters for composition defined at the population level (the (observed) proportion of each racial population that is exposed, aggregated over residual frailty, T_{k}(*a*), and the (unobserved) proportion of each racial population that is residually frail, ∏_{k}(*a*), aggregated over tobacco exposure) and lowercase Greek letters for composition defined at the subpopulation level (the (unobserved) proportion of each exposure subpopulation that is residually frail, π_{k, i}(*a*), and the (unobserved) proportion of each residual frailty subpopulation that is tobacco-exposed, τ_{k, j}(*a*)).

^{10}

One could instead decompose population-level mortality into the aggregate proportion of each racial population that is residually frail (∏_{k}(*a*), unobserved) and the proportion of each of those subpopulations that is exposed (τ_{k, j}(*a*), unobserved). Regardless of how population-level mortality is decomposed, it reflects the distribution of each race along both dimensions of heterogeneity simultaneously.

^{11}

This formulation is not necessarily a socially coherent counterfactual: in reality, if the social treatment of people designated as Black were substantially altered, presumably so would the social treatment of people designated as White. (For example, White mortality might rise if Whites were no longer protected by racism from meritocratic competition from Blacks; or White mortality might fall if social welfare programs were not highly racialized and denigrated.) In causal terms, the assumption beneath this simplification is the stable unit treatment value assumption (SUTVA) (Morgan and Winship 2015:48–52).

^{12}

I have argued elsewhere (Wrigley-Field and Elwert 2017) that the crossover literature should more seriously attend to how selection dynamics interact with racial disadvantages that may shrink or grow over the life course. Incorporating dynamic frailty alongside multiple dimensions of frailty is mathematically complex.

^{13}

The alternative model in which Black-White inequality is implicit in the heterogeneity distribution, rather than explicit in a mortality multiplier, also allows for a conceptually neat distinction between individual inequality and population disparity. In that alternative model, though, this distinction can only be defined with a more complex counterfactual (Wrigley-Field and Elwert 2017) rather than with a single, simple parameter.

^{14}

Indeed, the empirical studies with which this article is in closest dialogue (Dupre et al. 2006; Sautter et al. 2012) used binary observed heterogeneity without committing to any particular specification of unobserved heterogeneity.

^{15}

The dimensions of heterogeneity explored in the empirical literature are traits that—unlike “frailty”—are acquired and lost by individuals over time. This extension of the classic mortality selection models to time-varying dimensions of heterogeneity can introduce significant complications (see Manton et al. 1994, 1995; Rogers 1992; Vaupel et al. 1988; Woodbury and Manton 1983; Wrigley-Field 2013) that are not considered in those studies or in this one. Here, I focus solely on how fixed dimensions of heterogeneity interact in the selection process.

^{16}

This criterion is explicit in Dupre et al. (2006:146):

To investigate whether religious involvement operates as a source of heterogeneity, two conditions must be satisfied and are hypothesized separately. First, in accordance with prior research that shows that religious involvement is more protective for blacks, the following hypothesis must be true: the effect of religious involvement will have a greater impact among blacks on the risk of dying […]. Thus, blacks who attend services weekly or more will have a larger reduction in mortality than whites. Second, to support the claim that religion contributes to why hazard rates invert, the effect of religion must vary with age.

In Sautter et al. (2012), this criterion is implicit but undergirds the empirical analysis.

^{17}

It is well known that frailty can increase in populations in which individuals can newly acquire frailty during their lives; for one systematic exploration of population dynamics that can result from such dynamic frailty, see Vaupel et al. (1988). It is specifically in the context of frailty fixed in individuals that the frailty increases and frailty reversals illustrated here are deeply surprising.

^{18}

In the language of causal inference, the association occurs because mortality is a *collider* for its risk factors. Conditional on survival, those risk factors become associated (see Elwert and Winship 2015). In the classical mortality selection model of the crossover, mortality is a collider for race and frailty. In the multidimensional model, mortality is a collider for race, observed exposure, and residual frailty, producing three-way associations between them over age.

^{19}

Whether the disadvantage associated with residual frailty, *f*^{∗}, has a larger effect on the mortality of the non-exposed or the exposed (and whether these effects have the same sign) depends on whether the increased mortality of the residually frail groups outweighs the increased selection of the residually frail groups in each exposure subpopulation. Both effects are greater among the exposed, making the total effects of their competing signs unpredictable *a priori*.

^{20}

Whether the effects of covariates have *a priori* predictable or unpredictable signs is determined by the level of aggregation, not the dimension of heterogeneity. The mortality penalty associated with residual frailty, *f*^{∗}, always has a negative effect on the proportion of survivors who are residually frail at the subpopulation level, $f\u2217\u2192\u2212\pi k,ia$. But because the dimensions of heterogeneity interact at the population level, as illustrated in Fig. 5, *f*^{∗} can have either a negative or a positive effect on the proportion of survivors who are residually frail at the population level, $f\u2217\u2192+/\u2212\u220fka$. Hence, the effect of *f*^{*} on population mortality via its effect on the *population-level* proportion residually frail could be positive or negative, $f\u2217\u2192+/\u2212\u220fka\u2192+/\u2212\mu \xafka$.

^{21}

Simulations show that the aggregate can cross simultaneously with either the exposed or the non-exposed subpopulation. Simultaneous crossovers are defined as crossovers occurring at the same survivorship of the robust non-exposed Whites (thus, the same age), to three decimal places. The simulation procedure is described in section 3 of the online appendix.

^{22}

One might suspect that the aggregate crossover is nearly simultaneous with the non-exposed crossover because by the time the aggregate crossover occurs, virtually all survivors are non-exposed. But this is not the case. At age 82, when the aggregate and non-exposed crossovers occur, 27% of Black survivors and 24% of White survivors are exposed. (Thus, there has been a “frailty” reversal in observed exposure.)

^{23}

Any such covariates would need to be studied in a setting in which, or operationalized such that, they are fixed in individuals. For example, chronic illnesses acquired by middle adult or early elderly ages might be used as a strong predictor of mortality at older ages.

^{24}

Frailty reversals and increases might still occur along the residual frailty dimension, complicating the interpretation of the observed associations between tobacco exposure and mortality. See also the discussion in section 4 of the online appendix, which qualifies this criterion for alternative models in which one dimension of heterogeneity is more consequential for Whites and the other is more consequential for Blacks.

^{25}

Because frailty reversals and frailty crossovers will not always occur along the dimension of heterogeneity that less strongly increases mortality, only the presence (not the absence) of these phenomena constitute tests of this model.

^{26}

Because subpopulation crossovers will not always occur even in subpopulations whose frailest Whites have higher mortality than the most robust Blacks, only the presence (not the absence) of subpopulation crossovers constitute a test of this model.

## References

*Drosophila melanogaster*

^{th}century

## Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.