## Abstract

When investigating relationships between education and health, one has to take age into account. Conditioning on age entails conditioning on surviving, which has been argued to lead to a potential selection bias. In this note, I argue that surviving should be considered as a necessary precondition for the relationships of interest and, therefore, not as a possible source of bias. I criticize models of health trajectories that do not condition on surviving.

## Introduction

When investigating relationships between education and health, one has to take age into account. Conditioning on age entails conditioning on surviving, which has been argued to lead to a potential selection bias and should be coped with in some way (e.g., Beckett 2000; Chen et al. 2010; Kim and Durden 2007; Lynch 2003). In this note, I argue that surviving should be considered as a necessary precondition for the relationships of interest and, therefore, not as a possible source of bias.

I begin with a brief introduction of the conceptual framework and then consider mean health trajectories, which are defined conditional on surviving. I criticize the argument that references to selective mortality can help to explain differences of mean health trajectories of persons with different educational levels. I then consider age-specific changes of health and argue that these changes must also be defined conditional on surviving. Subsequently, I show that this condition is no hindrance to a causal interpretation of the relationship between education and health. Finally, I briefly consider growth curve models. If estimated in a temporally local way that allows conditioning on surviving, growth curve models are tools for modeling mean health trajectories. In contrast, results of hierarchical growth curve models are difficult to interpret and are potentially misleading because these models implicitly assume that all individual trajectories are defined for a common temporal domain. I end with a brief conclusion.

## Conceptual Framework

Let Ht* be a quantitative variable representing the health of a person at age t = 0, 1, 2, . . . (measured in years). In subsequent notations, I assume that Ht* is a discrete variable that can assume only nonnegative values. Using L for the person’s length of life, one can start from Pr(Ht* = h|L ≥ t): that is, the probability of Ht* = h at age t conditional on having survived at least until that age. The variable X will be used to record the person’s level of education and will be assumed to be a time-constant variable conditional on Lt0 (e.g., t0 = 30). The interest concerns how
$PrHt*=hL≥t,X=x$
for tt0 depends on age and education (and possibly on additional covariates). The quantities of interest are conditional on surviving because a person’s health is defined only while that person is alive.
To stress that surviving is a necessary precondition for values of Ht* to exist, I also use a variable Ht, which equals Ht* but can also take the value Ht = −1 when the person is already dead at the age t (L < t). In contrast with Ht*, Ht is defined for all possible ages, and the basic quantities of interest can be written as
$PrHt=hHt≥0,X=x$
1
without a reference to L. It often suffices to consider conditional mean values defined by
$EHtHt≥0,X=x=∑hhPrHt=hHt≥0,X=x.$
Using this formal framework for empirical research requires a historical embedding, which I assume to be given by a birth cohort—say, C.1 Individual members will be referred to by i = 1, . . . , N. Thus, one can think of values of the variables introduced earlier: hit, li, and xi, respectively. By definition, hit = −1 if li< t. Furthermore, one can define individual health trajectories:
$hi:=hi,t0,...,hi,tm,$
where tm is some fixed highest age. As mentioned earlier, the trajectories start from an age, t0, in which the level of education, X, is reached. To ease notations, I assume that lit0 for all members of C.
The reference to a particular birth cohort allows one to think of the probabilities introduced in Eq. (1) as frequencies defined by
$PrHt=hHt≥0,X=x:≡∑iIhit=h,xi=x∑iIhit≥0,xi=x,$
where I[.] denotes the indicator function, and the summation is over all members of C.

## Comparing Mean Health Trajectories

A main research question concerns differences between health trajectories of persons with different educational levels (e.g., Beckett 2000; Dupre 2007; Herd 2006; Lynch 2003, 2006; Ross and Wu 1996). Let Cx denote the subset of C that has members of educational level x. One aims to compare the health trajectories of members of $Cx′$ with those of members of Cx′′, where x′ and x′′ are two educational levels (I assume that x< x′′). One possibility is to consider
$Δt:=EHtHt≥0,X=x′′−EHtHt≥0,X=x′$
and investigate how these differences develop over the life course (e.g., become smaller or larger with age). This approach basically consists of a cross-sectional comparison of mean health trajectories defined by
$h¯x:=h¯t0x,...,h¯tmx,$
where $h¯tx:=EHtHt≥0,X=x$. It is noteworthy that such mean health trajectories can be estimated with cross-sectional data. If C is defined by a birth year tc, in order to estimate $h¯tx$, it would suffice to have a representative sample for the calendar year tc + t, and then consider persons born in tc. Of course, because the sample has to be drawn in a delimited region, cohort membership can change because of in- and out-migration.2

## Selective Mortality

Several studies found evidence for an age-as-leveler hypothesis, meaning that values of ∆t become smaller at higher ages (e.g., Beckett 2000; Dupre 2007; Herd 2006). 'This hypothesis motivated a discussion of whether a leveling effect of age could be explained by selective mortality. The basic question concerns how the subsets Cx are changed through mortality. Obviously, their size becomes monotonically smaller, which depends on the educational level, x, as described by the probabilities Pr (Lt | X = x). Because surviving depends on health, one can also think that mortality changes the distribution of health in the surviving population.

This is illustrated in Fig. 1, based on 30 individual trajectories hit = αi + βi(t − 30).3 Values of αi are random draws, uniformly distributed in the interval (0, 4), and βi = −0.05. In accordance with the definition of Ht, individuals are assumed to be dead if hit< 0. The bold line shows the mean values of the surviving individuals’ health.

However, the implications of mortality illustrated in Fig. 1 do not allow drawing any definite conclusions for the comparison of mean health trajectories of two groups, Cx and Cx′′. This is illustrated in Fig. 2, which compares mean health trajectories of two groups. Cx′′ consists of the 30 trajectories shown in Fig. 1. $Cx′$ comprises 30 trajectories that equal those in Cx′′ at t0, but decline with slope –0.07 (instead of –0.05). Clearly, the gap between the two mean health trajectories rises.

There is, however, a more fundamental reason why a reference to selective mortality is problematic. Given the distribution of Ht − 1, conditional on X = x, the mean value E(Ht | Ht ≥ 0, X = x) is a result of both mortality and a change in the health of the surviving persons, formally expressed as
$EHtHt≥0,X=x=∑h≥0EHtHt≥0,Ht−1=h,X=xPrHt−1=hHt≥0,X=x.$
2
Because of mortality,
$PrHt−1=hHt≥0,X=x≠PrHt−1=hHt−1≥0,X=x,$
and the difference between these two probability distributions could be called a mortality effect (at the respective age). However, because surviving is a necessary precondition for the mean value Eq. (2) to exist, this mortality effect cannot be hypothetically dismissed. In fact, assuming that this mortality effect is 0 would entail
$PrHt≥0Ht−1=h,X=x=PrHt≥0Ht−1≥0,X=x,$
so that surviving at age t would be independent of the health Ht − 1. But then, surviving should also be independent of the educational level; health otherwise could not depend on education.

## Changes of Health

Mean health trajectories describe the development of age-specific mean values of the health of surviving persons; they are not mean values of individual trajectories, as Fig. 1 illustrates. In fact, because there is no common temporal support, an average of individual trajectories cannot be defined in a temporally extended way. One possibility to come closer to a consideration of individual trajectories is to focus on changes of health between consecutive ages. In the present conceptual framework, such changes can be defined as
$δtx:=EHt−Ht−1Ht≥0,X=x,$
that is, the expectation of the difference in health between age t − 1 and t, conditional on education and having survived at least until t. The conditioning on surviving is required for the definition of a change, and is also the reason why δt(x) is generally not equal to a change of the mean health trajectory defined as
$δt*x:=EHtHt≥0,X=x−EHt−1Ht−1≥0,X=x.$

While δt(x) is the mean of the age-specific changes of the individual health trajectories, δt*(x) is the age-specific change of the mean health trajectory. The difference is immediately visible in Fig. 1. In this example, all individual trajectories change in the same way, independent of age. The mean of these changes is simply the gradient –0.05, which is obviously different from the age-dependent gradients of the mean trajectory.

The difference between the two ways of assessing change is due to mortality. To think of an individual’s change of health between t − 1 and t requires that the individual survives at least until t. In contrast, the mean health trajectory relates to a group of individuals that changes continuously through mortality. However, these changes are not a source of bias: δt(x) and δt*(x) are simply different concepts, both providing relevant information.

Note that δt(x) is a mean value, derived from a broad variety of underlying individual changes hi,t* − hi,t − 1*, and δt(x ′′) − δt(x ′) is therefore to be interpreted as a between-person effect. Also note that the quantities δt(x) provide only temporally local descriptions, which is illustrated in Fig. 3, again using the 30 trajectories from Fig. 1. The arrows are defined by
$t,EHtHt+d≥0→t+d,EHt+dHt+d≥0,$
3
where d = 3.4 The arrows show the temporally local mean direction of health change in three-year intervals. A concatenation of the arrowheads would equal the mean health trajectory shown in Fig. 1. However, because the arrows presuppose surviving until at least the endpoint of the arrow, they cannot be concatenated in a continuous way.

## Causal Considerations

I have argued that a person’s surviving is a necessary precondition for a meaningful reference to health. In this section, I briefly point to a consequence for a causal interpretation of the relationship between education and health.

Health is a time-varying variable, and health at age t + 1 depends on health at age t. Considerations of causally relevant conditions of health therefore require a temporally local (age-specific) approach. The following diagram relates to age t.

I use X (education) and Ht (health) as defined earlier. In addition, the variable Dt represents the survival status (1 = dead, 0 = still alive). Of course, a reference to relationships between education and health at age t presupposes survival at least until this age (Dt = 0).

The diagram allows one to think of a direct effect of X on Ht + 1, which requires conditioning on values of Ht and Dt + 1. There are two ways to do so. One way is to compare
$EHt+1X=x,Ht=ht,Dt+1=0$
4
for two values of X: say, x′ and x′′. These expectations condition on surviving at least until t + 1. The other way is to consider
$EHt+1X=x,Ht=ht,Dt+1=1.$
5

However, this expectation is deterministically known to be –1, and the relationship is independent of X and Ht. In other words, conditional on Dt + 1 = 1, neither X nor Ht can be attributed a causal effect on Ht + 1. This has an important consequence: namely, that there is no indirect effect of education on health mediated by survival; only the direct effect has a meaningful causal interpretation.

Because Eqs. (4) and (5) are conceptually different, they should be kept distinct. It would not make sense to integrate out Dt + 1, which becomes obvious when writing the following:
$EHt+1X=x,Ht=ht=EHt+1X=x,Ht=ht,Dt+1=0PrDt+1=0X=x,Ht=ht+EHt+1X=x,Ht=ht,Dt+1=1PrDt+1=1X=x,Ht=ht$
because, by definition, E(Ht + 1|X  =  x, Ht  =  ht, Dt + 1 = 1) = − 1. To avoid this conclusion, one would need to assume that one can meaningfully refer to the health of a deceased person. However, even if one could refer to properties of a dead person conditional on counterfactually assuming that the person is still alive,5 the imputation of a particular trajectory of health values for a deceased person is not justifiable. Apart from this problem, it is unclear why one should be concerned with effects of dying on imputed health values of deceased persons. As shown earlier, accepting that survival is a necessary precondition of health is no hindrance to thinking of a causal relationship between education and health. Therefore, an interest in causal relationships does not provide an argument for assuming health trajectories of deceased persons.

## Growth Curve Models

Researchers often use growth curve models to investigate how health depends on age, education, and/or other covariates. In terms of conditional mean values, a simple growth curve model for the present application can be specified as
$EHtHt≥0,X=x=α+tβ+t2γ+xαx+txβx+t2xγx.$
6

Without explicitly assuming a distribution of residuals, the model can be estimated with ordinary least squares (OLS). The resulting growth curves are then parametric models of mean health trajectories. As illustrated in Fig. 4, the growth curve is derived from OLS estimation of Eq. (6), with x = 0 for the 30 trajectories in Fig. 1.

Instead of estimating model Eq. (6) with OLS, one can consider hierarchical growth curve models. Several researchers have proposed that such models could be used to investigate how individual (in contrast to mean) health trajectories depend on education or some measure of socioeconomic status (SES) (e.g., Chen et al. 2010; Herd 2006; Lynch 2003). A basic version begins with representing individual health trajectories as
$hit*=α0i+tβ0i+t2γ0i+εit.$
The parameters are then assumed to depend on education:
$α0i=α+xiαx+να,i,β0i=β+xiβx+νβ,i,γ0i=γ+xiγx+νγ,i.$
Combining these specifications, the resulting model is
$hit*=α+tβ+t2γ+xiαx+txiβx+t2xiγx+να,i+tνβ,i+t2νγ,i+εit.$
This notation is in terms of individual values. To obtain a formulation in terms of variables, one assumes that να,i, νβ,i, νγ,i, and εit are realizations of random variables denoted, respectively, by να, νβ, νγ, and εt. Further, one assumes that these variables have a mean of 0 (and makes assumptions about their joint distribution). In terms of variables, the model can then be written as
$Ht*=α+tβ+t2γ+xαx+txβx+t2xγx+να+tνβ+t2νγ+εt,$
with E(να) = E(νβ) = E(νγ) = E(εt) = 0. This formulation allows one to consider the structural core of the model in terms of expectations of the dependent variable:
$EHt*Age=t,X=x=α+tβ+t2γ+xαx+txβx+t2xγx.$

This structural core of the model is identical with Eq. (6) and entails a single expected health trajectory for all persons with the same educational level.

To estimate expected health trajectories with a hierarchical growth curve model, I first consider again the 30 trajectories from Fig. 1. Because all individual trajectories have the same time-constant slope, it suffices to consider the model Ht* = α + tβ + να + ε. The solid line in Fig. 5 shows the estimated growth curve ($α^$ = 3.364, $β^$ = −0.05). This curve is obviously neither a possible individual trajectory nor some mean of the individual trajectories. The curve might be interpreted as a fictitious reference that allows one to think of the individual trajectories as random deviations (as defined by the stochastic part of the model).6

In this example, the estimated curve correctly represents the slopes of the individual trajectories, given that all individual trajectories have the same time-constant slope. In general, one does not obtain correct mean values of individual slopes. To show this, I consider a second example consisting of 30 trajectories generated according to
$hit*=αi′−0.008+βi′t−0.0004+γi′t2+εit,$
where α ′, δ ′, and γ ′ are random numbers uniformly distributed in the intervals (1, 5), (−0.003, 0.003), and (−0.0001, 0.0001), respectively, and εit  ~  N(0, 0.01). The dashed curves in Fig. 6 show the individual health trajectories with random fluctuations resulting from a suppressed εit. The figure also shows the mean health trajectory (dotted) as well as a growth curve estimated with a hierarchical growth curve model (solid). Again, this growth curve does not describe a meaningful trajectory but might be interpreted as a fictitious reference.
Mean changes of health can be assessed with the quantities δt(x) defined earlier. Alternatively, the individual trajectories can be thought of as continuous functions of time, starting from their slopes:
$sit:=−0.008+βi′−20.0004+γi′t.$
Mean values that correspond to δt(x) can be defined as
$s¯t:=1nt∑i:li≥tsit,$
where nt is the number of persons surviving at least until t. These mean values are shown in Fig. 7 as a dotted curve. In contrast, the solid curve shows the slope of the growth curve estimated with a hierarchical growth curve model (as shown in Fig. 6). This curve corresponds to mean slopes defined by ∑isit / n, which presuppose a common temporal domain for all individual trajectories.

## Conclusion

In this article, I argue that a person’s surviving is a necessary precondition for a meaningful reference to her health. Statements about the dependence of health on age, education, and/or other covariates must be understood as being conditional on surviving. Selective mortality should not be considered a source of bias, which can hypothetically be dismissed.7

Given this understanding, mean health trajectories that are defined conditional on surviving provide meaningful descriptions of the health of the surviving members of a cohort. However, age-dependent changes of a mean health trajectory must be distinguished from mean values of health changes of individual persons.

Selective mortality also has consequences for the understanding of growth curve models. Simple growth curve models (in which residuals have a temporally local definition) can be understood as tools for investigating how mean health trajectories depend on education and/or other covariates. Hierarchical growth curve models, in contrast, implicitly assume that all individual health trajectories have an identical temporal extension. Thus, growth curves estimated with these models do not represent the actually observed health trajectories. Because they do not condition on surviving, these models also misrepresent age-dependent changes of health.

## Notes

1

Several authors have stressed the need to explicitly distinguish between birth cohorts (e.g., Lauderdale 2001; Yang 2007).

2

Additional problems occur when cohorts are based on a broad range of birth years; for some discussion, see Lauderdale (2001).

3

This model, of course, is extremely simplified. Real individual health trajectories show a wide variety of different, generally nonlinear, and often nonmonotonic forms.

4

The plot is inspired by aging-vector graphs as used by Kim and Durden (2007).

5

It already seems difficult to consider this as a thought experiment, as proposed by Herd (2006); see also Noymer (2001).

6

For discussion of this question, see also Kurland et al. (2009). They considered the hierarchical growth curve model as an unconditional model (with regard to surviving), which requires values of the dependent variable for deceased persons as well. This model is contrasted with a partly conditional model, which relates to the surviving members of a cohort and is basically equal to Eq. (6).

7

Of course, it is possible for omitted variables to distort an assessment of the relationship between education and health: for example, if an omitted variable affects both health and mortality such that, conditional on surviving, its correlation with education changes. However, this problem cannot be avoided by hypothetically dismissing the conditioning on survival.

## References

Beckett, M. K. (
2000
).
Converging health inequalities in later life—An artifact of mortality selection?
.
Journal of Health and Social Behavior
,
41
,
106
119
. 10.2307/2676363.
Chen, F., Yang, Y., & Liu, G. (
2010
).
Social change and socioeconomic disparities in health over the life course in China: A cohort analysis
.
American Sociological Review
,
75
,
126
150
. 10.1177/0003122409359165.
Dupre, M. E. (
2007
).
Educational differences in age-related patterns of disease: Reconsidering the cumulative disadvantage and age-as-leveler hypotheses
.
Journal of Health and Social Behavior
,
48
,
1
15
. 10.1177/002214650704800101.
Herd, P. (
2006
).
Do functional health inequalities decrease in old age?
.
Research on Aging
,
28
,
375
392
. 10.1177/0164027505285845.
Kim, J., & Durden, E. (
2007
).
Socioeconomic status and age trajectories of health
.
Social Science & Medicine
,
65
,
2489
2502
. 10.1016/j.socscimed.2007.07.022.
Kurland, B. F., Johnson, L. L., Egleston, B. L., & Diehr, P. H. (
2009
).
Longitudinal data with follow-up truncated by death: Match the analysis method to research aims
.
Statistical Science
,
24
,
211
222
. 10.1214/09-STS293.
Lauderdale, D. S. (
2001
).
Education and survival: Birth cohort, period, and age effects
.
Demography
,
38
,
551
561
. 10.1353/dem.2001.0035.
Lynch, S. M. (
2003
).
Cohort and life-course patterns in the relationship between education and health: A hierarchical approach
.
Demography
,
40
,
309
331
. 10.1353/dem.2003.0016.
Lynch, S. M. (
2006
).
Explaining life course and cohort variation in the relationship between education and health: The role of income
.
Journal of Health and Social Behavior
,
47
,
324
338
. 10.1177/002214650604700402.
Noymer, A. (
2001
).
Mortality selection and sample selection: A comment on Beckett
.
Journal of Health and Social Behavior
,
42
,
326
327
. 10.2307/3090218.
Ross, C. E., & Wu, C-L (
1996
).
Education, age, and the cumulative advantage in health
.
Journal of Health and Social Behavior
,
3
,
104
120
. 10.2307/2137234.
Yang, Y. (
2007
).
Is old age depressing? Growth trajectories and cohort variations in late-life depression
.
Journal of Health and Social Behavior
,
48
,
16
32
. 10.1177/002214650704800102.