## Abstract

Cross-country comparisons of differential survival by socioeconomic status (SES) are useful in many domains. Yet, to date, such studies have been rare. Reliably estimating differential survival in a single country has been challenging because it requires rich panel data with a large sample size. Cross-country estimates have proven even more difficult because the measures of SES need to be comparable internationally. We present an alternative method for acquiring information on differential survival by SES. Rather than using observations of actual survival, we relate individuals’ subjective probabilities of survival to SES variables in cross section. To show that subjective survival probabilities are informative proxies for actual survival when estimating differential survival, we compare estimates of differential survival based on actual survival with estimates based on subjective probabilities of survival for the same sample. The results are remarkably similar. We then use this approach to compare differential survival by SES for 10 European countries and the United States. Wealthier people have higher survival probabilities than those who are less wealthy, but the strength of the association differs across countries. Nations with a smaller gradient appear to be Belgium, France, and Italy, while the United States, England, and Sweden appear to have a larger gradient.

## Introduction

Estimates of differences in survival by socioeconomic status play an important role in multiple domains. In the arena of public policy, they help policymakers and analysts assess how much various groups benefit from public programs, such as Social Security and health care. In financial markets, they are important for designing annuities and life insurance. For individuals, they inform decisions about retirement savings and investments in financial products. In demographic research, they matter when presenting age patterns of a variable (e.g., income or wealth) derived from either cross-sectional or synthetic panel data. This is because differential survival tends to bias these patterns.

Prior studies have estimated differential mortality—the complement to differential survival—by wealth (e.g., Attanasio and Emmerson 2003; Attanasio and Hoynes 2000), income and income inequality (e.g., Bommier et al. 2006; Deaton and Paxson 2004; Duleep 1986, 1989), occupational status (e.g., Marmot 1999), education (e.g., Feldman et al. 1989; Lleras-Muney 2005), and broader measures of socioeconomic status (e.g., Adams et al. 2003; Deaton and Paxson 2001). These studies generally show a strong association between the various measures of economic status and mortality.^{1}

To produce the most reliable estimates of differential survival by socioeconomic status, one ideally needs very rich panel data with consistent observations of socioeconomic status over time, a sufficient sample size, minimal attrition, and observations of actual deaths. But such ideal conditions are quite data-intensive and often difficult to achieve. Attempting to compare differential survival across countries imposes even more demanding requirements on the data because the time periods for observation must match and socioeconomic status must be defined and measured in identical ways from one country to another.

Perhaps in part for these reasons, numerous studies of differential survival by socioeconomic status have focused on individual countries—the United States and the United Kingdom in particular, along with several others.^{2} In contrast, there have been only a few cross-country comparisons of differential survival. But such international studies could be valuable. For instance, they could provide an insightful starting point for understanding the role that institutions (such as national health care systems) might play in reducing inequalities in health—and by extension, survival—between different socioeconomic groups.

The international comparisons that do exist (e.g., Huisman et al. 2004; Kunst and Mackenbach 1994a, b; Mackenbach et al. 1997, 2003) highlight the heterogeneity among certain countries in how socioeconomic status is associated with mortality. For example, Kunst and Mackenbach (1994b) found that inequalities by education in the 1970s were relatively small in the Netherlands and Scandinavian countries, as opposed to in the United States, France, and Italy. Conversely, Mackenbach et al. (1997) found large inequalities by occupational status (manual vs. nonmanual) common to Scandinavian countries in the 1980s.

All of these international studies share a major caveat, however: they rely on data sets that differ along several dimensions, which potentially affects the accuracy of the cross-country comparisons. For instance, the data have sometimes been collected using different survey instruments, or in certain cases have even used information from administrative databases that are unavailable in other countries. Often the data sets do not cover the same time period and groups. They can also differ—often substantially—in the amount of detail they provide on various socioeconomic characteristics, leading to inconsistencies in the variables that could be used in these comparisons.

Given the desirability of comparing differential survival internationally and the lack of data sets with the features ideally needed to do so, we propose in this article an alternative approach for estimating differential survival by socioeconomic status. This approach uses a single cross section of data: people’s subjective estimates of the probability that they will survive to a certain age, and measures of socioeconomic status designed to be directly comparable across countries. Conceptually, this subjective measure of survival is similar to the cohort life-table probability of survival from the respondent’s current age at the time of the survey to a given target age.

Clearly, subjective probabilities of survival do not measure the same concept as observations of actual survival. They convey expectations about a future outcome (i.e., survival) not yet observed; they are subjective, reflecting individuals’ beliefs; and they contain measurement error at the individual level.^{3} But if at the population level and—for our purposes, among relevant subpopulations—subjective survival probabilities strongly correlate with actual survival, then they could be used as informative proxies for estimating differences in actual survival. This method, with its much more modest data requirements, would make it considerably easier to produce estimates of survival directly comparable across countries.

Subjective probabilities of survival have been collected in the U.S. Health and Retirement Study (HRS) every two years since its baseline interview in 1992. The HRS is a nationally representative panel survey of persons born in 1953 or earlier and their spouses,^{4} designed to investigate the health, social, and economic implications of the aging of the American population. To elicit subjective probabilities of survival, the survey asks respondents, “What is the percent chance that you will live to be [X] or more?” The target age X varies as a function of the respondent’s current age. At the same time, the HRS has observations of actual survival, with data spanning a period of up to 14 years linked to the U.S. National Death Index (NDI).

Previous validation studies based on data from the HRS give reason to believe that the approach we propose is viable. For instance, these subjective probabilities predict actual mortality (Bloom et al. 2007; Elder 2007; Hurd and McGarry 1995, 2002; Siegel et al. 2003; Smith et al. 2001). They also vary systematically with known risk factors, such as smoking, and evolve in panel in response to information relevant to survival, such as parental death or onset of disease. Subjective expectations of survival drawn from the HRS data have also been used to construct cohort life tables that predicted an unusual revision of U.S. life expectancy made by the Social Security Administration in 2004 (i.e., increased male and lower female life expectancy, which narrowed the gender gap in longevity by 25%) (Perozek 2008). This suggests that individuals are able to predict their own survival. Finally, subjective probabilities have been shown to predict both expected and actual economic behavior (Bloom et al. 2007; Delavande and Willis 2008; Elder 2007; Hurd et al. 2004). This indicates that these expectations are meaningful to respondents when making decisions.

Recently, subjective probabilities of survival have also been adopted in two surveys conducted in Europe designed to be comparable to the HRS: the English Longitudinal Study of Ageing (ELSA) and the Survey of Health, Ageing and Retirement in Europe (SHARE). Winter (2008) showed that SHARE respondents’ subjective probabilities of survival to age 75 are predictive of two-year survival. Hurd et al. (2005) showed that they vary systematically by known risk factors that affect mortality—just like in the HRS. Moreover, this same team of authors also showed that subjective probabilities of survival correlate strongly with objective measures of health, such as grip strength, which is known to be a strong predictor of morbidity and mortality (Rantanen et al. 1999). These patterns are found across every country.

The availability of ELSA and SHARE allows us to apply the alternative approach we propose to perform cross-country comparisons of differential survival. In the absence of data that are both comparable across countries *and* contain actual observations on survival, studies presenting estimates of differential survival have had to make trade-offs. Most of these have taken advantage of actual observations on deaths but have had to compromise on the comparability of the data sets across countries. We make the opposite trade-off: although we do not use observations of actual survival, the HRS, ELSA, and SHARE provide us with a unique asset—consistently collected socioeconomic data and subjective probabilities of survival for the same time period and the same age group in a range of different countries. This allows us to do the cross-country comparison.

The first part of our article focuses on verifying the viability of our proposed approach. Because the HRS contains data on actual deaths, responses to questions about subjective survival probabilities, and socioeconomic information, it is sufficiently rich to support both the more customary approach to estimating differential survival based on panel data and our proposed alternative. We use panel data from the HRS to estimate differential survival by socioeconomic status in the traditional way on the basis of actual survival and then compare those estimates with ones we obtain from subjective probabilities collected at baseline from the *same* respondents. The estimates we produce using our alternative method are strikingly similar to those based on actual survival. These findings suggest that subjective survival probabilities are informative proxies for estimating differences in actual survival.

In the second part of our article, we use our alternative method to produce directly comparable estimates of differential survival by socioeconomic status based on subjective probabilities of survival for 10 European countries and the United States for the year 2004.

We have conducted the analyses of the first and the second part of the article for three different measures of socioeconomic status: wealth, income, and education. Because of space constraints, we present in detail only the results for wealth, but we provide detailed results for the other two measures of socioeconomic status—income and education—in Online Resource 1.

For every country, we find that people with greater wealth have higher subjective probabilities of survival than those with low wealth, but the strength of the association differs across countries. Nations with a smaller gradient appear to be Belgium, France, and Italy. The gradient appears to be larger in the United States, England, and Sweden. Although our estimates are not meant to replace those based on observations of actual survival, we suggest that they are valuable complements, especially when the data for cross-country evaluations on the basis of actual survival have yet to become available.

## Theoretical Framework of Survey Response

We present a simple theoretical framework of survey response that shows how we estimate differential survival from subjective probabilities of survival.

*TA*of a respondent with characteristics

*X*

_{t}at time

*t*before he has reached age

*TA*. The survival function determines the outcome

*A*

_{TA}=1, being alive at the target age

*TA*, according to the following rule:where

*C*is a constant and the ε are independent and identically distributed across individuals, i.e., the ε are individual-specific shocks that affect individual survival.

^{5}

*TA*conditional on

*X*

_{t}is equal toDenoting

*G*the cumulative distribution of ε, we getIf respondents know the shape of their survival function and the relevant

*X*

_{t}, their predictions Π

_{TA}about their chance of survival to age

*TA*are given by

To derive Eq. 2 from Eq. 1, we implicitly assume that there is no systematic, unexpected shift in longevity (e.g., because of improved medical technology) occurring between time *t* and the time respondents reach their target age *TA*—or at least, that such a shift affects everybody’s survival proportionally. In the latter scenario, the levels of the survival probabilities would change, but not the differentials by observables *X*_{t}. If Eq. 2 is verified, we can obtain estimates of differential survival using the subjective survival probabilities Π_{TA}.

## Verifying the Viability of Our Approach: Differentials in Survival Based on Subjective Probabilities of Survival vs. Those Based on Actual Survival

The extent to which subjective probabilities of survival generate estimates of differentials in survival by socioeconomic status similar to those produced by actual survival data is an empirical question. Using the rich longitudinal data from the HRS, we proceed in two stages. First, we illustrate at the population level that subjective probabilities of survival to the target age *TA* have predictive power for actual survival to the target age. Second, we apply our theoretical framework of survey response to estimate differential survival by wealth on the basis of subjective probabilities of survival and then compare these estimates to those we produced by using actual survival to target age *TA*.

### HRS Data

We use HRS data from a 14-year period: 1992 to 2006.

#### Vital Status of HRS Respondents

The HRS determines the vital status of respondents in any particular survey wave through tracking. A respondent is considered alive if he or she was interviewed or contacted directly by an interviewer during the wave, was said to be alive by a spouse or partner, or was not reported dead. If no informative contact was made, the respondent’s vital status is classified as unknown.^{6} In addition, the HRS matches respondent records to the National Death Index (NDI) for those who are reported deceased or are of unknown vital status during tracking. This NDI information is available up to 2004.

We construct our vital status variable as follows:

**Vital status in 2004.**

A respondent is considered dead in 2004 if he or she was reported dead by HRS through tracking.^{7} A respondent is considered alive if he or she answered the questionnaire in 2004 or if he or she had never been reported dead according to the HRS and had no match in the NDI.

**Vital status in 2006.**

A respondent is considered dead in 2006 if he or she was dead in 2004 according to our criteria above or if he or she was reported dead to the HRS in 2006. A respondent is considered alive in 2006 if he or she was reported alive to the HRS in 2006.

We apply these definitions to all 30,890 respondents whom the HRS has ever interviewed. By 2004, 21% had died and 4% were of unknown vital status. By 2006, the fraction of respondents who had died had increased to 26%. Vital status is unknown for 10.5% of the respondents in 2006, given that the NDI information is not yet available for that year.

#### The Analytical Sample

In the first waves of the HRS, respondents were asked their subjective probability of survival to ages 75 and 85. We identify a sample of respondents who, by the 2006 wave of the HRS, had either reached or would have reached, if still alive, the target age they had been queried about at baseline. We focus on the probability of survival to the target age 75 (denoted P75 hereafter) because this results in a younger sample at baseline (age 61–66). This offers two advantages. First, selection through mortality for a population in their early 60s is fairly small. Second, aging is associated with cognitive decline, and low cognition has been found to introduce bias into subjective expectations of survival (Elder 2007).

Our analytical sample is therefore composed of respondents who were asked their survival to age 75, whose vital status at age 75 we know, and who were 9 to 14 years away from age 75 when asked the subjective survival question. A total of 1,234 observations meet these criteria. Of these, 15 had a missing value for P75, which corresponds to a very low item nonresponse rate (1.2%). Our final analytical sample consists of 1,219 respondents, 30% of whom died before age 75. Sixty-five percent were asked their survival expectations in 1992, 34% in 1994, and less than 1% in 1996.^{8}

Note that this subset is not a random sample. While 35% are HRS respondents who were age-eligible in 1992, 65% are older spouses of the age-eligible respondents. Also, 71% of the respondents are male. But the nonrandomness of our sample does not affect the validity of our verification exercise. The goal of this particular exercise is not to produce population-representative estimates of differential survival, but to verify that for a given population, our method yields the same, or closely comparable, results as the approach based on actual survival observed in panel.

### Predictive Power of Subjective Probabilities of Survival for Actual Survival

For our approach to be viable, subjective probabilities of survival must be strong predictors of actual survival at the population level (not necessarily at the individual level, because survival is a stochastic event). By 2006, a sizable fraction of the HRS sample had reached the target age about which they were asked at baseline. This makes it possible to verify the predictive power of the subjective probabilities for actual survival to the queried target age.

Figure 1 presents the fraction of respondents in our sample alive at the target age by the answer to P75. We find a clear gradient in actual survival (based on panel observations) by the reported subjective probability of survival. In earlier work, subjective probabilities in the HRS were found to exhibit bunching at the focal values of 0%, 50%, and 100% (e.g., Hurd and McGarry 1995; Lillard and Willis 2002). However, when relating the subjective probabilities to actual survival, as shown in Fig. 1, the fractions of respondents alive among those who reported 0 and 50% do not show as outliers, as they follow the patterns of neighboring responses. This does not hold true for respondents who answered 100%, who show lower actual survival than those who reported lower subjective probabilities, such as 70%, 80%, and 90%. The reports of 100% nevertheless contain useful information because these respondents still exhibit higher survival than those who provided answers smaller than 50%.^{9} If we consider men and women separately, we find similar patterns by sex.

We also find that the overall distribution of P75 for respondents alive at 75 is located substantially to the right of that for respondents reported dead at 75. The mean P75 is 67.5% with a 95% confidence interval equal to [65.7,69.4] among respondents alive at 75, compared with 57.3% with a 95% confidence interval equal to [54.1, 60.5] among respondents who have died. The medians are 70% and 50%, respectively. Overall, this clearly shows that the subjective probabilities of survival to age 75 have strong predictive power for actual survival to the target age.

### Estimates of Differential Survival Based on Actual Survival vs. Those Based on Subjective Probabilities of Survival

The second requirement for our approach to be viable is that subjective probabilities of survival capture differences by measures of socioeconomic status. Here we report our results for one such measure, wealth.^{10} The strongest such test of validity is to compare estimates of differential survival based on actual survival to age 75 with those based on subjective probabilities of survival to age 75 in the *same* sample. To investigate the possibility that our results might be sensitive to the choice of functional form, we conduct our empirical validation in two ways: we present nonparametric estimates of differential survival by wealth, and we generate parametric estimates that account for additional covariates.

We define *wealth* as the sum of financial assets (including IRAs), housing, other real estate, and transportation, minus all household debt. We exclude pension and Social Security wealth. We use wealth from the last calendar year, measured at the same time as the subjective probability of survival to P75. All of the following analyses are based on the analytical sample described in the earlier “HRS Data” section.

#### Nonparametric Estimates of Differential Survival

*X*denotes household wealth. Because couples have much higher levels of wealth than singles, we run the kernel regressions separately for each group. Figure 2 focuses on couples because the vast majority of respondents in our sample live in couple households. Figure 3 presents another nonparametric validation showing the percentage alive at age 75, alongside the average of P75 by wealth terciles. Terciles have the advantage of being insensitive to outliers. We define terciles over all respondents interviewed in the same wave, stratifying by marital status (singles vs. couples) and age category (60–64 and 65–69). Stratifying by marital status also allows us to pool singles and couples in the analysis.

Figures 2 and 3 show a positive relationship between wealth and actual survival: the higher the level of wealth, the larger the percentage of respondents alive. This finding is in line with previous work. More importantly, in both figures, the lines showing actual survival are parallel to the ones tracking subjective probabilities of survival. This supports our hypothesis that they exhibit closely comparable patterns of differential survival by wealth. Separating out men and women yields very similar patterns (figures not shown). Note, however, that in both figures, the line of subjective survival is below that of actual survival. Respondents seem to provide subjective expectations that accurately reflect the *differential* in survival but that underestimate, on average, the likelihood of survival.^{11}

#### Parametric Estimates of Differential Survival

To estimate the survival function presented in our theoretical framework of survey response, we need to make a parametric assumption about the distribution *G*. We present results under the assumption that *G* is a logistic distribution with mean 0 and variance . To simplify the notation, we omit the subscript *t* in this section. Let *X*=[*W*, *Z*], where *W* denotes wealth (or another measure of socioeconomic status) and *Z* denotes nonwealth covariates.

Our empirical strategy is to

Estimate the effect ofWonactual survival,ββ_{WT}, using a logit model. The dependent variable is the binary event whether the respondent is alive at 75, and we estimate the equationwhere The scale and location normalization ensures identification of the parametersβ_{WT},β_{ZT}, andC_{T}.Estimate the effect ofWonelicited subjective survival, β_{WS}, using an analogous regressionThis estimation is performed over thesame sampleas the one in step (1). To estimate Eq. 4, we use the quasi-likelihood method presented in Papke and Wooldridge (1996) and based on a Bernoulli log-likelihood function.^{12}Papke and Wooldridge have shown that the quasi-maximum likelihood estimator is consistent, asymptotically normal, and efficient in a class of estimators containing weighted nonlinear least squares estimators.Test the hypothesis that the coefficients associated with the socioeconomic status variables in the model of actual survival are equal to those in the model of subjective survival:Table 1 presents the results. In addition to the variables of interest, we include categorical variables for sex and for age at the time of the query about expectations of survival as independent variables.

Table 1

. Logit on Actual Survival to Age 75 . Quasi Maximum-Likelihood on Subjective Survival to Age 75 . Coefficient . pValue .Coefficient . pValue .Wealth Tercile Lowest (ref.) –– –– Second 0.256 .094 0.211 .019 Highest 0.466 .003 0.431 .000 Age at Baseline 61 −1.041 .000 −0.294 .078 62 −0.705 .000 0.096 .300 63 (ref.) –– –– 64 0.333 .156 0.068 .580 65 0.082 .724 0.303 .015 66 −0.147 .594 0.283 .087 Female −0.004 .978 0.020 .811 Constant 0.830 .000 0.320 .000 N1,219 1,219

. Logit on Actual Survival to Age 75 . Quasi Maximum-Likelihood on Subjective Survival to Age 75 . Coefficient . pValue .Coefficient . pValue .Wealth Tercile Lowest (ref.) –– –– Second 0.256 .094 0.211 .019 Highest 0.466 .003 0.431 .000 Age at Baseline 61 −1.041 .000 −0.294 .078 62 −0.705 .000 0.096 .300 63 (ref.) –– –– 64 0.333 .156 0.068 .580 65 0.082 .724 0.303 .015 66 −0.147 .594 0.283 .087 Female −0.004 .978 0.020 .811 Constant 0.830 .000 0.320 .000 N1,219 1,219

_{WT}and β

_{WS}are vectors of size 2 containing the coefficients on the wealth terciles in the actual survival function and subjective survival function, respectively. Under the null hypothesis, we have

To test the hypothesis H_{0}, we estimate the variance-covariance matrix of variance-covariance matrix. The resulting value of the test statistics is 0.088 for wealth, indicating that for each specification we cannot reject the null hypothesis *H*_{0} at the 5% significance level. This result suggests once again that subjective probabilities of survival provide a suitable alternative for estimating differential survival by wealth.

Because the equations estimate survival to age 75, conditional on being alive at one’s current age, we expect the coefficients associated with the indicator variables for age to increase in age in Table 1. This is what we find up to age 63, both for the equation based on subjective survival and that based on actual survival. The point estimates appear to show nonmonotonic patterns after age 63, but the coefficients are not statistically different from each other.^{13} It is worth noting when assessing the estimated age patterns that more than 60% of the sample for this comparative exercise is age 62 or 63, leaving only relatively small numbers of observations for the remaining age groups. As a separate exercise to gauge how well subjective probabilities of survival capture age patterns of actual survival for the ages represented in our sample, we compare the age gradient in average subjective probabilities of survival to 75 with the age gradient in actual survival to 75 implied by the 2003 life table. We find them closely comparable. This suggests that people in their early to mid-60s are aware, on average, of the impact of age on mortality in the overall population.

Table 1 presents estimates of differential survival by socioeconomic status pooled by sex, including an indicator variable for “female.” The coefficient associated with being female is not statistically significantly different from zero in any of the regressions.^{14} Some studies report sex differences in differential survival by socioeconomic status in the age group we consider. When using actual survival and interacting wealth with sex, we do not find a statistically significant impact of sex on the relationship between survival and wealth. Moreover, we find that interacting wealth with sex leads to similarly close estimates of differential survival in the regressions based on actual survival and those based on subjective survival.

*G*, the coefficients we estimate can be interpreted in terms of odds ratios. We use this relationship when interpreting our findings. For example, the coefficient β

_{2}on the second wealth tercile can be interpreted as the log odds ratio for survival of the second wealth tercile to the first wealth tercile. Considering that the lowest wealth tercile is the omitted variable, the log odds ratio is given by

^{15}

#### Robustness Checks

Online Resource 1 presents a series of robustness checks of our validation results. We show that our results are robust to other estimation methods and functional form assumptions. In addition, we investigate how to deal with item nonresponse and focal answers at 50%. People who answer “50%” can be thought of as respondents who either truly believe that their chance of survival is about half or who are uncertain about it (e.g., Bruine de Bruin et al. 2000; Hill et al. 2006). One concern about our methodology is whether nonresponse or the tendency to provide focal answers varies systematically by wealth. We use variables that have been shown to correlate strongly with subjective probabilities of survival to impute P75 and then replace the 50% and missing answers with the imputations. The set of covariates for the imputation includes basic demographics, a number of health-related variables, and parental mortality.^{16} Once again, we find that the coefficients on the wealth terciles are very similar in both the logit regression on actual survival at 75 and in the quasi-maximum likelihood estimator on P75, where we replaced any original 50% or missing answers with the imputation.

## Applying Our Approach: Differential Subjective Survival by Wealth in Europe and the United States

In this part of our article, we apply our approach and use subjective probabilities of survival to age 75 to produce comparable estimates of differential survival by wealth across 10 European countries and the United States.^{17} The data come from ELSA and SHARE for the European countries and the HRS for the United States.

### Data Description: SHARE Wave 1, ELSA Wave 2, and HRS 2004

#### Description of the Surveys

In 2004, SHARE administered the same survey instrument—designed to be directly comparable to the HRS—in 11 European countries.^{18} The selection of countries reflects the different regions of Europe: Sweden and Denmark cover Scandinavia; the Netherlands, Belgium, Germany, France, Austria, and Switzerland represent Central Europe; and Italy, Spain, and Greece cover the south of Europe.^{19} The sample in each participating country is representative of the population age 50 and older when weights are applied. Börsch-Supan and Mariuzzo (2005) found that the SHARE data produce very similar distributions of key concepts—such as employment, income, education, and health—as other European surveys.^{20} They concluded that “SHARE represents the population of individuals aged 50 and over in Europe well” (p. 34).

ELSA is the English counterpart of the HRS and SHARE. In 2002, it collected its first wave of data, drawing the sample from households that had participated in the Health Survey for England in 1998, 1999, or 2001. We use the second wave of ELSA, administered in 2004.^{21}

Sample sizes vary by country. In SHARE, the number of participants varied from just over 1,000 in Switzerland to almost 4,000 in Belgium. ELSA surveyed more than 9,000 respondents in the UK, while about 20,000 respondents participated in HRS 2004 in the United States.

Because SHARE, ELSA, and the HRS are directly comparable, the variables of interest to our study (i.e., subjective probabilities of survival and wealth) were observed in all countries and measured in the same way. In our analysis, we use respondents aged 51 to 65 who were asked about their survival to age 75.^{22}

#### Preliminary Analyses

Before providing parametric estimates of differential survival from SHARE, ELSA, and the HRS, we evaluate whether the subjective probabilities of survival predict actual mortality and whether they vary meaningfully by wealth. Although we lacked a sufficiently long panel to assess (as we did for the HRS) whether the subjective probabilities of survival to age 75 in SHARE strongly predict actual survival to that age, the 2006 wave of SHARE permits us to determine whether subjective probabilities predict two-year survival. Winter (2008) compared average subjective probabilities of survival between those respondents who survived between 2004 and 2006 and those who died.^{23} Indeed, the average subjective probability for those who, in fact, did not survive is substantially lower than for those who survived (40% vs. 63%). The difference is statistically significant at conventional levels. Table 2 shows that this result holds for every country (*p* < .02 for all countries).

Table 3 presents the average of P75 by wealth tercile for individuals aged 51 to 65 in SHARE, ELSA, and the HRS.^{24} Wealth terciles are defined separately within country by marital status (single/couple) and age band (51–58, 59–65) using household weights. Our measure of wealth is total household net worth in euros, adjusted for purchasing power parity. We do not include entitlements to public or private pensions.

We find that for most countries, average subjective probabilities of survival are higher for higher wealth terciles. The gradient is particularly marked for the United States, England, Austria, Spain, and Sweden. Our findings are not driven by differences in the composition of age and sex within the wealth terciles: we find similar patterns when looking at the average subjective survival to age 75 separately by age groups and sex (table not shown). There are some differences in the levels of subjective probabilities across countries, similar to those found in life tables. For example, according to 2002 life tables, a 55-year-old Austrian man had a 67.3% chance of surviving to age 75, while a 55-year-old Swedish man had a 72.5% chance.

### Estimates of Differential Subjective Survival by Wealth Across Countries

^{25}All the regressions we present use respondent weights.

Table 4 presents our estimates and their significance levels for each of the country regressions, as well as the pooled regression for the European countries and the United States.^{26} In all countries, those with more wealth have higher subjective probabilities of survival than those with less wealth, and at least one of the two wealth coefficients (for the second and third terciles relative to the first) is significant. However, the gradient in the probability of survival is less pronounced in some countries than in others. Germany, France, and Belgium have a smaller gradient. In Belgium, for example, the odds ratio for individuals in the second and third wealth terciles is about 1.06. The gradient is largest in Sweden, England, and the United States, where the estimated odds ratios for the highest wealth tercile are between 1.27 and 1.43. When we test for the joint hypothesis of equality of the coefficients associated with the second and third wealth terciles for each pair of countries, the coefficients for the low-gradient countries are statistically significantly different (at 10%) from those for the high-gradient countries.^{27} In the Netherlands, there is a large difference between the first and second terciles, but very little difference between the second and third. We find a larger gradient in the United States than in the European countries as a whole, and the difference is statistically significant at 1% when we test for the joint hypothesis of equality of the coefficients associated with the second and third wealth terciles.

#### Robustness Checks

Item nonresponse might be a concern if it were not randomly distributed by wealth tercile and health status. The item nonresponse rate varies a lot by country: quite low in Austria, Germany, and England (less than 2%), it is highest in France (12%), followed by Spain and Italy (9% to 10%). It also tends to vary by wealth. In the Netherlands, Spain, Denmark, and the United States, for example, it is lower for higher wealth terciles.^{28} Finally, it varies by self-reported health, being higher among people who report being in fair or poor health.

To address the issues that might arise from differential rates of item nonresponse, we impute subjective probabilities for respondents who did not answer the question about subjective probabilities of survival. We use the same variables used for our imputations in the HRS (i.e., basic demographics, health information, and parental mortality), using a separate regression for each country (tables not shown). The resulting alternative estimates of differential survival and their significance levels are very similar to those presented in Table 4, even in countries with high item nonresponse. Only in Spain is the impact of wealth on survival reduced, and then only slightly.

Focal answers could be another potential issue. The tendency to answer 50% to subjective probability questions might bias our results if it were to vary systematically by wealth.^{29} The percentages are quite similar for each wealth tercile in Germany, Italy, and Belgium (between 23% and 26%), but tend to decrease with wealth in the Netherlands, Spain, England, and the United States.^{30} Here, too, we impute subjective probabilities for missing responses and answers of 50%, using basic demographics, health information, and parental mortality. Table 5 presents the resulting estimates of differential survival and their significance levels. Figure 4 shows the exponential of the coefficients associated with the middle and highest wealth terciles, which can be interpreted as the odds ratio of survival compared with survival in the lowest wealth tercile (the reference group), and their 95% confidence intervals.^{31}

Overall, the estimates with the imputations for missing responses and 50% answers shown in Fig. 4 present a picture of differential survival relatively similar to those estimates without imputations, although the differences in survival by wealth appear larger in many countries—in particular, Austria, Germany, Sweden, France, Denmark, England, Belgium, and the United States. In Spain, the estimated differential is smaller. The gradient is strongest in the United States, Sweden, and England; weakest in Belgium, France, and Italy. Again, the coefficients for the low-gradient countries are statistically significantly different (at 10%) from those for the high-gradient countries when testing for the joint hypothesis of equality of the coefficients associated with the second and third wealth terciles for each pair of countries.^{32} We find a larger gradient in the United States than in the European countries as a whole, and the difference is statistically significant at 1%.

Looking at other coefficients, we see in Table 5 that the coefficient associated with being female is positive and statistically significantly different from zero at a conventional level for most of the countries. This highlights women’s higher probability of survival. The coefficients associated with the age categories are, in most regressions, not statistically significantly different from zero. This may reflect the fact that in this age range (50s and early 60s), the actual probability of being alive at age 75 conditional on being alive at the current age is relatively flat, as confirmed by life tables.

Heterogeneity in survival by wealth terciles across European countries could simply reflect heterogeneity in the distribution of wealth. We investigated this possibility and found that countries that share a similar distribution of wealth had different survival differentials.^{33} This finding suggests that the heterogeneity in differential survival by wealth across European countries is more likely driven by heterogeneity in other factors, such as institutional settings, policy, and cultural and social issues, than by dissimilarities in the distribution of wealth.

### Comparison with Existing Studies

Because existing studies either cover a different time period or age group than our study or use different measures of socioeconomic status, and also because our estimates are forward-looking, it is difficult to make direct comparisons between our study and others. Nevertheless, we cite a few results found in existing work to compare with our own. Mackenbach et al. (1997), who looked at morbidity and mortality for several European countries, also found the strong gradient in survival that we find for Sweden and England based on subjective probabilities of survival. But Mackenbach et al.’s results are not easily comparable with ours because they studied mortality by occupational class in men in manual versus nonmanual occupations for a much larger age group. They found that France has the highest mortality gradient by occupation (but a low morbidity gradient).

Mackenbach et al. (2003) investigated the widening of inequalities in mortality in several European countries from the 1980s to the 1990s. For the 1990s, they reported a ranking of differential mortality similar to ours—from the weakest to the strongest, Denmark, England, and Sweden. Their ranking, however, was by occupational classes. Hoffmann (2005) found, as we do, less differential mortality by socioeconomic status in Denmark than in the United States.

## Caveats

By applying our method to the European countries, we implicitly assume that the validation exercise we performed in the U.S. context would be replicated if done in the European context. The fact that the subjective survival probabilities to age 75 in the SHARE data predict two-year survival gives us confidence that the SHARE data are similarly informative to the HRS data. However, we cannot yet undertake a complete investigation of whether the relationship between socioeconomic status and subjective survival is similar to that between socioeconomic status and actual survival for all the countries we study. That will require panel data that are comparable across countries, of similar length to the HRS (i.e., a significant fraction of the sample will have reached the target age asked about in the subjective survival questions), and matched with national death records or the equivalent.

Other caveats are inherent to the use of survey data in general. First, unit nonresponse varies across countries in SHARE (Börsch-Supan and Jürges 2005). Weighting reduces this effect somewhat. But to the extent that unit nonresponse varies within age and sex groups in a nonrandom manner—and, in particular, in a way that is correlated with wealth—this could affect the estimates of differential survival, both *actual* (if sufficient follow-up data were available) and *subjective*. We call attention again to our decision to exclude Switzerland from our analysis because of a particularly low unit response rate (36%). We also excluded Greece because of anomalies in the data. Second, some population groups (such as the very wealthy, the very poor, or those in nursing homes) may be underrepresented in the sample compared with the general population, potentially leading to biased estimates of mortality differentials.

## Conclusions

Obtaining reliable estimates of differential survival by socioeconomic status from panel data is an exercise with rigorous data demands. In many countries, the necessary data do not exist. Data that would further permit estimates comparable across countries are even harder to come by because measures of wealth or income usually come from different surveys with different designs.

As an alternative, we investigated a much less data-intensive approach that relies on cross-sectional observations of subjective probabilities of survival. We were able to draw the data for a selection of European countries from the same survey, SHARE, and for two other countries from comparable survey instruments: ELSA in England and the HRS in the United States.

To verify the viability of our approach, we first estimated differential survival by wealth from panel data from the HRS. We then compared these estimates with those generated from subjective probabilities of survival collected at the panel’s baseline in the same sample. The subjective probabilities of survival produced estimates of differential survival very close to those generated using actual survival data.

We next used this approach to conduct a cross-country comparison, estimating differential survival by wealth on the basis of subjective probabilities of survival to age 75 collected in 10 European countries and the United States in 2004. The United States, Sweden, and England showed large differences in subjective probabilities of survival as a function of wealth, while Belgium, Italy, and France showed smaller differences.

Overall, in the absence of comparable longitudinal data across countries, our approach appears to offer a useful alternative for conducting international comparisons of differential survival by various measures of socioeconomic status. Future research will show whether our validation exercise using the HRS data would hold if conducted with SHARE and ELSA data. But this will need to wait until these younger surveys have a sufficiently long panel.

In the meantime, our results provide a starting point for investigating the role of institutional settings, policy, and social factors in explaining health inequalities. Further, our method demonstrates the potential of subjective probabilities of survival to enable researchers to study the effects of mortality on behavior, or to construct alternative life tables, without waiting for a significant portion of an age cohort to have died.

## Acknowledgments

We are grateful for financial support from the National Institute on Aging (NIA) through a pilot grant of the RAND Center for the Study of Aging (P30 AG012815) and from the NIA via Grant P01AG08291. Delavande is grateful for additional funding from a Nova Forum research grant. We are thankful to Iliyan Georgiev, Michael Hurd, Chuck Manski, Younghwan Song, two anonymous referees, and seminar participants at the Workshop on Comparative International Research Based on HRS, ELSA and SHARE; the University of Lausanne; and Tilburg University for helpful comments and discussions. This article uses data from the Health and Retirement Study (HRS), the English Longitudinal Study of Ageing (ELSA), and the Survey of Health, Ageing and Retirement in Europe (SHARE). The collection of these data sets is supported by NIA, the U.S Social Security Administration, the European Commission through the 5th and 6th framework program, and several European governments.

## Notes

^{1}

In addition to quantifying the relationship between mortality and economic variables, most of these studies are concerned with causality between health and socioeconomic status, which is beyond the scope of this article. To conduct those analyses, authors have in some cases combined cross-sectional data on birth-cohort mortality with pooled time-series data on income, education, and poverty (e.g., Deaton and Paxson 2001, 2004).

^{2}

Examples are Desplanques (1991) and Bommier et al. (2006) for France, Nelissen (1999) and references therein for the Netherlands, Hoffmann (2005) for Denmark, and Martikainen (1995) and Valkonen et al. (2000) for Finland.

^{3}

See Manski (2004) for an overview and discussion of the state of knowledge about subjective expectations data.

^{4}

See Juster and Suzman (1995) for an overview of early waves, and the HRS website at http://hrsonline.isr.umich.edu or St. Clair et al. (2008) for information about later waves.

^{5}

Subscripts denoting individuals have been omitted for ease of presentation.

^{6}

Source: Data description and usage of the HRS Tracker file for 2006 (Final, Version 2.0).

^{7}

The vast majority of cases determined to be dead by HRS also have a match in the NDI. The small fraction without a match in the NDI is most likely due to respondents leaving the country or other forms of loss to follow-up.

^{8}

The vast majority of observations that come from 1994 pertain to respondents who were asked both in 1992 and 1994 about their subjective probability of survival to age 75. We use their 1994 response because at that time, the distance to age 75 (i.e., 11 or 12 years) matches more closely that of the rest of the sample.

^{9}

A similar graph of 14-year mortality by P75 for the population-representative sample of age-eligible HRS respondents (age 51–61) interviewed in 1992 shows qualitatively the same pattern, indicating that the patterns in our analytical sample are not driven by our sample selection.

^{10}

Results for the same type of analysis by income and by education are provided in Online Resource 1.

^{11}

These findings are consistent with those reported by Elder (2007), who investigated the predictive power of subjective probabilities of survival for actual mortality and documented biases. Biases in the level of survival do not affect the viability of our proposed method as long as the differentials across socioeconomic groups are preserved.

^{12}

Our estimate is based on the following Bernoulli log-likelihood function: .

^{13}

When testing the joint hypothesis that the coefficients on age 64, 65, and 66 are equal in both regressions, we do not reject this hypothesis at the 5% level.

^{14}

In our analytical sample, 69.9% of the men and 69.2% of the women survived to age 75.

^{15}

Other studies of differential mortality often report results in terms of mortality hazard ratios. In our framework, denotes the probability of dying before the target age *TA* conditional on being alive at the current age and on having characteristics (*Z*, *W*=2). The ratio of the probability of dying before the target age *TA* of the second wealth tercile to the first wealth tercile is given by

^{16}

The health variables include information on drinking alcohol, smoking, the number of chronic conditions, self-rated health, number of limitations with activities of daily living (ADLs), and body mass index.

^{17}

Results for differential survival by income and by education are included in Online Resource 1.

^{18}

Small adjustments were made to accommodate institutional differences.

^{19}

See http://www.share-project.org/ for more details on the sampling and features of SHARE. In this study, we use release 2.0.1 of the first wave of SHARE.

^{20}

These surveys include the European Union Labor Force Survey, 2004; the European Community Household Panel, 2000; and the European Social Survey, 2002.

^{21}

See http://www.ifs.org.uk/elsa/documentation.php for more details on the sampling and features of ELSA.

^{22}

We choose age 51 as the lower age bound because the HRS 2004 data are representative of the population aged 51 and older.

^{23}

Among the Wave 1 respondents, 73.3% were alive at Wave 2, 2.4% were dead, and the remaining were of unknown status.

^{24}

The averages are weighted. For the HRS, SHARE, and ELSA, we use respondent-level weights (Börsch-Supan and Jürges 2005).

^{25}

The sample is relatively small in some countries, so we use narrow age bands rather than age dummy variables.

^{26}

The pooled regression includes indicator variables for the different countries to allow for differences resulting from, for example, sampling or different survey agencies. The weights in the pooled regressions reflect the countries’ population (i.e., the sum of the weights in a country is equal to its population aged 51–65).

^{27}

For example, we reject the joint null hypothesis that the coefficient of the second wealth tercile for Belgium is equal to the coefficient of the second wealth tercile for England and that the coefficient of the third wealth tercile for Belgium is equal to the coefficient of the third wealth tercile for England.

^{28}

Those differences are statistically significant at 5% for Germany, Spain, the United States, and England.

^{29}

Like in the HRS, there is no statistical difference in the proportion of 100% answers by wealth tercile. The rate of zeros is higher in the lower wealth tercile, which is consistent with the fact that they experience lower mortality.

^{30}

The difference is statistically significant at the 5% level for the United States, England, the Netherlands, and Spain.

^{31}

The predicted probabilities that we impute are based on regressions including respondents who answered 50%. Excluding them does not change our results much. To obtain the variance of the exponential of the coefficients necessary to compute the 95% confidence interval, we use a first-degree Taylor approximation—that is, *V*(*f*(*x*))=(*f’*(*x*))^{2}*V*(*x*).

^{32}

Two exceptions are England and France and England and Italy, for which we get *p* values of .139 and .175, respectively. Note, however, that for these two pairs of countries, we reject equality of the coefficient associated with the third wealth tercile at 10%.

^{33}

See page 6 and Table S4 of Online Resource 1 for detailed results.

## References

*Managing the risk of life*(Michigan Retirement Research Center Working Paper No. 2007–167). Ann Arbor: Michigan Retirement Research Center, University of Michigan.

*Subjective survival probabilities in the Health and Retirement Study: Systematic biases and predictive validity*(Michigan Retirement Research Center Working Paper No. 2007–159). Ann Arbor: Michigan Retirement Research Center, University of Michigan.

*Estimating Knightian uncertainty from survival probability questions on the HRS*(Working paper). Ann Arbor: Department of Economics, University of Michigan.

*Does the socioeconomic mortality gradient interact with age? Evidence from US survey data and Danish register data*(MPIDR Working Paper WP 2005–020). Rostock, Germany: Max Planck Institute for Demographic Research.

*Subjective probabilities of survival: An international comparison*(Unpublished manuscript). Santa Monica, CA: RAND.

*Cognition and wealth: The importance of probabilistic thinking*(Michigan Retirement Research Center Working Paper UM00-04). Ann Arbor: Michigan Retirement Research Center, University of Michigan.

*International Journal of Epidemiology, 32,*830–837.

*RAND HRS Data Documentation, Version H*. Santa Monica, CA: RAND Labor and Population and RAND Center for the Study of Aging.