## Abstract

Multistate modeling is a commonly used method to compute healthy life expectancy. However, there is currently no analytical method to decompose the components of differentials in summary measures calculated from multistate models. In this research note, we propose a derivative-based method to decompose the differentials in population-based health expectancies estimated via a multistate model into two main components: the proportion resulting from differences in initial health structure and the proportion resulting from differences in health transitions. We illustrate the method using data on activities of daily living from the U.S. Health and Retirement Study to decompose the sex differential in disability-free life expectancy (HLE) among older Americans. Our results suggest that the sex gap in HLE results primarily from differences in transition rates between disability states rather than from the initial health distribution of female and male populations. The methods introduced here will enable researchers, including those working in fields other than health, to decompose the relative contribution of initial population structure and transition probabilities to differences in state-specific life expectancies from multistate models.

## Introduction

Decomposition methods comprise a suite of both index-specific and generalized approaches to decompose differences between summary indices into contributions resulting from their underlying parameters. For the case of healthy (or disability-free) life expectancy decomposition, the approaches include both applications of generalized decomposition (Andreev et al. 2002) and index-specific proposals (Gu et al. 2009; Nusselder and Looman 2004; Sauerberg and Canudas-Romo 2022; Shkolnikov and Andreev 2017). These approaches have been applied to and developed for the case of healthy life expectancy calculated with a life table and health prevalence by age (Sullivan 1971), and each estimates the contribution of differences in health prevalence and mortality to the differences in health expectancy.

A potential drawback to a Sullivan-based approach is that this method uses only aggregated mortality information from all health states and the health prevalence resulting from the transitions between states. Hence, the effects of health state transitions in shaping healthy life expectancy cannot be captured in the decomposition. When mortality rates differ by health states, differences in mortality between populations are likely to be driven by the underlying transitions between health states. The multistate life table can better illuminate the dynamics underlying healthy life expectancy using information on transitions between health states and mortality patterns specific to each health state. Despite the utility of the multistate model, there has been limited development of methods for decomposing multistate health expectancy.

This research note proposes a method to decompose health expectancies computed from population-based multistate life tables using a mathematical derivative-based decomposition approach analogous to the generalized approach of Caswell (1989). Differences in health expectancies are decomposed into the portion driven by the initial (*radix*) distribution of population health characteristics and the portion resulting from differences in transition rates between health states. To demonstrate the method, we decompose the sex differential in the disability-free and disabled life expectancy of females and males in the United States into the part due to differences in transition probabilities between health states and the portion resulting from differences in the initial distribution of health states. This decomposition yields new insights into the components underlying differences in multistate life expectancies.

## Methods

In the single-decrement life table, we usually start with the estimation of the age-specific probabilities of surviving, $px$, from the observed age-specific death rate. With the $px$, the survival function, $lx$, is derived recursively as $lx=lx\u2009\u2212\u20091px\u2009\u2212\u20091\u2009=l\alpha p\alpha p\alpha \u2009+\u20091...$$px\u2009\u2212\u20091\u2009=l\alpha \u220fk\u2009=\u2009\alpha x\u2009\u2212\u20091pk$, where $l\alpha $ is the survival at *radix* age, $\alpha $. The life table number of person-years lived between age $x$ and $x+1$,$Lx$, can then be approximated as $Lx\u2009=lx\u2009+lx\u2009+\u200912$, assuming deaths are uniformly distributed in the age interval. Life expectancies defined in bounded age limits, that is, temporary life expectancy (Arriaga 1984), can be calculated as $e\beta \u2009\u2212\u2009\alpha \alpha \u2009=\u2211x\u2009=\u2009\alpha \u2009\beta Lxl\alpha $, where $\alpha $ and $\beta $ are the lower and upper limits of the range of ages of interest to study. Any life expectancy at age $\alpha $ can also be calculated by studying the mortality starting at that age, that is, a life table starting at age $\alpha $ with radix $l\alpha \u2009=1$.

Multistate life tables inherit similar functions and equations from the single-decrement life tables and they “are best dealt with by adopting matrix notation” (Schoen 1988:63). For a multistate life table with $n$ states, $1,2,...,\u2009\u2009n$, the matrix of transition probabilities at a given age *x* is an $n$-by-$n$ square matrix,

where $pxij$ represents the transition probability from state $i$ to $j$ between ages $x$ and $x+1$. Like in a single-decrement life table starting from age $\alpha $, the survival matrix $lx$ can be calculated based on the transition probabilities as

where the product operator $\u220fk\u2009=\u2009\alpha x\u2009\u2212\u20091Pk$ invokes matrix products. This survivorship function at age $x$ is also a square matrix, represented as

where $lxij$ is the proportion of people with initial state $i$ at radix age $\alpha $ and in state $j$ at age $x$ (Willekens and Rogers 1978). This definition can also be found in Eq. (1), as the rightmost term, the product of $Px$, can be seen as one $n$-by-$n$ matrix multiplying the radix survival matrix at age $\alpha $. The radix of the survival matrix is a special case as $i$ and $j$ are the same,

where $l\alpha ii$ are the initial proportions of people at state $i$ at the radix age $\alpha $ of the multistate life table. The diagonal in this matrix adds up to 100% of the initial population, so $\u2211i\u2009=\u20091nl\alpha ii\u2009=1$ when computing the population-based multistate life expectancy (Crimmins et al. 1994).

As in the single-decrement life expectancy, the population-based multistate life expectancy, $e\beta \u2009\u2212\u2009\alpha \alpha $, is calculated in terms of the survival matrix as

where the $l\beta $ is the survival matrix of the last age and $lx$ are the ones between ages $\alpha $ and $\beta $. The expectancy in multistate life table from age $\alpha $ to $\beta $ can be represented as

where $e\beta \u2009\u2212\u2009\alpha \alpha ij$ corresponds to the expected contribution to population-based life expectancy in state $j$ from age $\alpha $ to $\beta $ for individuals in initial state $i$ at exact age $\alpha $, weighted by the initial population structure. The sum of the elements in each column is the expectancy in each state by persons alive at exact age $\alpha $, irrespective of their initial state (Schoen 1988).

Following the procedure in Vaupel and Canudas-Romo (2003), we let a dot on top of a function denote the derivative either with respect to time or in our case a between-population comparison (as done in Canudas-Romo and Guillot 2015). This comparison between two populations in the multistate temporary life expectancy can be calculated as the derivative for each of the additive components in Eq. (2),

The theory of matrix calculus is detailed in Magnus and Neudecker (2019) and the calculation procedure can be found in the online appendix. Similarly, the derivative in the survival matrix in Eq. (1) for values of $x\u2265\alpha $ is

where $\u220fk\u2009=\u2009h\u2009+\u20091x\u2009\u2212\u20091Pk$ is an identity matrix when $h=x\u22121$.

Substituting Eq. (4) into Eq. (3) and arranging terms, we obtain

where $I$ is the identity matrix and $e\beta \u2009\u2212\u2009xx$ is the status-based life expectancy between age $x$ and $\beta $. The latter quantity is similar to the population-based life expectancy, $e\beta \u2009\u2212\u2009\alpha \alpha $, in Eq. (2), but without the product of the initial population structure, $l\alpha $, calculated as $e\beta \u2009\u2212\u2009xx=I2+\u2211h\u2009=\u2009x\beta \u2009\u2212\u20092(\u220fk\u2009=\u2009xhPk)+\u220fk\u2009=\u2009x\beta \u2009\u2212\u20091Pk2$. Thus, the population-based multistate life expectancy in Eq. (2) can be alternatively computed as the product of the initial population structure and status-based life expectancy, $e\beta \u2009\u2212\u2009\alpha \alpha \u2009=l\alpha \xb7\u2009e\beta \u2009\u2212\u2009\alpha \alpha $. For the last value when $x=\beta \u22121$, the term $e0\beta $ is defined as a zero matrix. Details of the calculation procedures here are included in the online appendix.

Equation (5) can be interpreted as having two terms: (1) the effect from comparisons between the initial population structures, denoted by $l\u02d9\alpha \xb7\u2009e\beta \u2009\u2212\u2009\alpha \alpha $, and (2) the effect from the comparison between the transition matrices across ages, or the second term in Eq. (5). The product within the summation operator of the second term in Eq. (5), $lxP\u02d9x(I2+e\beta \u2009\u2212\u2009x\u2009\u2212\u20091x\u2009+\u20091)$, is the age-specific effect from differences in the transition matrix at age $x$. The sum of each column of the matrices in these two components is the contribution to the state-specific life expectancy comparison.

The difference in health expectancies resulting from differences in the transition matrices experienced by two groups can be interpreted as the combined effect of all the transition probabilities in the matrices. These overall health expectancy differences due to the transition matrices can be further disentangled into the individual effects of each transition probability, via matrix multiplication in the second term of Eq. (5). The effect from differences in any age-specific transition probability from $g$ to $h$ (denoted as $\lambda ghxij$) on the differential in life expectancy in state $j$ between ages $\alpha $ and $\beta $ for individuals in initial state $i$ (denoted as $e\u02d9\beta \u2009\u2212\u2009\alpha \alpha ij$) can be calculated as

where $lxij$, $p\u02d9xij,$ and $\epsilon \beta \u2009\u2212\u2009xxij$ are elements within matrix $lx$, $P\u02d9x,$ and $e\beta \u2009\u2212\u2009xx$, respectively, of the second term in Eq. (5). $P\u02d9x$ and $e\beta \u2009\u2212\u2009xx$ are, respectively, analogous to the matrices $Px$ and $e\beta \u2009\u2212\u2009\alpha \alpha $ presented earlier with the elements inside replaced by $p\u02d9xij$ and $\epsilon \beta \u2009\u2212\u2009xxij$. Equation (6) can also be seen as the matrix operations of $lxP\u02d9x(I2+e\beta \u2009\u2212\u2009x\u2009\u2212\u20091x\u2009+\u20091)$ in Eq. (5). The online appendix provides detailed calculation procedures for disentangling these effects. We can obtain the overall effect from differences in any transition probability, from $g$ to $h$, on the differential in life expectancy in state $j$ by summing all the initial states, $i$, and ages, $x$, from Eq. (6). This can be written as

with the “$\xb7$” before $j$ representing the aggregation by $i$. Equation (7) reveals that every probability inside the transition matrix has effects on the differential in Eq. (5) of every state-specific life expectancy. The sum of these transition-specific effects across every transition probability, $\u2211g\u2009=\u20091n\u2211h\u2009=\u20091n\lambda gh\xb7j$, is equal to the contribution to the state-specific life expectancy differences from the transition matrix (i.e., the second term in Eq. (5)). All calculations were conducted using R software (R Core Team 2023), and the generalizable R code is included in the repository https://github.com/tyaSHEN/HLEdecom.

## Illustration

We illustrate this decomposition method by looking at the sex differential in disability-free life expectancy (HLE) and disabled life expectancy (ULE) from ages 55 to 105 in the United States. This temporary life expectancy is, in this illustration, referred to and interpreted as the remaining life expectancy at age 55 assuming that no one survives beyond 105 years old. This sex differential is decomposed into the contribution resulting from the initial population health structure and the contribution from differences in transition probabilities between health states.

### Data and Estimation Procedures

Our disability data are from the U.S. Health and Retirement Survey ([HRS] 2021), a biannual national longitudinal survey (Sonnega et al. 2014). Since 1992, the HRS has surveyed a longitudinal panel of approximately 20,000 individuals aged 51 or above, collecting information on sociodemographic, health status, wealth, income, and pensions. Mortality follow-up is conducted by linking the HRS sample to the U.S. national death index. HRS data include sample weights and are designed to be representative of the U.S. population. We estimate the remaining HLE by sex by aggregating data from the 2008 to 2018 waves of data collection. Disability is conceptualized as difficulty in doing any of the five activities of daily living (ADLs): bathing, dressing, eating, transferring in/out of bed, and walking across a room. Individuals are classified as “disability-free” if they report no difficulty with ADL or “disabled” if they report difficulty with one or more ADL items, as in Payne (2022). The baseline of the initial population and health structure represents the average characteristics of respondents aged between 51 and 60 years old, with only the first record for each individual used in calculating this baseline health state distribution. The respondents from the 10-year age group are pooled together to create a synthetic cohort centered on age 55 to increase the sample size and reduce uncertainty.

Similar to many studies estimating healthy life expectancy from a health survey, the transition probabilities between health states by age and sex are estimated using a multinomial logistic regression model to produce smoothed estimates (e.g., Cai et al. 2010; Huang et al. 2021; Shen and Payne 2023). We include age squared and the interaction between age and sex in the regression to account for potential nonlinearity in the effect of age. Our analyses include the weight from HRS (combined respondent weight and nursing home resident weight) and an attrition weight calculated from inverse probability weighting of sociodemographic characteristics, such as sex, education, race, and ethnicity. The point estimates presented are from the dataset collected in HRS. We resample this original dataset 500 times by bootstrapping to estimate the variance in both baseline characteristics and transition probabilities. The central 95% of the results based on these 500 bootstrapping resamples are taken as the confidence intervals. The state-space for our three-state example is illustrated in Figure 1. The two transient (or nonabsorbing) states are disability-free (or health, “H”) and disabled (or unhealthy, “U”), while dead (“D”) is the absorbing state in our model.^{1}

## Results

The initial population health structure at age 55 by sex is provided in panel A of Table 1. The proportions of individuals who are initially disability-free are quite similar among females and males. Panel B of Table 1 presents the status-based HLE and ULE at age 55, where the expectancies are not weighted by the initial population structure from panel A. Each row represents expected years spent in different health states separately by initial disability status. For example, the cell in the top left corner (24.49) represents the expected years spent disability-free for a woman who is disability-free at age 55, and the cell to its right (5.49) represents the expected time spent with disability for a woman who is disability-free at age 55. Therefore, the remaining life expectancy of a woman who is disability-free at age 55 is 29.98 years. In contrast, a woman who is disabled at age 55 has a remaining life expectancy of just under 28 years.

Panel C of Table 1 presents the contribution of individuals from each initial health state to total population-based health expectancies, $e5055$, combining information from panels A and B. The sex differentials in panel C are what we set out to decompose in this paper. The rows represent the expected contribution to the population-based health expectancy by initial health state, weighted by initial disability status. The row total is the total life expectancy (TLE) by initial health state, weighted by initial disability status. The main function of these margin totals is to show the additive relationship of the weighted average TLE (e.g., 29.75 for females) and the relationship between the different steps of the subsequent decomposition. The more important information of panel C is in the margin of each column, which is the average time an individual can expect to spend in each state. These “Total” figures can be understood as the weighted average of the corresponding column in panel B, with the initial state proportions of panel A used as the weights. Females on average spend 23.88 years disability-free and 5.87 years with disability, which adds up to the TLE of 29.75 years. For males, these figures are 22.47 and 4.20 years, respectively, for a total of 26.67 years of TLE at age 55. Examination of the confidence intervals (CIs) in the parentheses below the point estimates shows that females have significantly higher HLE and ULE than males. The contributors to these sex gaps are then explored in the decomposition (see Table 2).

Table 2 has four panels. Panel A is the sex gap in expectancies calculated from Table 1, panel C. Panels B and C in Table 2 represent the components contributing to the sex gap from Eq. (5), and panel D further decomposes the contributions from the transition matrix based on Eq. (7). The row totals represent the sex gap in TLE by initial health state, weighted by initial disability status, with a total sex gap of 3.08 years in TLE. These row totals also show the additive relationship of the decomposition. For example, the row totals in panel A (2.48 and 0.61 years for the healthy and unhealthy, respectively) are the addition of the row totals of panels B and C. However, the primary focus should be on the “Total” (the third) row. As shown in panel A, females live 1.42 more disability-free years and 1.66 more disabled years than males, which combine to produce a sex gap of 3.08 years of TLE. All of these figures are significantly above zero (fully positive 95% CIs). The values in panels B and C add up to the corresponding values in panel A. Panel B shows that the initial health structure contributes to small sex gaps in HLE (−0.04) and ULE (0.02), though these gaps are not statistically significant. The values in the first row are all negative, revealing that there is a slightly higher proportion of males who are initially disability-free at age 55 as compared to females. The decomposition, however, allows us to directly identify how this difference in initial health state composition contributes to the overall sex differential in health expectancies.

Panel C of Table 2 presents the contribution of the difference in transition probability matrices to the sex gap in health expectancies at ages 55 and above. This represents the combined effect of all probabilities in the matrix on each expectancy across ages. The cells in panel C represent the total effect due to differences in transition probabilities, net of the initial health state composition. Thus, from panel C, we conclude that the combined difference in health state transitions between males and females at ages above 55 contributes 1.46 years to the gap in HLE, 1.64 years to the gap in ULE, and hence 3.10 years to the gap in TLE.

Figure 2 presents the age-specific sex gaps in transition probabilities, showing that the transition matrices act to widen the sex gap in both health states. The area under the curve corresponds to the column sum of panel C of Table 2 or the second term in Eq. (5). The CIs are usually wider at earlier ages because the weight, $e\omega \u2009\u2212\u2009xx$ in Eq. (5), is much higher at earlier ages and hence the bootstrapped variance of the transition matrices is magnified by this weight. Sex differentials in transition probabilities among the earlier ages contribute more to the gap in HLE, while transitions among ages above 75 predominantly contribute to the gap in ULE. These sex differentials in the transition matrices are the main reason for females' advantage in HLE at younger ages and in ULE at older ages.

The transition-specific decomposition of Table 2, panel C, is shown in panel D of that table, based on Eq. (7). This decomposition can facilitate understanding of which transition probability inside the transition matrix contributes the most to the sex gap. The difference in transition probabilities between nonabsorbing states has effects on the sex gap in both HLE and ULE. The sum of each row in panel D is equal to the corresponding column total in panel C. We find that all transitions except recovery make positive contributions to the sex gap in HLE and ULE, although these contributions are not universally statistically significant. The contribution from the probability of remaining healthy ($\lambda HHx\xb7H$) has a high point estimate (0.62) but its CI is also wide, spanning −0.11 to 1.29.

Figure 3 presents the age-specific contribution of each transition probability to differences in HLE (panel a) and ULE (panel b), providing a visualization of the subcomponents underlying Figure 2. The sum of all probabilities at each age in panel a corresponds to the age-specific values of the HLE (solid) line in Figure 2. Similarly, the sum of age-specific values in panel b corresponds to the ULE (dashed) line in Figure 2. The wider CIs at younger ages result from their higher weight, as discussed earlier.

Combining the results from Table 2, panel D, and Figure 3 demonstrates that transition probabilities tend to have greater impacts on time spent in destination states than in origin states. In other words, differences in the probability of staying healthy have a larger effect on HLE than on ULE. On the contrary, differences in disability onset have a larger effect on ULE than on HLE, though the sex differentials in the probability of remaining unhealthy ($\lambda UUx\xb7H$) are also the biggest driver of the sex gap in HLE. Females' higher probability of staying unhealthy (i.e., alive) compared with males, with little difference in recovery, implies lower mortality from being unhealthy and their advantage in total LE. By looking at the effect from each transition probability on differences in HLE and ULE, one can deduce differences in total LE (i.e., mortality), which is the sum of these two. The large effect of the probability of staying alive but unhealthy after age 55 leads to females' higher proportion of expected years living with disability (19.7%) compared with males' (15.7%), as shown in Table 1, panel C.

In Figure 3, we find that the effects of the probability of remaining unhealthy on HLE concentrate in the younger ages and gradually decline after age 70. In contrast, its effects on ULE grow at older ages, which explains the peak of the dashed line in Figure 2. Females' higher probability of staying healthy and of disablement also make a significant and stable contribution to the sex gap in ULE across ages. The probability of recovery is the only transition probability that compresses the sex gap, but it is not significantly below zero across age.

## Discussion and Conclusion

This research note presents a method to decompose differences in population-based multistate life expectancies that is fundamentally different from decomposition methods based on Sullivan-based health expectancies. While Sullivan-based methods decompose differences in health expectancy into portions resulting from mortality differences and health prevalence differences (Oksuzyan et al. 2010), a multistate decomposition offers greater insights into how differences in transition probabilities between states can result in differences in health expectancies and overall life expectancy.

Our multistate decomposition approach includes information on each transition, such as the probability of recovering from disability and remaining disabled, which is unavailable from the Sullivan-based decomposition. Although the Sullivan-based decomposition is valuable in cases where only cross-sectional data are available, our method presents the first comprehensive approach to decomposing multistate life table quantities. In our application of this method to the sex gap in health expectancies in the United States, we find that remaining disabled, rather than recovering, is the largest contributor to the sex differential in both disability-free and disabled life expectancy. The initial population health structure is a relatively small component in our example, partly because the health structure at age 55 between males and females is very similar. Additionally, the remaining life, from 55 to 105, is long enough to attenuate this gap with the transition probabilities at each age (*cf.*Lièvre et al. 2003: figure 2). However, the initial health structure could have big effects when considering remaining life expectancy at older ages or shorter temporary life expectancy (e.g., six years in Payne 2022). More importantly, our method not only decomposes component effects from initial conditions and transitions but also effects from each transition between health states. For example, we can identify that, despite the large effect from differences in the transition matrix in our illustration, recovery from disability accounts for only a very small impact on the sex gap. Our results are based on the most commonly used matrix algebra approach of multistate health expectancy calculations, but future studies could explore other parameterizations (e.g., conditional probability used in Moretti et al. 2023) and their sensitivity to the results.

The data used in the illustration are from the health survey and hence produce large CIs in the age-specific results. Provided that one has access to population-level register data or vital statistics, the method can be applied to the observed age-specific transition, which may reduce uncertainty. The method is also generalizable to other state-spaces without absorbing states, or to abridged life tables. Moreover, it is suitable for analyzing other expectancies based on multistate life tables, such as employment and marriage expectancies. Our decomposition for comparisons between populations can also be applied to changes over time (i.e., derivatives with respect to time). This decomposition could be of great use for understanding the underlying components contributing to the compression or expansion of morbidity.

In summary, this method enables researchers to explore the contribution of the initial population structure and the transition probabilities to the differential in expectancies from multistate models, allowing for unrivaled insight into the factors underlying differences in population-level health expectancies.

## Acknowledgments

The authors thank the three anonymous reviewers and Wen Su for their valuable comments. The views expressed herein are the authors’ and all errors are our own. This work was supported by the Australian National University Futures Scheme at the School of Demography, Australian National University. T.R. was supported by funding from Caixa Foundation Social Observatory SR22-00502. C.F.P. also acknowledges support from an Australian Research Council Discovery Early Career Researcher Award (DE210100087).

## Note

This three-state example should result in three columns of expectancies in $e\beta \u2009\u2212\u2009\alpha \u2009\alpha $ (or a $3\xd73$ matrix). However, the third row and column of the resulting matrix do not add any useful information to our results: all values in the third row are zero (as no individuals are initially dead in our analysis), and the third column represents the life-years lost, which can be calculated as the complement of the sum of the two state-specific life expectancies (Andersen et al. 2013). Therefore, the results are shown in a simplified $2\xd72$ structure even though we use a three-state example. Of note, our decomposition method is also applicable to multistate life tables without an absorbing state, in which case a $3\xd73$ matrix could be useful.