## Abstract

When independence is assumed, forecasts of mortality for subpopulations are almost always divergent in the long term. We propose a method for coherent forecasting of mortality rates for two or more subpopulations, based on functional principal components models of simple and interpretable functions of rates. The product-ratio functional forecasting method models and forecasts the geometric mean of subpopulation rates and the ratio of subpopulation rates to product rates. Coherence is imposed by constraining the forecast ratio function through stationary time series models. The method is applied to sex-specific data for Sweden and state-specific data for Australia. Based on out-of-sample forecasts, the coherent forecasts are at least as accurate in overall terms as comparable independent forecasts, and forecast accuracy is homogenized across subpopulations.

## Introduction

In recent years, we have seen considerable development in the modeling and forecasting of mortality. This includes the pioneering work by Lee and Carter (1992) and a variety of extensions and modifications of the single-factor Lee-Carter model (e.g., Booth et al. 2002; Cairns et al. 2006, 2011; Currie et al. 2004; Hyndman and Ullah 2007; Lee and Miller 2001; Renshaw and Haberman 2003). Booth and Tickle (2008) provided a comprehensive review. Most of this work has focused on forecasting mortality for a single population. However, modeling mortality for two or more populations simultaneously is often important—for example, when forecasting mortality by sex, by states within a country, or by country within a larger region (such as Europe). In such cases, it is usually desirable that the forecasts do not diverge over time.

Nondivergent forecasts for subpopulations within a larger population have been labeled “coherent” (Li and Lee 2005). Coherent forecasting seeks to ensure that the forecasts for related populations maintain certain structural relationships based on extensive historic observation and theoretical considerations. For example, male mortality has been observed to be consistently higher than female mortality at all ages, and although available evidence supports both biological/genetic and social/cultural/environmental/behavioral hypotheses as to why, Kalben (2002) concluded that the determining factor is biological. Thus, sex differences in mortality can be expected to persist in the future and to remain within observed limits. Mortality forecasts for regions within the same country can also be expected not to diverge radically. Differences attributable to environmental and biological factors can be expected to remain unchanged, whereas differences attributable to social, political, behavioral, and cultural factors are unlikely to produce persistent divergence in modern democracies where principles of equality apply. Similar arguments may apply to countries within a common economic and political framework, such as the European Union.

The problem of obtaining coherent forecasts has previously been considered by Lee and Nault (1993), Lee (2000), and Li and Lee (2005) in the context of the Lee-Carter model. Lee (2006) presented a general framework for forecasting life expectancy as the sum of a common trend and the population-specific rate of convergence toward that trend.

A related problem occurs in the case of disaggregated forecasts that involve several additive absorbing states: for example, in forecasting mortality by cause of death. Wilmoth (1995) quantified the way in which the multiple-decrement forecast based on proportional changes in mortality rates by cause of death tends to be more pessimistic than the single-decrement forecast of total rates. Using death densities and compositional methods, Oeppen (2008) found that the multiple-decrement compositional forecast is not always more pessimistic than the single-decrement case.

Coherence has not previously been defined more precisely than the nondivergence of forecasts for subpopulations. In this article, we adopt a more precise definition: we label mortality forecasts as coherent when the forecast age-specific ratios of death rates for any two subpopulations converge to a set of appropriate constants. Exactly what this implies for the forecast trajectories is detailed in the discussion section.

In the second section of this article, we propose a new method for coherent mortality forecasting that involves forecasting interpretable product and ratio functions of rates using the functional data paradigm introduced in Hyndman and Ullah (2007). The product-ratio functional forecasting method can be applied to two or more subpopulations; incorporates convenient calculation of prediction intervals as well as point forecasts; and is suitable for use within a larger stochastic population modeling framework (e.g., Hyndman and Booth 2008). The new method is simple to apply, flexible in its dynamics, and produces forecasts that are at least as accurate in overall terms as the comparable independent method.

In the third section of this article, we illustrate the product-ratio method with functional time series models by applying it to Swedish male and female mortality data and Australian state-specific mortality data. We compare our mortality forecasts with those obtained from independent functional time series models, and evaluate relative accuracy through out-of-sample forecasts. The final section contains a discussion of the method and findings.

## A Coherent Functional Method

We initially frame the problem in terms of forecasting male and female age-specific death rates because the two-sex application is the most common and the best understood. However, our method can be applied generally to any number of subpopulations.

Let *m*_{t,F}(*x*) denote the female death rate for age *x* and year *t*, *t* = 1, . . . *n*. We will model the log death rate, *y*_{t,f}(*x*) = log[*m*_{t,F}(*x*)]. Similar notation applies for males.

### Functional Data Models

*f*

_{t,F}(

*x*) that we are observing with error. Thus, where

*x*

_{i}is the center of age group

*i*(

*i*= 1, . . .

*p*), ε

_{t,F,i}is an independent and identically distributed standard normal random variable, and σ

_{t,F}(

*x*

_{i}) allows the amount of noise to vary with age

*x*. Analogous notation is used for males. For smoothing, we use weighted penalized regression splines (Wood 1994) constrained so that each curve is monotonically increasing above age

*x*= 65 (see Hyndman and Ullah 2007). The weights are to take care of the heterogeneity in death rates across ages. The observational variance σ

_{t,F}(

*x*) is estimated by using a separate penalized regression spline of {

*y*

_{t}(

*x*

_{i}) − log[

*f*

_{t,F}(

*x*

_{i})]}

^{2}against

*x*

_{i}, for each

*t*.

### Product-Ratio Method for Males and Females

^{1}(If substantial differences exist in the variances of subpopulations, the product and ratio will no longer be uncorrelated on the log scale. This does not bias our forecasts, but it makes them less efficient than they would otherwise be. In any case, our method will still be much better than treating the populations independently because we will still be imposing coherence constraints.)

*p*

_{t}(

*x*) and

*r*

_{t}(

*x*): where the functions {ϕ

_{k}(

*x*)} and {ψ

_{l}(

*x*)} are the principal components obtained from decomposing {

*p*

_{t}(

*x*)} and {

*r*

_{t}(

*x*)}, respectively, and β

_{t,k}and γ

_{t,l}are the corresponding principal component scores. The function μ

_{p}(

*x*) is the mean of the set of curves {

*p*

_{t}(

*x*)}, and μ

_{r}(

*x*) is the mean of {

*r*

_{t}(

*x*)}. The error terms, given by

*e*

_{t}(

*x*) and

*w*

_{t}(

*x*), have zero mean and are serially uncorrelated.

The models are estimated using the weighted principal components algorithm of Hyndman and Shang (2009), which places more weight on recent data and so avoids the problem of the functions {ϕ_{k}(*x*)} and {ψ_{l}(*x*)} changing over time (Lee and Miller 2001:545).

The coefficients, {β_{t,1}, . . . , β_{t,K}} and {γ_{t,1}, . . . , γ_{t,L}}, are forecast using time series models as detailed in the upcoming section on forecasts and prediction intervals. To ensure that the forecasts are coherent, we require the coefficients {γ_{t,l}} to be stationary processes. The forecast coefficients are then multiplied by the basis functions, resulting in forecasts of the curves *p*_{t}(*x*) and *r*_{t}(*x*) for future *t*. If *p*_{n+h|n}(*x*) and *r*_{n+h|n}(*x*) are *h*-step forecasts of the product and ratio functions, respectively, then forecasts of the sex-specific death rates are obtained using *f*_{n+h|n,M}(*x*) = *p*_{n+h|n}(*x*)*r*_{n+h|n}(*x*) and *f*_{n+h|n,M}(*x*) = *p*_{n+h|n}(*x*)/*r*_{n+h|n}(*x*).

### Product-Ratio Method for More Than Two Subpopulations

*J*> 2 subpopulations. Let and where

*j*= 1, . . . ,

*J*, and

*p*

_{t}(

*x*) is the geometric mean of the smoothed rates and so represents the joint (nonstationary) behavior of all subpopulations. The previous equations for males and females are a special case of these equations when

*J*= 2.

We can assume approximately zero correlation between the product functions *p*_{t}(*x*) and each set of ratio functions *r*_{t,j}(*x*). The ratio functions will satisfy the constraint *r*_{t,1}(*x*)*r*_{t,2}(*x*)⋯*r*_{t,J}(*x*) = 1.

Our models do not include a jump-off bias adjustment (Lee and Miller 2001) because it is unnecessary when several principal components are used. In fact, adjusting for jump-off bias may increase the forecast bias (Booth et al. 2006).

*p*

_{t}(

*x*) and each

*r*

_{t,j}(

*x*) for

*j*= 1, 2, . . . ,

*J*: It is necessary to fit only

*J*– 1 models for the ratio functions because of the constraint: recall that we need only one ratio model for the two-sex case when

*J*= 2. However, fitting all

*J*models will simplify the variance calculation for producing prediction intervals.

_{j}(

*x*) = μ

_{p}(

*x*) + μ

_{r,j}(

*x*) is the group mean, and

*z*

_{t,j}(

*x*) =

*e*

_{t}(

*x*) +

*w*

_{t,j}(

*x*) is the error term. Note that Eq. (6) is equivalent to Eq. (15) in Hyndman and Ullah (2007). Here, we are providing an easy and interpretable approach to estimation for this model.

### Forecasts and Prediction Intervals

Forecasts are obtained by forecasting each coefficient β_{t,1}, . . . , β_{t,K} and γ_{t,1}, . . . , γ_{t,L} independently. There is no need to consider vector models because the β_{t,k} coefficients are all uncorrelated by construction (see Hyndman and Ullah 2007), as are the γ_{t,l} coefficients. They are also approximately uncorrelated with each other because of the use of products and ratios.

The coefficients of the product model, {β_{t,1}, . . . , β_{t,K}}, are forecast using possibly nonstationary autoregressive integrated moving average (ARIMA) models (Shumway and Stoffer 2006) without restriction. When fitting ARIMA models, we use the automatic model selection algorithm given by Hyndman and Khandakar (2008) to select the appropriate model orders.

The coefficients of the ratio model, {γ_{t,1,j}, γ_{t,2,j}, . . . , γ_{t,L,j}}, *j* = 1,2, . . . , *J*, are each forecast using any stationary autoregressive moving average (ARMA)(*p*,*q*) (Box et al. 2008) or autoregressive fractionally integrated moving-average (ARFIMA)(*p*,*d*,*q*) process (Granger and Joyeux 1980; Hosking 1981). The stationarity requirement ensures that the forecasts are coherent. We have found ARFIMA models useful for forecasting the ratio function coefficients because they provide for longer-memory behavior than is possible with ARMA models. In implementing ARFIMA(*p*,*d*,*q*) models, we estimate the fractional differencing parameter *d* using the method of Haslett and Raftery (1989), we use the algorithm of Hyndman and Khandakar (2008) to choose the orders *p* and *q*, and then we use maximum-likelihood estimation (Haslett and Raftery 1989) and the forecast equations of Peiris and Perera (1988). In the stationary ARFIMA models, –0.5 < *d* < .05 for *d* = 0; and the ARFIMA(*p*,*d*,*q*) model is equivalent to an ARMA(*p*,*q*) model.

*h*-step-ahead forecast of β

_{n+h,k}, and let denote the h-step-ahead forecast of γ

_{n+h,l,j}. Then the

*h*-step-ahead forecast of log

*m*

_{n+h,j}(

*x*) is given by Because all terms are uncorrelated, we can simply add the variances together so that where

*I*

_{n}denotes all observed data up to time

*n*plus the basis functions {ϕ

_{k}(

*x*)} and {ψ

_{l}(

*x*)};

*u*

_{n+h|n,k}= var (β

_{n+h,k}|β

_{1,k}, …, β

_{n,k}) and

*v*

_{n+h|n,k}= var (γ

_{n+h,k}|γ

_{1,k}, …, γ

_{n,K}) can be obtained from the time series models; (the variances of the smoothed means) can be obtained from the smoothing method used;

*s*

_{e}(

*x*) is estimated by averaging for each

*x*; and

*s*

_{w,j}(

*x*) is estimated by averaging for each

*x*. The observational variance is very stable from year to year, and so it is estimated by taking the mean observational variance in the historical data. A prediction interval is then easily constructed under the assumption that the errors are normally distributed.

Life expectancy forecasts are obtained from the forecast age-specific death rates using standard life table methods (Preston et al. 2001). To obtain prediction intervals for life expectancies, we simulate a large number of future death rates, as described in Hyndman and Booth (2008), and obtain the life expectancy for each. Then the prediction intervals are constructed from percentiles of these simulated life expectancies.

## Empirical Studies

In this section, we illustrate the product-ratio functional forecasting method using two applications: male and female data for Sweden, and state data for Australia. With each application, we compare the forecast accuracy of our method with the accuracy of independent functional forecasts for each subpopulation, based on out-of-sample forecasts. This does not, of course, assess the accuracy of our forecasting applications, but it does provide two comparisons of the accuracy of the coherent and independent approaches. We fit each model to the first *t* observations and then forecast the death rates for the observations in years *t* + 1, *t* + 2, . . . , *n*. The value of *t* is allowed to vary from *n*_{1} = 20 to *n* – 1. Correspondingly, the forecast horizon varies from *n* – 20 to 1.

The number of components, *K* and *L*, in our models is fixed at *K* = *L* = 6. Hyndman and Booth (2008) found that the forecasts are insensitive to the number of components provided that there are enough components. That is, having too many components does not degrade forecast accuracy, but having too few components does. Greenshtein and Ritov (2004) and Greenshtein (2006) termed this phenomenon “persistence in functional linear predictor selection.” We have found that having more than six components makes almost no difference to the resulting forecasts.

*j*th group and forecast horizon

*h*, we define the mean square forecast error (RMSFE) to be

### Application to Swedish Male and Female Mortality

Age-sex-specific data for Sweden are available for the years 1751–2007 from the Human Mortality Database (2010). The data consist of central death rates and midyear populations by sex and single years of age to 110 years. We group the data at ages 100 and older to avoid problems associated with erratic rates at these ages. We restrict the data to 1950–2007, but this restriction has minimal effect because the weighted fitting procedure gives greater weight to recent data.

Figure 1 shows log death rates for males and females for selected ages as univariate time series. The bottom panels represent the same data as a series of smoothed curves (functional observations). We use rainbow plots (Hyndman and Shang 2010) with the time ordering of curves indicated by the colors of the rainbow, from red to violet. (See the online version of the article for the figures in color.) For females, mortality has declined steadily at most ages. For males, however, the decline stagnated at most adult ages in the 1960s and early 1970s. In recent years, male and female mortality at ages 20 and 30 are no longer declining, and small numbers of deaths result in relatively large fluctuations at age 10.

Figure 2 shows the product function (on the log scale) and the ratio function of observed rates. The product function is the geometric mean of smoothed male and female rates; although this mean generally declines over time, recent increases are observed at ages 15 to 30 and at ages 1 to 5. The ratio function is the square root of the sex ratio of smoothed rates. Sex ratios of less than 1 occur when small numbers of deaths result in large random variation: at very young ages in recent years but at very old ages in earlier years. Since the 1980s, the ratio function has declined at ages 40 to 80, and more recently for older ages. In recent years, large fluctuations occur at ages younger than 10 and at ages 20 to 30.

Figure 3 shows the basis functions and corresponding coefficients with 30-year forecasts, produced by applying functional time series models to the product and ratio functions. In the product model, the first principal component shows a declining trend in mortality, which is fastest at childhood ages and slowest at about age 20 and at very old ages. This term alone accounts for 97.0 % of total variation. In the ratio model, the first principal component shows, if anything, a downward trend; but after 1990, fluctuations are substantial. This term accounts for 57.9 % of total variation; the second and third terms account for 15.8 % and 8.5 % of total variation, respectively. No cohort effects or other age patterns were observed in the residuals (i.e., one-step forecast errors) for the models.

Figure 4 shows 30-year forecasts for sex ratios of male and female death rates using the coherent functional model and using independent functional time series models for separate male and female rates (following the approach of Hyndman and Booth (2008)). The coherent model produces sex ratios that are greater than 1 at all ages and remain within the limits of the observations on which they are based. In contrast, the independent models produce forecast sex ratios that are less than 1 at childhood ages and exceed the limits of observation at approximately ages 25 to 30 and 75 and older.

Forecasts of 30-year life expectancy at birth by sex are shown in Fig. 5, along with the sex difference in forecast life expectancy at selected ages. The forecasts based on independent models (dashed lines) are diverging, a feature that goes against recent trends, and leads to sex gaps in life expectancy at older ages that exceed observation. In contrast, the sex differences embodied in the forecasts based on the coherent method continue to decrease, and all tend toward positive differences (see the upcoming subsection on coherence in the Discussion). Thus, the coherent method produces sex differences in both death rates and life expectancies that are in keeping with the levels and trends of the last 30 years.

The prediction intervals for life expectancies shown in Fig. 5 are obtained from the percentiles of simulated life expectancy values, which in turn are obtained from simulated future death rates as described in Hyndman and Booth (2008). A feature of this method is that it takes account of the nonlinear relationship between age-specific death rates and life expectancy, thus producing asymmetric intervals.

To examine forecast accuracy, Fig. 6 shows out-of-sample RMSFE values of log death rates based on Eq. (9). Both the coherent and independent models are more accurate for female than for male mortality, but this sex difference is much smaller in the coherent case. The coherent model is more accurate than the independent model for male mortality but less accurate for female mortality. The better performance for male mortality is attained at the expense of accuracy for female mortality, resulting in greater similarity in the degree of forecast accuracy for the two sexes. In overall terms, the coherent method performs marginally better (an average RMSFE of 0.259 compared with 0.264 for the independent models). Thus, achieving coherence has not resulted in a reduction in overall accuracy.

### Application to Australian State-Specific Mortality

In the second application, we consider the six states of Australia: Victoria (VIC), New South Wales (NSW), Queensland (QLD), South Australia (SA), Western Australia (WA), and Tasmania (TAS). The Australian Capital Territory and the Northern Territory are excluded from the analysis because of many missing values in the available data. For each state, age-specific death rates and corresponding populations for 1950 to 2003 are obtained from Hyndman (2008). The ranking of life expectancy by state has changed over the fitting period; by 2003, the highest life expectancy occurred in Western Australia and Victoria and the lowest (by a margin of more than 1 year) occurred in Tasmania.

Figure 7 shows the first three basis functions of the product model and their coefficients along with 30-year forecasts. The first principal component of the product model accounts for 96.5 % of total variation. The first three principal components of the ratio model for Victoria are plotted in the bottom panels of Fig. 7. The first, second, and third terms account, respectively, for 37.0 %, 13.5 %, and 9.2 % of the total variation. For other states, the first term accounts for between 28.7 % and 52.1 % of the total variation. (Ratio models for other states are not shown.) No cohort effects or other age patterns were observed in the residuals for the models.

Figure 8 shows 30-year life expectancy forecasts by state produced by the coherent and independent models. The life expectancy forecasts from the independent models are diverging, whereas the forecasts from the coherent model are more constrained. For the coherent forecasts, the ranking of the six states changes over the forecast horizon, although Western Australia and Tasmania maintain their 2003 positions as highest and lowest, respectively. The overall difference between the highest and lowest forecast values and the differences between states are consistent with observation.

Figure 8 shows out-of-sample RMSFE values derived from coherent and independent models for the Australian state data. The coherent model is more accurate at all horizons for four of the six state forecasts; the greatest loss of accuracy is for New South Wales. The increase in accuracy is greatest for Tasmania, which might be regarded as an outlier by 2003. On average, the coherent model performs slightly better (average RMSFE of 0.325 compared with 0.346 for the independent models).

## Discussion

In this article, we proposed a new method for coherently forecasting the age-specific death rates of two or more subpopulations. The product-ratio functional method is based on functional forecasting of simple functions of the products and ratios of the subpopulation death rates rather than the rates themselves. Using two-sex data for Sweden and six-state data for Australia, we demonstrated the coherence of our results while also demonstrating that the coherent method is at least as accurate in overall terms as independent forecasts.

### Product-Ratio Method

The use of products and ratios is a simple and interpretable way of imposing coherence on the forecasts. The product function represents the geometric mean of subpopulation death rates—a version of the familiar mortality schedule. The ratio function is the ratio of the death rates of a particular subpopulation to the geometric mean rates; in the two-sex case, this is simply the square root of the sex ratio (and its inverse). Being a relative measure, the ratio function is easily understood and has the advantage of direct comparability over age (Fig. 2).

The use of the product and ratio functions also greatly simplifies the modeling procedure compared with the direct use of rates because the two functions are uncorrelated. The resulting model is elegant in its simplicity. The new method ensures coherence in terms of long-term trends in forecast rates and produces results that remain within the observed range of variation for mortality ratios of subpopulations. The product-ratio method can handle outliers; in such cases, the ratio function is larger than for other subpopulations and may have a different age pattern, but the forecast ratio function is kept within observed bounds.

A limitation of the product-ratio method is that the long-term forecasts of the ratio function will converge to the age-specific mean ratios, which are determined by the choice of fitting period. We address this limitation in two ways. First, the functional model that we employ incorporates weights that give greater weight to recent data (Hyndman and Shang 2009), so the long-term forecasts will converge to approximately the weighted mean ratios, which are largely immune to the earliest data, increasing their relevance to the future. (In our applications, we use a weight parameter of 0.05, which has the effect of giving weight 0.0500 to the most recent data, 0.05(0.95) = 0.0475 to the year before, 0.05(0.95)^{2} = 0.0451 to the year before that, and so on.) However, we acknowledge that the weighted mean ratios may not be optimal at all ages and in all cases. Second, we use ARFIMA models for the coefficients of the ratio function; these models converge very slowly to the mean of the historical values of the coefficients. Tests were conducted to evaluate the possibility of basing the ratio function on a shorter fitting period while maintaining the longer fitting period for the product function; this made almost no difference to average RMSFE values but destabilized the ratio forecast. The price of slow convergence using ARFIMA models is that a longer data series is needed for stable estimation. The full fitting period was therefore used for both functions. The use of the longer period for the ratio function may also be justified on conceptual grounds in that coherence implies that the subpopulation ratios are stochastically stable over time (i.e., the age-specific ratios are assumed to be stationary).

An important advantage of the product-ratio method is greater flexibility in the age pattern of change in forecast mortality. When the product and ratio functions are combined to produce subpopulation death rates, the age pattern of mortality embodied in the product function is modified in accordance with the age pattern of the ratio function. Where the ratio function is close to 1, there is little modification of the geometric mean death rate; where the ratio function differs from 1, the geometric mean rate is significantly modified (by multiplying by the ratio). The forecast product function is typically governed by an almost constant age pattern of change because of a dominant first principal component. In contrast, several principal components are typically required to model the more complex ratio function, producing a variable age pattern of change over time. This variable pattern of change in the ratio function provides flexibility in the age pattern of change in mortality forecasts. The product-ratio functional approach thus allows for greater flexibility of change over time than the independent functional approach. Further, even if using only one principal component to model the product and the ratio functions—and hence fixed patterns of change for both—the product-ratio approach would allow flexibility of change in forecast mortality. The new approach allows for evolving ratios among forecast subpopulation rates (e.g., sex ratios of forecast rates); this is in contrast to the Li and Lee approach, wherein the ratios remain fixed over time (Li and Lee 2005:582).

### Coherence

The terms “nondivergent” and “coherent” have previously been used interchangeably (Li and Lee 2005). In developing the product-ratio method, we have more precisely defined coherence to be the convergence of the ratios of the forecast age-specific death rates from any two subpopulations to appropriate constants. In this article, we used weighted historic mean ratios as the constants, but other values, including theoretically derived values, could be used instead.

Coherence, the essential feature of the product-ratio method, is ensured by the convergence of the ratio function to constant age-specific ratios. In the two-sex case, eventually constant ratios are in keeping with their biological basis (Kalben 2002). In the multistate case, the approach may be regarded as conservative in that constant ratios imply equal rates of mortality improvement in the long term.

Convergence to constant ratios does not imply that mortality differences between subpopulations tend to a set of constants. Trends in forecast differences depend on both the ratio and product functions. Only if both the forecast ratio function and forecast product mortality were constant would subpopulation differences be constant. The forecast ratio function typically increases at some ages and decreases at others; the net effect is that differences in life expectancy may be convergent or divergent. As the ratios converge to their historic means, their ability to effect change diminishes. For constant ratios, changing product death rates produce differences between subpopulations that change at the same rate as product mortality. Thus, change in product mortality governs whether and how differences converge or diverge in the long term. Convergence is constrained by the level of mortality and would occur only when death rates reach zero. Divergence and constant differences are similarly constrained, and for most populations are unlikely to be forecast given observed mortality trends.

For life expectancies at different ages, the age pattern of the mean ratios also influences forecast trajectories. This may result in widening differences in life expectancy at certain ages even when mortality is declining at all ages. Thus, while forecast death rates are constrained to tend toward a defined relative differential, forecast life expectancies are less precisely constrained.

Figure 9 shows results for four additional populations (the United States, Canada, the United Kingdom, and France), illustrating a range of outcomes. In all cases, the sex gap in forecast life expectancies from the coherent model is well-behaved over both time and age.

### Forecasts

The results for Sweden show a life expectancy at birth in 2037 of 86.4 years for females and 82.9 years for males. The forecast sex gap decreases from 4.0 years in 2007 to 3.5 years in 2037. Similarly for life expectancy at all other ages, the forecast sex gap decreases with slow convergence toward positive values.

The forecast life expectancies at birth in 2037 imply average annual increments of 0.113 years for females and 0.138 years for males. These are less than the average annual increments over the fitting period, implying deceleration in the rate of increase. For females, deceleration is in fact observed over the fitting period (Fig. 5) and is continued in the forecast. Bengtsson (2006) noted that deceleration has been a recent feature of several formerly leading countries of female life expectancy. For males, the linear trend since 1980 in observed life expectancy is not continued in the forecast.

Although approximate linearity has been observed in the recent life expectancies of developed countries (White 2002), it is also the case that approximate linearity is observed in the coefficient of the first principal component of models of death rates (Lee 2006). When linearity is continued in the forecast of the coefficient, it does not produce linearity in forecast life expectancy. The main reason for this is that the principal component model assumes a fixed age pattern of change, whereas life expectancy takes into account the fact that the age pattern of change varies over time (Booth et al. 2002). This is illustrated in Fig. 10 for Swedish female mortality. The pessimistic forecast is based on a functional model using a single principal component with linear coefficient, and hence a fixed age pattern of change. The optimistic forecast is a random walk with drift model of life expectancy. This latter model is heavily dependent on the fitting period, and in particular on the first year of the fitting period. If we were to discount the first few years of the fitting period, when the trend is particularly steep, a more gradual increase in forecast life expectancy would result, but this would still be greater than the increase embodied in the single principal component model.

In the Swedish example, several additional interconnected factors contribute to the relatively slow and diminishing increase in forecast life expectancy. Change in the product function, which is the main determinant of the trend in forecast male and female life expectancies, is more closely linear than in either male or female mortality (as determined by the independent models). This contributes to the result that linear change (the first principal component seen in Fig. 3) accounts for the very high percentage of variability after smoothing, which means that the functional forecast of the product term is effectively based on the single linear principal component, and the age pattern of change is fixed. Thus, the forecast of product function mortality embodies a deceleration in the increase in life expectancy.

When there is almost total dominance of the first principal component in the product model, the forecast trend can be heavily dependent on the fitting period (especially if a random walk with drift model is used, which depends totally on the first year and last year of the fitting period). In the Swedish example, the ARIMA(0,1,1) model (with drift) used for the first coefficient somewhat reduces this dependence by giving extra weight to more recent years. Further, we have used weighted functional methods to give greater weight to the more recent data and reduce dependence on the particular fitting period; however, this has little effect in the Swedish example because of linear change in the product function over the fitting period (Fig. 3).

In other work, we found the first principal component to account for about 94 % of variation (Hyndman and Booth 2008), allowing some influence of the second and higher principal components in the forecast. The main effect of the additional principal components is to modify the age pattern of change embodied in the first principal component so that the overall age pattern of change is not fixed. This has the effect of modifying the annual increments in forecast life expectancy.

For Sweden, the ratio model allows for greater flexibility over time in the age pattern of change. For example, one effect of the first principal component is the “shifting” of the primary mode of the ratio function to slightly older ages over the course of the fitting period, while the second component is largely responsible for the increase and then decrease in the secondary mode and also contributes to shifting it to older ages. At the oldest ages, there is little movement over time. In the forecast, the shifts are reversed because the pattern of change tends toward the weighted mean ratio function, but there is relatively little movement over the forecast horizon. At childhood ages, the forecast involves an increase in the ratio. In other words, female rates are declining faster than male rates, and the female advantage is increased. At young adult ages (roughly 20 to 30), this trend is reversed—the female advantage is reduced. Very little change is forecast at about ages 30 to 45. At about ages 45 to 70, the female advantage is increased; at older ages, it is slowly reduced over time in line with recent trends whereby the male mortality disadvantage has diminished.

When the forecast ratio functions for male and female mortality are applied to the forecast product function, the resulting life expectancies are not equidistant from geometric mean life expectancy. Two factors contribute to this result. First, although symmetric on the log scale, the sex ratio and its inverse are asymmetric in their effects on product death rates: the larger effect occurs for males. Second, the effect of a constant ratio is greater for lower death rates. Both factors produce a greater difference from geometric mean life expectancy for male mortality than for female mortality. However, these differences are slight, and the main determinant of the trends in male and female forecast life expectancy is the forecast of geometric mean mortality. In the Swedish example, this explains why male forecast life expectancy decelerates more rapidly than does female life expectancy.

The forecast life expectancies for Australian states show slow convergence of some pairs of states and slow divergence of others. Several trajectories intersect: for example, NSW and QLD. These results are the outcome of the combined influences of overall mortality decline, convergence of age-specific ratios to constant means, and the age pattern of these means, as discussed earlier herein. The average forecast rate of increase is less than the observed average. As in the case of Sweden, the dominant first principal component of the product function (a linear coefficient with fixed age pattern of change) produces slightly convex forecasts of life expectancy.

### Forecast Accuracy

It has been shown that the product-ratio functional method ensures coherence without compromising the overall (average) accuracy of the forecasts. In the two-sex case, the accuracy of the male mortality forecast was improved at the expense of accuracy in female mortality forecast; in other words, by adopting the coherent method, the accuracy of the forecasts for the two sexes was (partly) equalized. Similarly in the six-state example, forecast accuracy is more homogeneous in the coherent method, with an improvement in overall accuracy. This feature of the method is useful in practical applications, such as in population forecasting, for which it is preferable to maintain a balanced margin of error and hence a more balanced forecast population structure than might occur in the independent case. Similarly, in financial planning, it may be preferable to accept moderate error in all subpopulations rather than risk a large error in one subpopulation.

Coherent forecasting incorporates additional information into the forecast for a single subpopulation. The additional information acts as a frame of reference, limiting the extent to which a subpopulation forecast may continue a trend that differs from that of trends in related subpopulations. A similar approach has previously been adopted by Ortega and Poncela (2005) in forecasting fertility, as well as by Li and Lee (2005) for mortality.

The use of additional information raises the question of the choice of information. How important is it that the subpopulations be appropriately chosen? In the two-sex case, in which data for both females and males are used to forecast mortality for each sex, the choice is predetermined. Similarly, there is little doubt about the appropriateness of geographic subpopulations of a country. The grouping of several countries, however, is less clear (Li and Lee 2005). We note that the outlier state of Tasmania has been successfully forecast, and similar results can be expected for countries that might be considered outliers in a larger region. The product-ratio method is relatively insensitive to the composition of the product function because of the simpler orthogonal relationship between the product and ratio functions (compared with the more complex correlated relationship between the group and individual rates used by Li and Lee).

Evidence suggests that the product-ratio functional method may produce more accurate forecasts than other methods. The evaluations of Booth et al. (2006) show that the functional data model of Hyndman and Ullah (2007) produces more accurate forecasts of death rates than the Lee-Carter method and its variants in 13 of 20 populations. In this research, we have used an improved functional data model, which incorporates weights (Hyndman and Shang 2009) and has been shown to produce more accurate forecasts than any other method based on a principal components decomposition, including the Lee-Carter variant used by Li and Lee (Shang et al. 2011). To compare our results for Sweden with those of Li and Lee (2005), we extended our forecasts to 2100: the product-ratio method produced a sex gap in life expectancy at birth of 2.2 years, compared with 3.0 years by the Li-Lee method. Forecast life expectancy in 2100 was 91.3 years for females and 89.1 years for males, compared with approximately 93 and 90 years, respectively, from Li and Lee (2005). Further research is needed to fully evaluate methods for coherent mortality forecasting.

### Generalization of the Li-Lee Method

_{t}is a random walk with drift, ϕ(

*x*) is the first principal component of the common factor Lee-Carter model (based on combined male and female data), γ

_{t,j}is an AR(1) process, ϕ

_{j}(

*x*) is the first principal component of the residual matrix of the common factor model, and

*e*

_{t,j}(

*x*) is the error. The residual matrix is based on β

_{t}after adjustment to match annual life expectancy (Lee and Miller 2001). Coherency is achieved by modeling γ

_{t,j}as an AR(1) process.

There are several points of difference between Li and Lee (2005) and the product-ratio functional method. First, Li and Lee (2005) used a single principal component in each model, but we use up to six (Hyndman and Ullah 2007)—although in practice, only the first two have much effect for mortality. The additional principal components capture patterns in the data that are supplementary to the main trend and that may be significant. Booth et al. (2006:Table 3) showed that methods involving additional principal components were at least as accurate in 16 of 20 populations. The incorporation of additional principal components has the advantage that modeled death rates are no longer perfectly correlated across age (which is a feature of single-component models). The use of the multiple principal components does not add significantly to the complexity of the model because the γ_{t,l} coefficients (*l* = 1, . . . , *L*) are uncorrelated, by construction; there is thus no need to employ a vector approach.

Second, Li and Lee (2005) restricted the time series models to a random walk with drift for β_{t} and a first-order autoregressive process (AR(1)) for γ_{t,j}, whereas we allow more varied dynamics to be modeled through more general ARIMA processes for each β_{t,l}, *l* = 1, . . . , *L*, and any stationary ARMA(*p*,*q*) or ARFIMA(*p*,*d*,*q*) process for each γ_{t,l}, *l* = 1, . . . , *L*. Third, in the Li and Lee (2005) approach, the γ_{t,j} coefficients are highly correlated with each other (over *j*), necessitating a vector approach (Reinsel 2003) for modeling the γ_{t,j} coefficients, which adds considerably to the complexity of the method.

Finally, by modeling γ_{t,j} instead of *f*_{t,j}, the Li and Lee (2005) model includes observational error, which will be propagated in the forecasts. In the functional approach, observational error is removed by smoothing and added back in after modeling as part of variance estimation.

Thus, the Li and Lee (2005) model can be considered a special case of the more general product-ratio functional method—the case in which there is no smoothing, each model has only one component, and the time series models used are either a random walk with drift or a first-order autoregression.

### Extensions and Practical Application

A possible extension of the method would be to incorporate two or more dimensions in defining the product in order to obtain coherency between and within each dimension; for example, sex and state might be used to coherently forecast male and female mortality within state, and mortality by state within each sex. A further possible but less elegant extension is the use of an external standard mortality schedule in place of the product, combined with ratios of subpopulations to this standard. This option may be useful where data for the appropriate group are missing. Best-practice mortality (Oeppen and Vaupel 2002) may provide a useful standard. Further research is needed to examine these possibilities in greater detail.

Finally, it should be acknowledged that the development of improved forecasting methods for mortality, and by extension for fertility and migration (Hyndman and Booth 2008), represents a step toward more reliable and more easily automated demographic forecasting and the acceptance of these stochastic methods by national statistical offices responsible for producing official population projections. Although more complex than traditional methods, these methods are easily accessible through user-friendly code now available on the Comprehensive R Archive Network (CRAN) (Hyndman 2010). Application of the methods is considerably simplified by this free software.

## Note

^{1}

If *X* and *Y* are two random variables, then Corr(*X* – *Y*, *X* + *Y*) = Var(*X*) – Var(*Y*). So if *X* and *Y* have equal variances, then *X* – *Y* and *X* + *Y* are uncorrelated.

## References

*l*

_{1}constraint

*R package version 3.222.*Retrieved from http://robjhyndman.com/software/addb

*Why men die younger: Causes of mortality differences by sex*(SOA Monograph M-LI01-1)