## Abstract

In this article, we show how stochastic diffusion models can be used to forecast demographic cohort processes using the Hernes, Gompertz, and logistic models. Such models have been used deterministically in the past, but both behavioral theory and forecast utility are improved by introducing randomness and uncertainty into the standard differential equations governing population processes. Our approach is to add time-series stochasticity to linearized versions of each process. We derive both Monte Carlo and analytic methods for estimating forecast uncertainty. We apply our methods to several examples of marriage and fertility, extending them to simultaneous forecasting of multiple cohorts and to processes restricted by factors such as declining fecundity.

## Introduction

Forecasting uncompleted cohort experience is a key task in demography. Diffusion models—which describe how a population adopts a new innovation, technology, or behavior—are potentially useful in this respect. We analyze the Hernes, Gompertz, and logistic innovation diffusion models and develop a unifying framework for time-series-based probabilistic forecasting of cohort processes with these models. We introduce the concept of stochastic diffusion, which both expands the theoretical coverage of the models to include period effects and allows evaluation of the probabilistic forecast uncertainty. Applications to marriage and fertility rates illustrate the usefulness of these new methods and extend the methods to simultaneous forecasting of multiple cohorts and to processes restricted by factors such as declining fecundity.

## Background

The Hernes, Gompertz, and logistic diffusion models are commonly used in the social sciences. The Hernes model has been developed and applied for forecasting cohort marriage patterns (Goldstein and Kenney 2001; Hernes 1972; Li and Wu 2008). For cohort fertility forecasts, the Gompertz model previously used to fit period fertility (Hoem et al. 1981; Pollard and Valkovics 1992) can also be used to predict cohort rates (Goldstein 2010). The logistic diffusion model has not been used often for modeling cohort schedules (cf. Ike 2002) but is the standard model of population growth subject to constraints (Pearl and Reed 1920; Preston et al. 2001). Furthermore, in the economic literature, the logistic model had been used extensively to forecast innovation diffusion (Gruber and Verboven 2001; Harvey 1984; Mar-Molinero 1980; Meade and Islam 2006).

Although developed in a deterministic setting, the Hernes, Gompertz, and logistic models can all be extended to a stochastic setting. We do this by allowing random shocks to influence the processes. Introducing randomness is appealing because it acknowledges that the model under consideration is not the only influence on behavior. Introducing random shocks not only incorporates the possibility of other influences but, in our formulation, quantifies the importance of these outside factors to uncertainty in forecasts. A further advantage of introducing randomness is that it allows inclusion of influences that may extend across periods or affect cohorts in similar or correlated ways. In short, we see the stochastic models that we introduce here as a step forward in making diffusion models broader and more realistic.

Our approach builds on the stochastic forecasting framework pioneered by Alho, Lee, Tuljapurkar, and others (Alho 1990; Lee 1993; Lee and Tuljapurkar 1994). We take from these approaches the idea of modeling temporal change as a single or set of univariate stochastic time series, with the difference being that our approach is applied to cohort processes. An innovation of our approach is that we introduce stochastic elements in the context of behavioral cohort models. The mathematics used applies generally to linearizable difference equations and is particularly suitable for the cohort models we focus on. Cohort forecasting is of particular interest to those studying the behavioral basis of demographic rates because it relates to the life course behavior of individuals. For example, cohort fertility or marriage behavior is that experienced by real individuals, as opposed to the synthetic nature of period indices.

The models that we explore were all based in their original formulation on differential equations, wherein the levels and previous rates of change influence the subsequent evolution of the process. We introduce a stochastic element to these differential equations. In physics, finance, and many other fields, the new field of stochastic differential equations has allowed the introduction of random perturbations into previously deterministic models (e.g., the Ito equations with a wide range of applications (Øksendael 2003) or the Black-Scholes (1973) option-pricing equations). Here, we take a first step at introducing a similar conceptualization to demographic models.^{1}

Our approach is based on linearization of the models. When forecasting is the goal, linearization has certain advantages over alternative methods, such as fitting the diffusion curve to observed cumulative proportions, or change in the proportions (Billari and Toulemon 2006; Goldstein and Kenney 2001; Hernes 1972; Martin 2004). In particular, linearization makes the estimation easy and allows the incorporation of stochastic shocks in a straightforward additive rather than a multiplicative fashion.^{2}

Linearizing the diffusion model, in itself, is not new. Winsor (1932) showed how the logistic and Gompertz models can be linearized with respect to time. Harvey (1984) took the next step by showing how the predictions of a logistic model can be constructed from an autoregressive integrated moving average (ARIMA) time series model fit to the underlying linear process. In much of the research, however, the linear process has been modeled as a deterministic time trend with transient disturbances (Frances 1994; Li and Wu 2008).^{3} For example, Li and Wu (2008) used the Hernes model to predict first marriages and then based the predictions on a deterministic underlying process, modeled using a linear regression line. If a more dynamic difference-stationary structure is allowed, as in Harvey (1984), no attempt to derive prediction variance has been made.

We build on prior research on modeling cohort processes with diffusion models by (1) treating the underlying linear process as a dynamic nonstationary process; and (2) deriving forecasts and an analytic variance estimator for the forecasts. The approach allows the user to estimate and understand the sources of the probabilistic uncertainty in the predictions, a topic that is becoming increasingly important in demography (Alho et al. 2006; Keilman and Pham 2004; Lee 1998; Lutz and Goldstein 2004; Lutz et al. 2001). Empirical applications extend the methods to simultaneous forecasting of correlated cohorts and to processes restricted by factors such as declining fecundity, and illustrate that the new methods are useful in quantifying the prediction uncertainty.

Our work is distinct from the large literature that deals with diffusion in various scientific contexts. In spatial and network analysis, the word *diffusion* is often used to describe the influence or feedback between neighboring or linked observations (Anselin 1988; Valente 1995). In spatial analysis, the problem often reduces to the specification and estimation of linear regression models, which describe how regions are linked in time and space (Doreian 1980; Land et al. 1991; Tolnay et al. 1996). In network analysis, the central focus is on describing the structure of the linkages between individuals, and analyzing how the linkages influence the flow of ideas or behaviors (Christakis and Fowler 2008; Cowan and Jonard 2004; Marsden and Friedkin 1993).

In demography and sociology, diffusion models such as Hernes are often used to analyze and forecast the adoption of new ideas in the same spirit we do. The statistical issues that are confronted, however, are mainly about the estimation of the model parameters. The model itself is seen as deterministic, and the only source of prediction uncertainty comes from the uncertainty in the data and in the model parameters (Goldstein and Kenney 2001; Li and Wu 2008). When within-model stochasticity is allowed, this often pertains to individual-level uncertainty. For example, in microsimulation and in agent-based modeling of demographic processes, the diffusion arises from micro-level interactions (Billari and Prskawetz 2003; Hammel et al. 1990; Wachter 1987). At the individual level, there is uncertainty regarding the outcomes, but the macro-level uncertainty is often seen more as a nuisance that arises from the stochastic simulation and needs to be averaged out, rather than being a feature characterizing the process.

Also, in the survival formulation of the diffusion models, the stochasticity is only at the individual level (Diekmann 1989). These formulations may be very useful for estimating the model parameters, in particular because standard statistical packages often allow flexible estimation of survival models. However, the uncertainty in predictions (if they are made) is then limited to the uncertainty in the model parameters. The Coale-McNeil model for first marriages (Coale and McNeil 1972) also incorporates stochasticity only at the individual level.

Another advantage of the consistent stochastic framework that we propose is that we allow shocks indexed by time to influence the cohort processes. This extends the reach of cohort models to allow period influences and to be consistent with stochastic period models (Alho 1990; Lee 1993; Lee and Tuljapurkar 1994). The framework that we put forward may also be useful in analyzing Gompertz mortality models based on a declining stock of vitality with age, or perhaps for testing the hypothesis of an invariant rate of aging (Vaupel 2010). Another use of our framework is to allow researchers to test the behavioral assumptions of diffusion models by seeing whether outside shocks propagate over time.

This article is organized as follows. The next three sections show how estimation, prediction, and prediction interval estimation can be done in the Hernes, Gompertz, and logistic cohort diffusion models, using the dynamic time-series approach. The first section on Hernes is the most detailed, since the Gompertz and logistic cases are highly analogous to the Hernes case. The derivations of the analytic variance estimators are given in the appendix. Following the introduction of the methods, we illustrate the techniques by applying them to marriage and fertility. To anticipate the results, Table 1 summarizes the key results by showing the model equations, linearizations, prediction equations, and prediction-variance estimators.

## The Hernes Diffusion Model

### The Model and Its Linearization

*P*

_{t}be the proportion in a cohort that by age

*t*has adopted the innovation under study. Assume that

*P*

_{0},

*P*

_{1}, . . . ,

*P*

_{t}are observed and that

*P*

_{t + 1},

*P*

_{t + 2}, . . . ,

*P*

_{t + k}are being predicted, using the Hernes model. The Hernes diffusion model (Hernes 1972) for a proportion

*P*

_{t}iswhere

*a*> 0 and 0 <

*b*< 1. A behavioral interpretation of the model in the context of first marriages, which is what Hernes (1972) modeled, is that a person’s chance of first marriage depends on the parameter

*a*that represents the initial level of marriageability; on a peer-pressure effect proportional to the fraction already married,

*P*

_{t}; on the choice of remaining eligible partners ; and on an age effect,

*b*

^{t}, allowing for the lessening attractiveness of marriage with age attributable either to heterogeneity or to the aging process. The model can be solved for

*P*

_{t}aswhere

*P*

_{0}is the initial value at the first age of marriage. The model in Eq. (1) can be linearized with respect to cohort age

*t*with

*g*

_{t}the underlying linear process.

^{4}Equation (4) is a special case of what Li and Wu (2008) called the latent function of the Hernes model. If

*g*

_{t}is assumed to be a deterministic process, then

*P*

_{t}is also deterministic, and the source of prediction uncertainty is the uncertainty in the model parameters. This is the approach of Li and Wu (2008), where the latent function is modeled as a deterministic linear regression model. Here, we take a different approach and assume that the underlying process

*g*

_{t}is a time-series process: for example, an ARIMA. As a special case, we consider the random walk with drift model , where

*g*

_{0}is the initial level of the process; is the drift parameter; and

_{i}are independent, normal, and zero mean shocks with variance . This is a potentially useful representation of

*g*

_{t}: the model is parsimonious but allows the past to influence the future, consistent with the nature of the diffusion concept. The models also allow for arbitrary random shocks, perhaps because of a new period effect. We estimate the parameters using

Here, the first observation for *P* is at time 0, and there are *t* + 1 observations of the process *P*. Because of the approximation of the derivative in Eq. (4), the first and last observations for the underlying linear process *g* are *g*_{1} and *g*_{t –}_{1}, and there are *t* – 2 observations for *g*. Consequently, the denominators in (5) are *t* – 2 and *t* – 3, respectively.

### Prediction

*k*-step-ahead predictions and are based on predictions for the underlying linear process

*g*. Under a random walk with drift, these are and . The predictions and can be derived in several ways; we use the prediction equation:

*P*

_{t}in terms of

*P*

_{t – 1}and

*g*

_{t}. The assumption in this approximation is that a one-step change is close to the average change over two periods: that is, .

^{5}

### Prediction Variance

*k*-step-ahead prediction iswhere is the variance of the error term ,

*g*

_{t}is the value of the underlying linear process at last observation, and . The derivation of the analytic variance estimator is given in the appendix. If the underlying linear process is something other than a random walk with drift, or if computational methods are preferred over analytic ones, a straightforward Monte Carlo variance estimator based on simulating the process

*g*can be applied.

Equation (8) is important because it reveals the various sources of the prediction uncertainty. First, the multiplying factor shows that the prediction variance grows linearly with the variance of the error term in the underlying linear process. Second, the factor exp(2*g*_{t}) implies that if the predictions are made at a late age, the prediction variance is small. This is because the drift in *g*_{t} is negative, so late age (a large *t*) results in small *g*_{t} and small exp(2*g*_{t}). Conversely, if the predictions are made at an early age, the variance is large. Third, the term exp() implies that if the drift in *g* is large—that is, the diffusion takes place rapidly—the variance is small. Conversely, if the drift is close to 0 and diffusion happens at a slow pace, the variance is large. Finally, remembering that , we see that if the proportion *P* is close to the upper bound 1, then the coefficients are small, and additional contribution to the variance is also small. Similar remarks apply also to the Gompertz and logistic diffusion models (discussed in the next two sections).

The key results of this section are summarized in Table 1 in the rows marked “Hernes.”

## The Gompertz Diffusion Model

### The Model and Its Linearization

*P*

_{t}, where

*t*is the age. Values

*P*

_{0},

*P*

_{1}, . . . ,

*P*

_{t}are observed, and

*P*

_{t + 1},

*P*

_{t + 2}, . . . ,

*P*

_{t + k}are predicted. The Gompertz growth model for

*P*

_{t}isand the solution for the cumulative rate is

The behavioral interpretation of the Gompertz model (9) in the context of cumulative fertility is based on the idea that there are two countervailing forces influencing age-specific fertility (see Goldstein 2010). On the one hand, there is a kind of social contagion or influence that causes fertility rates to increase proportional to cumulative fertility of the cohort. The right-hand term *P*_{t} is the cumulative fertility by age *t*, and *a* is a parameter governing the strength of social influence. The opposing force is the tendency of fertility to decline with age itself, a potential combination of (1) biological fecundity decline, (2) a declining susceptibility to social pressure with age (“people becoming more set in their ways”; Hernes 1972), and (3) the heterogeneous achievement of family-size goals, reducing steadily the fraction of people who want more children. The exponential function exp(−*bt*) is used as parsimonious way to describe continuous decline without taking negative values.

*a*−

*bt*). To accommodate the model for discrete data, we use the discretization , proposed by Li and Wu (2008) in the context of the Hernes model. With this linearization we have

We model the underlying linear process *g*_{t} as a time-series process. In the case of a random walk with drift, the model parameters are estimated using Eq. (5).

### Prediction and Prediction Variance

*k*-step-ahead predictions and are based on predictions for the underlying linear process. Under random walk with drift, these are and . To derive the predictions and we use the approximation , which was used also in the Hernes case. We proceed in deriving the predictions as follows. First, note that for the Gompertz model, exp(g

_{t}) describes proportional change. This can be approximated by

*g*

_{t}) in Eq. (11) allows expressing

*P*

_{t}in terms of the previous observation

*P*

_{t – 1}and current value of

*g*

_{t}: . This leads to recursive prediction:

*k*-step-ahead prediction is derived in the appendix; the result is

As in the Hernes case, the estimator (13) reveals the sources contributing to prediction uncertainty. First, the prediction variance grows linearly with the variance of . If the predictions are made at a late age, so that *g*_{t} is small, the prediction variance is small. If the diffusion is rapid so that the absolute value of is large, the variance is small. The only major difference with respect to the Hernes variance equation is that Eq. (13) does not include terms of the type , which implied that as *P* gets closer to 1, increase in variance is small. These terms are absent here because the Gompertz process is not limited to the unit interval.

Table 1 also summarizes the key results of this section: the Gompertz diffusion model.

## The Logistic Diffusion Model

### The Model and Its Linearization

*P*

_{t}is the proportion in a cohort that has, by age

*t*, adopted the innovation under study;

*P*

_{0},

*P*

_{1}, . . . ,

*P*

_{t}are the observed proportions; and

*P*

_{t + 1},

*P*

_{t + 2}, . . . ,

*P*

_{t + k}are predicted. The logistic diffusion model for the proportion

*P*

_{t}isand the solution for the cumulative proportion is

*P*

_{t}=

*a*/ [1 + exp(

*a*–

*bt*)]. For a behavioral interpretation of the logistic diffusion model, see Mansfield (1961). The model is linearized by . To accommodate the model for discrete data, we use the discretization . This gives us

We model the underlying linear process *g*_{t} as a time-series process. In the case of a random walk with drift, the model parameters are estimated using Eq. (5).

### Prediction and Prediction Variance

*P*

_{t}in terms of

*P*

_{t – 1}and

*g*

_{t}, we use the approximation

*P*

_{t}in terms of

*P*

_{t – 1}and

*g*

_{t}:. The predictions can then be constructed recursively as

Table 1 also summarizes the key results of this section: the logistic diffusion model.

## Applications and Extensions

We first illustrate the use of our methods under controlled conditions by using simulated data. We then consider two real data applications that extend the basic models presented earlier. In the first application, we use the Hernes model to forecast the proportion ever married among the French 1965–1975 female cohorts and also estimate the probability of a cohort crossover. In this application, we extend the single-cohort Hernes model to allow the cohort processes be correlated in (period) time. In the second application, we forecast Dutch fertility for the 1960–1977 birth cohorts using the Gompertz model and introduce a method for correcting the predictions for age-specific fecundity decline.

### Illustration with the Hernes Model

We simulate random walk with drift data with , *g*_{0} = 0, and transform these data to a Hernesian process with the updating equation and initial value *P*_{0} = 0.001. We then “observe” *P* for *t* ≤ 20 and predict *P* for *t* > 20.

Panels A, B, and C of Fig. 1 illustrate the simulation. Panel A shows the simulated for one sample path, the simulated 95 % prediction interval for *t* > 20 from 1,000 sample paths that take off at *t* = 20, and the estimated 95 % prediction interval based on the first 20 observations. The estimate for is 0.092, a value close to the true value 0.1. Panel B shows one full realization of the simulated process *g*_{t} with predictions and 95 % prediction interval based on first 20 observations.^{6} The drift estimate is −0.154, which is close to the true drift = −0.15. Panel C shows one full realization of the simulated *P*_{t} and predictions (using Eq. (6)) and the 95 % prediction interval (using Eq. (8)) for *t* > 20. The difference between predicted *P*_{t} and the one sample path at *t* = 35 is small. More importantly, the analytic variance estimator accurately captures the within-model uncertainty: for 1,000 simulated *P*_{t} paths that take off at *t =* 20, the coverage rate for the variance estimator at *t* = 35 was 92.6 % at the 95 % nominal level.

### French First Marriages and a Cohort Crossover

We use the Hernes model that was developed and is often used to predict the proportion married within a cohort (Goldstein and Kenney 2001; Hernes 1972; Li and Wu 2008) to model French first marriages in a probabilistic setting. Hernes (1972) gave no uncertainty bounds for the predictions. Goldstein and Kenney (2001) based their marriage rate predictions on survey data and estimated only the uncertainty arising from sampling variation.^{7} Li and Wu (2008) used a deterministic time trend model in conjunction with the Hernes model, and based their prediction intervals on a mix of within-model and model parameter uncertainty.

We use the Hernes model to predict the proportion ever married among the French female 1965, 1970, and 1975 cohorts, and analyze the likelihood that the younger cohorts would catch up with the older cohorts in proportion ever married. For the 1965 cohort, data are observed up to age 40; for the 1970 cohort, up to age 35; and for the 1975 cohort, up to age 30. We use the full observed data starting from age 18 to construct the predictions.^{8}

To analyze the likelihood of a cohort crossover, we first construct the predictions and prediction intervals for each of the cohorts using the random walk with drift specification developed earlier in this article. We then extend the basic single-cohort Hernes model to a correlated-cohorts model, which allows a more realistic analysis of the cohort crossover. Figure 2 shows the results of the single-cohort approach in which no correlation between the cohorts is allowed.

Figure 2 shows that the prediction bounds for the 1965 cohort do not overlap with those for the 1970 and 1975 cohorts. Thus, it seems unlikely that the 1970 or 1975 cohorts would catch up with the 1965 cohort. The prediction interval for the 1975 cohort, however, overlaps with that of the 1970 cohort, suggesting that the 1975 cohort could possibly catch up with 1970 cohort. Such inference, however, assumes that the cohort processes are independent. In reality, period fluctuations may influence cohorts simultaneously, creating correlations across cohorts. The relevant correlation is the period correlation in the innovations in the underlying random walk with drift processes. If the correlation in cohort processes is assumed to be 0, Fig. 2 gives a reasonable picture of the likelihood of a cohort crossover. We, however, estimate the correlation for years 1995–2005 (ages 25–35 for cohort 1970 and ages 20–30 for cohort 1975) to be 0.31.

We use the Monte Carlo method in conjunction with the estimated cohort correlation to analyze the likelihood of a cohort crossover between the 1970 and 1975 cohorts under the assumption that the correlation stays the same at ages not yet observed. We first estimate the drift and variance parameters for the underlying random walk with drift processes for the 1970 and 1975 cohorts. We then simulate two sets of cohort processes with 1,000 simulated marriage paths in each: in the first set, the correlation in the future innovations in the underlying random walk with drift processes for the 1970 and 1975 cohorts is 0. These processes are transformed to predictions that start at age 35 for the 1975 cohort and at age 30 for the 1970 cohort. The results mimic those shown in Fig. 2. In the second set of simulations, we allow the cohorts to be correlated by generating the future innovations in the underlying linear processes from a bivariate normal distribution with estimated standard deviations 0.06 and 0.07 for the 1970 and 1975 cohorts, respectively, and between-cohort correlation .31, and transform these processes to proportions ever married.

In the simulation without correlation, 0.8 % of the 1970 and 1975 cohorts’ marriage paths had crossed by age 40; 2.5 %, by age 42; and 5.4 %, by age 45. In the simulation with correlated cohorts, 0.2 % of the marriage paths had crossed by age 40; 1.8 %, by age 42; and 4.2 %, by age 45. Thus, when the correlation in the cohort processes is taken into account, the estimated likelihood of a cohort crossover drops below .05.

### Dutch Completed Fertility and the Gompertz Model

Kohler (2001) and Bernardi (2003) argue that social interaction influences fertility. Consistent with the social interaction theories, Goldstein (2010) found that the Gompertz model works well in predicting first births and fertility if applied to cohort data. At older ages—and especially for the later cohorts—though, there may be departures from the model. Without prediction intervals, however, it is difficult to assess what is a departure from the model and what is within-model fluctuation.

One of the factors that may result in a departure from model is declining fecundity. At ages above 30, declining fecundity may influence fertility, so that the Gompertz model—which, as such, does not factor in fecundity—is at risk of overpredicting fertility (see, e.g., Goldstein 2010). We use the Gompertz model to forecast completed cohort fertility, and introduce a method for taking declining fecundity into account. We use cohort fertility data for Dutch female cohorts born from 1960 to 1970 (Human Fertility Database 2011) to first explore the fecundity decline and then estimate an infecundity correction. We test the correction to out-of-sample data (cohorts 1950 and 1955) and then forecast cohort fertility for the 1965–1977 birth cohorts.

Figure 3, Panel A, shows the estimated underlying process *g*_{t} for the Dutch female birth cohorts born in 1960–1970. If the Gompertz model held for these cohorts, the process *g*_{t} should be approximately linear. As Panel A shows, the decline in *g*_{t} accelerates with age. The acceleration is present for all cohorts and starts at an approximately same age, close to 30, suggesting that the force behind the acceleration is physiological rather than cohort- or period-specific. We exploit this observation to develop an infecundity correction, which influences the rate of decline in the underlying linear process of the Gompertz diffusion model. The correction is based on the empirical observation that for cohort fertility and ages 30 and older, the underlying process *g*_{t} departs from linearity in a predictable manner.

We allow declining fecundity to influence the Gompertz process by letting the underlying process deviate from linearity. We model *g*_{t} as is a random walk with drift, for ages 18 to 30. For ages above 30, we assume that for each additional year of age, the pace of the decline (the drift) accelerates at a constant rate. That is, at ages 30 and older, , where IFC is infecundity correction. The IFC is a scalar raised to the power of *t* – 30 that describes the accelerating rate of fertility decline at ages above 30. Using the 1960–1970 cohorts and a simple grid search that minimizes the weighted sum of squared errors,^{9} we estimate the IFC to be 1.118. Panel B of Fig. 3 shows the predicted *g*_{t} for the 1960–1970 cohorts with this IFC (observations: ages 15–30; predictions: ages 31–45).

Because the IFC was estimated from the 1960–1970 cohorts, it is not surprising that the model fits well for the same cohorts and the observed (Panel A of Fig. 3) and predicted patterns (Panel B of Fig. 3) look similar. Therefore we test the IFC = 1.118 for two out-of-sample cohorts: the 1950 and 1955 birth cohorts. We use observations up to age 30; estimate the parameters of the underlying process *g*_{t}; and predict *g*_{t} with , with IFC = 1.118. The cohort fertility predictions are derived from *g*_{t} using Eq. (12), and 95 % PI are calculated by using Eq. (13). For comparison, we estimate the predictions also without the infecundity correction (IFC = 1).

Panels A and B of Fig. 4 show the observations, forecasts with and without the infecundity correction, and 95 % prediction interval for the forecasts with the infecundity correction. For the 1950 cohort, the observed fertility at age 45 is 1.90. Without the infecundity correction, the prediction at age 45 is 2.02. With the correction, the prediction at age 45 is 1.90 and the 95 % prediction interval is [1.83–1.97]. For the 1955 cohort, completed fertility at age 45 is 1.87. Without the infecundity correction, the prediction at age 45 is 2.12. With the correction, the prediction is 1.82, and the 95 % prediction interval [1.73–1.90] captures the true value.^{10} Thus, without the infecundity correction, cohort fertility is overestimated. With the infecundity correction, the predictions seem reasonably accurate.

The Gompertz model with the infecundity correction allows probabilistic forecasting of fertility by age over cohorts. Figure 5 shows cohort fertility forecasts for ages 30, 35, and 45 for Dutch female cohorts 1950–1977 (1950–1975 with five-year birth intervals and the 1977 cohort). The data are observed up to year 2008, so cohort fertility is fully observed at age 30 for all cohorts. At age 35, predictions are needed for the 1975 and 1977 cohorts. The graph shows that the prediction uncertainty grows rapidly as we move to more recent cohorts. For the 1977 cohort, predicted cohort fertility is 1.55 (95 % PI 1.51–1.59) at age 35 and 1.85 (95 % PI 1.72–1.98) at age 45. In other words, we know that the 1977 cohort had cumulative fertility 1.06 by year 2008. The prediction intervals suggest that by 2012, we expect the 1977 cohort to have, on average, 0.45–0.54 children more at age 35 and 0.67–0.92 children more at age 45 than in 2008.

The point estimates also suggest that cohort fertility is on the increase as the prediction 1.85 for the 1977 cohort is higher than for any cohort born after 1960. The prediction interval, however, shows that the uncertainty in the predictions is high and grows rapidly as we move to later cohorts. For the 1970 cohort (for which fertility is observed up to age 38), the length of the prediction interval is 0.04 units (1.72–1.77). For the 1975 cohort, the length of the prediction interval increases to 0.22 (1.69–1.91); and for the 1977 cohort, the interval length is 0.26 units (1.72–1.98). Thus, the data are consistent with the hypothesis that cohort fertility is increasing, but the alternative hypothesis—stable or decreasing fertility—cannot be rejected.

## Discussion

We studied prediction and error propagation in the Hernes, Gompertz, and logistic innovation diffusion models and applied the methods to demographic processes. We developed a unified framework in which the predictions and prediction intervals can be derived from an underlying linear process. We derived an analytic closed-form variance estimator for the case of a random walk with drift as the underlying process; for other processes, a straightforward Monte Carlo simulation can be used. The analytic variance estimator revealed the role of different sources in contributing to total uncertainty: most importantly, that the earlier the predictions are made and the slower the diffusion, the larger the uncertainty in the predictions. In the empirical analyses, we further extended the methods in two dimensions. First, we showed how one can move from the single-cohort specification to a multicohort setting in which the cohort processes are correlated in time, allowing a realistic analysis of the probability of a cohort crossover. Second, we developed a method that allows correcting cohort fertility forecasts for declining fecundity.

Simulation studies and empirical applications to first marriages and cumulative fertility showed that the developed methods are useful in quantifying the uncertainty in the predictions: They give a precise sense of the within-model error, and allow forecasters a new ability to characterize the uncertainty. We showed that if the model assumptions hold less than perfectly—as in the case of cumulative fertility where advanced-age fertility seems to be constrained by extra-model factors, such as declining fecundity—the models can be modified to take such factors into account.

This article considers only the within-model error in the prediction uncertainty. To assess the magnitude of total error, future studies will need to expand the range of fitted populations, incorporating fertility and marriage data from the United States and other European countries, as well as historical data, to compare the relative importance of the within-model error against the total error. It is especially interesting to see where the methods do not work. For example, in the case of fertility postponement, the tendency of the uncorrected Gompertz model to overpredict fertility at oldest ages is likely to be an indication of sterility, a phenomenon that the model is not built to capture. Departures from the model may provide means of indirectly estimating the magnitude of lost fertility due to sterility. More generally, if the within-model error accounts for a large fraction of the total error, our methods can be used as gauges of the uncertainty in forecasts. If, however, the within-model error is small, we would recommend characterizing our methods as providing a lower bound on uncertainty, to which a substantial amount of model specification uncertainty would need to be added.

The new methods give raise to several further applications and research questions. First, the methods may prove useful in predicting period fertility rates, which could be done by combining adjoining cohorts. Second, the models allow testing the assumptions of the social diffusion framework by checking whether a shock in fertility at certain age influences the cohort’s subsequent fertility at older ages. Third, it will be fruitful to look at the correlation in model parameters across cohorts and over time and space. The correlations across these dimensions may be used in increasing the accuracy of the estimated model parameters, and in providing a richer description of past marriage and fertility changes.

## Acknowledgements

We wish to credit Ron Lee, who first suggested to JRG that the Hernes diffusion models be fit using a time-series, stochastic diffusion approach.

### Appendix

Here, we derive variance estimators for the Hernes, Gompertz, and logistic models for the case in which the underlying linear process is a random walk with drift. For more general processes, one can use the Monte Carlo method.

#### Hernes Prediction Variance

*P*

_{t}is a constant, so the prediction variance is

*k-*step-ahead prediction, the recursive nature of the Hernes prediction Eq. (6) means that one has to take into account the cumulation of uncertainty. We approach the problem by approximating the Hernes predictions with

*i*≠

*j*result from double-counting of the errors: shocks up to

*t*=

*i*influence both

*g*

_{t + i}and

*g*

_{t + j}, provided that

*j*≥

*i*. Simulation experiments indicated that these off-diagonal elements have a nonnegligible variance contribution. We approximate the off-diagonal using first-order Taylor series approximation as

The interpretation for Eq. (A5) is the following. There are min(*i*,*j*) common shocks in both *g*_{t + i} and *g*_{t + j}, each contributing to the covariance. The terms of the form scale the covariance proportionally to the size of . For *i* = *j*, the equation for the off-diagonal elements (Eq. (A5)) reduces to the equation for the diagonal elements (Eq. (A4)).

*k*-step-ahead prediction variance estimator is obtained by combining (A3)–(A5):where . This is the estimator given in Eq. (8).

##### Gompertz Prediction Variance

*k*-step-ahead prediction, we base the estimator on a linearization whose variance is

*k*-step-ahead prediction variance estimator:

First-order Taylor series approximation would deliver the same estimator. Note that Eq. (A10) is identical to the Hernes estimator shown in Eq. (A6), with the exception that here, .

##### Logistic Prediction Variance

*k*-step-ahead variance estimator for the logistic model:where . Note that in the Hernes case, .

## Notes

^{1}

The conceptual similarity between stochastic differential equations (SDE) and our results is masked by the fact that we work in an exclusively discrete setup.

^{2}

The tendency to treat complex processes as linear is occasionally criticized, as in the “general linear reality” paradigm in which the timing and order of events are irrelevant for the outcome and there is no feedback from the outcome to the effect (see Abbot 1988). The diffusion models we use take this critique into account by allowing for dynamic feedback over time.

^{3}

Deterministic time trend with transient disturbances appears to be contrary to the idea of diffusion processes in which the past influences the future. The difference-stationary specification, which has “long memory” (Raffalovich 1994), seems to fit better for diffusion processes.

^{4}

Deviations from linearity in *g* signal deviations from model assumptions. In principle, one can use standard methods to test the linearity (Harvery and Leybourne 2007; Hinich 1982). In demography, the number of observations is typically small, resulting in low test power. Therefore, visual inspection of the process may be preferred over formal testing, as in Wu (1990). These remarks apply also to the Gompertz and logistic models.

^{5}

The prediction Eq. (6) is preferred because of its simplicity and linearity in exp(*g*_{t}). Alternatives include , which follows directly from Eq. (2), and a prediction that is based on solving from . This quadratic equation arises from the approximation . Empirically, these result in similar predictions. The predictions have a small bias resulting from a discrete growth factor being applied to , whereas optimally, one would apply a continuous growth factor from to . The bias could be reduced by a midpoint correction, analogous to the Euler method for solving differential equations numerically (see Griffiths and Smith 1991), in which the growth factor is averaged over two observations. The bias, however, is small if the step length is small, and it was negligible in simulations in which the step length corresponded to one year. These remarks apply also to the Gompertz and logistic models: a small bias resulting from the discretization is present and could be reduced using the midpoint correction, but in practice, such a correction is not needed with one-year age groups.

^{6}

The prediction intervals are estimated in a standard way using as the variance estimate for *k*-step-ahead prediction . Because of the discretization (Eq. (4)), *g*_{20} is not observed.

^{7}

Billari and Toulemon (2006) used the Hernes model to forecast cohort childlessness and based the forecast uncertainty on uncertainty in the model parameters, in the same spirit as Goldstein and Kenney (2001).

^{8}

In empirical applications, the starting age may influence the results. For example, if very young ages were included, the process of adopting the modeled behavior could in these young ages differ from that of the older ages, resulting in nonlinearities in the underlying process *g*. As a rule of thumb, a reasonable starting age is when *g* starts to behave linearly.

^{9}

We first estimate the drift parameter for each cohort using data up to age 30. We denote this cohort-specific drift parameter by . For each cohort, we construct predictions for ages 31–45. We estimate the parameter IFC by minimizing , where the weights are *w*_{t} = (45 – *t* – 1) / (15 ⋅ 8). With this specification, the weights decline linearly so that *w*_{31} = 1 / 8 and *w*_{31} = 1 / (15 ⋅ 8). Such a weighting gives more weight to young ages, whose contribution to fertility matters more than that of older ages. We then find the best-fitting IFC using a grid search with IFC ranging from 0 to 1.500 with step length 0.001.

^{10}

The prediction interval for 1955 cohort is 33 % wider than for cohort 1950 even though the predictions both start at age 30, and the ultimate completed fertility rates are similar. The reason for this is that the 1955 cohort had children later than the 1950 cohort, which means that from the model’s perspective, predictions for the 1955 cohort start earlier.