## Abstract

High sampling variability complicates estimation of demographic rates in small areas. In addition, many countries have imperfect vital registration systems, with coverage quality that varies significantly between regions. We develop a Bayesian regression model for small-area mortality schedules that simultaneously addresses the problems of small local samples and underreporting of deaths. We combine a relational model for mortality schedules with probabilistic prior information on death registration coverage derived from demographic estimation techniques, such as Death Distribution Methods, and from field audits by public health experts. We test the model on small-area data from Brazil. Incorporating external estimates of vital registration coverage though priors improves small-area mortality estimates by accounting for underregistration and automatically producing measures of uncertainty. Bayesian estimates show that when mortality levels in small areas are compared, noise often dominates signal. Differences in local point estimates of life expectancy are often small relative to uncertainty, even for relatively large areas in a populous country like Brazil.

## Introduction

Small-area mortality estimation is a challenge for demographers and public health researchers for two main reasons. First, there is the universal problem of small populations and high sampling variability in recorded deaths (Bernardinelli and Montomoli 1992; Pletcher 1999; Riggan et al. 1991). With low mortality rates and short periods of exposure, observed event/exposure ratios are very unstable, and estimation of mortality patterns is difficult. In such situations models must fill the gap: smoothing procedures that use known, robust patterns in rate schedules must be combined with available data (e.g., Brass 1971; Wilmoth et al. 2012).

Second, in many countries or regions, vital registration is incomplete, and some deaths go unrecorded in official statistics (Frias et al. 2013; Mathers et al. 2005). Demographers have proposed a variety of methods to estimate the completeness of death registration and adjust mortality estimates accordingly (Bennett and Horiuchi 1981, 1984; Brass 1975; Hill 1987; 2007; Hill et al. 2009; Preston and Hill 1980; Preston et al. 1980; Queiroz et al. 2013, 2017). However, most methods depend on approximate stability of the population’s sex and age composition (Bhat 2002; Murray et al. 2010), an assumption that is not met in countries that have experienced recent, rapid demographic transitions. Migration between subnational areas can also make the methods’ stability assumptions unlikely (Bhat 2002; Bignami-Van Assche 2005; Hill and Queiroz 2010). Finally, standard methods cannot provide uncertainty measures about the completeness of death records (Murray et al. 2010).

Demographers and statistical epidemiologists have improved models of small-area mortality schedules significantly in recent years. New approaches based on Bayesian models that “borrow strength” over different dimensions (age, time, space, and/or sex) allow estimation of life expectancies and mortality rates in regions with sparse, high-quality data (Alexander et al. 2017; Congdon 2009; Jonker et al. 2012; Ocaña-Riola and Mayoral-Cortés 2010; Stephens et al. 2013; Tsimbos et al. 2014).

Although it is not directly related to small-area estimation, a growing literature has examined the use of Bayesian approaches in international demographic forecasting (e.g., Gerland et al. 2014; Raftery et al. 2013; 2014; Ševčíková et al. 2016) and in methods for constructing coherent sets of estimates over large numbers of administrative units (e.g., Alkema et al. 2011, 2013; You et al. 2015). The methods used in these studies have important features in common with our approach—in particular, partial pooling of information across related observational units and incorporating information from external sources through prior distributions for parameters.

Statisticians have also addressed estimation in cases of incomplete or underreported data. Raftery (1988) proposed a Bayesian approach to the general problem of inferring the number of binomial trials from the number of successes. Moreno and Girón (1998) developed a model for estimating crime rates from imperfectly reported data, which is closely related to the model for mortality that we propose in this article. Very recently, de Oliveira et al. (2017) analyzed Brazilian infant mortality data with a model in which death reports may be censored in some geographic regions.

In this article, we propose a Bayesian regression model that specifically addresses the fundamental problems in small-area mortality estimation in countries with potentially defective registration. The model smooths age-specific mortality rates in small samples while also accounting for uncertainty about the level of death registration. Bayesian regression produces estimates of small-area mortality rates and life expectancies, and of the uncertainty in those estimates.

Our model offers several advantages over existing non-Bayesian approaches to estimating age-specific mortality schedules in small areas with vital registration errors. It incorporates two sources of uncertainty: sampling variability and uncertainty about registration coverage. In addition, it uses a novel functional form for mortality schedules that stabilizes small-sample estimates without requiring strong assumptions about age-specific mortality patterns. Finally, estimated rate schedules from the model are continuous, smooth functions, unlike many existing corrections for underregistration that require different methods for infant, child, and adult deaths and may consequently have discontinuities in the estimated rate schedule.

Prior distributions for death coverage can include any available empirical information related to local data quality. In principle, prior information could include expert opinion, demographic estimates, or both. In our specific application to Brazilian data, prior information on age- and sex-specific vital registration comes mainly from standard demographic estimates and an intensive field audit conducted by public health researchers.

## Modeling Strategy

### A Simple Statistical Model for Deaths and Registered Deaths

For each age x within an area of interest, we observe exposure (Nx) and registered deaths (Rx). The true number of deaths, Dx≥ Rx, is not observed. We assume that true deaths have independent Poisson distributions at each age that depend on age-specific mortality rates (μx):

$Dx~PoissonNxμx.$
1
We further assume that each death is registered independently with an age-specific probability ($πx$), so that the total number of registered deaths at age x has a binomial distribution:
$Rx~BinomialDxπx.$
2
As shown in the  appendix, the distribution of registered deaths (Rx) implied by Eqs. (1) and (2) is1
$Rx~PoissonNxμxπx.$
3

### Identifiability of Mortality Rates

A distribution RPoisson(Nμπ) for registered deaths implies that the mortality rate is not identifiable from data on R and N because all (μ, π) pairs that have the same product will have identical likelihoods L(R|N, μ, π) ∝ e−Nμπ(Nμπ)R. In other words, from the likelihood alone, one cannot distinguish between situations of (high mortality, low registration) and situations of (low mortality, high registration).

In a classical frequentist approach to mortality estimation, this lack of identifiability is fatal. Unless the coverage probability π is known, there is no unique μ that maximizes the likelihood, and it is impossible to estimate the mortality rate from R and N. In contrast, a Bayesian approach allows the analyst to use probabilistic information about which coverage probabilities are more likely (expressed as a prior distribution, π ∼ fπ) to produce probabilistic statements about mortality rates μ, given R and N.

As an intuitive example, consider an age group in which we observe N = 1,000 person-years of exposure and R = 10 registered deaths. Suppose that a small field audit in this location, conducted five years earlier, found that 12 out of 15 total deaths had been registered. From the field audit information, it is reasonable to use π ∼ Beta(12, 3) as an expression of our prior knowledge about local death registration (Lynch 2007:54–57). As seen in the left panel of Fig. 1, this prior implies that the expected registration probability is π = 0.80: there is an 80 % probability that coverage is in the [0.66, 0.92] range indicated by the solid bar, and so forth.

Fig. 1

Prior distribution for vital registration coverage. Posterior distribution of the mortality rate

Fig. 1

Prior distribution for vital registration coverage. Posterior distribution of the mortality rate

In a Bayesian approach, the mortality rate (μ) is treated as an uncertain value with a statistical distribution, and the prior for π adds probabilistic information about coverage that allows us to infer which mortality rates are more and less likely, given exposure and a count of registered deaths. Based on the field audit information in our example, π = 0.8 is a very likely coverage level, and π = 0.4 is very unlikely. Therefore, a posteriori (i.e., after observing R = 10 and N = 1,000), one can say that μ = (10 / 1,000) / 0.8 = .0125 is a very plausible mortality rate, while μ = (10 / 1,000) / 0.4 = .0250 is much less plausible.

The full posterior distribution combines the likelihood for R with the prior for π, producing a posterior distribution for μ that summarizes which mortality rates are more plausible and which are less plausible given the observed data:2
$PμRN=∫01LRNμπfππdπ.$
4

In our specific example, with likelihood RPoisson(Nμπ), prior π ∼ Beta(12, 3), and data (R, N) = (10, 1,000), this distribution is

$PμR=10N=1,000∝∫01e–1000μπμπ10π111−π2dπ,$
5

which appears in the right panel of Fig. 1: the posterior median of the mortality rate is .014, and an 80 % credible interval (10th to 90th percentile) is [.009, .021]. In the full model, we use a parametric relational system for log mortality rate schedules, as described in the next section. However, the main principle is illustrated by this simple example: we combine probabilistic prior knowledge about death registration with a statistical model that relates registered deaths, coverage, and exposure. The result is an a posteriori distribution for local mortality rates.

## TOPALS Relational Model for Mortality Schedules

We model mortality by age with the TOPALS relational model (de Beer 2012; Gonzaga and Schmertmann 2016). In a TOPALS model, the log mortality schedule is a sum of two functions: (1) a constant schedule (called the standard) that incorporates basic age patterns, and (2) a parametric, piecewise-linear function made up of straight-line segments between designated ages (these ages are called knots) that represents differences between the standard and the log mortality schedule in the population of interest.

The vector of log rates over ages x = 0 . . . 99, λ ∈100 in the TOPALS model is
$λ=λ∗+Bα,$
where λ*100 is the standard (in our case, derived from national data for Brazil in 2010), B is a 100 × 7 matrix of fixed B-spline linear basis functions (de Boor 2001), and α ∈7 is a parameter vector.

The seven model parameters α = (α0 . . . α6)′ are the values of the piecewise-linear function at exact ages 0, 1, 10, 20, 40, 70, and 100. For example, $λ40=λ40∗+α4$ and $λ70=λ70∗+α5$. Between knots, the additive offsets to the standard schedule change linearly with age—for example, $λ50=λ50∗+23α4+13α5$, $λ55=λ55∗+12α4+12α5$, and $λ60=λ60∗+13α4+23α5$. The space of possible mortality schedules in a TOPALS model is thus the set of curves that can be constructed by adding piecewise-linear functions to the standard log-rate schedule.3

The mortality rate at age x in a TOPALS model is
$μxα=expλx∗+bx′α,$
where $bx′$ is the xth row of B. Under the distributional assumptions outlined earlier, the log likelihood is
$lnLRN,α,π=c−∑xNxπx·μxα+∑xRxlnμxα,$
6

where c is a constant that does not vary with α.

Figure 2 illustrates a fitted TOPALS model for males in the northern Brazilian state of Amapá in 2010. The figure shows a national standard λ for males (thin black line), observed ln(Rx / Nx) values from registered deaths and exposure in Amapá (open circles), the seven TOPALS offset parameters α0 . . . α6 (the vertical positions of the solid dots), the linear spline offsets (thin line connecting the offsets), and the fitted TOPALS model schedule (standard + spline, λ* + , thick line). In broad terms, this fit suggests that Amapá’s male mortality is higher than the standard at ages below 25 and lower than the standard at ages above 25. More subtly, the fit suggests that downward deviations from the standard are slightly larger at higher ages.

Fig. 2

TOPALS model for males in Amapá state, 2010. Open circles are observed ln(Rx / Nx) from registered deaths. Smooth dark line is national standard log mortality schedule. Heights of solid dots are maximum likelihood offsets α0 . . . α6. Fitted TOPALS schedule is the sum of the standard schedule and the linear spline

Fig. 2

TOPALS model for males in Amapá state, 2010. Open circles are observed ln(Rx / Nx) from registered deaths. Smooth dark line is national standard log mortality schedule. Heights of solid dots are maximum likelihood offsets α0 . . . α6. Fitted TOPALS schedule is the sum of the standard schedule and the linear spline

This particular fit for Amapá maximizes Eq. (6) over α under the assumption of 100 % death registration ($πx$ = 1, Rx = Dx) at all ages. It serves only to illustrate the components of a parametric TOPALS model. In practice, we relax the assumption of complete coverage, replacing it with prior distributions for $πx$.

## Model Summary

Figure 3 summarizes our statistical approach to estimating mortality and life expectancy in a small area. We use information from other sources to develop priors for age-specific coverage. (In the hypothetical example above, for instance, a field audit suggested a beta distribution with a = 12 and b = 3). As indicated in the top right of Fig. 3, we use a weak multivariate normal prior on α (described in detail later) to stabilize schedule estimates in very small populations.

Fig. 3

An integrated coverage and mortality model. Registered deaths (Rx) and exposure (Nx) are observed. α and $πx$ are uncertain parameters

Fig. 3

An integrated coverage and mortality model. Registered deaths (Rx) and exposure (Nx) are observed. α and $πx$ are uncertain parameters

Age-specific exposure (N0 . . . N99) and registered deaths (R0 . . . R99) are observed. The model combines the prior distributions for coverage and mortality parameters (fπ and fα) with the Poisson likelihood L(R|N, α, π) = ∏xL(Rx|Nx, μx(α), $πx$) to produce a posterior marginal distribution for α = (α0 . . . α6)′, similar to Eq. (4):

$PαRN=∫LRNαπfααfππdπ.$
7
The left side of Eq. (7) answers the following question: Given age-specific populations and registered deaths, combined with our uncertain knowledge of local death registration probabilities, which local mortality schedules are more and which are less plausible? In practice, we answer by drawing a large number of random realizations, $α1∗…αK∗$, from the distribution in Eq. (7) via Markov Chain Monte Carlo (MCMC) simulation. It is then an easy logical step to ask the same question about more and less plausible life expectancies, e0, via the simulated posterior distribution of $e0α1∗…e0αK∗$:
$Pe0
8
where I [ ] is a (0,1) indicator function equal to 1 if the condition in brackets is true.

### Data

We apply the model to small-area estimation in Brazil, which has good census data but incomplete vital registration. Death reporting has improved significantly over the past several decades, but large regional differences in the quality of coverage remain (de Mello Jorge et al. 2007; Instituto Brasileiro de Geografia e Estatística 2013; Paes 2005; Queiroz et al. 2017). Fine-grained mortality (and especially life expectancy) estimates are in increasing demand, but the complications of small samples and differential coverage make this a difficult task. In short, Brazil is a good test case for a model that incorporates underregistration into small-area mortality estimation.

Population and deaths by sex and single-year age come from the 2010 Brazilian Demographic Census and from the Mortality Information System of the Ministry of Health (MS/SVS/CGIAE), respectively. The Mortality Information System is an official registry for deaths, with data collected from health providers (de Mello Jorge et al. 2007).

Population and registered deaths are available for 5,565 Brazilian municipalities,4 100 single-year ages 0 . . . 99, and two sexes. For each of the 1,113,000 combinations of (municipality, age, sex), we recorded the 2010 census population, the number of registered deaths over calendar years 2009–2011, and geographic identifiers. We used the 2010 census populations to estimate age- and sex-specific exposure over 2009–2011 for each (municipality, age, sex) combination. Details are in Gonzaga and Schmertmann (2016).

Brazilian census geography begins with five major regions (North, Northeast, Southeast, South, and Center-West) comprising 27 states5 that cover the entire national territory. States are subdivided into mesoregions (137 total), mesoregions into microregions (558), and microregions into municipalities (5,565). Our goal in this analysis is to estimate age- and sex-specific mortality schedules and life expectancies for all 27 states and for each of the 558 microregions.6

## Priors

### Using Death Registration Estimates From Previous Studies

Prior information about the likely levels and age patterns of death registration is essential for our model. In the case of Brazil, we identified six published estimates of death registration coverage, by state and sex, for 2010 (Freire et al. 2015; Instituto Brasileiro de Geografia e Estatística 2013; Queiroz 2012; Queiroz et al. 2013, 2017). These estimates all come from death distribution methods (DDM) and refer mainly to deaths at ages 30+. The six estimates differ substantially in some cases. For example, for males aged 30+ in the Northeastern state of Maranhão (widely regarded as the state with the lowest vital registration coverage), the alternative coverage estimates were 50 %, 66 %, 75 %, 78 %, 78 %, and 97 %. The range of these estimates suggests that in addition to being low, coverage levels for adult male deaths in Maranhão are uncertain. At the municipal level, we use published estimates of death registration coverage from a research project known as busca ativa,7 which we translate as field audit. The field audit project (Frias et al. 2013; Szwarcwald et al. 2011) randomly selected 133 municipalities in Brazil’s poorest regions and compared the registered deaths in 2009–2010 with total deaths found for the same period from an exhaustive search of notary offices, clinics, official and unofficial cemeteries, and from interviews with health workers, midwives, funeral homes, pharmacies, and others. Based on correlations between municipal-level characteristics and levels of mortality coverage in the study areas, the field audit researchers estimated the likely probability of death registration in all 5,565 Brazilian municipalities for infant deaths (π0) and for all deaths (πall). Field audit estimates can be aggregated to higher levels of geography—in particular, to microregion or state level—by using published information from the project on the estimated number of deaths in each municipality.

When aggregated to the 27 Brazilian states, field audit estimates of infant mortality coverage (π0) range from 65 % (Maranhão) to 100 % in several southern states. The range is naturally wider across the 558 microregions: from 31 % in some remote Amazonian locations to 100 % in many southern microregions. Field audit estimates for death registration probabilities at all ages combined (πall) range from 78 % to 100 % across states and from 44 % to 100 % across microregions.

### Prior Distributions for Coverage

To utilize the available coverage data in Brazil, we assume that in each small area, there are distinct probabilities of death registration for three age intervals:

$πx=π0ifx=0π1ifx∈1...29π2ifx∈30...99,$
9
which can be combined to produce an overall probability of registration:
$πall=w0π0+w1π1+w2π2,$
10

where wx terms represent the proportion of all deaths that occur in the corresponding age group.8

Our choice of age groups was based in part on the availability of age ranges in the external coverage estimates—for example, DDM provides straightforward estimates of state-level π2. However, our choice of age groups is also based on the demographic literature and expert opinions. First, many Brazilian studies indicate that coverage of infant deaths (π0) is lower than coverage in any other age interval (Frias et al. 2013; Instituto Brasileiro de Geografia e Estatística 2013; Paes and Albuquerque 1999). This fits the pattern of other countries with defective vital records, in which infant deaths generally have the lowest coverage rates (Målqvist et al. 2008). Field audits also found high percentages of unreported infant deaths (Frias et al. 2013; Szwarcwald et al. 2011). Second, some Brazilian researchers suggest that external causes of death (such as homicides and transit accidents) have almost complete coverage in all Brazilian regions (Agostinho 2009; Campos and Rodrigues 2004). This is plausible because deaths by violence and transit accident must be reported to the local health department not only by family or relatives but also by the municipal coroner’s office. Although the quality of information varies between regions, the notification procedures are determined by national law, and the path of the death certificate from coroner’s office to the Mortality Information System is identical in all Brazilian states (Borges et al. 2012). Deaths from external causes in Brazil have increased in the last decades, and these deaths are concentrated among young adults (de Mello Jorge et al. 1997; Matos et al. 2013; Soares Filho et al. 2007). Taken together, this evidence suggests that coverage of infant deaths should not he higher than coverage of deaths at ages 1–29: π0 ≤ π1.

For infant deaths (π0) and for all deaths (πall), we build priors from field audit estimates. For every area, we aggregate the corresponding municipal field audit estimates, weighted by deaths, to calculate estimated coverage levels, $π̂0$ and $π̂all$. The associated priors are

$π0∼BetaK0π̂0K01−π̂0,K0−5∼exponential0.05$
11
$πall∼BetaKallπ̂allKall1−π̂all,Kall−5∼exponential0.05,$
12

where K0 and Kall are hyperparameters representing (uncertain) levels of precision from the field audit estimates.9

For mortality coverage at ages 30+, which we denote as π2, external information consists of state-level estimates from the six DDM studies. We use the mean and variance of the DDM estimates to estimate the parameters of a beta distribution by the method of moments. The resulting prior is

$π2,STATE∼BetaK2ϕ2K21−ϕ2,$
13

where ϕ2 is the mean of the six DDM estimates, and K2 is an estimated precision index.10 For example, for Maranhão males, the DDM estimates are (.50, .66, .75, .78, .78, .97), so we use π2,MaranhãoBeta(7.00 × 0.74, 7.00 × 0.26). This prior distribution answers the following question: Given the DDM estimates, how plausible are different levels of coverage for males 30 and older in Maranhão? The answer is probabilistic: a priori, there is a 10 % probability that the coverage level is below .52, a 50 % probability that it is between .64 and .86 (the interquartile range), a 10 % chance that it is above .92, and so on.11

Prior information for π2 is at the state level, so when we estimate coverage for substate areas like microregions, the prior applies to the weighted registration probability:

$π2,STATE=γaπ2a+…+γzπ2z∼BetaK2ϕ2K21−ϕ2,$
14
where γi is the proportion of state deaths at ages 30+ that occur in substate area i ∈ {a . . . z}, and π2i is the death registration probability in area i.12

Finally, we use qualitative information from the literature by adding a prior that completely rules out any triples of local coverage probabilities that do not match our assumed order:

$π0≤π2≤π1$
15

That is, we insist that in every local area, infant mortality coverage cannot be higher than coverage at other ages and that coverage of deaths at ages 1–29 cannot be lower than at other ages.

### Prior Distribution for TOPALS Parameters

TOPALS parameters α7 determine the shape and level of the age-specific mortality schedule. We use a vague prior for α, so that local death and exposure data are the primary determinants of rate estimates, via the likelihood function. Our prior for α is a multivariate normal distribution derived from two principles: (1) each αi component should have very similar prior probabilities over a wide range of possible values, and (2) very large differences between consecutive components αi αi − 1, i = 1 . . . 6 are unlikely.

Based on those principles, we use the prior distribution α ~ N (0, Σ) with covariance matrix13
$Σ=3.112.712.392.151.971.861.802.712.802.472.222.031.921.862.392.472.622.352.162.031.972.152.222.352.562.352.222.151.972.032.162.352.622.472.391.861.922.032.222.472.802.711.801.861.972.152.392.713.11.$

The marginal priors for each αi component are uninformative about levels, which are measured on the log-mortality rate scale. For example, the a priori distribution for the infant mortality offset is α0N (0, 3.11), implying a 57 % prior probability that α0 falls in [–1.39, +1.39]. This is a very vague prior assumption: it says that there is a 57 % probability that a region’s infant mortality rates is between one-fourth and four times the rate in the standard schedule and a 43 % prior probability that rates might be even more extreme. The correlation structure in Σ is also only weakly informative about differences between α components; it serves mainly to stabilize estimates in extremely small populations with very few deaths by giving slightly higher prior probabilities to simple up-and-down shifts in the standard schedule (which occur when α0 = α1 = · · · = α6 and differences between consecutive αs are all zero).

## Results

We implemented the full model in Stan (Carpenter et al. 2017), a language that allows MCMC sampling from complex posterior distributions. For both sexes, we estimated posterior distributions of mortality parameters (α), complete log mortality schedules (λ(α) = λ + ), and life expectancy (e0[λ(α)]) for all 27 Brazilian states and all 558 microregions. We call this the adjusted model. In order to learn about the effects of underregistration of deaths, we also estimated the same posterior distributions under the (incorrect) assumption of 100 % registration (π0 = π1 = π2 = 1). We call this second version the unadjusted model. We focus here on summary results for male life expectancy. Complete results, together with data and code, are available on the article’s companion website (http://mortality-subregistration.schmert.net/).

## Effects of Imperfect Mortality Coverage π on Estimated State-Level Life Expectancy

Figure 4 illustrates posterior distributions of male life expectancy for two states: the small state of Amapá (total population 0.6 million) and the Federal District that contains Brasília ( 2.5 million). The figure includes posterior densities before any adjustment for underregistration (on the right in each panel) and after adjustment (on the left).

Fig. 4

Male 2010 life expectancy in two Brazilian states, before and after adjustment for underregistration of deaths. Density curves to the right in each panel represent the posterior density under the assumption of perfect death registration (π = 1 at all ages). Curves to the left include the priors for (π0, π1, π2) developed in the text. Horizontal bars represent 80 % posterior intervals (10th to 90th percentile) for the adjusted distributions. Open circles at unadjusted medians; state abbreviations at adjusted posterior medians

Fig. 4

Male 2010 life expectancy in two Brazilian states, before and after adjustment for underregistration of deaths. Density curves to the right in each panel represent the posterior density under the assumption of perfect death registration (π = 1 at all ages). Curves to the left include the priors for (π0, π1, π2) developed in the text. Horizontal bars represent 80 % posterior intervals (10th to 90th percentile) for the adjusted distributions. Open circles at unadjusted medians; state abbreviations at adjusted posterior medians

Life expectancy is a complicated nonlinear function of the underlying α parameters, but the Bayesian approach permits easy calculation of uncertainty in e0 for both the unadjusted and adjusted estimates, via Eq. (8). Figure 4 shows that even if death registration were complete (right-hand densities), there would still be considerable uncertainty about state-level life expectancy because of sampling variability. Sampling uncertainty is naturally greater for Amapá, which has a smaller population, but is nontrivial even for the Federal District, which has a male population of more than 1 million.

Adjusting for underregistration lowers life expectancy estimates. The difference can be very large in areas where coverage is poor. A probabilistic model allows us to estimate the magnitude of these effects. In Amapá, for example, plausible corrections for underregistration of deaths reduce male e0 by approximately three years, as illustrated by difference between the unadjusted and adjusted posterior medians in the top panel of Fig. 4. The corresponding adjustment is very small for the Federal District, where death registration is nearly complete.

In addition to lowering the mean, uncertainty about vital registration coverage also increases uncertainty about life expectancy. This effect is visible for both states, but it is larger for the state with lower coverage levels (Amapá). As with the decrease in means, the fact that the variance will increase is qualitatively obvious. However, a carefully constructed probabilistic model allows demographers to estimate the increase in uncertainty caused by imperfect vital registration.

Figure 5 shows estimated 2010 male life expectancies for all 27 states, disaggregated by region. It reports the Bayesian model’s death registration adjustments and distributional information using the abbreviated format on the horizontal axes of Fig. 4. The results for Amapá (AP) and the Federal District (DF) in Fig. 4 appear on the 4th line from the top and the 11th line from the bottom, respectively, of Fig. 5.

Fig. 5

Male life expectancy by state: Brazil, 2010. Unadjusted estimates from deaths registered by the Mortality Information System (SIM) are open circles. Official state-level estimates from Brazil’s statistical agency (IBGE) are solid circles. Shaded bars represent 80 % posterior probability intervals after underregistration adjustment in the Bayesian coverage model; IBGE state abbreviations (listed at http://schmert.net/BayesBrass/brazilian_regions_and_states.txt) appear at posterior medians

Fig. 5

Male life expectancy by state: Brazil, 2010. Unadjusted estimates from deaths registered by the Mortality Information System (SIM) are open circles. Official state-level estimates from Brazil’s statistical agency (IBGE) are solid circles. Shaded bars represent 80 % posterior probability intervals after underregistration adjustment in the Bayesian coverage model; IBGE state abbreviations (listed at http://schmert.net/BayesBrass/brazilian_regions_and_states.txt) appear at posterior medians

For each state, Fig. 5 includes the unadjusted male life expectancy (e0) calculated directly from registered deaths (open circles) and the adjusted estimates and 80 % posterior interval from our partial-coverage model (two-letter abbreviations and shaded bars). Horizontal distances between open circles and state abbreviations in Fig. 5 illustrate median adjustments in life expectancy due to unregistered deaths.

Accounting for underregistration of deaths leads to large downward adjustments in life expectancy in the North and Northeast regions, and to small or negligible downward adjustments elsewhere. The most extreme adjustment is for the state of Maranhão (MA) in the Northeastern region. Direct use of death registration data would suggest that Maranhão has Brazil’s highest male life expectancy (74.3 years). In contrast, adjusted Bayesian estimates for Maranhão have a median of 70.4 years, almost four years lower. Northern states, such as Pará (PA: 72.2 68.6 years), Amapá (AP: 72.7 70.0), and Amazonas (AM: 72.3 69.8), also require large downward adjustments in e0 to account for likely levels of unregistered deaths.

### Comparison With Brazil’s Official State-Level Estimates

Bayesian posterior distributions for e0 come from a range of plausible state-specific registration coverage levels. They are therefore useful as comparative benchmarks for the official state-level estimates from IBGE, Brazil’s census bureau. IBGE uses a complex, multistep procedure to correct for underregistration of deaths (Instituto Brasileiro de Geografia e Estatística 2013). Their procedure combines different indirect methods for mortality and coverage estimation, and differs by region. As a result of this complexity, researchers (including us) have been unable to replicate the official state-level estimates.

A comparison to adjusted and unadjusted means from vital registration data is therefore useful for understanding the official state-level estimates.

Filled circles in Fig. 5 correspond to IBGE estimates for male e0 in each state. Thus, when a filled circle falls to the left of the open circle, the IBGE estimate is equivalent to assuming that deaths are underregistered. A filled circle to the right of an open circle is equivalent to assuming overregistration of deaths.

Figure 5 shows that official life expectancy estimates for states in the South and Southeast (bottom panels) are implausibly high. Comparison with registered death data indicates that IBGE estimates are plausible only if the vital registration system substantially overcounts deaths in these states.14

Figure 5 also shows that the IBGE estimates for states in Brazil’s North and Northeast regions are implausibly low. For almost all states in the North and Northeast, the official e0 estimate for males falls far below the 10th percentile of the adjusted posterior distribution. In other words, likely combinations of vital registration coverage and mortality schedules in these states produce life expectancies that are much higher than the official estimates.

### Signal and Noise in State Life Expectancy Estimates

Comparisons between areas are often important for public policy purposes, including allocation of health expenditures and other resources. Realistic estimates of the uncertainty of life expectancy highlight potential problems with allocating resources based on the most likely point estimates, even for areas with large populations. As noted earlier, states are not especially small areas: the least-populous Brazilian state (Roraima) has nearly half a million residents. Even so, after adjusting for likely underregistration of deaths, determining which of a pair of states has the higher life expectancy can be difficult.

Fig. 6 shows 80 % credible intervals for each state’s male e0 in the full model, including likely underregistration of deaths. As in the previous plot, the horizontal bars span the interval between the 10th and 90th percentiles of the posterior distribution. Solid dots indicate the posterior medians. States are sorted in order of posterior medians, which we use as best-guess point estimates. The Federal District has the highest estimate (72.3), Santa Catarina has the second-highest (71.9), and so forth. Some states, such as Rio Grande do Sul (median = 70.9) or Paraná (70.3) have very large populations and almost perfect vital registration. Consequently, certainty about e0 is very high in those states. In contrast, some of Brazil’s sparsely populated northern states such as Roraima (70.6) or Amazonas (69.8) have smaller populations and coverage levels that are much less certain. In these cases, we are much less certain about which state actually has the higher life expectancy.

Fig. 6

Male life expectancy by state (medians and 80 % posterior intervals): Brazil, 2010

Fig. 6

Male life expectancy by state (medians and 80 % posterior intervals): Brazil, 2010

Samples from the posterior distribution allow us to make probabilistic statements about the ranking of areas, which could be important for allocation of health resources. For example, the posterior medians of e0 in Amazonas and Roraima are 69.8 and 70.6, respectively, but in 730 of 4,000 samples from the posterior in Eq. (8), the Amazonas life expectancy was higher. Thus, even though the point estimate for Amazonas life expectancy is almost one year lower, the a posteriori estimate is $Pe0AM>e0RO≈.18$.

Analysis of posterior samples also allows probabilistic statements about which states have the lowest and highest life expectancies. For example, Alagoas had the lowest e0(α) value in 3,592 of 4,000 samples, implying an approximate 90 % probability that this state has a lower male life expectancy than all others. Alternative candidates for lowest male e0 are Pará (plowest = 0.06) and Pernambuco (plowest = 0.01). Analogous calculations show that the Federal District (phighest = 0.91), Piauí (phighest = 0.05), and Rio Grande do Norte (phighest = 0.02) are the top candidates for highest male life expectancy.

### Microregion-Level Estimates

We also used the Bayesian TOPALS model to estimate adjusted posterior distributions of α and e0(α) separately by sex for all 558 Brazilian microregions. Results for males appear in Fig. 7, which displays posterior medians, and in Fig. 8, which displays the widths of each microregion’s 80 % posterior probability interval.

Fig. 7

Posterior medians of male life expectancy at birth: Brazilian microregions, 2010. Darker colors indicate higher mortality and lower e0. The state of Bahia is highlighted with a thick border

Fig. 7

Posterior medians of male life expectancy at birth: Brazilian microregions, 2010. Darker colors indicate higher mortality and lower e0. The state of Bahia is highlighted with a thick border

Fig. 8

Uncertainty of male life expectancy at birth, as measured by the width of the 80 % posterior probability interval: Brazilian microregions, 2010. Darker colors indicate more certain estimates. The state of Bahia is highlighted with a thick border

Fig. 8

Uncertainty of male life expectancy at birth, as measured by the width of the 80 % posterior probability interval: Brazilian microregions, 2010. Darker colors indicate more certain estimates. The state of Bahia is highlighted with a thick border

The point estimates in Fig. 7 show that high-mortality (low-e0) regions for males tend to be concentrated along the Atlantic coast, particularly in large cities. Pockets of high mortality are also found in scattered areas of northern and western Brazil. Point estimates of mortality are especially low in parts of southern Brazil and in the Northeastern states of Piauí and Ceará.

Some of the point estimates in Fig. 7 are much more reliable than others, however. Figure 8 shows a strong North-South gradient in the precision of male e0 estimates. Posterior distributions (analogous the adjusted densities for states in Fig. 4) are notably narrower in southern microregions. As a result, we can be much more certain about small-area life expectancies in Brazil’s South and Southeast than in other regions.

Differences in the uncertainty of these small-area estimates arise for two reasons. First, at any given level of vital registration coverage, microregions with larger populations will have more reliable mortality estimates because of larger sample sizes. Many regions in northern and western Brazil are very sparsely populated; thus, even if registration coverage were known exactly, local estimates would still be subject to considerable sampling error. Second, the level of underregistration of deaths is also uncertain in many small areas. The Bayesian model accounts for this extra source of uncertainty, resulting in wider posterior distributions for small areas with more uncertain vital registration coverage.

### Signal and Noise in Microregion Life Expectancy Estimates

We have already highlighted the difficulty of ranking or comparing state-level e0 estimates. For smaller areas, the signal-to-noise ratio changes for the worse, and comparisons become even more difficult. As an example, Fig. 9 shows Bayesian-adjusted estimates of male e0 for the 32 microregions in the state of Bahia (highlighted with a thick border in Figs. 7 and 8). At this level of geography, rankings of areas and differences in estimated life expectancy are largely overwhelmed by uncertainty. For example, Livramento do Brumado (1st line of Fig. 9) has the highest estimated male e0, with a posterior median of 73.3 years. But the posterior probability that this microregion has the highest male life expectancy is only 37 %: it ranked highest in 1,479 of 4,000 posterior samples from the posterior in Eq. (7). Cotegipe (2nd line, with a posterior median of 73.0) has a 29 % chance of being the highest, Jeremoabo (3rd line, 72.2) has a 9 % chance, and so on.

Fig. 9

Male life expectancy (median and 80 % posterior intervals) for 32 microregions in Bahia

Fig. 9

Male life expectancy (median and 80 % posterior intervals) for 32 microregions in Bahia

Consider comparing Livramento do Brumado (location 1, on the top line of Fig. 9, estimated e0 = 73.3 years) and Boquira (location 11, 11th line, 71.6 years). Repeated sampling from the posterior distribution (local priors + local data) produces a large number of plausible ($e01,e011$) pairs for male life expectancies in these two places, illustrated in Fig. 10. Uncertainty is a result of small sample sizes, very small numbers of registered deaths at some ages, and imprecise estimates of local death registration probabilities. Although the point estimate for $e01$ is nearly two years higher than the estimate for $e011$, considerable uncertainty remains about which region has the higher life expectancy. Given data and priors about death registration, the posterior probability that male life expectancy is actually higher in Boquira than in Livramento do Brumado is 11 % ($e011−e01>0$ in 447 of 4,000 samples), and there is even a 3 % probability that it is more than one year higher ($e011−e01>1$ in 107 of 4,000 samples).

Fig. 10

Sample from joint posterior distribution of male life expectancies in two microregions: Bahia, 2010. Diagonal lines labeled by difference in life expectancies

Fig. 10

Sample from joint posterior distribution of male life expectancies in two microregions: Bahia, 2010. Diagonal lines labeled by difference in life expectancies

The main message of Figs. 9 and 10 lies in the very high overlap between the posterior distributions of many of the microregions. These are not especially small areas: Livramento do Brumado was the least-populous microregion in Bahia, but it had a total population of 97,786 in 2010, and the median total population of Bahian microregions was slightly below 290,000.

Despite these fairly large areas, however, uncertainty dominates most pairwise comparisons. It is clear that at this geographic level, researchers and policy makers should not rely on point estimates to distinguish high- and low-mortality areas—especially if differences in best-guess estimates of median e0 are less than one year. That result applies even more strongly to smaller areas, such as municipalities.

## Conclusion

A Bayesian model is a natural approach to estimating small-area mortality when the vital registration system is imperfect. In this article, we show that it is straightforward to combine age-specific mortality rates and coverage probabilities in a unified model.

The TOPALS relational model for age-specific rates is a useful component in Bayesian modeling for small areas. Because it is flexible but low dimensional, the TOPALS parametric model for age-specific mortality schedules makes it possible to estimate rates and life expectancies even for very small populations.

It is illuminating to consider correction for underregistration as a statistical, rather than arithmetic, process. An explicitly statistical model naturally emphasizes the uncertainty of results. Brazilian microregions are not especially small areas. Nevertheless, the uncertainty in life expectancy estimates is often much greater than the differences between local point estimates.

One of our main substantive findings concerns the signal-to-noise ratio in small-area estimates. The levels of posterior uncertainty in microregional life expectancy estimates for Brazil make it clear that attempts to estimate mortality at even smaller levels of geography (for example, for the 417 municipalities in Bahia that are subregions of those displayed in Fig. 9) face serious and possibly insurmountable difficulties. Even with good statistical methods for estimating mortality in very small populations, realistic assessment of uncertainty suggests that drawing meaningful distinctions about the mortality of those populations may be extremely difficult. Demographers should stay humble.

## Acknowledgments

This research was supported by the Capes Foundation of Brazil’s Ministry of Education. Marcos R. Gonzaga gratefully acknowledges support from Research Projects 470866/2014-4 (Estimativas de mortalidade e construção de tabelas de vida para pequenas áreas no Brasil, 1980 a 2010 MCTI/CNPQ/MEC/Capes/Ciências Sociais Aplicadas) and 454223/2014-5 (Estimativas de mortalidade e construção de tabelas de vida para pequenas áreas no Brasil, 1980 a 2010/MCTI/CNPQ/Universal 14/2014).

### Appendix: Statistical Distribution of Registered Deaths

A generalized Poisson distribution for a random count variable Y, using a mixture of heterogeneous risks, is (Greene 1997:939–940)
$PY=k=∫0∞e−λzλzkk!gzdz,$
where z is a multiplicative risk factor with density g(z) over positive real numbers. This mixture model, YPoissonMix(λ, g), describes the distribution of count variable Y ∈ {0, 1, 2, . . .} in terms of a scalar parameter λ and a density function g (). It generalizes the Poisson distribution by allowing the mean and variance of Y to differ. In particular, it provides a framework for modeling overdispersion (V(Y) > E(Y)), which is often observed in count data.

The mixture model includes the standard Poisson distribution as a limiting case: as the distribution g(z) approaches a constant at z = 1, Y’s distribution approaches a Poisson with E(Y) = V (Y) = λ. It also includes the negative binomial distribution: if g(z) is a gamma density with E(z) = 1 and V(z) = 1 / θ, then Y has a negative binomial distribution with E(Y) = λ and V(Y) = λ + (λ2 / θ). Other {λ, g ()} mixtures yield other discrete distributions for Y.

Suppose that total deaths in a population follow a distribution in this generalized family, so that the probability of D deaths is
$PD=∫0∞e−λzλzDD!gzdz.$
If deaths are registered independently with probability π, then
$PRD=D!R!D−R!πR1−πD−RforR∈0122, . . . D,$
and the joint probability of a pair of integers (R, D) is
$PRD=∫0∞e−λzλzDR!D−R!πR1−πD−RgzdzforD∈012. . .andR∈012. . .D.$
In terms of registered deaths (R) and unregistered deaths (U = D − R), the same expression is
$PRU=∫0∞e−λzλzR+UR!U!πR1−πUgzdzforR∈012. . .andU∈012. . ..$
The marginal probability of registered deaths (R) is therefore
$PR=∑U=0∞∫0∞e−λzλzR+UR!U!πR1−πUgzdz$
$=∫0∞∑U=0∞e−λzλzR+UR!U!πR1−πUgzdz$
$=∫0∞e−λzλzRR!πR∑U=0∞λzUU!1−πUgzdz$
$=∫0∞e−λzλzRR!πRe+λz1−πgzdz$
$=∫0∞e−λπzλπzRR!gzdz.$
The distribution of registered deaths (R) therefore has exactly the same mathematical form as the marginal distribution of total deaths (D), except that parameter λ is replaced with λπ. That is,
$D∼PoissonMixλgR∼BinomDπ⇒R∼PoissonMixλπg.$
This general proof applies to special cases where D ~ Poisson or DNegBinom, as well as to other Poisson mixtures. Most importantly for this article, it demonstrates that if total deaths have a Poisson distribution with expected value λ = Nμ, then registered deaths also have a Poisson distribution, but with expected value λπ = Nμπ.

## Notes

1

In the appendix, we also demonstrate that a negative binomial distribution for total deaths implies a negative binomial distribution for registered deaths. A negative binomial model would be appropriate if the data exhibit overdispersion—that is, higher variance than predicted by a Poisson model. With the Brazilian data that we use in this article, extensive experimentation produced no evidence of meaningful overdispersion, and posterior distributions of mortality rates were virtually identical with Poisson and negative binomial specifications. We therefore use a Poisson distribution for D in this article.

2

By omitting an explicit prior, we assume a priori that μ is equally likely to take any positive real value. The omitted (improper) prior is therefore fμ(μ) ∝ I(μ ≥ 0), where I() is a (0,1) indicator function. This yields a proper posterior distribution for (μ, π) and a proper marginal posterior for μ in Eq. (4).

3

Gonzaga and Schmertmann (2016) showed that this property makes the specific choice of a standard schedule λ* far less important than in other relational models used in demography. Note that the TOPALS model includes indirect standardization as a special case in which all α values are equal and the standard schedule is shifted up or down by the same amount at all ages.

4

In Brazil, municipalities are the smallest areas responsible for registering vital events.

5

For simplicity we consider the Federal District that contains Brasília to be a state.

6

Even microregions are fairly large “small areas.” With a single exception (the remote island of Fernando de Noronha, with a total resident population of only 2630 in 2010), all had resident populations of at least 20,000 in 2010. Rounded to the nearest thousand, the 10th, 50th, and 90th percentiles of microregional population were, respectively, 63,000, 173,000, and 557,000. The largest microregion, metropolitan São Paulo, had a 2010 population of more than 13 million.

7

The complete name in Portuguese is Busca ativa de óbitos e nascimentos no Nordeste e na Amazônia Legal (Active search for deaths and births in the Northeast and the Amazonian administrative region).

8

In practice, we used identical weights for each region: w = (.035, .109, .856) for males and w = (.037, .047, .916) for females. These were calculated from national deaths over 2009–2011.

9

Hyperparameters, K, correspond to sample sizes in a field audit. Prior uncertainty about K represents uncertainty about the precision of the field audit estimates of π. Our (hyper)priors for K are fairly conservative: they imply that the most likely precision of the field audit estimates is equivalent to results from an audit of slightly fewer than K = 25 deaths in a region.

10

Denoting the mean and variance of DDM estimates as $x¯$ and s2, the method of moments estimators (cf. Glen and Leemis 2017:227–228) are $ϕ2=x¯$ and $Κ2=x¯1−x¯s2−1$.

11

Priors based on busca ativa estimates are constructed from a single coverage estimate for each region by adding a hyperparameter for the estimate’s unknown precision. In contrast, priors from DDM estimates are based on multiple estimates per region and use the variance of those estimates as an index of (im)precision. A third alternative, which we do not use here, is to choose beta distribution parameters ϕ and K so that available estimates are all in a specified range of prior probability—for example, a 90 % probability that π ∈ [min(DDM), max(DDM)].

12

Because we have only state-level prior information about death registration in this age group, we can assess only the joint prior probability of a set of substate coverage levels, (π2α . . . π2z), by looking at whether their weighted average is likely.

13

This prior distribution arises from two lines in the Stan programming language that we use for MCMC sampling. From our first principle (diffuse marginal distributions for each αi), we add α ~normal(0,4) to the model. From the second principle (small differences between consecutive parameter values) we add αi – αi – 1 ~ normal(0, sqrt(0.5)), as in Gonzaga and Schmertmann (2016). These statements in Stan represent changes to the log prior density of any proposed α vector, which together yield this specific multivariate normal distribution. The results that we report in this article are extremely insensitive to the choice of priors for α.

14

The high estimates for life expectancy in the South and Southeast probably result from IBGE’s compatibilização step (Instituto Brasileiro de Geografia e Estatística 2013: tables 6 and 13), in which they adjust national totals by removing deaths from these two regions.

## References

Agostinho, C. (
2009
).
Estudo sobre a mortalidade adulta, para Brasil entre 1980 e 2000 e Unidades da Federação em 2000: Uma aplicação dos métodos de distribuição de mortes
[A study of adult mortality in Brazil between 1980 and 2000, and in Brazilian states in 2000] (Unpublished doctoral dissertation).
,
Belo Horizonte, Brazil
.
Alexander, M., Zagheni, E., & Barbieri, M. (
2017
).
A flexible Bayesian model for estimating subnational mortality
.
Demography
,
54
,
2025
2041
. 10.1007/s13524-017-0618-7
Alkema, L., Kantorová, V., Menozzi, C., & Biddlecom, A. (
2013
).
National, regional, and global rates and trends in contraceptive prevalence and unmet need for family planning between 1990 and 2015: A systematic and comprehensive analysis
.
Lancet
,
381
,
1642
1652
. 10.1016/S0140-6736(12)62204-1
Alkema, L., Raftery, A. E., Gerland, P., Clark, S. J., Pelletier, F., Buettner, T., & Heilig, G. K. (
2011
).
Probabilistic projections of the total fertility rate for all countries
.
Demography
,
48
,
815
839
. 10.1007/s13524-011-0040-5
Bennett, N. G., & Horiuchi, S. (
1981
).
Estimating the completeness of death registration in a closed population
.
Population Index
,
47
,
207
221
. 10.2307/2736447
Bennett, N. G., & Horiuchi, S. (
1984
).
Mortality estimation from registered deaths in less developed countries
.
Demography
,
21
,
217
233
. 10.2307/2061041
Bernardinelli, L., & Montomoli, C. (
1992
).
Empirical Bayes versus fully Bayesian analysis of geographical variation in disease risk
.
Statistics in Medicine
,
11
,
983
1007
. 10.1002/sim.4780110802
Bhat, P. N. M. (
2002
).
Completeness of India’s sample registration system: An assessment using the general growth balance method
.
Population Studies
,
56
,
119
134
.
Bignami-Van Assche, S. (
2005, March–April
).
Province-specific mortality in China 1990–2000
. Paper presented at the annual meeting of the Population Association of America,
.
Borges, D., Miranda, D., Duarte, T., Novaes, F., Ettel, K., Guimarães, T., & Ferreira, T. (
2012
).
Mortes violentas no Brasil: Uma análise do fluxo de informações
[Violent deaths in Brazil: An analysis of the flow of information.].
Rio de Janeiro, Brazil
:
LAV/UERJ
.
Brass, W. (
1971
).
Mortality models and their uses in demography
.
Transactions of the Faculty of Actuaries
,
33
,
123
142
. 10.1017/S0071368600005164
Brass, W. (
1975
).
Methods for estimating fertility and mortality from limited and defective data, based on seminars held 16–24 September 1971 at Centro Latinamerico de Demografia (CELADE) San Jose, Costa Rica
(Report).
Chapel Hill, NC
:
International Program of Laboratories for Population Statistics
.
Campos, N. O. B., & Rodrigues, R. N. (
2004
).
Ritmo de declínio nas taxas de mortalidade dos idosos nos estados do Sudeste, 1980–2000 [The pace of decline in mortality rates of the elderly in states of the Southeast, 1980–2000]
.
Revista Brasileira de Estudos de População
,
21
,
323
342
.
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, G., Goodrich, B., Betancourt, M., & . . . Riddell, A. (
2017
).
Stan: A probabilistic programming language
.
Journal of Statistical Software
,
76
,
1
32
. 10.18637/jss.v076.i01
Congdon, P. (
2009
).
Life expectancies for small areas: A Bayesian random effects methodology
.
International Statistical Review
,
77
,
222
240
. 10.1111/j.1751-5823.2009.00080.x
de Beer, J. (
2012
).
Smoothing and projecting age-specific probabilities of death by TOPALS
.
Demographic Research
,
27
(
article 20
),
543
592
. 10.4054/DemRes.2012.27.20
de Boor, C. (
2001
).
Applied mathematical sciences: Vol. 27. A practical guide to splines
(Revised ed.).
New York, NY
:
Springer-Verlag
.
de Mello Jorge, M. H. P., Gawryszewski, V. P., & Latorre, M. D. R. D. D. O. (
1997
).
.
Revista de Saúde Pública
,
31
,
5
25
. 10.1590/S0034-89101997000500002
de Mello Jorge, M. H. P., Laurenti, R., & Davidson Gotlieb, S. L. (
2007
).
Análise da qualidade das estatísticas vitais brasileiras: A experiência de implantação do SIM e do SINASC [Quality analysis of Brazilian vital statistics: The experience of implementing the SIM and SINASC systems]
.
Ciência e Saúde Coletiva
,
12
,
643
654
. 10.1590/S1413-81232007000300014
de Oliveira, G. L., Loschi, R. H., & Assunção, R. M. (
2017
).
A random-censoring Poisson model for underreported data
.
Statistics in Medicine
. Advance online publication. doi: 10.1002/sim.7456
Freire, F. H., Lima, E. C., Queiroz, B. L., Gonzaga, M. R., & Souza, F. H. (
2015, May
).
Mortality estimates and construction of life tables for small areas in Brazil, 2010
. Paper presented at the annual meeting of the Population Association of America,
San Diego, CA
.
Frias, P. G., Szwarcwald, C. L., de Souza, P. R. B.Jr, Almeida, W. D. S., & Lira, P. I. C. (
2013
).
Correcting vital information: Estimating infant mortality, Brazil, 2000–2009
.
Revista de Saúde Pública
,
47
,
1048
1058
. 10.1590/S0034-89102013000901048
Gerland, P., Raftery, A. E., Ševčíková, H., Li, N., Gu, D., Spoorenberg, T., & . . . Wilmoth, J. (
2014
).
World population stabilization unlikely this century
.
Science
,
346
,
234
237
.
Glen, A. G., & Leemis, L. M. (
2017
).
International series in operations research & management science. Computational probability applications
.
Cham, Switzerland
:
Springer
.
Gonzaga, M. R., & Schmertmann, C. P. (
2016
).
Estimating age-and sex-specific mortality rates for small areas with TOPALS regression: An application to Brazil in 2010
.
Revista Brasileira de Estudos de População
,
33
,
629
652
. 10.20947/S0102-30982016c0009
Greene, W. H. (
1997
).
Econometric analysis
(3rd ed.).
:
Prentice Hall
.
Hill, K. (
2007
).
Methods for measuring adult mortality in developing countries: A comparative review
(Global Burden of Disease 2000 in Aging Populations Research Paper No. 13).
Cambridge, MA
:
Harvard Burden of Disease Unit
.
Hill, K., & Queiroz, B. (
2010
).
Adjusting the general growth balance method for migration
.
Revista Brasileira de Estudos de População
,
27
,
7
20
. 10.1590/S0102-30982010000100002
Hill, K., You, D., & Choi, Y. (
2009
).
Death distribution methods for estimating adult mortality: Sensitivity analysis with simulated data errors
.
Demographic Research
,
21
(
article 9
),
235
254
. 10.4054/DemRes.2009.21.9
Hill, K. H. (
1987
).
Estimating census and death registration completeness
.
Asian and Pacific Population Forum
,
1
(
3
),
8
13
, 23.
Instituto Brasileiro de Geografia e Estatística
(Ed.). (
2013
).
. Estudos e pesquisas. Informação demográfica e socioeconomic [Sex- and age-specific abbreviated life tables for Brazilian states and major regions in 2010: Studies and research. Demographic and socioeconomic information].
Rio de Janeiro, Brazil
:
Instituto Brasileiro de Geografia e Estatística (IBGE)
.
Jonker, M. F., Van Lenthe, F. J., Congdon, P. D., Donkers, B., Burdorf, A., & Mackenbach, J. P. (
2012
).
Comparison of Bayesian random-effects and traditional life expectancy estimations in small-area applications
.
American Journal of Epidemiology
,
176
,
929
937
. 10.1093/aje/kws152
Lynch, S. M. (
2007
).
Introduction to applied Bayesian statistics and estimation for social scientists
.
New York, NY
:
Springer
.
Målqvist, M., Eriksson, L., Nga, N. T., Fagerland, L. I., Hoa, D. P., Wallin, L., & . . . Persson, L.-Å. (
2008
).
Unreported births and deaths, a severe obstacle for improved neonatal survival in low-income countries: A population based study
.
BMC International Health and Human Rights
,
8
,
4
. 10.1186/1472-698X-8-4
Mathers, C. D., Ma Fat, D., Inoue, M., Rao, C., & Lopez, A. D. (
2005
).
Counting the dead and what they died from: An assessment of the global status of cause of death data
.
Bulletin of the World Health Organization
,
83
,
171
177
.
Matos, K., de Godoy, M., & Baccarat, C. (
2013
).
Mortalidade por causas externas em crianças, adolescentes e jovens: Uma revisão bibliográfica [Mortality from external causes in children, teenagers, and young adults: A bibliographic review]
.
Espaço para a Saúde-Revista de Saúde Pública do Paraná
,
14
(
1/2
),
82
93
.
Moreno, E., & Girón, J. (
1998
).
Estimating with incomplete count data: A Bayesian approach
.
Journal of Statistical Planning and Inference
,
66
,
147
159
. 10.1016/S0378-3758(97)00073-6
Murray, C. J. L., Rajaratnam, J. K., Marcus, J., Laakso, T., & Lopez, A. D. (
2010
).
What can we conclude from death registration? Improved methods for evaluating completeness
.
PLoS Medicine
,
7
(
4
),
1000262
10.1371/journal.pmed.1000262
Ocaña-Riola, R., & Mayoral-Cortés, J.-M. (
2010
).
Spatio-temporal trends of mortality in small areas of Southern Spain
.
BMC Public Health
,
10
,
1
. 10.1186/1471-2458-10-26
Paes, N. A. (
2005
).
Avaliação da cobertura dos registros de óbitos dos estados brasileiros em 2000 [Assessment of completeness of death reporting in Brazilian states in 2000]
.
Revista de Saúde Pública
,
39
,
882
890
. 10.1590/S0034-89102005000600003
Paes, N. A., & Albuquerque, M. E. E. (
1999
).
Avaliação da qualidade dos dados populacionais e cobertura dos registros de óbitos para as regiões Brasileiras [Evaluation of population data quality and death registration coverage for Brazilian regions]
.
Revista de Saúde Pública
,
33
,
33
43
. 10.1590/S0034-89101999000100006
Pletcher, S. D. (
1999
).
Model fitting and hypothesis testing for age-specific mortality data
.
Journal of Evolutionary Biology
,
12
,
430
439
. 10.1046/j.1420-9101.1999.00058.x
Preston, S., Coale, A. J., Trussell, J., & Weinstein, M. (
1980
).
Estimating the completeness of reporting of adult deaths in populations that are approximately stable
.
Population Index
,
46
,
179
202
. 10.2307/2736122
Preston, S., & Hill, K. (
1980
).
Estimating the completeness of death registration
.
Population Studies
,
34
,
349
366
. 10.1080/00324728.1980.10410395
Queiroz, B. L. (
2012, November
).
Estimativas do grau de cobertura e da esperança de vida para as unidades da federação no Brasil entre 2000 e 2010
[Estimates of the degree of coverage and life expectancy for Brazilian states between 2000 and 2010]. Paper presented at the XVIII Encontro de Estudos de População da ABEP, Aguas de Lindóia.
Queiroz, B. L., Freire, F. H. M. A., Gonzaga, M. R., & Lima, E. E. C. (
2017
).
Completeness of death-count coverage and adult mortality (45q15) for Brazilian states from 1980 to 2010
.
Revista Brasileira de Epidemiologia
,
20
,
21
33
. 10.1590/1980-5497201700050003
Queiroz, B. L., Lima, E. C., Freire, F. H., & Gonzaga, M. R. (
2013
).
Adult mortality estimates for small areas in Brazil, 1980–2010: A methodological approach
.
Lancet
,
381
,
S120
10.1016/S0140-6736(13)61374-4
Raftery, A. E. (
1988
).
Inference for the binomial N parameter: A hierarchical Bayes approach
.
Biometrika
,
75
,
223
228
.
Raftery, A. E., Chunn, J. L., Gerland, P., & Ševčíková, H. (
2013
).
Bayesian probabilistic projections of life expectancy for all countries
.
Demography
,
50
,
777
801
.
Raftery, A. E., Lalic, N., Gerland, P., Li, N., & Heilig, G. (
2014
).
Joint probabilistic projection of female and male life expectancy
.
Demographic Research
,
30
(article 27),
795
822
. 10.4054/DemRes.2014.30.27
Riggan, W. B., Manton, K. G., Creason, J. P., Woodbury, M. A., & Stallard, E. (
1991
).
Assessment of spatial variation of risks in small populations
.
Environmental Health Perspectives
,
96
,
223
238
.
Ševčíková, H., Li, N., Kantorová, V., Gerland, P., & Raftery, A. E. (
2016
).
Age-specific mortality and fertility rates for probabilistic population projections
. In Schoen, R. (Ed.),
Dynamic demographic analysis
(pp.
285
310
).
Cham, Switzerland
:
Springer
.
Soares Filho, A. M., Souza, M. F. M., Gazal-Carvalho, C., Malta, D. C., Alencar, A. P., Silva, M. M. A., & Morais Neto, O. L. (
2007
).
Análise da mortalidade por homicídios no Brasil [Analysis of homicide mortality in Brazil]
.
Epidemiologia e Serviços de Saúde
,
16
,
7
18
.
Stephens, A. S., Purdie, S., Yang, B., & Moore, H. (
2013
).
Life expectancy estimation in small administrative areas with non-uniform population sizes: Application to Australian New South Wales local government areas
.
BMJ Open
,
3
(
12
),
e003710
. 10.1136/bmjopen-2013-003710
Szwarcwald, C. L., Morais Neto, O. L., Frias, P. G., de Souza, P. R. B., Jr., Escalante, J. J. C., de Lima, R. B., & Viola, R. C. (
2011
).
Busca ativa de óbitos e nascimentos no Nordeste e na Amazônia Legal: Estimação das coberturas do SIM e do SINASC nos municípios Brasileiros [Active search for deaths and births in the Northeast and in the Legal Amazon: Estimation of coverage of SIM and SINASC in Brazilian municipalities]
. In
Ministry of Health
(Ed.),
Saúde Brasil 2010: Uma análise da situação e de evidências selecionadas de impacto de ações de vigilância em saúde
[Saúde Brasil 2010: An analysis of the health situation and selected evidence of the impact of health surveillance actions] (pp.
79
98
).
Brasília
:
Ministério da Saúde
.
Tsimbos, C., Kalogirou, S., & Verropoulou, G. (
2014
).
Estimating spatial differentials in life expectancy in Greece at local authority level
.
Population, Space and Place
,
20
,
646
663
. 10.1002/psp.1800
Wilmoth, J., Zureick, S., Canudas-Romo, V., Inoue, M., & Sawyer, C. (
2012
).
A flexible two-dimensional mortality model for use in indirect estimation
.
Population Studies
,
66
,
1
28
. 10.1080/00324728.2011.611411
You, D., Hug, L., Ejdemyr, S., Idele, P., Hogan, D., Mathers, C., & . . . Alkema, L. (
2015
).
Global, regional, and national levels and trends in under-5 mortality between 1990 and 2015, with scenario-based projections to 2030: A systematic analysis by the UN Inter-agency Group for Child Mortality Estimation
.
Lancet
,
386
,
2275
2286
.