## Abstract

Recent studies have shown that U.S. Census– and American Community Survey (ACS)–based estimates of income segregation are subject to upward finite sampling bias (Logan et al. 2018; Logan et al. 2020; Reardon et al. 2018). We identify two additional sources of bias that are larger and opposite in sign to finite sampling bias: measurement error–induced attenuation bias and temporal pooling bias. The combination of these three sources of bias make it unclear how income segregation has trended. We formalize the three types of bias, providing a method to correct them simultaneously using public data from the decennial census and ACS from 1990 to 2015–2019. We use these methods to produce bias-corrected estimates of income segregation in the United States from 1990 to 2019. We find that (1) segregation is on the order of 50% greater than previously believed; (2) the increase from 2000 to the 2005–2009 period was much greater than indicated by previous estimates; and (3) segregation has declined since 2005–2009. Correcting these biases requires good estimates of the reliability of self-reported income and of the year-to-year volatility in neighborhood mean incomes.

## Neighborhood Income Segregation in the United States

A growing scholarship seeks to offer a nuanced description of patterns of income segregation across income level, over time, and with respect to other types of segregation (Jargowsky 1996; Massey et al. 2009; Owens 2016; Owens et al. 2016; Reardon and Bischoff 2011). These descriptions of income segregation—the uneven distribution of people with different incomes across space, place, and institutions—are important given how much people's lives and futures are shaped by the socioeconomic conditions of their surroundings (Chetty and Hendren 2018a, 2018b; Chetty et al. 2016; Chetty et al. 2014; Sampson 2012).

Sociological theories suggest that income segregation exacerbates inequality and hampers class mobility across generations. Since Park (1915), sociologists have theorized that distinct social norms and environments emerge from clustering people into neighborhoods. More recently, collective efficacy theory explains how poor neighborhoods' organizational characteristics and reputations—promoted by higher mobility and biased perceptions, respectively—increase residents' exposure to crime (Sampson 2012; Sampson et al. 1997). That is, the different lived experiences of the rich and the poor are not confined to the household; rather, they are exacerbated by contextual disparities that emerge from residential segregation.

A similar conclusion is reached by a different line of scholarship that emphasizes the political economy of place (Lichter et al. 2012; Logan and Molotch 1987; Massey et al. 2009). These scholars describe how elites capture municipalities by maneuvering both politically and residentially among cities, suburbs, and towns. They argue that the growth imperative of municipalities binds them to elite interests as they compete for growth and affluent tax bases. Income segregation within places enables municipalities to boost their elite clients by investing more in richer neighborhoods, while income segregation between places concentrates resources and needs into distinct political units. This is potentially self-reinforcing as the higher tax bases, social capital, and human capital of rich neighborhoods and municipalities enable those places to attract high-income residents through investments in community resources and amenities like schools, social services, and parks.

One theorized consequence of income segregation, then, is differences in educational opportunity that enable the rich to transfer their class and status across generations. Empirical studies have validated this concern, showing that neighborhood socioeconomic conditions strongly influence child development, economic mobility, and educational attainment (Chetty and Hendren 2018a, 2018b; Chetty et al. 2016; Wodtke et al. 2016). A complementary scholarship focuses on the role of income segregation between schools and school districts, with recent studies highlighting the strong relationship between income segregation and both income and race/ethnicity achievement gaps (Fahle and Reardon 2018; Fahle et al. 2020; Owens 2016, 2017; Owens et al. 2016; Reardon, Kalogrides, and Shores 2019; Reardon, Weathers et al. 2019).

Scholars have attended to these concerns by describing the income segregation of residences and schools in the United States. This literature shows that residential income segregation has increased over the last four decades, particularly in the 1980s and since 2000 (Bischoff and Reardon 2014; Jargowsky 1996; Massey et al. 2009; Owens 2016; Reardon and Bischoff 2011; Reardon et al. 2018). This increase was driven by increasing segregation of poverty in the 1970s and 2000s, as well as increasing segregation at all income levels in the 1980s (Bischoff and Reardon 2014; Reardon and Bischoff 2011). The increase in segregation is also concentrated among households with school-age children (Owens 2016). Partly as a result, income segregation between schools and school districts has increased markedly since 1990 (Owens et al. 2016), with affluent White families increasingly living in affluent school districts (Owens 2017).

## Finite Sampling Bias

Yet this story has recently been questioned. Finite sampling bias partly accounts for the apparent increase in residential income segregation during the 2000s, when the Census Bureau switched from collecting income data in the decennial census to collecting it in the American Community Survey (ACS) (Logan et al. 2018; Logan et al. 2020; Reardon et al. 2018). One of the more recent, prominent studies showing increasing income segregation was by Bischoff and Reardon (2014). It compared residential income segregation in 2000 measured using the decennial census sample to segregation measured using the 2005–2009 through 2007–2011 ACS five-year pooled samples. Because the effective sample of the latter is much smaller (roughly 8% vs. roughly 17% in the census), it is subject to more sampling variation, which upwardly biases income segregation estimates (Logan et al. 2018).

Both Logan et al. (2018) and Reardon et al. (2018) demonstrated that the apparent increase in income segregation between 2000 and 2005–2009 is exaggerated by this bias. Reardon et al. (2018) proposed a correction for finite sampling bias in segregation measures using publicly available data, while Logan et al. (2020) proposed an alternative approach using restricted microdata and new methods that account for weighted sampling. Replicating Bischoff and Reardon (2014), the former found that income segregation increased by about half of what was initially reported, while the latter found that income segregation was stable over the period in question.

The recent attention to finite sampling bias is important, but misses—as we show next—two additional sources of downward bias that are much larger than the upward bias that results from finite samples. We formalize the three types of bias—finite sampling bias, attenuation bias, and pooling bias—and provide a method for correcting them using public data from the decennial census and ACS. Finally, we report new estimates of the national trend in residential income segregation, finding that it is substantially different than previously believed.

## Measurement Error–Induced Attenuation Bias

Estimates of income segregation typically use error-prone self-reported measures of income. In addition to oft-cited sources of error like nonresponse and motivated misreporting, error can also come from the cognitive demands of income reporting in the form of misunderstanding income concepts and terms, information retrieval errors, and motivated misremembering (Moore et al. 2000). Measurement error in self-reported income presents an issue for estimating segregation because it increases the apparent variance of income within neighborhoods, exaggerating the extent to which income distributions of different neighborhoods overlap. In other words, it makes neighborhoods appear more similar than they are, downwardly biasing estimates of income segregation.

Attenuation in segregation estimates has received only passing attention in the literature. In a comment, Dickens and Levy (2003) described downward bias from response error and income volatility in measures of segregation by permanent income, reporting that estimates using dissimilarity indices should be inflated by 15–30%. In a note, Owens et al. (2016) disattenuated their income segregation estimates by dividing $H$ by the reliability of self-reported annual income. Though attenuation bias appears to be severe, it remains unformalized, the previously applied corrections have not been evaluated, and reported estimates rarely consider it.

## Temporal Pooling Bias

A third source of bias emerges when samples of respondents reporting annual income are pooled across multiple years (as is done in the ACS). In this case, temporal variation in neighborhood mean income is included in calculations of within-neighborhood income variation, artificially inflating the variance of incomes within-neighborhoods. This biases measures of income segregation downward. This source of bias has not been discussed in the literature.

The ACS uses pooled samples (pooled over five years in the case of tract-level data), but the decennial census does not. As a result, pooling bias affects the trend during the 2000 to 2005–2009 period that has been the focus of recent work. If neighborhood means vary over five-year sampling periods, income segregation increased more than previously believed during the 2000s. Furthermore, the bias may affect trends even when using the ACS only; if the extent to which neighborhood means vary has been changing since the switch to the ACS, the estimated trend from 2005–2009 onward is biased in an unclear direction.

## Formalizing the Three Biases

### Simplifying Assumptions

For simplicity, and to build intuition, we assume the distribution of observed (self-reported) income in log-dollars in each neighborhood $j$ and year $t$ is described by the data-generating model:

Here, $Yijt$ indicates the true income of household $i$ in neighborhood $j$ in year $t$; $Y^ijt$ indicates its self-reported income; $\Theta $ is the average value of $Y$ in the population over time; $\delta t$ and $\mu j$ are year- and neighborhood-specific deviations from this average; $vjt$ is a neighborhood-by-year specific deviation; $eijt$ is the difference between a household's true income and the average income in their neighborhood and year; and $\u2208ijt$ is measurement error in observed household income. Table 1 provides a reference for the symbol conventions we use here and introduce below.

Additionally, we assume that all neighborhoods are the same size (so that they each are weighted equally in computing segregation). We consider the case where all neighborhoods in the relevant region are included in the data collection, but a fixed sample of size $n$ is drawn from each neighborhood. As a result, we have the full population of neighborhoods, but a finite sample of households within each. Because all neighborhoods are included in estimation, we treat both the number of neighborhoods and the total population over all neighborhoods as infinite in the derivations that follow.

Given the data-generating model, the true within-neighborhood variance of income (denoted $W$) in a given year is

the true between-neighborhood variance of income (denoted $B$) in a given year is

and the total population variance of income (denoted $V$) in a given year is

Segregation is generally defined as the proportion of variation in income that is due to between-neighborhood differences in income. A simple way to operationalize this is to define segregation in year $t$ as the ratio of between-neighborhood variance to total variance of income:

The latter formula is often used when it is simpler to estimate $W$ than $B$.

### Bias in Segregation Estimates

We start by considering the most general case, when all three features of data collection described above are present: individual income is measured with error; neighborhood income distributions are estimated based on finite samples; and neighborhood income distributions are estimated by pooling data across several years. Below we adopt the following notation: to indicate that a measure of segregation is based on data that include measurement error, we add a superscript accent symbol ($S\u2032$); to indicate that a measure is based on a sample, we add a superscript asterisk ($S*$); and to indicate that a measure is based on data pooled over years, we add a tilde above ($S\u02dc$). In addition, let $T$ be the number of years over which neighborhood samples are pooled; let $n$ be the sample size in each neighborhood. Note that in our stylized data-generating model, the reliability of self-reported income would be equal to $r=V/(V+\eta )$.

We first consider the within-neighborhood variance we expect to observe in this case. The within-year, within-neighborhood variance of observed income will be $\omega +\eta $ (true income variance plus measurement error). But we do not observe a sample drawn from one year; nor do we observe income of all households. Instead, we estimate the within-neighborhood variance from a finite sample of size $n$ pooled over $T$ years. If we demean the sample distributions in each year, pool them over years, and then compute the sample variance of the pooled distribution, the expected variance of the observed demeaned within-neighborhood distribution will be $(n\u22121n)(\omega +\eta )$. However, because the sample is pooled over years, the observed within-neighborhood sample variance will be larger, because neighborhoods do not have stable mean income over time. Over a period of $T$ years, the neighborhood means have an expected sample variance of $\Delta T+\u2009(T\u22121T)\sigma ,$ where $\Delta T$ represents the expected sample variance of the population mean income over $T$ years and $(T\u22121T)\sigma $ represents the expected variance of the neighborhood means over $T$ years, net of the overall changes in income over the period.^{1}

Thus, the expected variance of observed income within a neighborhood, when the distribution of income is estimated from a sample of size $n$ surveyed over a period of $T$ years, will be

Next, we consider the total variance of observed income in a sample pooled over $T$ years. This will be equal to $V,$ plus two additional components: additional variance $\eta $ due to measurement error in observed income, and additional variance $\Delta T$ resulting from changes in mean population income over time:

Subtracting the expected within-neighborhood sample variance from the total variance yields an expression for the expected value of the between-neighborhood variance in observed income:

The first term represents between-neighborhood variance due to stable differences in neighborhood incomes; the second term represents variance among neighborhoods in their average $vjt$; and the third term represents variance in the estimated neighborhood means due to the fact that the neighborhood means are estimated from finite samples of size $n$.

After some algebraic rearranging of terms, the expected value of $S$ can be written as

Note that, because the sample sizes used for computing segregation are generally quite large, we will drop the expected value notation going forward.

Given Eqs. (6)–(9), it is straightforward to derive expressions for the bias in $W,$$B,$$V,$ and $S$ under any combination of the three data collection processes. If there is no measurement error, we set $r=1$ in the above expressions. If income data are collected from the full population, rather than samples, we set $n=\u221e$. If data are collected in a single year, rather than pooled over years, $T=1$ and $\Delta T=0$ (because a sample of one has no variance). Table 2 displays the full set of variance and segregation expressions under all combinations of the three data collection conditions. For example, estimates of segregation based on decennial census data have $T=1$ and $\Delta T=0$, which yields

and

Both the between-neighborhood and total variance components are biased upward in this case, when $r<1$ (which implies $\eta >0$) and $n$ is finite. The direction of bias in segregation, however, is ambiguous: measurement error biases it downward, while sampling biases it upward; the net bias will depend on the specific values of $r$, $n$, and $S$. For example, if true $S=.2$ and $r=.80$ and $n=100$, the estimated value of $S$ will be $S\u2032*\u2009\u2009\u2009\u2248.17$; however, if $r=.90$ and $n=25$, $S\u2032*\u2248.22$.

In summary, measurement error and pooling both bias segregation estimates downward, whereas sampling biases segregation estimates upward. ACS and decennial census estimates are known to differ in two ways: (1) the ACS has downward bias from pooling unlike the decennial census and (2) the ACS has greater upward bias than the decennial census from sampling at a lower rate. Because of these countervailing factors, the direction of the bias when comparing ACS and decennial census estimates is unclear.

Consider Figure 1, which depicts how segregation is biased under ACS- and census-type sampling when $S=.3$. Considering the orange line, we see that an annual sample where true income is observed results in upward bias, which is greater at the ACS's lower sampling rate than at the census's sampling rate. Considering the blue line, we see that an annual sample of observed income ($r=.75$) leads to severely underestimating income, but this is slightly mitigated when sampling at a lower rate. And considering the maroon line, we see that a pooled sample of observed income (using our average metro-area estimates of $\Delta T$ and $\sigma $ as benchmarks) leads to further underestimation. Pooled sampling is a substantially more potent source of downward bias than finite sampling is a source of upward bias, hence ACS-type data, which use a pooled sample of observed income, will typically be more underestimated than census-type data, which use a larger, annual sample of observed income. That is, there is reason to suspect that Bischoff and Reardon's (2014) estimated segregation increase from 2000 to 2005–2009 was underestimated, not overestimated.

## Correcting for Bias in Segregation Estimates

We now turn to the question of how to estimate segregation given these three sources of bias. After defining our generalized estimator for segregation, we consider how to estimate $\Delta T$ and $\sigma $ using ACS data. The following subsection focuses on estimation with continuous income data, while the subsequent subsection concerns estimation with coarsened income data. Following these two subsections, we validate these approaches and discuss practical considerations in census and ACS data, and apply our method to major metro areas.

Equation (9) can be rearranged to get an estimator for $S$ when any combination of the three biases is present. We observe naive estimates of total within-neighborhood variance and between-neighborhood variance, which we denote $W^$ and $B^,$ respectively; these variance estimates may have any combination of the three biases depending on the data condition. True segregation is

As before, we set $r=1$ if there is no measurement error, $n=\u221e$ if the data are collected from the full population, and $T=1$, and $\Delta T=0$ if the data are annual rather than pooled.

While we focus on $S$ for simplicity, alternative segregation indices can also be recovered; in the online appendix A, we describe how we estimate the rank-order segregation indices used in the Bischoff and Reardon (2014) and Reardon et al. (2018) analyses. Additionally, we can relax the assumptions that *J* and the total population are large; online appendix B provides equations for when there are finite tracts or a finite population.

In any case, we can use Eq. (11) to estimate segregation from decennial census data given tract sample size $n$ and an estimate of $r$ from the literature, but to estimate segregation from ACS data we will need to first estimate $\Delta T$ and $\sigma .$ To do so, we first consider the case in which neighborhood segregation is computed within metropolitan areas and income data are continuous. We get our estimators for $\Delta T$ and $\sigma $ from a system of three equations, which we turn to now.

The first equation we consider is crucial to estimating $\Delta T$. Here, we leverage that the ACS provides publicly available annual income data for metro areas, which allows us to compute the unadjusted sample variance of estimated annual metro means over time, $svar(Yt\xaf^)$. The observed estimated metro mean in year $t$ is

where $Y\xaft$ is the true year $t$ metro mean in constant dollars and $ut$ is the error in $Y\xaf^t$. Let $ut\u2009~\u2009MVN(0,U)$. The expected sample variance of $Y\xaf^t$ over $T$ years is

We observe $svar(Yt\xaf^)$ and $T$ but need more information to estimate the sampling error variance, $U,$ and to be able to solve for $\Delta T$.

The second equation we consider is crucial to estimating $\sigma $. Here, we make use of $Var(d^j)$, the variance of the observed difference in pooled neighborhood mean income (in constant dollars) between adjacent pooled samples. Note that because this equation requires adjacent pools, it uses $T+1$ years of data. Suppose $T=5$. The difference, $d^j$, between neighborhood $j$'s observed pooled mean in the years 1 through 5 pool, $Y\xaf^j15$, and its observed pooled mean in the years 0 through 4 pool, $Y\xaf^j04$, is

where $qj15$ and $qj04$ are the errors in neighborhood pooled mean income in the two pooled samples. Let $qjab\u2009~\u2009MVN(0,Q)$. The variance of $d^j$ is

We observe $Var(d^j)$ and $T$, but need more information to estimate the error variance in the pooled neighborhood means, $Q$, and to be able to solve for $\sigma $.

### Correcting for Bias Using Continuous Income Data

When data are continuous, $U$ and $Q$ are not observed because they rely on the within-unit variance of observed income in the population, $W\u2032.$ Specifically, $U=TW\u2032Jn+\sigma J$ and $Q=W\u2032n$.^{2} We can use the equation for $W\u02dc\u2032*$ to define $W^$ in terms of $W\u2032$:

where $n=\u221e$ when observing the full population.

This gives us a system of three equations (for $svar(Yj\xaf^)$, $Var(d^j)$, and $W^$) with three unknowns ($\sigma $, $W\u2032,$ and $\Delta T$), which we can solve for $\Delta T$ and $\sigma $. Solving for $\Delta T$:

and for $\sigma $:

where $n=\u221e$ when observing the full population.

Note that this estimation strategy leans on the assumption of no time trends within neighborhoods during the pooling period. This is necessary without more information about how neighborhood mean incomes change, but it risks inflating the magnitude of pooling bias and overcorrecting our segregation estimates. In online appendix C, we consider pooling bias when there are linear neighborhood time trends and produce alternate segregation trend estimates under different scenarios. Given linear time trends, the bias from pooling decreases when there is less residual variance among neighborhoods' $T$ observations and when there is less variation between neighborhoods' trend magnitudes. However, even in extreme scenarios, it is likely that Bischoff and Reardon's (2014) estimated segregation increase from 2000 to 2005–2009 was underestimated. Relaxing the assumption decreases our bias-corrected estimates of segregation and its increase over time, but the changes are small relative to the discrepancy between our estimates and prior findings.

### Correcting for Bias Using Coarsened Income Data

We typically have income data that have been coarsened to preserve privacy. In this case, we can use constrained heteroskedastic ordered probit (HETOP) models to estimate tract and metro-area means and variances as needed under the assumption that income is log-normal within neighborhoods (Reardon, Kalogrides, and Ho 2017). Correcting for bias in coarsened data requires a different approach because HETOP estimates include error variance from the HETOP modeling. This modeling error variance inflates $U$, $Q$, and $B^$.

Estimating $\Delta T$ and $\sigma $ is easier in coarsened data because HETOP's reported standard errors can be used to estimate $U$ and $Q$. In coarsened data, $U$ and $Q$ include error variance from HETOP modeling in addition to sampling error variance, and HETOP's reported standard errors for each estimated tract mean include error variance from both sampling and modeling. Thus, we can estimate $U$ and $Q$ as the average squared standard error of the estimated metro-year means and tract-pool means, respectively, then compute $\Delta T$ and $\sigma $ by a simple rearranging of terms in Eqs. (13) and (15).

However, there is an added step before plugging into Eq. (11) once we have estimated $\Delta T$ and $\sigma $, because $B^$ is inflated by the HETOP modeling error in the neighborhood means. Let $B^coarse$ refer to the estimated between-neighborhood variance in coarsened data while $B^$ is the between-neighborhood variance one would estimate if the data were continuous. We estimate $B^$ by first removing the error variance in the estimated neighborhood means, $E$, which includes both modeling error variance and sampling variance, then adding the sampling variance back in:

where $E$ is the error variance in the estimates of the neighborhood means and $W^n$ is the sampling variance of neighborhood sample means. We estimate $E$ from the average of the squared standard errors that HETOP reports for the estimated neighborhood means.

### Assessing the Segregation Estimator's Accuracy in Four Data Conditions

We assess the validity of our approach by simulating $M=100$ metros over $K=12$ years according to the data-generating model in Eq. (1). In each metro, we set $J=500$, $n=120$, $S=.3$, $\Theta =0$, $r=.75$, $\omega =.7$, $\tau =.27$, $\sigma =.03$, and $\Delta =.001$, values similar to what we observe for major metro areas in the analysis in the following section. We first generate tract-year means $Yjt=\delta t+\mu j+vjt$ such that the sample variance of $\delta t$ over the $K$ years is exactly $K\u22121K$, the sample variance of $\mu j$ over the $J$ tracts is exactly $\tau $,^{3} and the sample variance of $vjt$ over the $JK$ tract-years is exactly $JKJK\u22121\sigma $. We then generate samples of $n$ households in each tract-year, drawing $eijt$ and $\u2208ijt$ from normal distributions with variances of $\omega $ and $\eta $, respectively.

We use the resulting annual continuous data to model three additional conditions: annual coarsened data, pooled continuous data, and pooled coarsened data. When coarsening, we use 10 bins with cut points set at the deciles of a normal distribution with variance $Vr\u2009\u2009+\u2009\u200945\Delta ,$ where $\Delta =0$ if data are annual. When pooling, we draw annual samples of $nT$ then pool over $T=5$ years. We estimate uncorrected segregation ($S$) and corrected estimates, computing $KM=1,200$ estimates in the annual conditions and $(K\u2212T)M=700$ estimates in the pooled conditions.

Figure 2 reports our corrected estimates of $S$ compared with the uncorrected estimates in each condition (online appendix D discusses the results for $\Delta T$ and $\sigma $). We report the 95% confidence interval around each mean estimate to display the bias in the estimates and we report the 90th percentile to 10th percentile range of our estimates to display their imprecision.

Across conditions, our correction substantially reduces bias; across all simulations and conditions, the magnitude of bias is reduced by 99% on average. In both continuous data conditions, our estimates are unbiased; in the annual continuous condition, the 95% confidence interval of the mean is (.2993, .3002), while in the pooled continuous condition it is (.2996, .3001). There is a small downward bias in the annual coarsened and pooled coarsened conditions, which have confidence intervals of (.2982, .2991) and (.2982, .2988), respectively. This appears to be due to HETOP models tending to slightly overestimate within-tract variance, as has been noted elsewhere (Reardon, Shear et al. 2017).

The corrected estimates are not substantially less efficient than uncorrected estimates, most of the imprecision owing to sampling variability, but the imprecision is nonetheless nonnegligible. Across conditions, the standard deviation of the corrected estimates is 32.7–36.4% greater than that of the uncorrected estimates. This variation in the estimates is some cause for caution when comparing single observations of individual metro areas.

## New National Trend Estimates

### Data and Measures

In the foregoing simulations, applying our bias-correcting estimator to annual data sharply increased estimated segregation and applying it to pooled data raised estimated segregation further still. This suggests that the various estimates of national income segregation at the center of recent debates are all severe underestimates. Moreover, given the change from estimates based on one-year, decennial census data in 2000 to estimates based on pooled ACS data starting in 2005–2009, the national trend may differ substantially from the various estimates in prior research.

We produce new national trend estimates using our bias-correcting estimator. Our data rely primarily on 1990 and 2000 census and 2005–2009 through 2015–2019 ACS tabulations of tract-level household income in constant dollars. We focus on the Bischoff and Reardon (2014) sample of the 116 largest metro areas. To facilitate HETOP modeling, we further coarsen income from 16 to 8 bins by combining adjacent bins.^{4} We also exclude the small number of tracts that have small populations (fewer than 50 households) or sparse income distributions (observations in fewer than five bins). In 2015–2019, for example, these exclusions remove 11 of the 45,882 tracts with nonzero populations.

Applying our estimator requires estimates of $r$, which we derive from the literature. In online appendix E, we review the literature on the reliability of self-reported income. The literature provides a wide range of estimates; studies use a variety of methods, compare observed data to a variety of approximations of true values, and estimate the reliability of various types of income reports (e.g., household vs. personal income, total income vs. wage income). As a result, it is not clear exactly how reliable income reports are, though most evidence suggests that the reliability of self-reported household income in the decennial census and ACS is roughly between $.7$ and $.8$. We use the midpoint, $r=.75$, to produce our primary estimates. We treat the reliability of income as stable over time and place and assume it is the same for the decennial census and ACS income items, which have the same design.

To apply our estimator to the pooled ACS data, we need to estimate $\Delta T$ and $\sigma $. We estimate a constant value of $\Delta T$ in each metro area using ACS one-year metro-area data from 2010 (when the data first became available) to 2019.^{5} We estimate $svar(Yt\u2032)$ when $T=5$ by estimating the population variance of metro mean income over all 10 years with available data, then multiply this by $45$ to estimate the five-year sample variance.^{6} This improves precision at the cost of forcing $\Delta T$ to be constant over time within metros. We estimate a six-year running average of $\sigma $ in each metro area using adjacent five-year pools from the ACS in 2005–2009 (the start of the ACS) through 2015–2019. In practice, the $\sigma $ estimates are imprecise, which can inflate how much $\sigma $ appears to vary over time and across metros. In online appendix F, we describe a procedure we use to reduce imprecision and better estimate variation in $\sigma $.

We estimate household income segregation between tracts within metro areas and we estimate it in three metrics: $S$, $H,$ and $R$. While we use $S$ above for illustrative purposes, it is not a standard income segregation measure. It is more common to estimate income segregation using the rank-order information theory index, $HR$, and the rank-order variance ratio index, $RR$, measures for computing segregation of a coarsened continuous variable (Reardon and Bischoff 2011). More crucially for our purposes, these are the metrics in which Bischoff and Reardon (2014) reported their unadjusted estimates and Reardon et al. (2018), Logan et al. (2018), and Logan et al. (2020) reported their finite sampling bias-adjusted estimates. The rank-order measures estimate segregation in continuous characteristics like income. Income segregation is estimated by using binary segregation estimates (i.e., $H,$ which is based on entropy, and $R$, which is based on variance) taken at several cut points along the income distribution to estimate how binary segregation varies over the full distribution of income. This segregation “profile” is then integrated over to collapse the information into rank-order segregation, a single, distribution-wide measure that equally weights each income quantile. Online appendix A describes how we extend our method to estimating rank-order segregation and includes a simulation assessment of the bias-reduction using this strategy.

Our estimation procedure is unbiased under the simplifying assumptions that income is log-normal within tracts and observed with measurement error that is classical in the logged income metric. Tract income distributions are of course messier than our simple data-generating model, so we assessed whether this assumption biases our estimates and applied a *post hoc* adjustment to remove the bias. We estimate unadjusted rank-order income segregation in $HR$ and $RR$ in our analytic sample in two ways: first, using the income tabulations provided by the census or ACS and, second, using the income tabulations implied by the HETOP-estimated means and variances (see Eqs. (A1), (A2), and (A3) in online appendix A for more details). The latter estimates, which assume income is well-behaved, tend to provide overestimates of unadjusted segregation compared to the tabulation-based estimates. The bias is smaller in later years (no more than 2.4% from 2006–2010 on) and is greatest in 1990, where the estimates are 11.1% greater. Our *post hoc* adjustment removes this bias in each metro-year estimate by multiplying our adjusted segregation estimates by the ratio of the tabulation-based unadjusted estimates to the HETOP-based unadjusted estimates.^{7}

### Findings

Figure 3 presents the estimated national trends in each segregation metric—$S,$$H,$ and $R$—when $r=.75$. The top panel portrays the trend in each metric, while the bottom panel portrays the trend as a percentage change since 1990 to ease comparisons across metrics. The trend is similar across metrics. The rank-order segregation trends, $H$ and $R$, run nearly parallel to one another, with estimates in $R$ consistently above those in $H$ such that they almost perfectly align when presented as a percentage change. The trend in $S$, which is roughly double the rank-order measures in scale, has a similar pattern but with a noticeably less steep percentage increase from 2000 to 2005–2009 (plotted at the midpoint, 2007). Across the three metrics, we find that income segregation declined by roughly 4% from 1990 to 2000 and then increased substantially through 2006–2010, such that the net increase since 1990 was 12.1% in $S$, 19.0% in $R$, and 20.3% in $H$. Over the following eight years, segregation declined, and by 2014–2018, it was only about 5% above the 1990 value in the rank-order metrics and had returned to the 1990 value in the $S$ metric. Segregation ticked back up between then and our last observation, the 2015–2019 period.

Figure 4 compares our estimates of the national trend to what one would estimate using extant methods, with the detailed estimates reported in Tables 3 and 4. We provide three types of estimates: an unadjusted estimate following the procedure of Bischoff and Reardon (2014); an estimate adjusted for finite sampling following the procedure of Reardon et al. (2018), denoted RBOT estimate; and new estimates that apply our correction to adjust for finite sampling, attenuation, and pooled sampling. We report three new estimates at each time point depending on the reliability of income, which is estimated with a wide range in the literature. The top panel of Figure 4 and Table 3 provide estimates in the $H$ metric, while the bottom panel of Figure 4 and Table 4 provide estimates in the $R$ metric.

We see three main takeaways in these findings. First, segregation is far greater than previously believed. In the 2015–2019 ACS sample, we estimate that if the reliability of self-reported household income is $r=.75$, the average segregation across major metro areas in the $H$ metric is 52.1% greater than unadjusted estimates suggest and 64.4% greater than would be estimated using the Reardon et al. (2018) correction for finite sampling bias. The findings are similar when using the $R$ metric: segregation is 49.5% greater than unadjusted estimates suggest and 59.2% greater than estimates following the Reardon et al. (2018) procedure. Most of the downward bias in the unadjusted and Reardon et al. (2018) estimates is due to attenuation bias; attenuation bias accounts for the entirety of the difference between our estimates and the Reardon et al. (2018) estimates in the decennial census years, where our estimates show that segregation is over 40% greater than previously believed.

Second, the increase from 2000 to 2005–2009 (plotted at 2007) was much greater—not lower—than what we find when we use unadjusted estimates like Bischoff and Reardon (2014). This marks the switch from decennial census to ACS data, in which the additional downward bias from pooled sampling far outweighs the increase in the upward bias from sampling at a lower rate. Using unadjusted estimates yields an estimated increase of .0049 in the $H$ metric, or 5.0%. Following the Reardon et al. (2018) procedure, we estimate a substantially smaller increase of .0015 (1.5%). At $r=.75$, we estimate that segregation actually increased by .0318 (23.4%), a change that was masked by unaccounted-for pooling bias in the other estimates. The findings are similar when using the $R$ metric: we estimate that segregation increased by 21.4% compared with changes of 3.8% using unadjusted estimates and 0.8% using finite sampling bias-adjusted estimates.

Third, the decline in segregation after 2005–2009 is underestimated in prior estimation methods because those estimates include time-varying downward bias from pooled sampling bias. The difference in trends is starkest when comparing 2011–2015 to 2005–2009. Over this time, both unadjusted estimates and estimates following the Reardon et al. (2018) procedure find a small increase in segregation regardless of metric, whereas we find a decrease of .0109 (6.5%) in the $H$ metric and a decrease of .0113 (6.2%) in the $R$ metric when $r=.75$. Our estimates of $\sigma $ over this period show that pooled sampling bias was decreasing in magnitude as tract-level volatility settled down following the Great Recession, tilting the trend upward and masking the decline in estimates that do not correct for it (see Figure G2 in online appendix G).

This speaks to the sensitivity of segregation trend estimates to the trend in neighborhood-level income volatility when using pooled sample data, which will be standard practice for the foreseeable future. For example, we might suspect that both segregation and tract income volatility have recently been increasing during the COVID-19 pandemic, meaning both segregation and the downward pooled sampling bias are greater. If we estimate current segregation with future ACS data without adjusting for pooled sampling, it is plausible that income segregation will appear steady despite having increased. Correcting for pooled sampling bias will be crucial for accurately estimating segregation trends for as long as researchers are reliant on pooled ACS data.

### Limitations

Our estimator improves on past approaches by formalizing and accounting for attenuation bias and pooling bias in addition to finite sampling bias. However, our estimator is a partial solution that we hope will motivate and facilitate additional improvements. Though we rely on weaker assumptions than prior estimates—which have assumed $r=1$, $\sigma =0$, and $\Delta =0,$ and in many cases assumed a 100% sampling rate—we nonetheless had to rely on strong simplifying assumptions about the income distribution, measurement error in income, and neighborhood sampling rates given gaps in the available public data and literature. With more detailed information about the true income distribution, measurement error in self-reported income, and tract sampling in the decennial census and ACS, we could produce estimates that relax our assumptions. We now turn to each set of assumptions and consider how additional information might alter our findings.

First, we assume income is well-behaved such that there is independent annual variation in neighborhoods' means. This implies that there are no neighborhood time trends during the pooling period, which is why we can estimate $\sigma $ and $\Delta $ without annual data. If there are neighborhood trends, we have inflated the magnitude of pooling bias and overcorrected our segregation estimates. Online appendix C considers how our estimates would improve given additional information about how neighborhood mean incomes change. We demonstrate how to extend our estimator to incorporate this information. In the most extreme scenario we consider, neighborhoods follow maximally steep linear trends with annual observations falling almost exactly on the trend line. Accounting for this information would meaningfully decrease our bias-corrected estimates of segregation and its increase, but our estimates would remain much greater than estimates based on existing estimators.

Second, we assume measurement error in income is well-behaved such that (1) the reliability of self-reported income is known and constant and (2) measurement error is classical. Our assumptions about measurement error are crucial because our estimates are particularly sensitive to our corrections for attenuation bias. Though our estimator can readily incorporate better reliability estimates that vary over time and place, we are constrained by the available literature. As Figure 4 demonstrates, our uncertainty about the reliability of self-reported income creates a large band of potential segregation estimates. A greater concern for making comparisons over time or across places is the possibility that the reliability varies. For example, the decennial census is given near tax season and asks about the previous years' income, whereas the ACS is given year-round and asks about the past 12 months' cumulative income, a tougher recall task that may make ACS income reports less reliable. In this case, we are underestimating the increase in income segregation from 2000 to 2005–2009. With more detailed inputs, our estimator may yield different substantive findings.

Our estimator cannot, however, be readily extended to account for violations of the simplified data-generating model's log-normality assumptions, the foremost threat being nonclassical reliability. Yet we can speculate how our estimates may be affected. If lower income households report their incomes with more error, as Kim and Tamborini (2012) reported, measurement error widens income distributions more than we have assumed such that we may have undercorrected our estimates. Conversely, we may have overcorrected our estimates if self-report errors at the tails are skewed toward the population mean (Kim and Tamborini 2012; Pedace and Bates 2000; *cf*. Bingley and Martinello 2017). Nonclassical response error also raises the possibility that response errors are related to tract- and metro-level volatility. For example, if self-reports err toward period means, our estimates may have overcorrected for attenuation bias. This is less of a concern if one is more interested in segregation by long-run income rather than annual income, but challenges remain. While there exist estimates of the reliability of self-reported annual income as a proxy for long-run income (e.g., lifetime earnings) (Brady et al. 2018; Haider and Solon 2006; Hyslop 2001; Mazumder 2001; Rothstein and Wozny 2013), they will imperfectly account for self-reports erring toward period means if there is misalignment between the pooling period length, the long-run averaging length, and the way self-reported income errs toward income in the recent past and near future.

Third, our equations assume simpler sampling than occurs in practice, treating neighborhood sample sizes as constant within populations and ignoring sampling rate variation within neighborhoods. We assume constant sampling across tracts within metro areas, but tract sample sizes can differ in practice. We relax this assumption when computing bias-corrected tract means and variances to estimate rank-order segregation (see online appendix A) by using tract-specific sample sizes, but we cannot relax it for our computations of $\sigma $.

To make our model tractable using publicly available data, we must also ignore sampling rate variation within tracts. This is violated by weighted sampling and, in pooled samples, by time-varying sampling rates. Logan et al. (2020) considered bias from weighted sampling in depth. Our estimates of the national trend are likely overestimated in light of bias from weighted sampling, though the size of this bias is small relative to the three biases considered and the uncertainty in segregation due to uncertainty in income reliability. It poses a larger problem for race-specific analyses, for which sample weights differ more. Logan et al. (2020) provided formulas correcting for weighted sampling that, like our estimator, estimate segregation by estimating within-unit variance and total variance, so extending our estimator to accommodate weighted sampling is straightforward when the necessary data are available. Our estimator also does not consider time-varying sampling rates in the ACS. Annual variation in tracts' sampling rates would lead to overcorrecting for finite sampling bias and underestimating segregation, whereas annual variation in sample weights would have the opposite implications. We cannot extend our estimator to account for these conditions without microdata.

## Discussion

We consider three sources of potential bias in recent income segregation estimates, yielding several important findings. We extend the recent discussion of upward bias in sample-based segregation measures by considering two sources of downward bias that have received less attention. We confirm that segregation measures using noisy characteristic data, such as self-reported income, are severely biased downward. We also demonstrate that segregation measures drawing from pooled samples, as we find in ACS data, are biased downward. This bias can be substantial when the characteristic of interest varies over time; our estimates of tract- and metro-level income variation in the ACS indicate that pooling bias is typically greater in magnitude than finite sampling bias in the case of income segregation. Ignoring any of these three sources of bias may lead to incorrect inferences.

Additionally, formalizing the relationship between the three sources of bias indicates that the direction of bias is often unclear, including in both decennial census– and ACS-type data. We also show how one can use a single procedure to compute segregation estimates that are largely free of the three sources of bias. We demonstrate that this is possible using only publicly available census and ACS tabulations and the extant literature on the reliability of income.

Our bias-corrected estimates indicate that income segregation is on the order of 50% greater than previously believed owing to attenuation and pooled sampling bias. The increase in segregation from 2000 to the 2005–2009 period estimated without adjustment is an underestimate by more than a factor of 4, rather than an overestimate as has been argued. The decline in segregation in the years following 2005–2009 has been underestimated as a result of pooled sampling bias: from 2005–2009 to 2011–2015, a period of declining tract-level volatility and consequently declining pooled sampling bias, segregation declined by over 6%, but estimates ignoring pooled sampling bias instead find a small increase. The latter finding is particularly noteworthy because it indicates that pooled sampling bias substantially distorts unadjusted segregation trends when using only ACS data, and so correcting for pooled sampling bias will be crucial for the foreseeable future.

Though the focus of this article is the measurement of income segregation, the new estimates of the national income segregation trend also have implications for how we understand income segregation. Reardon and Bischoff (2011) estimated that an increase of 1 standard deviation in income inequality raises income segregation by .25 standard deviations, providing one account for what drives the trends we observe here. However, it remains unexplained why we find segregation decline in the 1990s when income inequality substantially increased, and why we find such a large increase in segregation from 2000 to 2005–2009, a period of only modestly increasingly inequality (World Bank 2022). Our estimates call for further examination of what drives changes in income segregation and, potentially, identification of additional mechanisms.

The three types of segregation measurement biases we consider are not limited to income segregation estimates, the segregation indices discussed here, or the datasets we focus on. As others have discussed in more detail, finite sampling bias is relevant to all sample-based measures of segregation (Logan et al. 2018; Reardon et al. 2018).

Similarly, all segregation indices are biased downward by noise in the characteristic of interest, whether because of reporting error (e.g., self-reported educational attainment), the use of data reduction techniques (e.g., socioeconomic status), or using a noisy proxy (e.g., using annual income to proxy for permanent income). Attenuation bias can also lead to erroneous inferences when comparing across times, groups, or datasets in which the reliability of the characteristic of interest differs. For example, income segregation comparisons across groups that differ by educational attainment, race and ethnicity, or occupations may be biased by group differences in response error (Kim and Tamborini 2012).

All segregation indices are potentially biased downward by pooled sampling when data are collected as a pooled sample, as in the ACS, or when researchers combine samples to increase statistical power. This is a relevant concern regardless of the characteristic of interest, but it depends on whether there is unit-level (e.g., neighborhood-level) variation over time within the pooling period. This may occur primarily because people's characteristics change over time, as in the case of income, or because of mobility across units. In addition to pooled sampling in the ACS biasing the comparison of decennial census and ACS data, we find that changes over time in tract-level income volatility mask the recent downward trend in income segregation estimates that ignore pooled sampling bias. Comparing neighborhood segregation across groups in pooled samples will also be biased if the groups have different levels of neighborhood mobility or volatility in the characteristic of interest.

Moreover, outside of the case of annual population data, the aggregate bias in segregation measures is both multiplicative and additive, such that the bias depends on the true segregation level whether comparing raw or proportional differences.

The bias-corrected estimator we describe here brings us much closer to yielding approximately unbiased estimates in each of these cases. Researchers can use it to make valid comparisons when true segregation, sample sizes, unit-level volatility, and reliability differ. In the case of national income segregation trends, this approach allows us to compare income segregation across metro areas and years with different sample sizes, levels of true segregation, neighborhood-level income volatility, and metro-level income volatility. It would also allow comparisons across metro areas and years with different reliabilities of income if such information were available. Though our estimator assumes a large number of units and a large population, researchers can use the adjustments in online appendix B to estimate segregation in smaller populations.

Bias-corrected estimates will not be accurate in all cases. Our estimates are based on a simplified data-generating model with assumptions that may be violated. Income may not be log-normal, and self-reported income may have nonclassical reliability (Kim and Tamborini 2012; Pedace and Bates 2000). Neighborhood income may have time trends within pooling periods, in which case one would need more information, particularly regarding the noisiness of the trends, to avoid overcorrecting for pooling bias (see online appendix C). We also assume unweighted sampling; though it is straightforward, scholars with access to sampling weights would need to integrate the adjustments in Logan et al. (2020) with ours.

Income segregation is surprisingly difficult to measure from decennial census or ACS data. In this article, we partially address the challenges of measuring it. We formalize three important threats to segregation measures. Note that these sources of bias pertain equally to aggregated, publicly reported census and ACS data and to household-level census and ACS microdata. Given these sources of bias, we provide a strategy for estimating segregation when income and bias are well-behaved and income reliability is known. Our new estimates of the national income segregation trend demonstrate that previous estimates of both the levels and trends in income segregation were severely biased. That said, there remains some uncertainty about the actual trend, because our estimates rely on simplifying assumptions about the shape of income distributions, the independence of neighborhood income means from different years, and estimates of the reliability of self-reported income. Violations of these assumptions or variation (over time or place) in the reliability of self-reported income might bias the trends further in ways we have not been able to correct for with the information currently available. Nonetheless, we hope the groundwork laid here improves our understanding of the measurement and trends in income segregation.

## Acknowledgments

We thank Ann Owens for providing the necessary code to match our sample to past studies. We received helpful comments from discussants and audience members at the Population Association of America and American Education Research Association conferences. Leung-Gagné acknowledges support from the Institute for Education Sciences (R305B140009).

## Notes

^{1}

We think of the $T$ years as a random draw of years from some long time period such that the expected value of $\Delta T$ is $(T\u22121T)\Delta $. Note that while any specific period of $T$ consecutive years might have substantially different variance than the long-run average variance of a set of random years, in practice we estimate $\Delta T$ from computations of $\Delta $ over periods only slightly longer or shorter than $T,$ depending on data availability.

^{2}

Note in the equation for $U$ that $\sigma J$ is the variance of the annual metro mean of $vjt$ over years.

^{3}

This is in accordance with the assumption that $J$ is sufficiently large to ignore sampling over finite $J.$

^{4}

Note that the ACS reports estimated population counts, which can be scaled to approximate sample bin counts by multiplying by the sampling rate. Alternatively, the standard errors on estimates using population bin counts can be scaled up by dividing by the square root of the sampling rate.

^{5}

We estimate a constant value of $\Delta T$ owing to data constraints. Moving averages of $\Delta T$ would be exceedingly noisy (see Figure D2 in the online appendix) and require substantial imputation to account for missing observations, most critically during 2005–2009, when we do not observe annual metro means.

^{6}

Of the 116 metro areas in our analytic sample, 103 appear in all 10 years; two appear in the first nine years and eight appear in the first three years, so we estimate $\Delta T$ from those years in these areas; two are missing, so we impute $\Delta T$ in these areas as the mean over the observed metros; and the Philadelphia metro area appears in all 10 years but it has an unbelievable income decline from 2012 to 2013, so we estimate $\Delta T$ separately using the 2010–2012 and 2013–2019 time spans and use the observation-weighted average of these estimates as $\Delta T$ in our analyses.

^{7}

We can compute this ratio only in the $H$ and $R$ metrics, and it is similar across the two metrics, so the ratio we use for our adjusted estimates in the $S$ metric is the average of the $H$ and $R$ ratios.