## Abstract

Reports of rising income segregation in the United States have been brought into question by the observation that post-2000 estimates are upwardly biased because of a reduction in the sample sizes on which they are based. Recent studies have offered estimates of this sample-count bias using public data. We show here that there are two substantial sources of systematic bias in estimating segregation levels: bias associated with sample size and bias associated with using weighted sample data. We rely on new correction methods using the original census sample data for individual households to provide more accurate estimates. Family income segregation rose markedly in the 1980s but only selectively after 1990. For some categories of families, segregation declined after 1990. There has been an upward trend for families with children but not specifically for families with children in the upper or lower 10% of the income distribution. Separate analyses by race/ethnicity show that income segregation was not generally higher among Blacks and Hispanics than among White families, and evidence of income segregation trends for these separate groups is mixed. Income segregation increased for all three racial groups for families with children, particularly for Hispanics (but not Whites or Blacks) in the upper 10% of the income distribution. Trends vary for specific combinations of race/ethnicity, presence of children, and location in the income distribution, offering new challenges for understanding the underlying processes of change.

## Introduction

Evidence of increasing income inequality in the United States has heightened interest in the degree to which social classes separate into neighborhoods based on income. Several recent studies focusing on the post-2000 period have reported that income segregation is trending upward. For example, Bischoff and Reardon (2014:208) stated, “Socioeconomic residential sorting has grown substantially in the last forty years . . . and the bulk of that growth occurred in the 1980s and in the 2000s” (see also Florida and Mellander 2015; Fry and Taylor 2012). These reports have been questioned by the insight that the observed trends after 2000 are distorted by changes in census data collection. Logan et al. (2018) pointed out that the post-2000 income data, on which all recent measures are based, come from much smaller samples (less than 8%) in the American Community Survey (ACS) than were previously available from the decennial census (about 16%). They demonstrated that sampling at the census tract level results in an inherent upward bias in standard measures of income segregation and that this bias is greater when the sample size is smaller.

Bias due to limited sample size has two main implications for past findings. First, income segregation may erroneously have appeared to increase after 2000. Logan et al. (2018) offered a rough estimate that as much as one-half of the apparent increase in income segregation may be due to the Census Bureau’s new, smaller samples, an estimate seconded by a subsequent reanalysis by Reardon et al. (2018). Second, the same bias has greater effects on estimates of income segregation in racial/ethnic subgroups of the population because samples of African Americans and Latinos in particular are very limited in most census tracts. The much higher level and substantially greater increases of income segregation that have been observed among minorities compared with Whites since 2000 may be misleading as a result.

We confirm these insights but show that there is another equally important source of upward bias in segregation estimates stemming from the use of weighted sample data. Fortunately, more reliable measures can be calculated from the original unit-level household sample data collected by the Census Bureau, and we implement our proposed correction procedures for the decennial censuses of 1980, 1990, and 2000, and the ACS of 2007–2011 and 2012–2016. With these data, we provide unbiased estimates for measures based on both original incomes and their rank-ordered transformation (H, R, and NSI, as defined later). We report new findings on levels and trends in income segregation overall and for the top and bottom tenths of the population, with separate analyses for all families, families with children, and for racial/ethnic subgroups:

Income segregation increased on all measures and for every category of families between 1980 and 1990. Most measures have been stable since 1990, and some have declined. Instead of explaining how increasing inequality translates into greater residential separation, researchers now need to understand why it may not.

During this whole period, income segregation was not consistently higher for Black and Hispanic families than for White families. As is true for the whole population, we find that income segregation among all three groups increased in the 1980s, but on most measures not after 1990.

Looking specifically at families with children, there were increases after 1990 for the total population and for all three racial/ethnic groups. Based on measures that focus on the top and bottom of the income distribution, results vary by group. Segregation increased for the lower 10% among White families with children but declined for the top 10%. Evidence is mixed for Black families with children. For Hispanic families with children, there was no change for the lower 10% but a substantial increase for the top 10%.

These findings bear directly on past observations and interpretations of trends in income segregation. Research on the 1980–1990 decade found growing income segregation and attributed it in part to increasing poverty in central cities and middle-class flight to the suburbs (Jargowsky 1996:996; see also Massey and Eggers 1993). Subsequent studies have emphasized how income inequality itself can translate into income segregation, building on evidence that income segregation was higher in metropolitan areas with greater disparities in income (Mayer 2001; Watson 2009). Reardon and Bischoff (2011) theorized that this is a result of three kinds of processes that *motivate and enable affluent families* to seek more exclusive locations while *limiting options for the poor*: (1) affluent residents’ preference for living with neighbors of similar class standing, (2) the advantages in terms of public services that accrue to higher-income communities with a stronger tax base, and (3) price competition in the housing market that raises prices and restricts access to such places. They reported that segregation of both affluent and poor families increased from 1970 through 2000 as income inequality rose (mainly in the 1980s). In later work, they found that both low-income and high-income segregation also increased after 2000 (Bischoff and Reardon 2014). Our findings contradict these interpretations because they do not apply generally to family households. Income segregation did not increase for families overall, nor did the separation of the lower or upper decile of families from others increase.

Owens (2016) called attention more specifically to families with children and suggested that the main driver of change comes from the upper end of the class structure. She argued that high-income and highly educated parents are becoming more conscious of the need to invest in their children’s futures through more selective residential choices, leading to “increased willingness to pay to live in an expensive area associated with greater opportunities for children; and higher home prices associated with high-quality schools” (2016:553). Consistent with this thesis, she reported that increasing income segregation since 2000 has occurred only among families with children. We do find increasing segregation among families with children but not specifically for either the top or bottom tenths of the income distribution.

Several studies also have examined the role of racial segregation, mainly by studying income segregation trends separately for Whites, Blacks, and Hispanics (Jargowsky 1996; Massey and Fischer 2003; Reardon and Bischoff 2011; Reardon et al. 2015; Watson 2009; Yang and Jargowsky 2006). Most attention has been given to the situation of African Americans, although similar reasoning can apply to Hispanics. A highly segregated racial minority might tend to live in mixed-class neighborhoods because of obstacles to residential mobility for those who are more affluent. However, if racial separation creates large Black districts in urban areas, lower- and higher-income households may cluster in separate class-based neighborhoods within them. This has been the pattern in major cities, like New York City and Chicago, since the early decades of the Great Migration (Logan et al. 2015). Would income segregation across tracts in this case be similar to, less than, or greater than income segregation among Whites?

Jargowsky (1996) reported that income segregation among both Blacks and Hispanics was greater and increased more in the 1970–1990 period. Using a different measure, Bischoff and Reardon (2014) reported that income segregation among Blacks and Hispanics was lower than among Whites in 1970 but (despite a decline in the 1990s) grew much faster over time, especially after 2000. These scholars attributed the rising segregation among Blacks to the modest relaxation of race-based segregation that has occurred since 1970. The logic is that the high segregation of Blacks on the basis of race may restrict housing options even for families that are more advantaged, limiting the separation between higher- and lower-income group members. Reduction in race-based segregation might then result in an exodus of the Black middle class from income-mixed neighborhoods and hence higher income segregation among African Americans. As Jargowsky (1996:993) stated, “greater social distance between [race] groups constricts the housing options available to all members of the lower-status group . . . and leads to lower economic segregation within the group.” But “[d]ecreases in racial segregation, whether spurred by changes in social distance, public policy, or other causes, should increase economic segregation as the artificial boundaries limiting housing options are removed.” This point is reminiscent of Wilson’s (1987) discussion of the Black middle-class exodus from poor inner-city neighborhoods (see also Reardon and Bischoff (2011:1106–1107).

These interpretations also need to be reconsidered because we find that income segregation among Blacks and Hispanics was not generally higher than among Whites. Further, it is also only among Hispanics (especially Hispanic families with children) that we find increasing separation of the top 10% of the group from others.

## Research Design

### Data Sources and Measures

We analyze confidential household records in a Research Data Center of the U.S. Census Bureau’s Center for Economic Studies. Following the lead of past studies, we focus on family income segregation, leaving aside issues associated with single-person and nonfamily households. The original records are from samples: the one-in-six long-form samples of the decennial census in 1980, 1990, and 2000; and the nearly 8% samples that result from pooling annual data from the ACS in 2007–2011 and again in 2012–2016. Family income is measured as the sum of income of all family household members from all sources. We apply household weights to these data as developed by the Census Bureau to correct for under- or overrepresentation of various population segments in the sampling process. The income data are not top-coded. In 1980, the Census Bureau protected privacy of personal information by suppressing income data in tracts with small populations, but there is no suppression in the files available to us. Since 1980, the Census Bureau has relied on data swapping to protect privacy. The general approach to swapping is to exchange the record for one person or household that has an uncommon set of personal characteristics with the record of a somewhat similar person or household in another nearby tract. The files available in the Federal Statistical Research Data Centers (FSRDC), like those in the public data, include such swapped records.

Results of analyses with these data are released only after a disclosure review by census professionals. We conduct analyses in all years for all 384 metropolitan regions using constant 2010 metro boundaries. We gained approval to report segregation measures for metros with more residents than the smallest state in each study year; in 1980, for example, the smallest state was Alaska, with 401,851 residents. These data are available from the DATA section of the Diversity and Disparities website at Brown University (https://s4.ad.brown.edu/projects/diversity/Data/data.htm).

Here we report average values of segregation measures for the 95 metros that met this criterion in every year. Averages are weighted by the number of families reporting income in a given metro, or by the number of White, Black, or Hispanic families for group-specific measures. In addition, we impose a metro sample size threshold of 100 unweighted cases for each category of families studied here. We also omit cases for a given category of families and year if the estimate is below 0, which occurs occasionally with sample sizes that are only modestly above 100. This reduces the number of metros for analysis especially for Hispanic families and Hispanic families with children. Excluding cases with such small samples has little practical effect because all results reported in our main findings are weighted averages.

For the calculation of race-specific measures, families are classified by the race and Hispanic origin of the household head. An advantage of access to the original sample data is that we can identify non-Hispanic Black families in every data file (the published tables include Hispanic Black families in the Black category). In addition, although published tables are for families whose heads are “Black alone” in 2000 and beyond (when multiple-race reporting was introduced), we can identify all who are “Black alone or in combination with another race.” This classifies African Americans in a way that is more consistent with the 1980 and 1990 reports, when only one race could be recorded.

We call attention to our use of information on unit-level family incomes, unlike studies that relied on tabulations of families in income categories (i.e., grouped data). Researchers have long been aware of the difficulties with using public data at the tract level. When income is reported in categories, the distribution of incomes within each category is unknown and must be estimated. This estimation is more difficult for the top category (because it has no upper bound) and the bottom category (where incomes may cluster close to zero). However, it is problematic in any category, especially when samples are smaller, because incomes are not smoothly distributed within categories. Even careful approximation of the underlying income distribution can yield distorted estimates. Reardon and collaborators (e.g., Reardon and Bischoff 2011) simplified the problem by estimating segregation measures after converting incomes into percentiles. The value of their preferred measure (H)—which involves dividing the population into families above and below fixed points in the percentile distribution—can be calculated exactly for percentiles that coincide with the cutting points in the available grouped data. The value at other percentiles can be estimated by fitting a polynomial to the known points. If the full curve of values of estimated H at every percentile matches the estimated values form the unit-level household data, the overall value of H can be accurately estimated from it. Reardon et al. (2018:2138) argued that “[b]ecause there is no theoretical reason to expect systematic bias related to the binning of income data,” this procedure is unlikely to bias results.

Because we can replicate estimates using both grouped and ungrouped data, we are now able to assess how the use of grouped data can affect results. We present this analysis in section A of the online appendix. We find that grouped and unit-level data may generate similar results, but they do not always do so. Distortions are most likely for measures of the separation of the top or bottom income groups from all others. For studies that must rely on grouped data, therefore, our advice is to proceed with caution. In our study, as we explain shortly, we must rely on unit-level household income data to carry out the correction procedures to compensate for the bias in standard income segregation estimates.

### Measures of Income Segregation

We study several different measures of income segregation. These differ on whether they measure variation within and between tracts as entropy or as variance. Some of these are based on reducing the income distribution to a dichotomy and asking how segregated people in one income category are from all others. Bischoff and Reardon (2014) did this with a class of measures, H_{p}, and a related class of measures, R_{p}. This is similar to the approach of studies that divide the population into three categories and calculate a standard segregation index (the index of dissimilarity) between the bottom and top categories (the rich and poor) (e.g., Massey and Eggers 1993; Massey and Fischer 2003). Having transformed incomes into rank order, Reardon and Bischoff dichotomized the income distribution at a given percentile (*p*) and computed the segregation between income ranks above and below this point. Both H_{p} and R_{p} can be calculated at multiple cutting points, and Bischoff and Reardon focused particularly on the segregation of those at the below the 10th percentile from all others (H_{10} and R_{10}, segregation of the poorest) and those above the 90th percentile (H_{90} and R_{90}, segregation of the most affluent). H denotes their use of an information theory measure of segregation between the two categories, where the entropy within census tracts is compared with the total entropy in the population. R is based instead on variance within tracts in comparison to the total variance.

Measures based on dichotomies do not make use of the full income distribution provided by the census. Four other measures do exploit the multiple and ordered category nature of the data. One, H^{R}, is built from the full set of rank-order measures H_{p}. As Bischoff and Reardon (2014:228) described, “if we computed the segregation between those families above and below each point in the income distribution and averaged these segregation values, weighting the segregation between families with above-median income and below-median income the most, we get the rank-order information theory index.” Its equivalent based on analysis of variance is R^{R}, built from the full set of rank-order measures, R_{p}. Another alternative is based on a partitioning of the variance in income without recoding incomes to ranks. This measure is the correlation ratio, which Jargowsky (1996) referred to as the Neighborhood Sorting Index (NSI). It is simply the square root of the between-tract variance in income divided by the total variance of income, a familiar statistic in analysis of variance.

In this study, we report estimates of both H and NSI. We also report an alternative version of R that we call R^{F}, which may be thought of as the NSI applied after the income data have been recoded to quantiles. It uses the same formula as the R_{p} except that the {0,1} index of whether the quantile of income is less than or greater than *p* is replaced with the quantile itself. An attraction of this measure is that there is a convenient and intuitive way to construct small-sample bias corrections for it, as explained later. Henceforth, for notational convenience, we use H to denote H^{R} and R to denote R^{F}.

### Biased Estimates and Their Correction

Our findings rely on progress in identifying and correcting for biases and inaccuracies that have distorted prior studies without being recognized, and also on access to original sample data available only at an FSRDC. We consider two issues: (1) income data at the tract level are based on relatively small samples, and (2) the underlying unit-level data generally have sample weights.

#### Correcting Bias Related to Sample Size

_{10}, H

_{90}) that draw solely on knowledge of the tract-level sample sizes and tract population counts. The approximate bias for the entropy-based measure of income segregation in the case of an unweighted sample is

*M*

_{j}and

*M*are the tract-specific and total metro population, respectively; and

*N*

_{j}is the tract sample size.

^{1}Here the subscript

*bu*refers to the “uncorrected (biased), unweighted” estimate of H, and the subscript

*cu*refers to the “count-corrected, unweighted” estimate. Recall that

*H*

_{bu}is a segregation index that depends on the percentage of households in the sample from each tract sample that is below each percentile in the combined sample of all tracts. Let us call the difference between

*H*

_{cu}and

*H*

_{bu}the “count-based correction” because it depends only on the sample counts as proposed by Logan et al. (2018). Entropy estimates for points in the income distribution, such as H

_{10}, the segregation of households in the lowest 10% of the distribution, have a closely related correction factor.

^{2}

Logan et al. (2018) also proposed an approach (sparse sample variance decomposition (SSVD)) to correct the partitioning of variance within and between tracts using either the original interval-scale measure of income or a rank-order measure, which then allows for estimates of NSI or R. This is possible because (1) the income variance within tracts can be estimated from samples of any size without bias, and (2) the population-weighted average of the variance estimates for each tract from the sample converges to the within variation for the population as the number of tracts gets large. The total variance in the metropolitan area is estimated from a very large sample, and the between-tract variance is simply the difference between the total and within variance. We refer readers to the original article for details of the SSVD procedure (Logan et al. 2018).

None of these methods address the risk that when there is only one sample, it is subject to sampling variation that is inherently greater when samples are smaller. However, our analysis (Logan et al. 2018) of many sample draws from a 100% transcription of incomes from the 1940 census of the population in Chicago shows that these methods do yield unbiased estimates of income segregation, whether based on H, R, or NSI.

#### Correcting for Weighted Sample Data

A final step that we take is to show analytically and empirically that weighting of sample data by the Census Bureau also introduces bias. Then we offer an approach to estimate and correct for this weighting-induced bias. Unlike the data for the full population in 1940 on which Logan et al. (2018) relied to validate their sample-count bias correction, the contemporary data are weighted. This is problematic because, as we will show, heterogeneity in weights alters the precision of estimates of the dispersion in tract income. As a result, bias corrections for these measures must also account for weighting. In the following section, we develop this point theoretically and present alternative measures that incorporate weighting.

Let us first consider entropy-based measures (H). In the case of unweighted observations, bias depends only on the sample size. However, the *effective* sample size for the computation of variance of an estimator is smaller when weights are variable than when weights are uniform (e.g., all case have a weight of 1). To get some sense of this effect, suppose we have a population of 3,200 with income variance *v* that is randomly divided into two equal-sized subpopulations A and B. Then the variance of the estimated mean for a 10% sample of 320 households is *v* / 320. If the sampling rate for A is reduced to 6.25% (1/16), then the sampling rate for B must be raised to 25% to achieve the same variance. This change results in a total sample size of 500 rather than 320.^{3} The sample must be 56% larger because of the heterogeneous weights.

*p*

_{j}in a given tract

*j*with income below a given level using a weighted sample of given size:

^{4}We also have to assume that household weights are independent of income. Without this assumption, the bias correction will depend in general on the unknown true tract fraction,

*p*

_{j}, and thus will not be feasible with sampled data. With this assumption, the bias in the entropy for a weighted sample is

*H*

_{gw}. Henceforth, unless further clarification is required, we will call

*H*

_{gw}the “corrected” estimate, and designate the estimate that corrects only for sample counts as the count-corrected estimate.

*H*

_{bw}is the uncorrected (biased) estimate calculated using weighted data.

In fact, the Census Bureau generally assigns larger weights to lower-income families. We examined the impact of this correlation on our estimation procedure in two ways. First, we explored this issue analytically in a simplified two-strata sample population. This thought experiment (available from the authors on request) suggests that our proposed expressions will be useful as long as the covariance of weight and income within tracts is small relative to the variation in income within tracts. Second, we validated our estimation procedures with 1940 data in which we introduced weights and where the 100% population measure of segregation is known. We report these analyses in detail in section B of the online appendix. We first assigned weights to the 1940 microdata in accordance with a multilevel model predicting weights in the Chicago metro in the ACS 2008–2012. This model shows that the relationship of household income to weight is small (*b* = –.0305) but statistically significant. We then compared the true value of H, R, and NSI with the estimated value based on our approach to correcting for sample counts and for weighting. These analyses demonstrate that estimates are affected by both sample-count and weight-related bias and also that our proposed alternative measures correct for both types of bias.

*u*

_{ij}nonzero and ∑

*u*

_{ij}= 0. Thus,

*w*indicates that this measure of the NSI uses weights, and the subscript

*b*indicates that it is not corrected for sampling bias. To construct an unbiased estimator, we use the fact that the within- and across-variation sum to the total variance. Again, incorporating the assumption that the weights are uncorrelated with income (see footnote 4) for the purpose of bias adjustment, the unbiased estimate of the within variance is

*g*indicates, as before, that this measure has been count-and-weight corrected. As with

*H*

_{gw}, this will be called the “corrected” estimate, unless there is a need for further clarification. Note that in the absence of variation in household weights,

*w*

_{ij}= 1/

*N*

_{j}, and both expressions reduce to the corresponding expressions in Logan et al. (2018), so the count-corrected and corrected estimates will be the same.

*NSI*

_{bw}nor to

*NSI*

_{gw}. They are based on grouped data, which implicitly incorporate sample weights but correct only for the sample size. They do not correct for the effective sample size, which is lower than the actual sample size because of the differential weights. Formally, those estimates are

*c*indicates the estimate is corrected for counts but not for the effective sample size. Note that the only difference between

*NSI*

_{gw}and

*NSI*

_{cw}is the replacement of the $\u2211iwij2$ term in the numerator in the former with 1 /

*N*

_{j}in the latter. Because $1/Nj<\u2211wij2$, as before, it follows that

*NSI*

_{cw}>

*NSI*

_{gw}. The corrected

*NSI*should be lower than the corresponding figures that correct only for sample size counts. The same approach can be used to produce corrected measures of R by replacing

*y*

_{ij}with its percentile in the population distribution and to estimate, for example, R

_{10}by replacing

*y*

_{ij}with an indicator of whether income is above the 10th percentile in the population.

^{5}

#### Consequences of Count and Weight Correction

A way to summarize the impact of the two forms of bias discussed earlier is to show how they affect estimates of change in income segregation over time. We do this in Fig. 1, which plots the estimated change in one segregation measure for all families, H, between 2000 and 2007–2011 using household-level income data in the FSRDC and applying our methods of correction. The figure displays three estimates for every one of the 95 metros studied here: (1) the change in the uncorrected estimate of H, (2) the count-corrected estimates, and (3) the final estimate that incorporates corrections for both sample counts and weighting. The horizontal axis arrays metros according to the change in the final estimate. Thus, the dots along the 45-degree straight line represent the corrected estimates of the change. The average values of H (see upcoming Table 1) are roughly .123, with a standard deviation of .026. Figure 1 shows that most metros experienced change within a range of -.01 and +.01, averaging change closer to 0.

We compare corrected and uncorrected estimates in the following way. The plus signs in the figure represent the original uncorrected estimates of change in H. The vertical distance between a plus sign and the corresponding corrected value reveals the total bias from both sources for this metro. In every case, the bias is positive: the uncorrected estimates show more increase in H than do the corrected estimates. In many cases where H actually declined, the uncorrected value of H increased. Where H increased, the uncorrected value increased more.

The figure also shows (as hollow circles) the estimates after we correct only for the reduced sample count in the post-2000 data. These values are intermediate between the corrected and uncorrected estimates, but they are also in a positive direction in every case.

We make three observations about these results. First, the count-alone corrections address only about 60% of the bias in the raw estimates. The (unweighted) mean bias correction only for counts is .0068, and the mean total bias is .0114. Thus, estimates of the change in segregation by Logan et al. (2018) and Reardon et al. (2018) that corrected only for sample counts still overstated the growth in income segregation over this interval. Second, the three groups of points are roughly parallel. This indicates that the ranking of changes in segregation estimates are not substantially affected by the process of bias correction. Third, the fraction of estimates lying above 0 *is substantially affected* by the process of bias correction. Although 93% of the uncorrected observations lie above 0 (the dotted line), only 56% of the count-corrected observations do. Put another way, all the uncorrected observations in the northwest corner of the graph are misclassified as having growing income segregation estimates even though the corrected estimates show decreased segregation.

## Results: Uncorrected and Corrected Estimates

Let us now summarize our methodological conclusions. Relying on grouped income data introduces errors in estimation of several standard measures of income segregation. For this reason, it is preferable to work directly with the original unit-level household data that are accessible in the FSRDC. There is systematic bias associated with the size of samples and with reliance on weighted data (all census and ACS sample data are weighted). These biases can support incorrect conclusions about trends in segregation, but they can be reliably estimated for every one of the income segregation measures that we consider here. We implement these corrections using the unit-level income data in the FSRDC. Here we present the results for all years between census 1980 and ACS 2012–2016 for family households of different types.

The average values (weighted by the number of households) of the largest metropolitan regions are reported in Tables 1, 2, 3, 4, 5, and 6.^{6} Each table includes the uncorrected and corrected values calculated from unit-level family-household data in order to gauge how the bias corrections have altered the observed results. Although we would expect sampling variation to affect estimates for any given metro, we are confident that the average across all large metros is close to the true value. Tables 1 and 2 present results separately for all families and for families with children, providing a test of the influence of children on locational choices. Table 1 presents the overall summary measures across the entire income distribution (H, R, and NSI). Table 2 presents the measures corresponding to the separation of the bottom tenth (H_{10} and R_{10}) and top tenth (H_{90} and R_{90}) of families. Subsequent tables offer parallel sets of results for White, Black, and Hispanic families and for families with children of each specific racial/ethnic group.

### All Families and Families With Children

Table 1 replicates findings in previous studies that showed a spike in income segregation between 1980 and 1990 for all measures and both types of families. In these decades, when the decennial census provided a full one-in-six sample of income data in every year, the uncorrected estimates were higher than the corrected estimates, but both increased substantially. Note that if we rely on the uncorrected measures, income segregation for all families appears to have increased again between 2000 and 2007–2011 and then stabilized. The corrected values show that neither H nor R increased after 1990, and NSI vacillated (down by 2000, then up, then down again). By these measures, the general rise in income segregation that has previously been reported did not occur.

However, a different result is found for families with children. For these families, the corrected measures show that income segregation continued in each interval through 2012–2016. This result is consistent with the trend reported by Owens (2016), although the magnitude of these gains is much reduced after correction. For example, the uncorrected H for families with children increased from .170 in 1990 to .215 in 2012–2016 (up .045), but the corrected H rose far more slowly, from .156 to .176 (up .020, about one-half as much).

Table 2 focuses on the upper and lower ends of the income distribution, relying on the dichotomies of the upper (or lower) 10% of the population versus all others to provide more detail about the patterns of change. Let us focus first on the actual trends as reflected in the corrected values. In the 1980s, when overall income segregation was rising strongly, segregation of the poor and segregation of the affluent both rose substantially, as measured by either H or R. Levels of segregation and increases were higher for families with children than for all families. After 1990, the levels stabilized or declined.

For all families, H

_{10}and R_{10}(segregation of the lower tenth) dropped during 1990–2000. H_{10}and R_{10}then stabilized or continued to decline through 2012–2016. At the end of these years, these measures were actually lower than they had been in 1980.Again looking at all families, H

_{90}and R_{90}(segregation of the upper tenth) stabilized or declined slightly through 2012–2016, but the final levels remained higher than in 1980.Trends are somewhat different for families with children. Both H

_{10}and R_{10}declined steadily after 1990, but H_{90}and R_{90}rose again during 1990–2000 and then stabilized.

These patterns of change in Tables 1 and 2 challenge recent interpretations. Based on the uncorrected estimates, one could describe a fairly steady rise in overall income segregation (Table 1) that coincides with rising income inequality. The upward trend appeared to be most striking and consistent from decade to decade for families with children (also Table 1). Then turning to Table 2, rising segregation seems especially clear for affluent families with children. These trends could be interpreted in terms of the motivations and behaviors of parents whose locational decisions increasingly seek advantaged communities for their children—especially affluent parents—which is Owens’ (2016) interpretation. However, the corrected results do not fit this narrative as well. After 1990, H, R, and NSI continued to increase for families with children, although not as much as previously reported. Yet this post-1990 trend does not appear either for segregation of the poor or segregation of the affluent families with children. From these results, we infer that the locational shifts evident in Table 1 were occurring more toward the middle of the income distribution.

### Race-Specific Patterns

We turn now to findings for Whites, Blacks, and Hispanics. In the previous tables, some portion of income segregation was due to racial/ethnic segregation given that Black and Hispanic families have lower incomes than White families. The race-specific measures in the following tables consider each group separately, so they measure the degree to which White (or Black or Hispanic) families are segregated by income from other White (or Black or Hispanic) families. In these tables, the mean values and standard deviations of segregation estimates are group-specific, and the measures of segregation of affluence and poverty refer to the top and bottom tenths of that group’s income distribution.

Because many census tracts have few Black or Hispanic residents even in metros with large minority populations, we expected bias corrections for these groups to be especially large, particularly after 2000. An example is provided in Fig. 2, which displays the uncorrected and corrected estimates of H for Whites and Blacks. The uncorrected estimates are upwardly biased, much more so for Blacks than for Whites. After 2000, the corrected value of H for Whites declines modestly, and the uncorrected estimate increases. Among Blacks, the corrected value of H remains nearly unchanged, and the uncorrected value spikes remarkably from .128 to .173, equivalent to nearly 2.0 standard deviations. This discrepancy leads to widely divergent conclusions of these trajectories. From the uncorrected data, it appears that income segregation among Blacks was much higher than among Whites; and whereas the increase after 2000 among Whites was mild, that among Blacks was enormous. After correction, we conclude that income segregation among Blacks was only modestly higher than among Whites, and both remained rather stable post-2000 after rising in the 1980–1990 decade.

Table 3 presents the full set of values of H, R, and NSI for the three groups. The upward bias in uncorrected estimates of H found in Fig. 1 for Whites and Blacks is replicated for R and NSI, as is the post-2000 spike for Blacks. In these respects, the results for Hispanics follow the same pattern. Now let us focus on the trends revealed by the corrected measures. For every group and every measure (with a small inconsistency for Hispanic NSI), there were substantial increases between 1980 and 1990, as we found previously for the total population. If we then compare the 1990 value with the final value in 2012–2016, we do not find consistent increases:

For Whites, H declined from .095 to .090, R declined from .170 to .158, and NSI declined from .136 to .132.

For Black families, H declined from .108 to .100, R declined from .186 to .175, and NSI declined from .127 to .108. (In this case, however, NSI fluctuated, rising in 2007-2011 before dropping again. We cannot account for this inconsistency).

For Hispanic families, H remained at .091, after dropping from 1990 to 2000 and then rising back to the 1990 level. R remained at .159, also after dropping from 1990 to 2000 and then rising back to the 1990 level. NSI also fluctuated, but this is the one case where NSI ended up higher in 2012–2016 than it had been in 1990.

From these findings, we can conclude that previously reported results for these groups overstated the differences between Whites and Blacks/Hispanics. Income segregation was somewhat higher among Blacks than among either Whites or Hispanics. Previous reports also overstated the tendency for income segregation to rise for any of them. In fact, income segregation among White and Black families declined after 1990, and income segregation among Hispanic families was the same in 2012–2016 as in 1990 for H and R.

Table 4 repeats our analysis of segregation of the affluent and of the poor for all families in each group, reporting trends for the lower-income (H_{10} and R_{10}) and upper-income (H_{90} and R_{90}) segments. Because there are so many comparisons to make in this table, we do not discuss the uncorrected measures; we report them here only for reference. Consistent with Table 3 for the overall income segregation measures, the segregation of both poverty and affluence increased from 1980 to 1990 for all three groups. We find the following after 1990:

The average levels of all these measures at the ends of the income distribution were stable (for the bottom 10%) or declining (for the upper 10%) for White families.

For Black families, there was some decline for lower-income families, more clearly for H

_{10}than for R_{10}, and also for affluent families, more clearly for H_{90}than for R_{90}.For Hispanic families, little trend is evident for poor families, but income segregation as measured by H

_{90}and R_{90}increased substantially among the affluent.

In relation to previous reports, the main conclusion from Table 4 is that instead of a generalized increase in segregation of either the affluent or the poor after 1990, there was a decline for Whites and Blacks and an increase only for the higher-income segment of Hispanic families.

As a final step, we report group-specific results for families with children in Tables 5 and 6. Again, we focus only on the corrected measures. Recall that we found evidence of increasing income segregation for families with children in Table 1 based on measures for the full income distribution (H, R, and NSI) but not for the upper and lower segments (H_{10,} R_{10}, H_{90}, and R_{90}). Is there, however, a tendency for increasing segregation for families within racial/ethnic groups?

As shown in all the tables to this point, segregation rose from 1980 to 1990. With respect to the full income distribution after 1990 (measured by H, R, and NSI), the answer is mixed:

After 1990, there is little trend among White families with children. H rose in the 1990s (from .118 to .127) but then stabilized. R also rose from .207 to .218 in the 1990s and then again to .223 in 2012–2016. NSI rose from .166 to .180 in the 1990s but then declined to .177 by 2012–2016.

There is a more substantial upward trend for Black families with children, shown most clearly in the increases after 2000. For example, after declining in the 1990s, NSI rose from .124 in 2000 to .170 in 2012–2016.

There is a similar upward trend for Hispanic families with children, reaching its lowest level in 2000 and then rising strongly after that time.

Trends for the poorest and most affluent families with children are reported in Table 6. For all groups and measures, there was a strong increase in the 1980s. Here again the clearest evidence of rising segregation among families with children is among Hispanics, specifically as regards the separation of the most affluent Hispanics from others.

For Whites, both H

_{10}and R_{10}declined slightly in the 1990s but then rose moderately after 2000. In contrast, both H_{90}and R_{90}declined after 2000.For Blacks, H

_{10}declined after 2000, while R_{10}remained stable. H_{90}and R_{90}changed little after rising in the 1990s.For Hispanics, H

_{10}and R_{10}were stable after 1990, ending at about the same level as they began. However, both H_{90}and R_{90}had strong upward trajectories through this whole period, starting in 1980 and continuing through 2012–2016. H_{90}rose consistently from .092 in 1980 to .137 in 2012–2016, and R_{90}rose from .073 to .117.

## Discussion and Conclusion

This study contributes to two kinds of goals. One is substantive: to document trends in income segregation in U.S. metropolitan areas since 1980. We compare patterns between all families and families with children, and among White, Black, and Hispanic families, and look separately at trends for the upper and lower tails of the income distribution. The other purpose is methodological. We call attention to the effects of stratified sampling on measures of spatial inequality, and especially to the problems associated with the shift in data sources from the long-form samples of decennial censuses to the smaller samples of the ACS.

### Substantive Findings

We find that social scientists cannot rely on published tract-level data to discover the real levels and trends in income segregation. After recognizing and seeking to take into account the upward bias associated with smaller samples after 2000, two research teams (Logan et al. 2018; Reardon et al. 2018) estimated that the post-2000 increase was about one-half of what had previously been reported. We take advantage of confidential census data files at the individual family level, obviating the need to interpolate income distributions within the categories that are used in public tract data and allowing us to account for variance in sampling probabilities. The results show that not only have increases in income segregation been overstated in past studies, but for several categories of families, there was no change or actual declines after 1990.

An important caveat is that removing the bias associated with sample size does not solve all concerns with sampling variability. Current ACS data come from a single sample that can be very limited in many census tracts, especially for subgroups of the population. Any income segregation measure aggregated up from tract-level distributions can be no more than an estimate of the actual population value. Researchers should be cautious in interpreting results for any single metropolitan area given the possibility that the estimate in a given year is too large or too small and that observed changes over time reflect sampling variability rather than real change. Nevertheless, we have high confidence in the average value of segregation over many metros because in that case, random sampling errors—positive and negative—can be expected to cancel each other out.

After surging in the 1980s, family income segregation has undergone some ups and downs but has not increased, and it has declined for families headed by non-Hispanic Whites and for affluent White families. Segregation for families with children who have been described as especially conscious of the advantages of moving to places with more resources continued the trend toward higher levels on H, R, and NSI through 2012–2016. However, segregation of affluent families with children was very stable. This result undermines the interpretation of changes for families with children that they result from the most advantaged parents seeking special place-based advantages for their children. The finding that this measure of segregation of affluent families with children increased only for Hispanics but not for Whites (who presumably have the most locational options) points to a more group-specific process.

Summary statistics like these do not reveal who is moving to more separate neighborhoods at each time point, and we cannot draw strong conclusions from them about the processes at work within metropolitan neighborhoods. The general conclusion is that rather than focusing on why income segregation seems to be rising in parallel with growing income inequality, scholars need to give more attention to why it may not. There are many directions to look. In the post-2000 period, one might consider the possible effects of the Great Recession and foreclosure crisis that occurred in the middle of the 2007–2011 ACS period. As income inequality continued to rise, many people lost jobs, lost their homes, and were forced to postpone moves by changing mortgage requirements, and there was a temporary steep decline in the value of nonhome assets held by the most affluent households. We are not in a position to fit these pieces together into a coherent narrative, and this remains a challenge for future research.

Our findings for Black and Hispanic families are intriguing in light of expectations that even a modest opening up of opportunities in the housing market might motivate and enable some minority families to seek more advantaged neighborhoods. We find such a pattern for Hispanics but not so clearly for Blacks. The hypothesis of an exodus of more affluent minorities from income-diverse neighborhoods after 1980 needs a more direct test through analyses of residential mobility.

### Implications of Bias From Smaller Samples

This research adds to concerns that others have expressed about the use of tract-level data from the ACS. The Census Bureau has attempted to educate users on the potentially large sampling variation in point estimates (such as the median value of income or the percentage of residents born abroad) for census tracts, and it now routinely disseminates measures of standard errors around these estimates. Fortunately, researchers have begun to notice these standard errors, and the point estimates are unbiased. That is, they may be far from the population value in a given tract, but they will cluster randomly around the true value. We draw attention to a different phenomenon associated with sampling variation. Standard measures of spatial inequality, such as the measures of income segregation analyzed here, have an inherent upward bias when based on samples, and the bias is greater when the sample size is smaller and where sampling is stratified. This is why income segregation was observed to increase again for all families after 2000 after seeming to moderate in the 1990s. It is also why differences in levels and trends between Whites and minorities were especially exaggerated after 2000.

Measures of income segregation are often included in multivariate analyses of other outcomes. In a cross-sectional study, the previously reported metro-level estimates may perform well. In supplementary analyses not reported here, we find very high cross-sectional correlations between uncorrected and corrected measures for the whole population (r > .95). This indicates that studies of the correlates of income segregation in a given year are likely to be only slightly affected by biased measures. Studies of specific segments of the population, however, should be attentive to the average sample size for a given subgroup, which may vary greatly across metros. We also find lower correlations in change between the corrected and uncorrected measures (in the range of .75 to .85), suggesting that there is greater potential for error in longitudinal analyses.

For the 95 largest U.S. metros, we recommend the use of the corrected estimates analyzed here, whether for measures based on entropy (H) or variance (NSI, R). These different measures typically trend in the same direction, but it is prudent not to rely only on one of them. Data for smaller metros may also be approved for disclosure in the future. Finally, for researchers who are able to gain access to the original sample data in the Census Bureau’s FSRDC system, the programs used to calculate measures and implement corrections are available from the authors. There are significant obstacles to FSRDC use, including their geographic location (they are spread unevenly around the country), their cost (sometimes free to faculty of hosting institutions but with fees of as much as $20,000 per year to others), the time required for an individual to gain special sworn status and for a proposed research project to be approved (sometimes six months to a year), the process of learning how to find documentation and use confidential data sets through the FSRDC’s computing system, the difficulty of evaluating interim findings that cannot be printed but only viewed on a terminal screen, and a learning process associated with disclosure reviews. There is a clear rationale for every one of these obstacles and therefore no simple solution. Nevertheless, as we discover that some kinds of studies that rely extensively on census data can no longer be carried out in familiar ways, scholars will increasingly need to learn how to make effective use of this data resource.

## Acknowledgments

This research was supported by the Sociology Program of the National Science Foundation (grant 1756567) and National Institutes of Health (1R21HD078762-01A1). The Population Studies and Training Center at Brown University (P2CHD041020) provided general support. We thank Todd Gardner of the U.S. Census Bureau for his assistance in working with census data through the FSRDC network. Any opinions and conclusions expressed herein are those of the author(s) and do not necessarily represent the views of the U.S. Census Bureau. All results have been reviewed to ensure that no confidential information is disclosed.

## Authors’ Contributions

All authors contributed to the study concept and design. Data preparation in the FSRDC was performed by Todd Gardner, Charles Zhang, and Hongwei Xu. Methodological innovations and programming were developed primarily by Andrew Foster. Analysis of income segregation patterns was carried out primarily by John Logan. The first draft of the manuscript was written by John Logan, and all authors commented on subsequent versions of the manuscript. All authors read and approved the final manuscript.

## Data Availability

All data sets used are held within the FSRDC under strict confidentiality rules. Segregation indices have been approved for disclosure by the U.S. Census Bureau and are posted on the Brown University website of the Initiative on Spatial Structures in the Social Sciences developed by John Logan: https://s4.ad.brown.edu/projects/diversity/Data/data.htm.

## Compliance With Ethical Standards

### Ethics and Consent

The authors report no ethical issues.

### Conflict of Interest

The authors declare no conflicts of interest.

## Notes

^{1}

By taking a second-order expansion of the entropy function around the fraction *p*_{j} of households in tract *j* with income below some given level and taking expectations, Logan et al. (2018) showed that the expected bias in any given tract *j* is $Ep\u0302j\u2212pj2pj1\u2212pj=pj1\u2212pj/Njpj1\u2212pj=1Nj$, where $p\u0302j$ is the corresponding fraction in the sample. This expression assumes the sample is done with replacement. Logan et al. also derived expressions for the case without replacement, which corresponds to the ACS procedure. This latter approach, however, complicates the resulting mathematical expressions and does not lead to a measurable improvement in performance in our simulated data.

^{2}

Reardon et al. (2018) subsequently advanced a somewhat similar correction that applied to H and R but not NSI. It has two other limitations. First, the derivation of their correction depends on the assumption that no systematic bias is introduced by grouped data. We show in the online appendix that there may in fact be distortions due to inability to model the upper and lower tails of the income distribution. Second, in grouped data, the sample weights have been applied, but they are invisible to the researcher. Therefore, it is not possible to correct for weighting, which we discuss later. A more useful tack is to turn to unit-level household data in the RDC, as we do here.

^{3}

We use the fact that if *y*_{i} is income and *w*_{i} is the weight normalized such that ∑*w*_{i} = 1, then $Var\u2211wiyi=\u2211wi2Varyi$.

^{4}

Reassuringly, this expression reduces to $1Nj$ in the case that all sampled families have equal weight, which would be the case with unweighted data, so that $wij=1Nj$. Note that we are in effect disregarding differences in weights across tracts. This is reasonable because the published census data in 2000 provide the true tract sizes and total population (not their sample analogs), and the ACS data include adjustments based on census 2010 full counts.

^{5}

A fully corrected estimate of R^{R} could be constructed by weight-correcting R_{p} for every value of *p* and then computing the *p*(1 – *p*) weighted average of the weight-corrected values.

^{6}

We also calculated changes in the unweighted averages, yielding very similar patterns. We prefer to weight by the number of families so that the statistic reflects the experience of the average family, the average family with children, or the average White, Black, or Hispanic family with or without children, in large metropolitan regions.

## References

## Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.