In the first half of the twentieth century, the rate of death from infectious disease in the United States fell precipitously. Although this decline is well-known and well-documented, there is surprisingly little evidence about whether it took place uniformly across the regions of the United States. We use data on infectious disease deaths from all reporting U.S. cities to describe regional patterns in the decline of urban infectious mortality from 1900 to 1948. We report three main results. First, urban infectious mortality was higher in the South in every year from 1900 to 1948. Second, infectious mortality declined later in southern cities than in cities in the other regions. Third, comparatively high infectious mortality in southern cities was driven primarily by extremely high infectious mortality among African Americans. From 1906 to 1920, African Americans in cities experienced a rate of death from infectious disease that was greater than what urban whites experienced during the 1918 flu pandemic.
In the first half of the twentieth century, the rate of death from infectious disease in the United States fell precipitously. Although this decline is well-known and well-documented (Armstrong et al. 1999; Cutler et al. 2006; Haines 2001), there is surprisingly little evidence about whether it took place uniformly across the regions of the United States. In this article, we use data on infectious disease deaths from all reporting U.S. cities to describe regional patterns in the decline of urban infectious mortality from 1900 to 1948.
The fall in infectious mortality in the first half of the twentieth century was one of the last phases of what public health scholars call the epidemiological transition. Starting in the eighteenth century, mortality rates in the United States and Europe began to fall, and life expectancy began to rise. Life expectancy in the United States increased by nearly 20 years between 1900 and 1948 alone (Carter et al. 2006). Some scholars have attributed the initial drop in mortality to improved nutrition (Fogel 2004; McKeown and Record 1962), but others have emphasized the introduction of public health initiatives instead (e.g., Szreter 1988). Improvement in public health has been the leading explanation for the decline in mortality after 1870 (Cutler et al. 2006; Haines 2001; but see Anderson et al. 2018). Cutler et al. (2006) divided public health interventions into two categories: (1) large public works projects, such as water filtration and chlorination (Cutler and Miller 2005), sanitation infrastructure (Alsan and Goldin 2019; Melosi 1999), milk pasteurization (Lee 2007), and mass vaccinations; and (2) changes in behavior promoted by the public sector, such as “boiling bottles and milk, protecting food from insects, washing hands, ventilating rooms and keeping children’s vaccinations up to date” (Cutler et al. 2006:102). After 1930, the introduction of antibiotics played an important role (Smith and Bradshaw 2008).
Cities led the way in the fight against death from infectious disease. In 1900, mortality rates were higher in cities than in rural areas. By mid-century, the opposite was true (Haines 2001). As deaths from infectious disease fell, they made up a smaller proportion of total urban deaths. In 1900, a median 37 % of urban deaths were due to infectious causes; by 1948, this figure had fallen to 6 %.
A small number of studies have used city-level data to study mortality in the first half of the twentieth century.1 Crimmins and Condran (1983) reported regional differences in deaths from several diseases in 1900 using data from 129 cities. They found that southern cities had the highest mortality rates at the turn of the century. Condran and Crimmins-Gardner (1978) showed that the extent of sewers and waterworks in 28 cities in 1900 was negatively correlated with mortality rates in middle to old age groups, although these results varied by region. The correlations between cities’ use of sewers and waterworks and their death rates from typhoid and diarrheal diseases, moreover, were weak. Cutler and Miller (2005), in contrast, found that the introduction of clean water technologies was responsible for nearly one-half of the total mortality decline in 13 cities from 1900 to 1936.2 Ferrie and Troesken (2008) reported similar results using data from Chicago from 1850 to 1925. Anderson et al. (2019) showed that municipal reporting requirements and the opening of state-run sanatoriums reduced the rate of death from pulmonary tuberculosis, although other aspects of the tuberculosis movement had no discernible effect.
Although some previous research used samples of cities to study declines in total, infant, and cause-specific mortality, no study has documented regional variation in infectious mortality over time using data on all cities for which they are available. Consequently, we know very little about how much the urban infectious mortality decline varied across regions. Studying regional variation in deaths from infectious disease may help to direct research on the causes of the decline. The regions of the United States varied in the compositions and densities of their populations, their climates and disease environments, and the extent of the public health interventions they introduced. Observing variation in infectious mortality over space gives researchers a warrant to search for causes of the decline that vary from city to city and region to region. In an influential review essay on the urban mortality transition in the United States, Haines (2001:47) concluded, “There is a need to look at more disaggregated data (e.g., states, counties, and specific cities).”
There are two common ways to measure the decline in infectious mortality. One is to examine year-to-year volatility in life expectancy and mortality. When deaths from infectious disease make up a large proportion of all deaths, waves of infectious disease influence the total mortality rate (Deaton 2013:63; Meeker 1971:359). Taking this approach, Smith and Bradshaw (2006, 2008) showed that yearly variation in period life expectancy narrowed substantially in the 1940s. The second approach, which we use here, is to measure mortality from infectious diseases directly. We use historical data on deaths by cause to estimate the rate of infectious mortality in all reporting U.S. cities. To construct the data set we use for this analysis, we digitized 49 years of death-by-cause data for all reporting cities, classified causes as infectious or not, and indirectly age-standardized the data using a separate series of death-by-cause-and-age data and newly available complete-count census data (Ruggles et al. 2018).3
We restrict our analysis to cities to ensure that we compare like with like. There was a clear urban penalty in mortality at the beginning of the twentieth century (Cain and Hong 2009; Condran and Crimmins 1980; Haines 2001; Higgs 1973). Using other units of analysis, such as states, could lead us to mistake differences in the urbanization of regions for other regional differences.4 Condran and Crimmins (1980:202) argued that “differentiation of space by rural and urban characteristics is essential to understanding the decline in mortality.” States also varied in the completeness of death registration within them (Condran and Crimmins 1980).
We report three main results. First, urban infectious mortality was higher in the South in every year from 1900 to 1948. The higher rate of death from infectious disease in southern cities in 1900, originally documented by Crimmins and Condran (1983), lasted at least through mid-century. Second, urban infectious mortality declined later in the South than in the other regions: deaths from infectious disease fell at an accelerated pace in cities outside the South in the early 1930s; a steeper decline in the South began instead in the late 1930s. Finally, the regional differences that we document were driven mainly by the extremely high risk of infectious mortality among African Americans, who made up a far larger share of the southern urban population than the urban population of other regions throughout this period.
Research in history, sociology, economic history, and demography has documented how barriers that African Americans faced in acquiring safe housing (Acevedo-Garcia 2000; Boustan and Margo 2016; Collins and Thomasson 2004; Du Bois 1908; Eriksson and Niemesh 2016; Galishoff 1985; Roberts 2009; Zelner et al. 2017), accessing urban social programs (Preston and Haines 1991) and medical innovations (Jayachandran et al. 2010), and establishing economic security (Ewbank 1987) put them at greater risk of death from infectious disease (Sen 1998). Our results point to a need for more research in this area. The risk of death from infectious disease among urban African Americans was so high that it was primarily responsible for regional differences in the infectious mortality of all residents of U.S. cities.
To measure regional and city-level mortality from infectious disease, we digitized and standardized a variety of historical records. In this section, we describe the data we collected, how we classified causes of death, and how we age-standardized the data.
We digitized city-level data on the number of deaths by cause from published volumes of the Vital Statistics of the United States.5 States and cities were not legally required to register deaths until 1933 (Haines 2006). However, starting in 1900, 10 states and the District of Columbia made up an official death registration area (DRA), which grew to cover the entire country by 1933 (Haines 2006). Cities gradually enter our data set as they entered the DRA in the period 1900 to 1933, often before their entire state entered. In 1900, the DRA contained 332 cities; by 1948, that number had grown to 1,044.6
The DRA data enable us to study in fine detail the decline in deaths from infectious disease in U.S. cities. However, the urban mortality decline should not be considered representative of the national mortality decline. In 1900, child mortality was much higher in the DRA than it was in the entire nation (Preston and Haines 1991). Moreover, regional differences in total mortality in 1900 differed from regional differences in urban mortality. For instance, the South as a whole had a relatively low child mortality rate in 1900, but this rate was pushed up by the high child mortality rate of its black population and pushed down by its rurality (Preston and Haines 1991). The urban child mortality rate of the South—excluding the region’s rural areas—was much higher.
Reporting infectious mortality rates entails classifying causes of death as infectious or not. This is more complicated than it may first appear. In some cases—such as “childbirth deaths”—the cause-of-death categories used in the vital statistics data are ambiguous about whether the cause was infectious. We coded causes conservatively, excluding from our infectious counts categories that mix infectious and noninfectious causes.7 Wherever possible, we followed the infectious classifications that Armstrong et al. (1999) used.8
There are two additional challenges to constructing a consistent series of infectious mortality. The first is that the cause-of-death reporting in the vital statistics was updated with each new International List of Causes of Death (ICD).9 Some ICD changes were more extensive than others. Fortunately, the ICD revisions made from 1900 to 1948 are relatively small: the earliest extensive change was the sixth revision in 1948, which was adopted in the 1949 Vital Statistics (International Institute for Vital Registration and Statistics 1993; Moriyama et al. 2011). For this reason, we end our series in 1948. The second challenge is the coarseness of cause-of-death categories in the city-level vital statistics data. Often, specific causes of death listed in the ICD codes are grouped into coarser categories at the city level. Usually the coarser categories in the vital statistics were changed only when the ICD codes changed. This magnifies the differences between ICD coding regimes in our data. Most of our analysis concerns differences in the rate of infectious mortality across regions within years. In those parts of our analysis that focus instead on the pace of the decline in infectious mortality, we broke the data into periods corresponding to the ICD coding regimes to minimize the effect of changes in the classification of deaths on our results. In these parts of the analysis, we also excluded 1918–1920, the flu pandemic years. This yields five periods corresponding to the following approximate decades: 1900–1909, 1910–1917, 1921–1929, 1930–1938, and 1939–1948.
In some cases, the grouping of causes in the vital statistics changed even within ICD regimes. This occurred in 1904; 1905; and each year from 1936 to 1944 except 1939, when a new ICD took effect. For instance, puerperal fever/puerperal septicemia was coded as a distinct cause of death from 1910 to 1942, but it was collapsed into the general “childbirth” category from 1900 to 1909 and from 1943 to 1948. The 1910 shift coincides with a shift in the ICD regime, but the 1943 shift took place within the 1939–1948 regime. Because the vital statistics made so many changes to the coarse categories of death reported from 1936 to 1944, it is impossible to create periods with no changes in the categories. However, dividing the data at the ICD changes ensures that the most consequential changes in the classification of death occurred between intervals.
Because infectious mortality is almost always highest at very young and very old ages, differences in infectious mortality across regions or periods may actually reflect differences in how much of a region’s population is very young or very old. To estimate regional differences in infectious mortality over and above regional differences in the age distribution, we report infectious mortality rates that are indirectly age-standardized. We constructed a standard age structure of infectious mortality based on urban age-specific mortality data in the years 1922–1933 and used that schedule to predict the “expected” infectious mortality of each city-year based on its age schedule. This procedure is described in detail in the online appendix. The main outcome in our figures and regressions is the log of the ratio of actual to expected infectious mortality in each city-year, a ratio known as the comparative mortality ratio. An unlogged comparative mortality ratio of 1.5, for example, reflects infectious mortality 50 % above what would be expected given a city’s age distribution and a standardized mortality schedule. For simplicity, we refer to the logged comparative mortality ratio as infectious mortality.
Southern cities differed from cities in other regions in both their rate of infectious mortality and when that rate declined. But the South’s distinctiveness had less to do with causes affecting all residents of southern cities than with the fact that southern cities were populated by greater proportions of black residents, who suffered extreme risks of death from infectious disease in cities in all regions.
Urban Infectious Mortality Was Higher in the South Every Year From 1900 to 1948
In Fig. 1, we plot median infectious mortality in the four census regions. Panel a of the figure shows the raw infectious mortality rate per 100,000 people. The most notable feature of the plot is that the urban South’s infectious mortality rate exceeded that of the other regions for the entire period. Only in the 1940s were there signs of convergence.
In 1900, the beginning of our series, the median infectious death rate in southern cities was 1,053 per 100,000—almost twice that of Midwestern cities (529 per 100,000). It took at least 20 years for the urban South to cut its rate of death from infectious disease to what the urban Midwest’s was at the beginning of the twentieth century.10 To make the South’s infectious mortality rate in 1900 comparable to that of the Midwest in the same year, we would have to remove all southern deaths from tuberculosis (the top killer in 1900), meningitis, and malaria combined.
In 1948, the urban South still had a higher infectious mortality rate than the other regions, with a median death rate of 102 per 100,000, compared with 58, 61, and 72 in the Northeast, Midwest, and West, respectively. Infectious mortality in southern cities could be made comparable to infectious mortality in Midwestern cities only by excluding southern deaths from tuberculosis (the second-biggest killer in 1948) and diarrhea. But the general convergence across regions, and the greater pace of decline in the 1940s, meant that the urban South was many fewer years behind in 1948 than it had been at the beginning of the twentieth century. The burden of infectious mortality in southern cities in 1948 is very similar to that of Midwestern cities just seven years earlier. The lag between southern and Midwestern cities’ infectious mortality declined by roughly one-third of one year per year between the beginning and the endpoint of our series.
To check that our results do not simply reflect regional differences in the age distribution, in panel b of Fig. 1 we show logged comparative mortality ratios, which describe age-standardized mortality on a proportional scale. The comparative mortality ratios (unlogged) in 1900 range from a median of 1.7 in the Midwest, reflecting infectious mortality 70 % above what we would expect given Midwestern cities’ age distributions and a standardized mortality schedule derived from age-specific mortality in 1925, to 3.4 in southern cities, or infectious mortality nearly three and one-half times what southern cities’ age structures would predict. In 1900, southern cities’ standardized infectious mortality, like their raw infectious mortality rate, was roughly double that of Midwestern cities. By 1948, Midwestern cities had infectious mortality only 18 % of what their age structures would predict from the standardized mortality schedule; southern cities’ infectious mortality had fallen to 30 % of what their age structures would predict, approximately 1.7 times the standardized Midwestern rate.
Does the fact that our sample changes as new cities entered the DRA affect our results? Panel c of Fig. 1 addresses this question. Instead of using an unbalanced panel of cities, as in panel b, here we reproduce our results using a balanced panel of cities for which we have data in all years.11 Despite the smaller sample size, including just 14 southern cities, the broad pattern of our results remains very similar: compared with infectious mortality in other regions, infectious mortality in southern cities continues to be higher but converges considerably by 1948.
Infectious Mortality Declined Later in Southern Cities Than in Cities in the Other Regions
The urban South differed from the other regions not only it its level of infectious mortality but also in the timing of its infectious mortality decline. The pace of the infectious mortality decline in cities in the other three regions was gradual and constant from 1900 to 1930, accelerating from 1930 through the end of the series. The urban South’s decline, in contrast, stayed gradual until the late 1930s, when it became much sharper than that of the other regions.
Figure 2 gives a more detailed look at the data, allowing us to see more clearly how the pace of the infectious mortality decline varied across regions. Each dot represents one city-year observation, with darker dots indicating the median city in each year. Here we divide the data into the five aforementioned periods, each corresponding to an ICD regime. We draw linear trends within each period. The figure makes especially apparent the sharp discontinuity in the fall of infectious mortality in southern cities after the late 1930s.
To examine the timing of the decline more formally, we fit a simple regression model, regressing log age-standardized infectious mortality on region indicators and the interaction of region and year trends, and clustering the standard errors at the city level.12 Panel a of Fig. 3 plots the region indicators. Dots represent the mean infectious mortality in each region in each period. Bars plot 95 % confidence intervals. Looking down the figure, the regional means move gradually to the left, indicating the infectious mortality decline across all regions. The urban South’s distinctively high infectious mortality stands out in this figure: despite the dramatic decline in infectious mortality in the South in the 1939–1948 period, southern cities still lagged behind cities in other regions at mid-century.
Panel b of Fig. 3 plots the slope coefficients, measuring the pace of the decline within each region and ICD regime. Here, the gradual infectious mortality decline through the late 1920s is particularly salient. Across all periods from 1900 to 1929, the (unlogged) annual declines in the mean comparative mortality ratio ranged from 0.7 % in the 1920s urban South to 3.6 % in the 1920s urban Northeast. The slopes for the Northeast, Midwest, and West grew sharper in the 1930s. Most notable is the distinctive pattern of the decline in southern cities: compared with the other regions, infectious mortality in the urban South fell most gradually in the 1930–1938 period but most sharply in the 1939–1948 period, with an annual decline in the comparative mortality ratio of 10.1 %.
The broad timing of the 1930s decline across all regions in our data is consistent with that found in previous studies. Catillon et al. (2018), for instance, documented a trend break in 1936 using national data on influenza and pneumonia mortality. Jayachandran et al. (2010) found that sulfa drugs reduced maternal mortality by 24 % to 36 %, pneumonia mortality by 17 % to 32 %, and scarlet fever mortality by 52 % to 65 % from 1937 to 1943. These diseases accounted for roughly 12 % of total mortality in the pre–sulfa drug period.
High Infectious Mortality in Southern Cities Was Driven Primarily by Extremely High Infectious Mortality Among African Americans
In Fig. 4, we show that the regional differences documented in Figs. 1, 2, and 3 primarily reflect differences in the urban infectious mortality of African Americans and whites.13 African Americans in cities across the United States often lived in segregated and crowded housing (Acevedo-Garcia 2000; Collins and Thomasson 2004; Du Bois 1908; Eriksson and Niemesh 2016; Galishoff 1985; Grigoryeva and Ruef 2015; Roberts 2009; Zelner et al. 2017), had high poverty rates (Ewbank 1987), and were prevented from accessing many urban social programs (Preston and Haines 1991) and medical innovations (Jayachandran et al. 2010), all of which increased their susceptibility to death from infectious disease (Sen 1998).
The sample of cities for which the vital statistics report deaths by cause separately for whites and nonwhites is smaller than the full sample we used to generate Figs. 1, 2, and 3. These data span the years 1906 to 1942, excluding 1938. Panel a of Fig. 4 plots logged comparative mortality ratios for all groups in this smaller sample. As in Fig. 1, southern mortality exceeded that of the other regions and sharply declined in the late 1930s. In panels b and c of Fig. 4, we examine infectious mortality among whites and nonwhites separately. The regional differences reported in Figs. 1, 2, and 3 are much less pronounced when we examine infectious mortality among whites and nonwhites alone. The lagging decline in urban infectious mortality in the South appears to be less a consequence of causes of death that affected all urban southerners than of causes that especially threatened the life chances of black city dwellers, irrespective of what region they lived in.14 In regions outside the South, the median percentage nonwhite in cities in our full panel ranged from 0.5 % to 4 %; in the South, it fell from 37 % in 1900 to 22 % in 1940. Still, it is notable that white urban infectious mortality in the South exceeded that of the other regions from the late 1920s to the late 1930s. This may reflect the relative poverty of the region’s white residents.
Most striking, however, is how much higher infectious mortality was among urban African Americans than it was among urban whites. Nationally, the (unlogged) median comparative mortality ratio for whites from 1906 to 1920 ranged from 1.4 to a peak of 3.0 at the height of the flu pandemic. The (unlogged) median comparative mortality ratio for nonwhites, in contrast, never fell below 3.1. African Americans in cities faced such a high risk of death from infectious disease that it is as if they lived through the flu pandemic experienced by urban whites in every year from 1906 to 1920.15
National medians smooth over volatility in infectious mortality that is more apparent at lower levels of aggregation. Median urban nonwhite mortality between 1906 and 1920 exceeded median urban white mortality in 1918 in all 15 years in the Midwest, 9 of 15 years in the Northeast, all 15 years in the West, and 13 of 15 years in the South.
The first term reflects the contribution that regional differences in the racial composition of the population make to the regional difference in infectious mortality. This term represents the difference between racial group i’s share of the population in cities in the South and its share of the population in cities outside the South, , weighted by racial group i’s average infectious mortality across cities in the South and non-South, .16 The second term reflects the contribution that regional differences in racial group–specific mortality make to the regional difference in infectious mortality. This term represents the difference between the average city’s infectious mortality in the South and the average city’s infectious morality in the non-South for each racial group i, , weighted by racial group i’s average share of the population in the South and non-South, .
The difference between the racial composition of cities in the South and that in the non-South is responsible for most of the difference in infectious mortality between the South and the non-South over our period. Fig. 5 shows these results, plotting the compositional component (the first decomposition term) as a proportion of the total southern excess.17 When this proportion exceeds .5, compositional differences account for most of the difference in infectious mortality between the South and the non-South. Across years, the racial composition of the population contributes a median proportion of .58 of the total southern excess mortality and more than one-half of the total in two-thirds of the years we study.
Like panels a and b of Fig. 1, Figs. 4 and 5 are based on an unbalanced panel of cities. The sample of cities for which we have data for whites and nonwhites in every year from 1906 to 1942 (excluding 1938) is small and predominantly southern. In Fig. 6, we plot city-specific trends in infectious mortality in the cities for which we have data for whites and nonwhites in at least 30 years. The dashed line indicates median infectious mortality during the 1918 flu pandemic among urban whites in cities present in the data for at least 30 years. As shown in the right panel, in many cities, infectious mortality among nonwhites from 1906 to 1920 was comparable with median urban white infectious mortality during the 1918 flu pandemic.18
In this article, we document regional variation in the decline in urban infectious mortality in the United States. The decline in total mortality in the United States in the first half of the twentieth century is well known, but regional variation in both the level of infectious mortality and changes in it is not. Our results point to three main conclusions.
First, the level of infectious mortality in southern cities exceeded that of cities in other regions for the entirety of our study period. Previous research documented that southern cities had comparatively high rates of death in 1900 (Crimmins and Condran 1983). We show that these differences lasted at least until mid-century.
Second, southern cities differed from cities in other regions not only in their levels of infectious mortality but also in the timing of their infectious mortality decline. Infectious mortality in cities in the Northeast, Midwest, and West gradually declined until the early 1930s and then declined more steeply. In southern cities, a sharper fall in infectious mortality did not take place until the late 1930s.
Third, and most strikingly, southern cities’ distinctiveness can be explained primarily by the fact that African Americans, who suffered extremely high risks of death from infectious disease in cities in all regions, made up a comparatively large share of their populations. Until 1920, median infectious mortality among African Americans in cities was higher than that of urban whites at the peak of the flu pandemic.
Our objective in this article has been to document regional differences in the fall of urban infectious mortality. Future research should attempt to explain these differences. Roberts (2009) and Zelner et al. (2017) presented evidence linking African Americans’ high rate of death from tuberculosis in the early twentieth century to segregation and crowded housing. Other well-known causes of black infectious mortality, in contrast, are unlikely to explain the gap in the infectious mortality rates of black and white city dwellers. For instance, by 1900, malaria, a common cause of death in the South (Kitchens 2013), was rare in southern cities (Boustan and Margo 2016; Humphreys 2009). Research by Black et al. (2015) and Eriksson and Niemesh (2016) has shown that the Great Migration increased mortality among African Americans at young and old ages. But a large part of this effect was due to the fact that migrants left rural areas for cities, where the risk of death from infectious disease was comparatively high at the beginning of the twentieth century (Eriksson and Niemesh 2016).19 The Great Migration may have accounted for some portion of the regional convergence in urban infectious mortality for all groups, as black migrants—with comparatively high risks of death from infectious disease—left southern cities for cities in the other regions. But it should account for a smaller portion of the difference in infectious mortality between urban whites and urban African Americans because infectious mortality rates for African Americans in southern cities were similar to those for African Americans in cities in the other regions. In future work, we plan to study infectious mortality among urban African Americans in closer detail, focusing especially on which causes of death were primarily responsible for their extreme mortality rates.
Authorship is alphabetical to reflect equal contributions. We thank Magali Barbieri, Douglas Ewbank, Evan Roberts, Melissa Thomasson, and Jon Zelner for helpful comments. Hero Ashman, Gianluca Russo, and jim saliba provided excellent research assistance. This research was supported by the Robert Wood Johnson Foundation Health & Society Scholars program; the Regents’ Junior Faculty Fellowship at the University of California, Berkeley; and the Minnesota Population Center at the University of Minnesota, Twin Cities, which is funded by a grant from the Eunice Kennedy Shriver National Institute for Child Health and Human Development (P2C HD041023).
Using state-level vital statistics, Moehling and Thomasson (2014) found that home nurse visits, spending on health and sanitation, and the establishment of health centers reduced infant mortality between 1924 and 1929.
Using a larger sample of cities, Anderson et al. (2018) found that these technologies reduced typhoid mortality, as Cutler and Miller (2005) reported, but had a much smaller effect on total mortality.
The complete-count census data cover the decennial censuses from 1900 to 1940 and come from IPUMS-USA at the University of Minnesota (Ruggles et al. 2018). Without the very recent release of complete-count census data with clean city and age variables, our analysis would not have been possible.
For instance, Boustan and Margo (2016) noted that national black life expectancy estimates derived from vital statistics data in the early twentieth century are likely understated because they are based primarily on northern states, whose black residents predominantly lived in cities (see also Logan and Parman 2014).
We were able to minimize errors in our digitization by checking that cause-specific deaths sum to total deaths in each city and each year.
Our own panel ranges from 329 to 982 cities because of data restrictions described later in the article and in the online appendix. In Fig. A4 (online appendix), we map the cities in our sample within their corresponding regions.
The most consequential of these categories is rheumatism and rheumatic fever. In our main results, we exclude these causes, but we reproduce our results with both causes classified as infectious in Fig. A2 in the online appendix. Reassuringly, Fig. A2 is nearly indistinguishable from our main result reported in Fig. 1, panel b. Acute rheumatic fever is the only infectious cause that we exclude even when it is reported separately from noninfectious diseases (chronic rheumatism and gout) because rheumatism and rheumatic fever are reported sometimes separately and sometimes in a combined category and because deaths from these causes combined made up a relatively large share of total deaths. The median death rate from “rheumatism” in 1910, when the category first appears, was 7 per 100,000.
Specifically, we classified the following causes of death as infectious: appendicitis; assorted infectious, epidemic, or parasitic causes; bronchitis, diarrhea, diphtheria, erysipelas, influenza, malaria, measles, meningitis, pneumonia, polio, puerperal fever, scarlet fever, septicemia, small-pox, syphilis, tuberculosis, and whooping cough. Not all of these causes appeared in every year of the vital statistics data we used, and many appeared in multiple variants (e.g., specific forms of tuberculosis).
The ICD provides standardized guidelines for coding causes of death. The first ICD was developed by Jacques Bertillon in 1893 and adopted by many countries. Now called the International Classification of Diseases, it has subsequently been revised many times to incorporate changes in terminology and medical knowledge (Anderson 2011). ICD revisions were implemented uniformly across cities in the vital statistics.
In our series, the southern median urban infectious death rate exceeded the 1900 urban Midwestern median in every year from 1900 to 1920. However, there is a sharp break between 1920 and 1921, in which infectious mortality appeared to fall in all regions, coincident with a change in ICD codes. (For example, meningitis was reported separately before 1921 but collapsed into “all other causes” afterward.) Some of this apparent mortality decline likely reflected the ICD change, although some likely reflected the resurgence of influenza in 1920. We say that the regional difference in 1900 represented “at least” 20 years of the southern mortality decline to come because mortality in all regions might have been higher in years after 1920 absent the ICD change.
We report the sample sizes for all panels in the online appendix. Some cities exited and reentered the full panel over the period.
We centered years relative to the beginning of each period to simplify the interpretation of the coefficients. We saturated the regression model with four region fixed effects rather than include an intercept or the linear trend in year.
The vital statistics report mortality for whites and nonwhites, but in this period, nonwhites overwhelmingly were African American. In all years except 1930–1934, “Mexicans” were classified as white (U. S. Department of Commerce 1941:2). The change in the classification of “Mexicans” had a minimal effect on the nonwhite mortality rates of the North, Midwest, and South. In the West, it causes a spike in nonwhite infectious mortality from 1930 to 1934, visible in panel c of Fig. 4.
According to Troesken (2004), northern and southern cities had similar rates of public water and sewer connections in the early twentieth century.
Although median infectious mortality among African Americans from 1906 to 1920 was higher than that among whites during the flu pandemic, the age pattern in deaths during the flu pandemic was unique. Infectious mortality is usually highest among the very young and the very old, but the flu pandemic killed many people in the prime of their life (Noymer 2009). As discussed earlier, between 1920 and 1921, a change in the coding of causes of death made infectious mortality appear to fall in all groups; median nonwhite comparative mortality ratios were 3.3 in 1920 but 2.4 in 1921, and they remained below 2.8 for the remainder of the series. Some of the infectious mortality decline between 1920 and 1921 likely also reflects the resurgence of influenza in 1920.
Both the racial population shares, and , and the racial group–specific mortality rates, and , are calculated as unweighted means across cities.
The racial composition measures were generated from IPUMS population measures, which in some cases generate city population totals that differ from published city population totals. In each city in our racial group–specific sample, these discrepancies never exceed 2.5 % in either direction. When we generated the proportions shown in Fig. 5, we estimated the total regional difference as the sum of the two decomposition terms so that a consistent population size was used in the numerator and denominator of this proportion.
Three cities had infectious mortality among whites in some years from 1906 to 1920 that exceeded median urban white infectious mortality in 1918: Key West, FL (two years), San Antonio, TX (five years), and Asheville, NC (seven years).
Eriksson and Niemesh (2016) found that moving to the North increased the infant mortality rates of African Americans in 1920 but that this effect had disappeared by 1940, mostly because of the disappearance of the urban mortality penalty.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.