Abstract
Flint switched its public water source in April 2014, increasing exposure to lead and other contaminants. We compare the change in the fertility rate and in health at birth in Flint before and after the water switch to the changes in other cities in Michigan. We find that Flint fertility rates decreased by 12 % and that overall health at birth decreased. This effect on health at birth is a function of two countervailing mechanisms: (1) negative selection of less healthy embryos and fetuses not surviving (raising the average health of survivors), and (2) those who survived being scarred (decreasing average health). We untangle this to find a net of selection scarring effect of 5.4 % decrease in birth weight. Because of long-term effects of in utero exposure, these effects are likely lower bounds on the overall effects of this exposure.
Introduction
Overwhelming evidence has shown that lead in water contributes to higher rates of lead in the blood and is related to eventual developmental problems in children (Edwards et al. 2009; Hanna-Attisha et al. 2016). Despite this, reports suggest that current EPA plans to test for chemical pollutants including lead are lacking (Booker 2018; Davis 2017; Hall 2019).
Lead exposure also may affect health through indirect channels by decreasing the latent health of those infants carried to term. Latent health is difficult to measure and may not manifest until much later in life, as demonstrated interdisciplinarily in the literatures of epidemiology (Barker 1995), biology (Schultz 2010), and economics (Almond and Currie 2011). As such, exposure to lead in utero and in infancy may represent only a lower bound on the overall effect of lead on health and human capital development.
High blood lead content is associated with cardiovascular problems, high blood pressure, and developmental impairment affecting sexual maturity and the nervous system (Agency for Toxic Substances and Disease Registry (ATSDR) 2007; Zhu et al. 2010). Maternal lead exposure is linked to fetal death, prenatal growth abnormalities, and reduced gestational period and birth weight (Edwards 2014; Taylor et al. 2015; Wang et al. 2019; Zhu et al. 2010) as well as increased infant mortality rates (Clay et al. 2014; Troesken 2006a, 2008). Maternal lead crosses the placenta, poisoning the fetus (Lin et al. 1998; Taylor et al. 2015). Lead exposure also decreases sperm count and male fecundity (Paul 1860; Vigeh et al. 2011).
We study the effect of a switch in the water supply from water sourced from Lake Huron to the more corrosive water sourced from the Flint River, which leached lead from pipes into the water supply, on fertility and birth outcomes. It is important to note that during the period when water was sourced from the Flint River, government officials continually reassured residents that the water was safe. Officials did not issue a lead advisory until September 2015 (Fonger 2015c), which reduced the scope of an avoidance behavioral response to the crisis (Neidell 2009).
The Flint water supply also contained higher rates of trihalomethanes (THMs), which can be detrimental to pregnant women (Cao et al. 2016; Gallagher et al. 1998; Nieuwenhuijsen et al. 2000), although others dispute this (e.g., Horton et al. 2011; Yang et al. 2007). The water switch likely also caused a Legionnaires’ disease outbreak that killed at least 12 individuals (Rhoads et al. 2017). Unfortunately, we cannot separately identify the effects of these contaminants. However, we focus on lead for two reasons. First, a large literature links lead exposure to poor pregnancy outcomes, and the specific results from Flint show elevated lead levels (Hanna-Attisha et al. 2016). Second, early efforts to treat bacterial contaminants (with chlorine disinfectants) and THM byproducts (with ferric chloride) had the unfortunate consequence of increasing the corrosiveness of the water and the potential level of lead poisoning (Masten et al. 2016; Torrice 2016).
Only the city of Flint switched its water source at this time: the rest of Michigan did not. These other areas in Michigan provide a natural control group for Flint given that they are economically similar and otherwise followed similar trends in fertility and birth outcomes over this period. Nevertheless, we also compare Flint with cities across the United States.
Using all live births in Michigan from 2008–2015, we estimate the effect of the switch in water supply on fertility and health. Following the switch, the general fertility rate (GFR) in Flint decreased by 7.5 live births per 1,000 women aged 15–49 (12 %). Because the higher lead content of the new water supply was unknown at the time, this decrease in GFR likely reflects increased fetal death and miscarriage and not behavior changes related to conception, such as contraceptive use or reduced sexual activity. Additionally, the ratio of male to female live births decreased by 0.9 percentage points in Flint compared with surrounding areas.
Because of the large decrease in fertility, selection into birth is a major concern for our birth outcome results. We therefore perform a bounding exercise that provides an upward bound on the birth weight effect caused by the water switch (Bozzoli et al. 2009). Accounting for selection, we find a 5 % decrease in birth weight.
This estimate of the selection-scarring effect of in utero exposure to a contaminated water source is a contribution to both the fetal origin literature and the health effects of lead literature. Furthermore, our study contributes to a growing interdisciplinary literature on the consequences of the Flint water switch. Others have shown increased child lead levels (Hanna-Attisha et al. 2016; Zahran et al. 2017), diminished test scores (Sauve-Syed 2017), increased bottled water purchases (Christensen et al. 2019), and worse health at birth (Abouk and Adams 2018; Wang et al. 2019).
Anecdotal references in news reports suggested that adult data are largely nonexistent. For example, “We know the effects become more profound as your blood-lead concentration increases, but in Flint, relatively few adults were ever tested for exposure, says Michael Kosnett, a professor of pharmacology and toxicology at the University of Colorado School of Medicine” (Patterson 2016). We do know of one recently published article, which found little effect on adult lead tests, but it had very small sample sizes in the pre–water switch period (Gómez et al. 2019). Additionally, one study found statistically significantly higher blood lead concentrations in dogs in Flint after the water switch compared with dogs elsewhere, with several dogs having levels higher than 20 parts per billion (Langlois et al. 2017).
Our article contributes to the literatures of economics, health, and epidemiology. We are the first to investigate the impact of the Flint water switch on fertility rates. We use multiple control groups, including cities across the United States, other cities in Michigan, and areas directly proximate to Flint. We also use standard methods, including difference-in-differences, synthetic control, and new methods (imperfect synthetic control). Additionally, the amount of lead in the water from the water switch was relatively small compared with historical contexts (Clay et al. 2014), making our estimates particularly informative. No amount of lead in the water is considered safe (National Toxicology Program (NTP) 2012).
More broadly, as fresh water sources become more scarce and aquifers dry up, other local governments will be faced with the difficult decision to switch water sources. Our study focuses on the demographic and health consequences of a water switch in the context of crumbling infrastructure and the fiscal decisions of unelected leaders. It provides important information about the potential unintended consequences of these decisions.
Background on Flint1
Flint, an old manufacturing city, is the birthplace of General Motors (GM) (Scorsone and Bateson 2011). The city has been shedding residents for many years; its contraction followed GM closing several manufacturing plants in and around Flint.2
In 1967, Flint stopped supplying water from the Flint River because of concerns about serving a growing population (Carmody 2016). The city began to receive Lake Huron water via pipeline from the Detroit Water and Sewerage Department (DWSD).
In 2011, based on the city’s precarious economic situation, the Governor of Michigan installed an Emergency Manager who would make all fiscal decisions and “rule local government” (Longley 2011). Citizens and elected officials would have little recourse to fight decisions made by the Emergency Manager. Concurrently, DWSD water rates were rising (Zahran et al. 2017). To cut costs, the Emergency Manager—together with other Genesee County officials—began to build a pipeline directly to Lake Huron in March 2013 (City of Flint 2015a; Walsh 2014). However, the project would take more than two years to complete. In the interim, Flint decided to switch its water source from Lake Huron to the Flint River between April 2014 and the completion of the new pipeline, while Genesee County continued to receive water from DWSD (Carmody 2016).
Flint had to treat the new interim Flint River water source. The city used some of the same treatment products as the DWSD, but Flint did not use anticorrosive inhibitors, such as orthophosphate (Olson et al. 2017; Pieper et al. 2017). Flint citizens began complaining about the color and smell of their water but were continually assured that the water was safe to drink (City of Flint 2015a, b). In August 2014, a boil advisory was announced for part of the city because of a positive fecal coliform test, although the city minimized this adverse result by claiming that it was an “abnormal test” caused by a “sampling error” (Adams 2014; Fonger 2014a). Less than a month later, a second boil advisory was announced for a similar issue. In response to these issues, the city determined to increase chlorine levels in the water (Fonger 2014b). Then in October 2014, GM announced that it would switch off the Flint River as the water source for its Flint plant because the water was too corrosive for its engine parts (Fonger 2014c). The city confirmed the GM switch was best for engine parts but continued to advise that the water was safe for human consumption. In December 2014, Flint received an EPA violation for excess total trihalomethanes (TTHMs) in the water, likely caused by the chemicals used to treat the water (Fonger 2015a).
Throughout early 2015, Flint held public meetings to assure citizens the water was safe and that the TTHM violation would be fixed soon (City of Flint 2015a, b). Concurrently, the Emergency Manager commissioned a report on the safety of the water and rebuffed an offer from DWSD to return Flint to service from Lake Huron water. A team from Virginia Polytechnic Institute and State University (Virginia Tech), organized by Marc Edwards, began independently testing Flint consumers’ water. In August 2015, the team reported much higher levels of lead than previously reported, noting that Flint River water was many times more corrosive than the DWSD water (Edwards et al. 2015). Mona Hanna-Attisha, a Flint pediatrician and researcher, held a press conference September 24, 2015, to report a substantial increase in blood lead levels in children following the water switch (Fonger 2015b; Hanna-Attisha et al. 2016). Although the city initially attacked the results of this study, the resulting public outcry finally forced the city to issue a lead warning and ultimately to switch back to Lake Huron water on October 16, 2015 (Emery 2015).
Literature Review
Background on Lead
Lead is a naturally occuring heavy metal that is associated with health problems. Human activities, including burning fossil fuels and industrial chemical reactions, cause the majority of lead emission into the environment (ATSDR 2007). The United States dramatically decreased the incidence of lead emissions and blood lead levels by first banning lead-based paint in the 1970s and then reducing leaded gasoline throughout the 1980s before banning it in 1996 (Centers for Disease Control and Prevention (CDC) 2005; Zhu et al. 2010).
Previous work has investigated the effects of general exposure to lead, lead levels in the blood, lead exposure from a water source, and the mechanisms through which lead and other in utero exposure affects current and future health. Each has implications for our study of Flint.
General Exposure to Lead
Exposure to lead is associated with cardiovascular problems, high blood pressure, and developmental impairment affecting sexual maturity and the nervous system (ATSDR 2007; Zhu et al. 2010). Lead crosses the placenta (Amaral et al. 2010; Lin et al. 1998; Rudge et al. 2009; Schell et al. 2003) and is correlated with mental health issues, prenatal growth abnormalities, reduced gestational period, spontaneous abortion, and reduced birth weight (Bellinger 2005; Borja-Aburto et al. 1999; Cleveland et al. 2008; Hertz-Picciotto 2000; Hu et al. 2006; Joffe et al. 2003; Taylor et al. 2015; Vigeh et al. 2010; Zhu et al. 2010). Using variation in lead exposure from the introduction of the Interstate Highway System and the Clean Air Act of 1970, Clay et al. (2018) found that exposure to lead in the air resulted in reductions in the birth rate and a worsening of birth outcomes. Additionally, men exposed to lead, including in industrial settings, have lower fecundity (Alexander et al. 1996; Apostoli et al. 1999, 2000; Assennato et al. 1987; Bonde and Kolstad 1997; Coste et al. 1991; Eibensteiner et al. 2013; Hamilton et al. 1983; Hernberg 2000; Lin et al. 1996; Paul 1860; Sallmén 2001; Sallmén et al. 2000; Shaiau et al. 2004; Vigeh et al. 2011; Winder 1993; Wirth and Mijal 2010; Wu et al. 2012). Even moderate blood lead levels reduce female fertility. According to NTP (2012:1), there is sufficient evidence (“chance, bias, and confounding could be ruled out with reasonable confidence”) that blood lead levels <5 μg/dL reduce fetal growth and limited evidence that blood lead levels <10 μg/dL increase spontaneous abortion and preterm birth.
Lead Exposure From a Water Source
High lead content in water results in increases in lead content in the blood (Edwards et al. 2009; Hanna-Attisha et al. 2016; Troesken and Beeson 2003), in turn increasing the risk of the aforementioned negative health outcomes. Clay et al. (2014) found historical evidence of higher rates of fetal deaths in cities with more lead service pipes and more acidic water. Areas with higher water lead levels have higher rates of preeclampsia (Troesken 2006b). Fetal death rates increased and birth rates decreased following the increase of lead in the water in Washington, DC, from 2000 to 2003 (Edwards 2014). Our study is similar to Edwards (2014), but we use a substantially larger group of comparison cities. That study compared only Washington, DC, with the United States overall and Baltimore, MD. This makes proper inference difficult because of small clusters in both treatment and control cities (see, e.g., Cameron et al. 2008).
Despite previous studies using exact measures of lead in the blood (see, e.g., Taylor et al. 2015; Zhu et al. 2010), these study designs did not include exogenous variation in lead supply. Thus, they could not rule out that these worse birth outcomes are actually associated with an omitted variable.
Lead increased in the Flint water supply because of improper water treatment. Officials did not treat the Flint River water with corrosion inhibitors. Furthermore, they used ferric chloride to combat infectious bacteria in the water, which increased the likelihood of corrosion (Clark et al. 2015; Pieper et al. 2017). Corrosion inhibitors aid in creating protective corrosion scales within pipes, reducing the amount of lead leached from the pipes (Olson et al. 2017; Pieper et al. 2017). Lead, galvanized, and unknown service line connections all potentially leach lead into the water source, and these accounted for approximately 7 %, 21 %, and 27 % of all connections, respectively, in Flint (Clark et al. 2015; University of Michigan–Flint GIS Center 2016).
Other Outcomes From Lead Exposure
Previous studies have found that increases in lead levels have an adverse effect on later-life cognitive function (Ferrie et al. 2012; Hernberg 2000; Reuben et al. 2017), mental health and criminality (Reyes 2007, 2015), educational outcomes (Aizer et al. 2018), and school suspensions (Aizer and Currie 2019; Billings and Schnepel 2018). However, Billings and Schnepel (2018) and Gazze (2016) found that lead remediation can moderate the negative effects of those exposed to lead and reduce blood lead levels. This finding underscores the importance of lead testing and access to information and health care.
Mechanisms Through Which Lead Affects Health
This study contributes to the large literature on fetal origins hypothesis, where in utero shocks may affect health. The sign of the effect of these shocks is ambiguous because of two countervailing mechanisms (Almond 2006).
First, fetal insults may lead to selective attrition, the culling of weaker fetuses through miscarriage or fetal death (Almond 2006; Clay et al. 2014; Edwards 2014). Thus, the less healthy fetuses would not be born, leaving only the healthier fetuses and thus having a potentially positive effect on population health. Additionally, higher rates of lead may shift the overall health distribution of infants affected in utero toward being more unhealthy, leading to worse health outcomes (scarring). The two effects (selection and scarring) could even approximately cancel each other out for surivors (Bozzoli et al. 2009).3 Behavioral selection into pregnancy may occur if women decide not to get pregnant because of concerns about their future children’s health. Dehejia and Lleras-Muney (2004) documented nonrandom selection into pregnancy in response to changing labor market conditions. Clay et al. (2018) provided evidence of more-educated women reducing fertility in response to lead exposure. However, women would need to be aware of the water crisis in advance for this explanation to affect our analysis. Although women were aware of several issues with Flint water following the switch, they did not know about the elevated lead content in the water until nearly the end of the Flint River water regime (see Fig. B2, online appendix).
Data
We use vital statistics data for Michigan from 2008–2015. These data contain detailed information on every birth in the state: health at birth; background information on the mother and father, such as race, ethnicity, education, and marital status; and prenatal care during pregnancy. We calculate the date of conception for a woman from the clinical gestational estimate and exact date of birth. We define Flint per the census tract–level (UM–Flint GIS Center 2016) data on lead pipes, and then we use U.S. Department of Housing and Urban Development (HUD) census tract-to–ZIP code matching4 and SAS ZIP code–to-city matching5 for the 15 largest non-Flint, Michigan cities (Ann Arbor, Dearborn, Detroit, Farmington Hills, Grand Rapids, Kalamazoo, Lansing, Livonia, Rochester Hills, Southfield, Sterling Heights, Troy, Warren, Westland, and Wyoming). As a robustness check, we also use National Vital Statistics data from 2008–2015 with a marker for the city of residence. These data provide 220 additional comparison cities.
Methods
A strength of our study is that it exploits a natural experiment in the exposure of women to contaminants, including lead, caused by an exogenous switch in the water supply. Because policy shifts may occur in response to local conditions that were already changing or to other unobservable factors, policy endogeneity is a risk. However, an Environmental Protection Agency (EPA) memo citing lead concerns in Flint was leaked to the public only in July 2015 and was not confirmed by other researchers until September 2015 (Robbins 2016).
Results
Table 1 presents summary statistics. Columns 1 and 2 present means of births to individuals who did not reside in Flint before and after the water switch, respectively. Descriptive statistics for mothers who lived in Flint at the time of birth before the water switch are presented in column 3, and results for Flint mothers who gave birth after the water switch are presented in column 4.
Mothers who gave birth outside of Flint were older in the pre-period. However, we find no differential change in age between the periods. Women in Flint had lower educational attainment: they were much more likely not to have a high school diploma and less likely to have obtained a college degree. Following the water switch, the proportion of mothers who did not receive a high school diploma decreased by approximately 2.5 percentage points for both Flint and non-Flint mothers. In addition, Flint mothers were more likely to receive a high school diploma, and non-Flint mothers were more likely to complete some college or a college degree.
The GFR in Flint was lower by 8.5 births per 1,000 women aged 15–49 in Flint following the water switch compared with control areas. The sex ratio of babies born in Flint skewed more female following the water switch, resulting in a decrease in males of 0.74 percentage points. Compared with babies in other areas in the pre-period, babies born in Flint during this period were nearly 150 g lighter than in other areas, were born one-half a week earlier, and gained 3 g per week of gestation less. The unadjusted difference-in-differences for these variables was a decrease of 15 g at birth, 0.12 weeks of gestational age, and 0.27 g per gestational week in growth rate.
Fertility Results
In Fig. 2, we present trends in GFR for Flint and the rest of Michigan separately. We present unadjusted fertility rates.9 The graph demonstrates that although births in Flint are more volatile because of the smaller base sample in the area, fertility rates decreased substantially in Flint for births conceived around November 2013, and the decrease persisted through the beginning of 2015. Flint switched its water source in April 2014, so these births would have been exposed to this new water for at least one trimester in utero. Other cities in Michigan had similar seasonally volatile GFR trends but did not display large decreases in GFR following the Flint water switch.
Table 2 presents regression results for GFR by city. The main coefficient of interest is β1, the parameter of Waterct calculated using Eq. (2). The unit of observation is city-month. We estimate that women living in Flint following the water switch gave birth to 7.5 fewer infants per 1,000 women aged 15–49 compared with control cities. These results are statistically significant at the 0.1 % level. On a base of 62 births per 1,000 women aged 15–49, this signifies a 12.0 % decrease. In column 2, we include a more flexible measure of fixed effects by interacting month into year. Estimates are nearly identical. We adjust our standard errors using the wild bootstrap method (Cameron et al. 2008) because we have only one treated cluster and find consistent inference results. We include city-specific linear time trends in column 3, and the results are statistically significant, but we do not consider them our main results because of concerns about potentially biased treatment group–specific time trends (Lindo and Packham 2017; Wolfers 2006).10,11
We also examine how the sex ratio of live births changed in Flint, given the medical literature finding that male fetuses are more susceptible to fetal insults (Sanders and Stoecker 2015; Trivers and Willard 1973). We find that sex ratios decrease by 0.9 percentage points (1.8 %) in Flint compared with other Michigan cities. Sanders and Stoecker (2015) found that birth ratios skewed more male following the implementation of the Clean Air Act and argued that this result is consistent with an increase in health. Although our finding of an increase in the proportion of births that are female likely represents a level of selection consistent with an increase in fetal deaths, it is also consistent with a decrease in health at the time of birth. In light of the concerns of biased treatment group–specific time trends (Lindo and Packham 2017; Wolfers 2006), we are not concerned by the results in column 6 being smaller in magnitude.
In Table 3, we limit our sample to demographic subgroups. We use demographic characteristics of mothers because it is more accurately measured and available for nearly all births. These results are from Poisson regressions because we do not have good measures of population by subgroup. For the full sample, we find a decrease in births of 0.15, which can be interpreted as similar to a 15 % decrease in births in Flint. This decrease drops to 5 % including city-specific linear time trends. Results by age show a similar pattern for all age groups, although these results differ when including a city-specific linear time trend, especially for older mothers. Decreases in births are also apparent for both white and black mothers. We find that our results are driven by decreases in sex ratio among mothers aged 25–49 and white mothers. Results for black mothers show an oppositely signed, statistically significant estimate.
Birth Outcomes
Next, we turn our focus to birth outcomes. If increased lead in the water supply has only a selective attrition effect, then we would expect an increase in health among the births in Flint because the selection would remove only the weakest and leave the healthier fetuses to come to term. Alternatively, if a scarring effect also is present, then we would expect a decrease in health for those births that occurred.
We first investigate whether the switch in water supply caused a change in birth weight, shown in Table 4.12 We cluster standard errors in these regressions at the census tract level. We find negative results for birth weight; however, our birth weight estimates are imprecisely measured. Adding census tract, month, and year of conception fixed effects and additional covariates in columns 2–5 does not substantially change the coefficient.
Results for low birth weight, gestational age, and gestational growth rate all indicate worse health, but no finding is statistically significant. The magnitudes on the coefficients are all small, with the exception of low birth weight, suggesting non-economically meaningful effect sizes.13
Behavioral Changes
Behavioral changes—rather than the physiological impacts of lead—could be driving our results. Following Barreca et al. (2018), in Table 5 we use the American Time Use Survey (ATUS) to investigate time spent engaged in sexual relations, proxied by any time spent in “personal or private activities.”14 These analyses are at the level of the county or core-based statistical area (CBSA) and are thus not directly comparable to our main results, given that Flint accounts for approximately one-quarter of the population of Genesee County. We find that sexual activity increased in the post-period, which would bias our main result of a decrease in the fertility rate toward 0.15 Although only suggestive, this evidence supports our conclusion that reduced conception rate is not driven by reduced sexual activity.16
Synthetic Control Methods
We perform an analysis of fertility rates using a synthetic control method (Abadie and Gardeazabal 2003; Abadie et al. 2010).17 This method creates a weighted control group that more closely resembles the characteristics of Flint in the period before the water switch on both level and trend of fertility rates. It also controls for demographic characteristics of mothers in the pre-period, including race/ethnicity, educational attainment, and gender of the child. Figure 3, panel a displays GFR trends in Flint and its synthetic control group before and after the water switch. Panel b shows the difference between each city systematically assigned to treatment and the synthetic version of the city for each month. Flint is denoted by the solid line. The average treatment effect in Flint compared with the synthetic control is a decrease of 11.6 births, presented in panel c by the vertical black bar—a slightly larger effect size than shown in Table 2. The average treatment effect in Flint is substantially larger than the average treatment effect for all other cities.18
As an additional robustness check, we perform a synthetic control model matching on all GFR for March in each year before the water switch (2008–2013). The strength of this analysis is that it creates a better pre-trend match on GFR, but it may overfit on GFR and ignore other covariates (see Kaul et al. 2018).19
Finally, we use an imperfect synthetic control method (Powell 2018). This method solves two issues with Abadie et al.’s (2010) synthetic control method. First, it improves inconsistent pre-period match resulting from transitory shocks by using pre-period outcomes predicted from city-specific flexible time trends instead of the actual pre-period outcomes. Second, this approach allows the treated group to be an outlier and therefore not a convex combination of the control groups (i.e., with nonnegative weights). Using the imperfect synthetic control method, a convex combination of the treated group and the rest of the control group can match the outcome of a control group that has been assigned a falsification treatment. If the treated group has a positive weight in this situation, that weight can be inverted to describe the mapping from the control group to the treated group.
Figure 4, panel a shows that the imperfect synthetic control group is a better match for Flint in the pre-period. Panel b demonstrates that the decrease in GFR in Flint is larger than in any other area following the water switch, which provides additional evidence of the statistical significance of our estimates.
Robustness Checks
We perform several robustness checks. First, we perform a randomization inference permutation test (Cunningham and Shah 2018; Fisher 1935). This test systematically assigns treatment status to each control area and compares the size of the treatment effect for each control area with that of the actual treated area. As shown in Fig. B7 in the online appendix, we find our effect size in Flint is larger than all control area treatment effects, providing additional support for our main analyses.
Second, we compare county-level GFR rates in Table B1 in the online appendix. The treatment in this table includes all of Genesee County, of which Flint accounts for approximately one-quarter of the population. Our estimates are just 17 % as large as those in Table 2, which is to be expected given that the treatment sample is contaminated with nonaffected areas. However, GFR still decreases in a statistically significant way in Genesee County compared with other counties in Michigan following the Flint water switch.20
Table B2 (online appendix) presents results that are robust to limiting our sample to GFRs of births conceived before September 2014 and dropping the cities with the highest and lowest GFRs. We investigate births conceived before September 2014 because of potential avoidance following boil advisories due to fecal coliform in the Flint water source reported around this time. The decrease in fertility rates was larger in this early period than in our main results, which use the full post-period. One potential explanation for this change is that avoidance behaviors caused Flint residents to begin buying bottled water at higher rates after September 2014 (Christensen et al. 2019).21
In Table B3 (online appendix), we estimate the effect of the water switch on log births. We find a 15 % to 18 % decrease in Flint following the water switch, which is comparable to our 12 % to 14 % result in Table 2. In Table B4 (online appendix), we estimate a Poisson model and find a decrease in births of 0.15 (i.e., a 15 % decrease in births in Flint). These results assume a constant population in Flint over the study period. Estimates of population in Flint decrease over the study period, which may partially explain the larger magnitude of the effect in these models. In Table B5 (online appendix), we estimate a 90 % confidence interval using Conley-Taber standard errors and find that our fertility results are still statistically significant (Conley and Taber 2011).
We find consistent results comparing only Flint and the rest of Genesee County (Table B6, online appendix). As a falsification analysis, we compare Genesee County, excluding Flint, with the rest of Michigan in Table B7 (online appendix). We find no change in GFR or sex ratios, providing strong support for a change within Flint at the time of the water switch driving our results.
Table B8 in the online appendix shows that when we aggregate our analysis to the quarterly level, we find nearly identical results. In Table B9 (online appendix), we limit our analysis period to births conceived in 2011 to 2015. This analysis contains a pre- and post-period of similar lengths. We find similarly sized, statistically significant estimates of GFR and nonstatistically significant estimates for sex ratio.
To determine whether Michigan cities are a good comparison group for Flint, we use data for all U.S. cities with a population of 100,000 or larger from National Center for Health Statistics (NCHS 2019) for the same 2008–2015 period. In these analyses, our Flint sample remains the same as in the previous analyses. Our estimates using these data, presented in Table B10 (online appendix), are slightly smaller but are still statistically significant. Our results are substantially larger when we include city-specific linear time trends, which may be more justified because these cities are distributed across many states and thus may trend differently. In addition, we focus on cities with a larger black population, like Flint, and consistently find effects for GFR that are more negative and more in line with our main results from Table 2.22
We perform several robustness checks on our synthetic control method analyses in Table B11 in the online appendix. First, we include our main results by month as in Fig. 3 and Fig. B6 (online appendix) as a reference. We also perform these analyses collapsing the data to the quarterly level in columns 3 and 4, and we find very similar results. In panel b, we increase our donor pool to include all cities in the NCHS data. We estimate all four specifications on this sample and again find qualitatively and quantitatively similar results.
In section C of the online appendix, we focus on Flint compared with counties in Michigan rather than cities. The results are largely robust to this alternative definition of control areas.
Discussion
Our results for the decrease in the fertility rate are plausible given the broader scientific literature on this topic. Specifically, Edwards (2014) studied an increase in lead in drinking water in Washington, DC, in the early 2000s using somewhat different methods and found a similar 12 % decrease in the fertility rate.
We attempt to extrapolate the consequences of our results. The population of women aged 15–49 in Flint during our study period is approximately 26,000. The GFR dropped from 62 to 57, suggesting that over our study window of 17 months (births conceived from November 2013 through March 2015), between 198 and 276 more children would have been born had Flint not switched its water source.23 We consider this strong empirical support for the existence of a culling effect caused by increased lead in the water. Our results on sex ratios suggest that among the live births that occurred in Flint following the switch in water supply, an additional 18 female infants were born than expected.24
We now turn to a discussion of our nonfinding for the effects of the Flint water switch on birth outcomes. In isolation, one could infer from these results that there was no biological effect of lead on birth outcomes at the population level. However, this inference would be naïve given our finding of a statistically significant decrease in fertility rates and the likely nonrandom selection mechanism behind it, which could have balanced out a negative scarring effect.
We therefore perform an analysis in the spirit of Bozzoli et al. (2009) to disentangle scarring and selection effects. We assume that the pre–water switch birth weight distribution in Flint is normally distributed (see Fig. 5), with the mean (3,082 g) and standard deviation (632 g) as in column 3 of Table 1. We assume that the 12 % reduction in the live birth rate all came from the left tail of the birth weight distribution given that birth weight is often thought of as a proxy for infant health. In other words, there is some minimal birth weight cutoff for live birth, and the selection shock of adding lead to the water shifted the entire distribution left such that the bottom 12 % of birth weight did not survive.
Using the formula for the mean of a truncated normal,25 we calculate that mean birth weight of the surviving newborns, without any scarring, would have been 3,242 g. The observed Flint mean birth weight in the post-period is 3,042 g, a decrease of 200 g. Removing the pre-post difference in the rest of Michigan (from columns 1 and 2 of Table 1) reduces this by 25 g (to 3,217 g), leaving a scarring effect of 175 g—a 5.4 % decrease compared with 3,217 g. This is much larger than the scarring effect found from ignoring how scarring and selection cancel each other out (as in Gørgens et al. 2012) and naïvely using the coefficient in Table 4. We consider this a bounding exercise for the full effect of scarring if no selective attrition had occurred. As Fig. 5 makes clear, despite the large amount of selective attrition documented in Table 2, our probability density function for Flint shows that the health distribution shifted to the left in Flint following the water switch but did not shift in comparison cities.
Additionally, although our sex ratio results are not definitive, they support our main result that fertility rates decrease because of both selective attrition and scarring from a biological effect of an increase in contaminants including lead in the water. The 0.9 percentage point increase (1.8 %) in female births following the water switch is consistent with worse health at birth (Sanders and Stoecker 2015; Trivers and Willard 1973). Moreover, as in Table 5, we find no evidence to support a decrease in sexual relations among individuals living in Flint during this period. For our results to be explained by behavioral changes, we would have to hypothesize that at the same time Flint switched its water source, parents changed their preference for male children and began performing sex-selective abortions showing a preference for female children. This result would run counter to the prevailing evidence of lower female births than expected, especially in Asian countries (e.g., Das Gupta 2005; Sen 1990) and also in the United States (Abrevaya 2009).
Conclusion
We provide the first estimates of the in utero effect of increased amounts of lead and other contaminants in drinking water in Flint. General fertility rates in Flint decreased substantially following the water switch, and health outcomes displayed mixed results, with suggestive evidence of an overall decrease in birth weight and an increase in rates of low birth weight. Our careful empirical approach sufficiently eliminates potential biases in the comparison groups, and our use of several comparison groups provides credence to our results.
An overall decrease in fertility rates can have lasting effects on a community, including by decreasing school funding because of a decrease in the number of students. If the decrease in births reflected only a culling effect, that effect could reduce the health expenditures of the community. Given the research demonstrating a substantial increase in blood lead levels among children exposed to lead, an overall decrease in health expenditures in both the short- and long-term is highly unlikely (Edwards et al. 2009; Hanna-Attisha et al. 2016). Furthermore, children who seemed to be healthy at birth may still have worse latent health at birth, which could manifest later in their lives (Barker 1992, 1995). Although local programs work to counteract the water crisis (e.g., Hanna-Attisha 2017), these latent health effects remain a concern.
This study has several limitations. First, other contaminants may be present in the water that also affect health, so our results estimate the overall effect of the water switch on these outcomes. Additionally, we are not able to follow women long term to determine whether these fertility rate decreases persist after Flint switched back to a less corrosive water source. Future research should investigate the long-term effects on fecundity of both men and women and associated pathways following an increased exposure to lead. In addition, the health effects of a switch in water supply are not limited to pregnant women and neonates. This is just one consequence of this water supply switch. With the litany of evidence linking fetal and birth outcomes to health, education, and labor outcomes later in life, this study is an important step in investigating this public health issue. Despite these limitations, the culling of births in Flint provides robust evidence of the effect of lead on the health of not just infants but also potential newborns in utero.
To our knowledge, this study represents the first study of the Flint water switch on fertility and birth outcomes. This is a natural experiment from which to study the effect of high concentrations of lead in water on birth outcomes. Lead problems in many municipalities have recently been reported, making these estimates important in informing public policy (see Wines and Schwartz 2016).
This study is of great importance as the current legislative environment includes calls for a substantial decrease in funding for the EPA, which is charged with ensuring that localities maintain minimum water standards. Our results suggest that a less restrictive regulatory environment in the context of drinking water may have substantial unforeseen consequences on maternal and infant health, including large reductions in the number of births.
Acknowledgments
We thank Vincent Francisco, Kate Lorenz, Matt Neidell, Dhaval Dave, Dietrich Earnhart, Josh Gottlieb, Ben Hansen, Shooshan Danagoulian, Scott Cunningham, Edson Severnini, David Keiser, Peter Christensen, Charles Pierce, Nicolas Ziebarth, Nigel Paneth, Tom Vogl, Nick Papageorge, Karen Clay, Bryce Steinberg, Ken Chay, Osea Giuntella, Werner Troesken, Emily Rauscher, Donna Ginther and Jarron Saint Onge; seminar participants at the University of Minnesota, University of North Carolina, Appalachian State University, Johns Hopkins University, University of Pittsburgh, Brown University, and the University of Kansas Medical Center; and other conference participants at the 2017 iHEA conference, the 2017 National Bureau of Economic Research Summer Institute, and the 2017 APPAM Conference for their suggestions and feedback. We also thank Glenn Copeland of the Michigan Department of Community Health, Vital Records and Health Statistics Division, for providing vital statistics data; the University of Michigan–Flint GIS Center for sharing data; David Powell for sharing his imperfect synthetic control method code; Anil Kumar for sharing his Conley-Taber code; the West Virginia University Center for Free Enterprise for financial support; the Big XII Faculty Research Fellow Program; and the staff at KU IT for managing our research server.
Notes
Figure B1 in the online appendix provides a timeline of events around the water switch.
GM employment in Flint decreased from 80,000 in 1978 to 30,000 in the 1990s to less than 10,000 as of 2011 (Scorsone and Bateson 2011).
In the Great Chinese Famine, taller children were more likely to survive but were stunted, resulting in minimal height change but taller, unscarred grandchildren (Gørgens et al. 2012).
See https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_15_1YR_S0101&prodType=table for “State 040,” “Place 160.”
Our sample covers 95 conception months: May 2007–March 2015, corresponding to January 2008–December 2015 birth data.
Our results are robust to varying the treatment date (see Fig. B3, online appendix).
We calculate a 13-month moving average (+/– 6 months) to remove seasonality and idiosyncratic noise (see Fig. B4, online appendix).
We test for statistically significant differences in pretrends by dropping the treated months for Flint and then regressing GFR on a linear time trend interacted with the Flint dummy variable as well as city and month-year fixed effects. We find a monthly pre-trend difference of –0.0385 (SE = 0.0180; p < .05). However, we are not concerned with this difference for three reasons. First, the magnitude is small. Suppose that this pre-trend continued after the water switch and was driving our results. With 95 months of data (78 pre-period and 17 post-period), at the end of the pre-period, the GFR in Flint GFR would be 78 × –0.0385 = –3.00 lower than in the cities, and so the average pre-period difference would be –1.50. At the end of the post-period, Flint would have grown to –3.66 (95 × –0.0385), resulting in a post-period difference in averages of –3.33. A difference-in-difference analysis would therefore give a result of –1.83, but this would explain only 24.5 % of our main result (–7.451). Thus, nonparallel trends explain less than one-quarter of our variation. Furthermore, removing this –1.83 from our estimate in column 2 gives –5.62, which is almost exactly the result in column 3 that controls for city-specific linear time trends. Second, this pre-trend difference has a p value of .49, whereas our main result has a t statistic of –9.19. Thus, our main results are much more precisely estimated. Finally, we find consistent results (discussed later) when applying both the synthetic control method and the imperfect synthetic control method, both of which by design better match pre-trends between the treatment and control cities.
We sought administrative records on fetal deaths from the state of Michigan. Unfortunately, the state sent us two data files showing different results, and so we are unable to report whether the fetal death rate increased or decreased. However, using the data set that gives the largest increase in the fetal death rate explains only 2 % of the decrease in the fertility rate.
We also estimate models using abnormal conditions as the dependent variable, but this variable is often missing data. We estimate models only for 2010 onward and find no evidence of changes in abnormal conditions.
Clustering at the city level, rather than the census tract level, provides mostly statistically significant estimates. Still, scarring and selection may be negating each other, and so the disentangling presented in the Discussion section applies.
The specific ATUS wording is “having sex, private activity (unspecified), making out, personal activity (unspecified), cuddling partner in bed, spouse gave me a massage.”
This result is analogous to Barreca et al.’s (2018) findings of a statistically significant increase in the probability that individuals spend time on sex during environmental conditions that reduce fertility.
The ATUS has only county/CBSA identifiers. In Table B1 of the online appendix, we repeat our analyses at the county level and show that although considering the rest of Genesee County (where Flint is) as treated reduces the magnitude of our results, they are still directionally consistent and statistically significant in some specifications.
We describe this method in section A of the online appendix.
We find similar effect sizes when we drop outlier cities (Fig. B5, online appendix) and when we drop Flint from the inference analysis so that when we assign treatment to each control group, Flint cannot be part of a synthetic control.
Our estimates are robust to this alternative specification (see Fig. B6, online appendix). We also find similar results when we match on the fourth quarter GFR for each year before the water switch (2007, 2008, 2009, 2010, 2011, and 2012) and when we use a 13-month moving average for GFR.
The exception is the county-specific linear time trend, which biases the results for the reasons described earlier.
Our final period, March 2015, shows an increase in GFR for Flint. We are unable to determine whether that month’s GFR is an outlier or a general trend toward higher GFRs.
Limiting the comparisons to cities with a population density similar to Flint’s (approximately 3,000 individuals per square mile) also provides comparable results.
The change in GFR in Flint = (62 – 57) × population aged 15–49 in Flint (26,000) × the number of years affected (17 / 12) = 198; the difference-in-differences estimate = (7.5) × 26,000 × 17 / 12 = 276.
The change in sex ratio (0.009) × the number of post–water switch births (2,010) = 18.
, μ is the mean, σ is the standard deviation, Φ is the standard normal CDF, ϕ is the standard normal PDF, and p is the truncation cutoff.
References
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.