Exploiting unique German administrative data, we estimate the association between an expansion in maternity leave duration from two to six months in 1979 and mothers’ postbirth long-term sickness absence over a period of three decades after childbirth. Adopting a difference-in-difference approach, we first assess the reform’s labor market effects and, subsequently, prebirth and postbirth maternal long-term sickness absence, accounting for the potential role of the reform in mothers’ selection into employment. Consistent with previous research, our estimates show that the leave extension caused mothers to significantly delay their return to work within the first year after childbirth. We then provide difference-in-difference estimates for the number and length of spells of long-term sickness absence among returned mothers. Our findings suggest that among those returned, mothers subject to the leave extension exhibit a higher incidence of long-term sickness absence compared with mothers who gave birth before the reform. This also holds true after we control for observable differences in prebirth illness histories. At the same time, we find no pronounced effects on mothers’ medium-run labor market attachment following the short-run delay in return to work, which might rationalize a negative causal health effect. Breaking down the results by mothers’ prebirth health status suggests that the higher incidence of long-term sickness absence among mothers subject to the reform may be explained by the fact that the reform facilitated the reentry of a negative health selection into the labor market.
Most Organisation for Ecomonic Co-operation and Development (OECD) countries offer at least brief periods of maternity leave covering some weeks before and after childbirth. Their primary aim is to protect mothers and their offspring from immediate health impairments around childbirth (for an overview, see Tanaka 2005). In addition, many countries offer more-generous policies. One such example is Germany, which over the past decades experienced several expansions in maternity leave. Although more recent reforms were primarily initiated to enhance children’s well-being and the compatibility of childrearing and female employment, West Germany’s earliest reform in 1979 explicitly aimed to improve mothers’ health (e.g., Dustmann and Schönberg 2012).
The goal of our study is to evaluate possible maternal health consequences of this reform, which increased the length of paid, job-protected maternity leave from eight weeks to six months. This corresponds to a pre-reform setting that was quite close to the current maximum of 12 weeks of unpaid, job-protected leave in the United States and the minimum of 14 weeks of paid, job-protected leave in the European Union (EU) (see Dagher et al. 2014). The role of maternity leave regulations in Germany and elsewhere has been widely studied with regard to labor market outcomes (e.g., Baker and Milligan 2008b; Ondrich et al. 2003; Schönberg and Ludsteck 2014), fertility (Hoem 1993; Lalive and Zweimüller 2009), and children’s outcomes (e.g., Baker and Milligan 2010; Dustmann and Schönberg 2012; Rossin 2011), but relatively few studies have investigated potential associations with maternal health (for reviews, see Aitken et al. 2015; Staehelin et al. 2007). Against the background of a continuously growing share of working mothers, this lack of knowledge seems unfortunate. We contribute to closing this gap by exploiting unique administrative data from the German Pension Register and the Federal Employment Agency (BASiD), covering a period of up to three decades after the initial expansion of leave duration.
Because the reform considered here became effective in May 1979, depending on the birthday of the child, the assignment to different policy regimes is close to random. To assess the reform’s implications for mothers’ health outcomes, we adopt a difference-in-difference approach by comparing the differences in outcomes of mothers who gave birth shortly before and after the reform with differences in outcomes of mothers giving birth during the same calendar months one year prior to the reform. Health is measured by the number and length of spells of long-term (>6 weeks) sickness absence, which Marmot et al. (1995:124) proposed to “be used as an integrated measure of physical, psychological, and social functioning in studies of working populations.” As we will argue in this article, long-term sickness spells in Germany provide a highly reliable measure of illness given that the misuse of sickness payments is strongly restricted by the health insurance’s auditing system. Because long-term sickness absence is conditional on labor market participation, our measure is informative about the health outcomes of those mothers who returned to the labor market. To accomplish this, we take a two-step analytical approach, assessing (1) the reform’s effect on mothers’ return-to-work behavior (extending previous research by Schönberg and Ludsteck 2014) and, (2) prebirth and postbirth maternal health outcomes, accounting—importantly—for the potential role of the reform in mothers’ selection into employment.
Whereas our focus on returning mothers is highly relevant from a policy point of view, it restricts us to identifying a joint effect on health outcomes comprising both a causal and a potential unobservable compositional component. Although we cannot separate the latter, we shall attempt to assess the quantitative relevance of observable compositional effects. Our strategy here is to examine the relationship between observable prebirth illness histories and return-to-work patterns across mothers who gave birth shortly before and after the reform. Controlling for such differences in health is not only informative about a health selection on observable characteristics but may also yield important insights into the likely direction of a potential unobservable composition effect.
Previous Empirical Findings
Our analysis is closely related to the literature on the maternity leave–maternal health nexus (see Aitken et al. 2015). A first strand of empirical evidence stems from the psychological and public health literature, whose results tend to suggest that short maternity leave bears a negative association with women’s postpartum mental health (e.g., Hyde et al. 1995; Tucker et al. 2010). Whitehouse et al. (2013) estimated the optimal paid leave duration to reduce Australian mothers’ psychological distress to be more than 6 months but not more than 12 months, because longer leave duration would be associated with larger economic pressures (also see Dagher et al. 2014).
A major drawback of these studies is that they are mainly descriptive. Obviously, mothers’ timing of their postbirth employment interruption is not independent of their health. To address the resulting endogeneity problem, another strand of literature exploits differences in maternity leave legislation providing an exogenous variation in individuals’ uptake behavior. Baker and Milligan (2008a) followed this approach and found no effect of an increase in maternity leave entitlements on Canadian mothers’ self-reported health, depression, or specific postpartum health problems in the first two years following childbirth. Chatterji and Markowitz (2005) exploited regional variation in leave mandates to provide instrumental variable estimates from a national sample of employed U.S. mothers, finding only weak evidence that returning to work later affects the probability of having three or more outpatient physician or clinic visits in the six months following childbirth (for similar findings from Denmark, see Beuchert et al. 2016). Liu and Skans (2010) found no association between extended leave spells in Sweden and mothers’ hospital admissions due to mental disorders within 16 years after childbirth, but Chatterji and Markowitz (2005) suggested that a one-week increase in the length of maternal leave reduces a scale of depressive symptoms by approximately 7 %, on average. Chatterji and Markowitz (2012) corroborated this result and found a positive relationship between longer maternity leave and mothers’ overall health. Exploiting institutional variation across continental European countries, Avendano et al. (2015) found evidence that maternity leave policies yield significant mental health benefits for mothers that extend beyond the period around birth and persist into older age.
Most previous studies have been limited to considering variations in maternity leave uptake ranging from roughly 6 to 14 weeks, monitoring mothers’ health status for no more than 12 months after childbirth (cf. Aitken et al. 2015; Staehelin et al. 2007). The analysis by Avendano et al. (2015) constitutes a noteworthy exception because it links maternity leave benefits at the time when women aged 16–25 gave birth to their first child to mental health outcomes approximately 40 years later. Maternal health was, however, measured at a single point in time (at age 66, on average).
The German Institutional Context
The German leave legislation allows mothers to take time off from work after childbirth by guaranteeing them the right to return to a comparable job at their previous employer. When paid job-protected maternity leave was introduced in West Germany in May 1979, its maximum duration was six months, including eight weeks of postpartum mothers’ protection (Mutterschutz) available since 1968. During the mothers’ protection period, women continued to receive their average salary over the three months prior to childbirth, whereas benefits in the subsequent four months were fixed to a nominal amount of 750 Deutschmark, regardless of women’s prior earnings. However, only mothers who were employed before childbirth were entitled to these benefits.
The mother’s protection law (Mutterschutzgesetz) and the 1979 reform were explicitly designed to protect mothers from health impairments. This and the initially short leave duration makes the 1979 reform particularly suitable for testing potential direct associations with health, which—as discussed in the next section—may be expected to result primarily from leave expansions within the first year after childbirth.1
Proposed Mechanisms Linking Maternity Leave and Health
Maternity leave duration might affect mothers’ health directly and indirectly, in positive and negative ways. As to the direct effect, much of the evidence on the labor market effects of maternity leave suggests that expansions in leave duration delay mothers’ return to work (e.g., Baker and Milligan 2008b; Lalive and Zweimüller 2009; Schönberg and Ludsteck 2014). By increasing the time spent away from work, the availability of paid and job-protected maternity leave should therefore reduce young mothers’ exposure to stress (e.g., Chatterji et al. 2013). Such a benefit prevents mothers, for some time, from the double burden of childrearing and gainful employment by allowing them to concentrate on childcare responsibilities while facing only limited (if any) financial strain and job insecurity (e.g., Tucker et al. 2010), which should protect mothers from immediate health impairments during the period in which they are eligible for maternity leave. Next to short-run effects, one may also expect longer-run effects because lack of protection after childbirth might cause an initial health shock, triggering increasingly worse health over the individual’s life course. An accumulation of health disadvantage may result from health shocks experienced both during critical periods of the life course (e.g., childhood; see Haas 2008) and following critical life events (e.g., unemployment; see Strandh et al. 2014). Moreover, a later return to work is associated with higher odds of initiating and continuing breastfeeding over a longer period (e.g., Baker and Milligan 2008a; Roe et al. 1999), which some studies have suggested to bear a positive relationship with mothers’ short- and long-term health (e.g., Labbok 1999; Lawrence 2000).
Mothers’ entitlement to maternity leave also affects their subsequent attachment to the labor market. The direction of this indirect effect—and a possibly resulting effect on maternal health—is theoretically ambiguous, though. On the one hand, causal studies have provided little support for adverse long-run employment consequences (e.g., Lalive and Zweimüller 2009 for Austria; Rasmussen 2010 for Denmark; Schönberg and Ludsteck 2014 for Germany). Rather, job-protected leave entitlements seem to facilitate higher levels of labor force participation among women with small children (e.g., Schott 2012) and have been shown to increase job continuity with the prebirth employer (Baker and Milligan 2008b). Job-protected leave may thus enable more women to enjoy the health benefits of gainful employment (e.g., Schnittker 2007).
On the other hand, incentives to postpone one’s return to work—set by very generous maternity leave programs—might result in lower labor market attachment and potential economic disadvantages in the long run because time spent out of the labor force may devaluate the individual’s human capital and lower her earnings (e.g., Buligescu et al. 2009).2 A potentially negative career trajectory and lower socioeconomic status might thus bear the potential for an indirect health penalty for mothers who take leave from work after childbirth (e.g., Frech and Damaske 2012). Moreover, women in particular have been suggested to derive subjective utility from noneconomic interpersonal work rewards, such as social support or recognition from others (e.g., Ross and Mirowsky 1996). Women losing their attachment to the labor force might thus face not only economic but also psychosocial disadvantages, resulting in adverse health outcomes.
We argue that the sign of causal health effects should depend on the time horizon as follows:
If maternity leave has a direct causal health effect in the short run, we expect to observe a positive effect, where longer leave protects mothers from health impairments related to the aftermath of childbirth. In this case, the effect of leave on health (or, more specifically, long-term sickness absence) should be primarily mediated through the effect of the leave extension on mothers’ delay in return to work shortly after childbirth.
Causal effects of maternity leave extensions on women’s long-run health outcomes, however, are likely to reflect indirect effects. In this case, a positive effect of a longer leave on health should be mediated either through a direct positive effect of leave duration on mothers’ health shortly after childbirth, and/or through a direct effect of maternity leave on mothers’ subsequent employment careers. If the latter effect dominates and the leave extension fostered mothers’ detachment from the labor market, one might also expect a negative causal effect of longer leave duration on maternal long-term sickness absence in the longer run.
Next to causal effects, another channel through which expansions in leave coverage might alter postbirth health outcomes of those mothers who have returned to the labor market are potential compositional effects. A key concern is that an expansion in leave coverage may give rise to differences in the dependencies of return-to-work decisions on health status. Note that the direction of this effect is not clear a priori. On the one hand, one might argue that longer leave mandates offer better opportunities to recover from the stresses associated with childbirth, particularly for mothers with a poor prebirth health status. More time off work will also allow these women to better adapt to their (prospective) double role as employee and mother, reducing subsequent work–family conflict and its adverse health consequences (e.g., Allen et al. 2000), which might encourage those who otherwise would have stayed away from the labor market to return to work. On the other hand, more time spent at home causes a depreciation of human capital. The latter is arguably more costly for those with a poor health status and might therefore—compared with the pre-reform setting—worsen the incentives to return to work for the less-healthy ones.
Data and Empirical Strategy
The empirical analysis is based on German register data (BASiD), combining information from the German Pension Register with various data sources from the German Federal Employment Agency (see Hochfellner et al. 2012).3 The BASiD data set is a stratified random 1 % sample of all birth cohorts from the early 1940s to the early 1990s who have at least one entry in their social security records. The data provide longitudinal information on individuals’ entire pension-relevant histories up to the calendar year 2007. Individual work histories cover the period of those who have not yet retired, generally from the year individuals were aged 14 until the age of 67. In Germany, statutory pension insurance is mandatory for all employees in the private and public sector, thus excluding civil servants and self-employed individuals. In addition, contributions to the pension insurance are paid by the unemployment insurance or the health insurance during periods of unemployment and prolonged illness.
The Pension Register contains information on all periods for which contributions were paid (such as employment, long-term illness, and unemployment) as well as periods without contributions, which were still creditable for the pension insurance. These are periods of school or university attendance after age 15, periods of training and apprenticeship, and periods of caring. For the wide majority of mothers, we can retrieve information on their entire prebirth and postbirth employment and illness histories. (See Table 6 in the appendix for a more detailed description of the individual characteristics provided by the Pension Register.)
Starting from 1975 (in West Germany), employment spells subject to social security contributions from the Pension Register can be merged with data from the German Federal Employment Agency, the Integrated Labor Market Biographies, and the Establishment History Panel. The Integrated Labor Market Biographies provide further time-varying individual information on educational status and an establishment identifier. The latter allows us to retrieve information on tenure at the current employer. (See Table 7 in the appendix for a more detailed description on the variables gained from the Integrated Labor Market Biographies.)
Measurement of Births and Maternal Leave Durations
The pension insurance records the year and month of all births. Compared with other data sets used for the analysis of female employment histories, the information on children and births in the data can be considered highly reliable (Kreyenfeld and Mika 2008). The recorded births generally pertain to the child’s parent who claims the (pension-relevant) period of childrearing, though. The data may thus also include fathers with a recorded birth. However, the pension insurance records the period of childrearing as a default for the child’s mother, and fathers may claim childrearing periods only upon formal request (Kreyenfeld and Mika 2008). As a result, the fraction of fathers claiming the period of childrearing is negligible. Moreover, the pension data does not provide direct information on maternal leave takeup. In our analysis, we will measure leave durations by the number of months that elapse until the first postbirth employment spell. Thus, although the data allow us to precisely measure postbirth employment interruptions,4 they are not informative about the effective takeup of benefits. However, given that we are interested in the duration of job protection (as opposed to the takeup of benefits), this constitutes a minor restriction.
Spells of Long-Term Sickness Absence
We retrieve information on spells of long-term sickness absence as a measure of mothers’ postbirth health status. The BASiD data record (1) all spells of illness are subject to sickness pay covered by the mandatory health insurance, and (2) all spells covering long-term rehabilitation measures. The first type comes into effect after a period of six weeks of absence and may cover either spells of employment or unemployment.5 The six-week period corresponds to the mandatory duration of sickness pay to be paid by employers and may also derive from the accumulation of several shorter illness spells within the last 12 months, as long as these are caused by the same disease diagnosis. In 2014, the most frequent diagnoses were related to mental and behavioral disorders (especially depressive episodes) as well as to diseases of the musculoskeletal system and connective tissue (back pain in particular); see SVR Gesundheit (2015: chapter 7). A potential concern is that illness spells subject to sickness payments may, to a limited extent, also cover caring periods for ill children younger than age 12. However, these periods were capped at a maximum length of 5 days per year per child before 1992 and of 10 days thereafter. To address this potentially confounding effect, we exclude from our illness episodes all those spells up to a length of 5 or 10 days, respectively.6
The second type of illness episodes covers spells with measures aimed at reintegrating individuals who suffer from long-term ill health into the labor market. Taken together, periods of illness recorded by the Pension Register Data generally refer to spells of long-term illness of employees who have been absent due to the same disease diagnosis for more than six weeks.
Given that the six-week period is strongly linked to the mandatory duration of employer-provided sickness pay, it is important to stress that the latter has remained unchanged since 1970. Thus, using spells of long-term illness as an indicator of mothers’ health has the clear advantage that one obtains a consistent measure of health over the whole available observation period. A further advantage over health measures based on survey data (e.g., Baker and Milligan 2008a) is that our administrative measure does not suffer from attrition bias. Moreover, different from short-term sickness absence, which has been suggested to indicate absenteeism rather than ill health (e.g., Marmot et al. 1995; also see Johansson and Palme 2002), long-term sickness absence spells may be expected to provide a highly reliable measure of illness for several reasons. First, long-term sickness payments apply to individuals who are still in the labor force with the overall aim to sustain their long-term employability. Thus, unlike disability insurance schemes, long-term sickness payments offer no possibility to permanently withdraw from the labor market along with the associated disincentives. Second, the misuse of sickness payments is strongly restricted by the health insurance’s auditing system. According to the German Social Code (SGB V; 275), the Medical Service of the Health Insurance (MDK) has the right to audit individuals’ sickness absence if the health insurance expresses profound doubts about its acceptability. Audits may be performed either based on an assessment of the documentation provided by the medical doctor who ascertained the individual’s inability to work, or based on a personal assessment of the individual’s ability to work by the service’s medical staff. In 2014, approximately one-half of the receivers of long-term sickness payments were subject to an audit; in roughly one of five audited cases, the MDK recommended terminating the payments (which does not imply, though, that the individual was not eligible initially); see Eckl (2015). Objection against the MDK’s recommendation is possible but must be based on another medical expert’s opinion. Overall, this auditing system renders misreporting or misuse of long-term sickness payments very difficult and costly. The notion that moral hazard should be of minor relevance is further supported by indirect evidence from Ziebarth (2013), who explored a German reform of long-term sickness payments that involved a cut in the replacement level from 80 % to 70 % of forgone gross earnings. The author found that the cut in sickness payments did not significantly reduce the average incidence and duration of long-term sickness periods, thereby confirming the view that long-term sickness absence reflects true illness.
Despite the overall advantages over self-reported health measures, our measure has some limitations. First, it may be somewhat conservative given that shorter illness spells are not captured by the data. However, because the data allow us to measure not only the number but also the length of long-term illness spells, we obtain sufficient variation in this measure. Second, even though our proposed measure is not strictly contingent on employment after childbirth, it clearly conditions on labor market participation. As mentioned earlier, this may imply compositional effects, which we address by explicitly investigating the reform’s labor market effects.
POSTi is defined as earlier. The indicator 1979i is equal to unity for mothers giving birth in the reform year 1979, and 0 otherwise. The coefficient β3t on the interaction of these two variables (POSTi· 1979i) identifies the (intention-to-treat) effect of the expansion in the leave duration on mothers’ outcomes Yit in month or year t after childbirth. The vector of controls, Xi, is interacted with the indicator 1979i, such that the coefficients on Xi are allowed to vary across mothers who gave birth in 1978 and 1979, respectively.
Another important prerequisite for our empirical strategy to work is the assumption that mothers subject to the reform were treated by chance—that is, their fertility behavior is no response to an anticipated reform. Whether and for which time span this assumption is plausible depends on when the reform has been seriously debated and announced to the public. For the 1979 reform considered here, Dustmann and Schönberg (2012) have already investigated the validity of the no-anticipation assumption. Analyzing newspaper articles that appeared prior the reform, the authors showed that the change in maternity leave legislation was announced to the public only shortly before it came into effect. In particular, the government proposed the reform’s draft bill in January 1979. Our choice of an observation window of four months before and after the reform ensures that mothers’ fertility decisions were exogenous to the reform.
After having identified the causal effect of a prolonged maternity leave duration on mothers’ postbirth return-to-work behavior, we then address the reform’s consequences for mothers’ health outcomes.7 Given that our health measure is conditional on labor market participation, it restricts us to identifying a joint effect comprising both a causal and a potential unobservable compositional component. In terms of Eq. (2), this implies that as long as the reform has altered the unobservable health composition of those returned to the labor market, the error term uit conditional on having returned will differ across mothers giving birth shortly before and after the reform. In this case, β3t will also capture an unobservable compositional component.
As argued earlier, the direction of such a potential effect is not clear a priori. To address this issue, we examine the relationship between the observable prebirth health status and return-to-work patterns across pre-reform and post-reform mothers. In this regard, it is important to note that the data allow us to retrieve information on the full prebirth employment and (associated) illness histories. Including these observables in Xi may thus give us some indication of whether a comparison of long-run labor market and health outcomes across pre-reform and post-reform mothers is driven by systematic differences in the return behavior across both groups.
As set out earlier, for the 1979 cohort, post-reform (pre-reform) mothers are those who gave birth after (before) the change in maternity leave legislation, with the observation window spanning four months before and after the threshold date. Because the eligibility rules for job protection require that mothers be employed prior to childbirth, we restrict both groups to those women who were employed for at least three months within the year prior to giving birth. For the 1979 cohort, this results in a sample comprising 967 observations, which corresponds to 42.5 % of the total number of births (2,294) observed between January and August 1979 in our data. This covers approximately 60 % of a 1 % sample of the official number of births in 1979 in West Germany.8
Table 1 reports differences in leave durations after childbirth. The first row indicates that post-reform mothers—those giving birth from May to August 1979—spent on average more time away from work than pre-reform mothers (105.2 compared with 100.88 months).9 The second row reports the share of mothers whose leave durations are right-censored in the data: those who had not returned to work by 2007. The figures indicate that the reform slightly raised the fraction of mothers never returning to the labor market. Note, however, that the differences are not significantly different from zero. Conditional on returning, post-reform mothers feature somewhat lower average leave durations than pre-reform mothers (albeit again not statistically significant at conventional levels; see the third row). This suggests that the unconditional longer leave duration among post-reform mothers can be fully explained by a larger fraction never returning to work. Given that we expect an expansion in leave coverage to delay mothers’ return to work, the conditional shorter leave duration among post-reform mothers deserves some more attention. In the next section, we explore whether this result is driven by potentially long- or medium-run positive effects that may counteract the short-run negative effects.
To address whether any systematic differences exist across pre-reform and post-reform mothers, Table 2 provides descriptive evidence on a number of prebirth characteristics. For the 1979 cohort, the upper panel of Table 2 indicates that both groups appear to be quite similar with respect to age and education. Pre-reform and post-reform mothers also exhibit similar amounts of previous employment, unemployment, and non-employment experience. The same is true for the prebirth illness duration as well as the number of prebirth illness spells, which can be taken as indicators of mothers’ prebirth health status. The p values of the t tests indicate that none of the differences in these health related attributes are statistically significant. To check whether these two variables may jointly predict group outcomes, we also perform an F test indicating that the two variables are not jointly significant either (with a p value of .49).
The lower panel of Table 2 contains the corresponding descriptive statistics for mothers giving birth during the same observation window in 1978. The figures show that post-reform mothers in 1978 (i.e., those giving birth between May and August 1978) were less likely to be high-skilled, and exhibited longer non-employment durations than pre-reform mothers. This further highlights the importance of including the covariates in our regression framework.
Labor Market Outcomes
We next use three indicators to explore the extent to which the extension of the mandated maximum leave duration affects mothers’ postbirth return-to-work behavior. First, we measure the return-to-work probability with an indicator equal to 1 if a mother has returned at least once to the labor market by month m or year t (return-to-work). A mother is defined as having returned to work if she is employed for at least two consecutive months after childbirth. To distinguish between short- and long-run effects, we construct this indicator for up to 24 months and up to 28 years after childbirth. Second, to address the fact that mothers may only temporarily return to work, we measure whether a mother is employed in month m or year t after childbirth (employed). Third, to capture the intensive dimension of employment, we also look at the cumulative number of months worked by year t after childbirth (months worked).
Panel a of Fig. 1 compares the return-to-work profiles during the 24 months after childbirth for the 1979 cohort of pre-reform and post-reform mothers. This panel demonstrates that the reform strongly affects the short-run return-to-work behavior. Approximately 38 % of pre-reform mothers but only 6 % of post-reform mothers return to work after two months following childbirth; 17.9 % of post-reform and 21.9 % of pre-reform mothers return to work before the mandated leave duration has run out. This implies that 82.1 % of post-reform and 78.1 % of pre-reform mothers fully exhaust the maximum mandated leave duration. Although 21.8 % of post-reform mothers and 16.3 % of pre-reform mothers exactly return to work when the mandated leave duration has run out, 60.3 % of post-reform and 61.8 % of pre-reform mothers continue to stay away from work at the end of the job protection period. The figure further reveals that within the first two years after childbirth, the delay in return to work appears to be only of short-run nature given that after two years, the fraction returned is approximately 2 percentage points larger among post-reform than for pre-reform mothers.
As noted earlier, our data offer the great advantage of tracking women’s employment histories over a much longer time span than previously used data sets. To explore whether the long-run return-to-work profiles differ from the short-run profiles, panel b of Fig. 1 displays the (insignificant) differences in return-to-work probabilities between both groups during the maximum number of available years after childbirth for pre-reform and post-reform mothers. The figure reveals that the (insignificant) return-to-work advantage among post-reform mothers disappears approximately four years after childbirth. After 17 years, the pattern tends to reverse, as pre-reform mothers exhibit slightly larger return-to-work probabilities. As to the absolute fractions, after 28 years, 82 % of post-reform mothers ever return to the labor market, compared with 85 % among pre-reform mothers. Figure 2 shows that pre-reform mothers feature a larger fraction employed as well as a larger cumulative number of months worked by year t, especially during the last years of our observation period. As with panel b of Fig. 1, these differences are insignificant, though.
Table 3 reports regression results to assess the leave extension’s effect on subsequent labor market outcomes, explaining the outcome variables return-to-work probability, the fraction employed per year, and the number of months worked per year by the pre-reform or post-reform status as well as a number of controls. Estimates of the return-to-work and employment probabilities are based on a linear probability model. To further assess the importance of different time horizons, the four panels of the table report the regression results measuring the outcomes three months after childbirth and 3, 10, and 28 years after childbirth, respectively.
In line with Eq. (1), column 1 of Table 3 displays the baseline differences in the outcomes at different points in time for the 1979 cohort, with the data corresponding to those shown in Fig. 1. The estimates in the first panel show that mothers subject to the leave extension are significantly less likely to have returned to work three months after childbirth—a difference of 32 percentage points. The difference in the fraction employed is also estimated to be significantly negative in the short run (i.e., three months after childbirth). A similar pattern emerges for the intensive dimension of employment as post-reform mothers work approximately 1.5 weeks (0.344 months) less during the first three months after childbirth. For the return-to-work probability and the fraction employed, the pattern appears to reverse three years after childbirth, even though the positive differences of 1.6 and percentage points are estimated fairly imprecisely. The long-term differences in all outcome variables, in contrast, are estimated to be negative and insignificant at conventional levels.
As specified by Eqs. (1) and (2), column 2 of Table 3 adds a set of control variables (such as information on age and education) as well as the prebirth employment and illness histories. The data here show that in the short to medium run, the results are quite robust to including controls: 28 years after childbirth, the estimated coefficients reverse their sign but remain insignificant. Finally, the last two columns of Table 3 report the results from estimates using data on women giving birth one year prior to the observation window in 1979. Column 3 reports the estimates corresponding to those in column 2 for the 1978 cohort (i.e., for mothers giving birth between January and August 1978), whereas column 4 reports the differences with regard to the estimates in 1979. The estimates in column 4 therefore correspond to the estimated interaction coefficient β3t from Eq. (2). The estimates in column 3 suggest that for the number of months worked per year, some seasonal effects appear to confound the differences across pre-reform and post-reform mothers.10 In year 3 after childbirth, mothers giving birth between May and August appear to supply less labor in intensive terms compared with those giving birth between January and April. This difference is in absolute terms considerably larger than the difference in the reform year, such that the estimated net effect in column 4 becomes even positive, albeit insignificantly so. Apart from that, there are no further significant seasonal differences between 3 and 28 years after childbirth, even though the difference in the fraction employed gives rise to a significant positive net effect three years after childbirth. For the other outcomes, in contrast, there are no significant long-run effects 3, 10, or 28 years after childbirth, respectively.
How do these results compare with those that have been obtained earlier in the literature for Germany? The estimated difference-in-difference results obtained by Schönberg and Ludsteck (2014) are of the same order of magnitude as our estimates for month 3 after childbirth.11 Although their medium-run responses differ somewhat from our findings,12 their estimated effects 72 months after childbirth are broadly consistent with our results pointing to small negative return-to-work effects in the long run.
The empirical analysis thus far has documented a delay in return-to-work caused by the expansion in leave coverage. Moreover, the delay is most pronounced and significant only within the first year after childbirth. Does the reform-induced change in return-to-work behavior after childbirth translate into different health outcomes? To explore this issue, we construct three health indicators from the information on long-term sickness absence spells in the administrative records. Because these spells are contingent on women’s labor market participation, the health outcomes can be measured only conditional on being employed or either unemployed. Motivated by the fact that long-term sickness absence spells represent a relatively infrequent event, we first construct a dummy variable equal to 1 if a mother has experienced at least one such a long-term illness spell by year t (ever become ill). Second, we also look at the number of such long-term illness spells (number of illness spells per 1,000 days) relative to the cumulative time spent in the labor market by year t after childbirth. Finally, to capture the intensive dimension of illness, we compute the cumulative length of all long-term sickness absence spells (length of illness) relative to the cumulative length of labor market participation by year t after childbirth (including periods of employment and unemployment). Note that the recorded illness episodes in our data are confined to those illness periods exceeding the mandatory duration of sickness pay to be paid by employers (six weeks). To compute the exact duration of long-term illness spells, we therefore add the duration of six weeks to each recorded continuous illness spell.
Turning to the first measure (ever become ill), panel a of Fig. 3 displays the differences in the cumulative proportion of women that have experienced at least one long-term illness spell during the 28 years after childbirth for pre-reform and post-reform mothers. The expansion in leave coverage from two to six months is associated with a larger fraction of women ever having experienced a long-term illness spell at each point in time. In year 9 after childbirth, the difference peaks at 6.7 percentage points. After this period, the health disadvantage among post-reform mothers becomes somewhat smaller and almost disappears starting from year 17 after childbirth.
Panel b of Fig. 3 compares the number of illness spells per 1,000 days in the labor market after childbirth across both groups. Over the whole observation period, post-reform mothers experience more illness spells than pre-reform mothers, with the difference being largest between year 3 and 9 after childbirth. Evaluated at the mean time spent in the labor market, this implies that after 28 years, post-reform (pre-reform) mothers who returned to the labor market experienced on average 1.03 (0.88) illness spells.
Panel c of Fig. 3 compares the cumulative days of illness as a fraction of the duration of labor market participation after childbirth across both groups. Over the whole observation period, post-reform mothers experience a longer duration of illness relative to the time spent in the labor market than pre-reform mothers. Even after approximately 20 years, the difference amounts to approximately 1 percentage point, suggesting that the differential tends to be long-lasting. Evaluated at the mean time spent in the labor market, the data imply that after 28 years, post-reform (pre-reform) mothers experience, on average, 141 (131) days of long-term illness.
The overall result of a health disadvantage among post-reform mothers is surprising and clearly deserves further attention. We argue earlier that such a finding may be rationalized by a potential negative compositional effect because the reform might encourage the return to work among less-healthy mothers, who would have stayed away from the labor market under shorter leave mandates. We address this potential compositional effect in the next section within a regression framework.
Table 4 reports regression results to assess the reform’s effect on subsequent health outcomes, explaining the outcome variables, the indicator variable ever become ill, the number of illness spells, and the fraction length of illness by the pre-reform or post- reform status as well as a number of controls. For the indicator variable ever become ill, the estimates are based on a linear probability model. To further assess the importance of different time horizons, the four panels in the table report the regression results measuring the outcomes 1, 3, 10, and 28 years after childbirth, respectively.13
Column 1 of Table 4 displays the baseline differences in the outcome variables at different points in time, with the data corresponding to those shown in Fig. 3. These baseline estimates show that the fraction of women ever having experienced a long-term illness spell is 3.2 percentage points larger among post-reform compared with pre-reform mothers three years after childbirth. The difference in the fraction length of illness is also estimated to be significantly positive in the medium run, amounting to 1.1 percentage points three years after childbirth. The same is true for the number of illness spells, with the difference being 0.064 per 1,000 days (i.e., per approximately three years) spent in the labor market. Column 2 adds a set of control variables, such as information on age, education, and prebirth employment histories. The data show that 3 years after childbirth, the results are robust to including controls, whereas adding controls leads to a decline in the estimated coefficients after 10 and 28 years, respectively. This finding may potentially reflect that among post-reform mothers, those who return between years 3 and 10 are particularly negatively selected with respect to their observable characteristics. To explore whether the health disadvantage of post-reform mothers, can be explained by a negative health selection among this group, we show the results after additionally controlling for prebirth illness histories in column 3. The positive coefficients for all outcome variables become slightly smaller but are still significant three years after childbirth. Even though the differences between the estimated coefficients from columns 2 and 3 are not significant, the observed declines may be taken as some (weak) evidence of a negative observable health selection among those returned after the reform.
Finally, the last two columns of Table 4 report the results from estimates using data on women giving birth one year prior to the observation window in 1979. Column 4 reports the estimates corresponding to those in column 2 for mothers giving birth between January and August 1978, whereas column 5 reports the difference-in-difference estimates (i.e., the coefficient β3t from Eq. (2)). Contrary to the labor supply outcomes from Table 3, all estimates referring to 1978 turn out to be insignificant, suggesting that there are no distinct seasonal effects confounding the results. In terms of illness duration, the estimates indicate that post-reform mothers exhibit a 0.8 and 0.4 percentage point larger fraction length of illness 3 and 28 years after childbirth, respectively. Evaluated at the mean time spent in the labor market, this implies that post-reform mothers experience approximately 5 and 13 additional long-term illness days compared with pre-reform mothers 3 and 28 years after childbirth, respectively.
Overall, the estimates indicate that even after controlling for observable health differences, mothers subject to the leave extension still exhibit unfavorable health outcomes compared with those giving birth prior to the reform. Although being particularly pronounced after three years, the health disadvantage is already visible (albeit small and imprecisely estimated) within the first year after childbirth. This finding strongly argues against a dominating substitution effect within the first year after childbirth, inducing those who are subject to the shorter leave duration to substitute maternity leave by an illness episode. At the same time, the results for labor market outcomes show that there are no pronounced effects on return-to-work behavior within the first three years after childbirth, which might rationalize a negative causal health effect for post-reform mothers. Overall, these findings suggest that the positive coefficient reflects a further negative health selection on unobservable characteristics.
In this section, we provide further support for the idea that the reform may have facilitated reentry of a negative (unobservable) health selection of mothers into the labor market. Recall from earlier that a potential negative selection effect might stem from longer leave mandates offering mothers with a poor health status the possibility to recover. This suggests that a potential negative health selection on unobservable characteristics is likely to be particularly pronounced for mothers with more unfavorable observable prebirth illness histories.
To investigate this hypothesis, Fig. 4 first illustrates the return-to-work profiles by mothers’ prebirth health status; the sample is broken down by “good health” (panel a) and “poor health” (panel b) mothers. A mother is considered to have poor health if her prebirth long-term illness duration per year in the labor market exceeds the median of mothers’ prebirth illness durations (equals 0 in our sample), and to have good health otherwise. According to this definition, 124 mothers of the 1979 cohort are classified as being of poor (prebirth) health and 843 are classified as having good (prebirth) health. Among the poor-health mothers, 66 are post-reform and 58 are pre-reform mothers; corresponding figures for mothers in good health are 430 for post-reform and 413 for pre-reform mothers. A comparison of panels a and b clearly shows that the return-to-work behavior of both groups is affected differently by the reform. In particular, poor health post-reform mothers show much larger positive differences in return-to-work probabilities after the extended leave duration has expired compared with good-health post-reform mothers.
To demonstrate that this difference also holds after controlling for observables and seasonal effects, Table 5 repeats the baseline result from Table 3 and 4 for both groups. Turning to the labor market outcomes in columns 1 and 2 of Table 5, the estimates indicate that the reform causes particularly those with a poor prebirth health status to increase their short-run labor supply. A similar pattern holds in the medium run, with poor-health post-reform mothers working approximately five additional months compared with their pre-reform counterparts in year 3 after childbirth. Moreover, the differences in the labor market effects across poor- and good-health mothers are significant at conventional levels in the first year after childbirth. In year 3, the difference in the effect on the number of months worked across both groups borders the 10 % significance level (with a p value of .103). These differences in return-to-work behavior were already visible by comparison of columns 2 and 3 of Table 4.
One potential explanation for the more pronounced return-to-work response among poor-health mothers is that this group may have benefitted to a greater extent from the legislation’s guaranteed right to return to their old employer compared with those with a better prebirth health status. For instance, poor-health mothers might have had much less of a chance under the old legislation to return to their old employer after a leave period longer than the mandated duration of two months (e.g., for reasons of negative signaling or stigmatization).
To investigate this issue further, we run a difference-in-difference regression with an indicator taking on the value of unity if a mother returned to her prebirth employer as the dependent variable. Estimating this regression separately by prebirth health status, the results corresponding to the difference-in-difference estimates in column 4 in Table 3 and column 5 in Table 4 suggest that among poor-health mothers, the reform-induced increase in the probability of returning to the old employer is 18.7 percentage points (with a standard error of 12.2), whereas for those with a good prebirth health status, the increase is only 2 percentage points (with a standard error of 4.3). Even though the difference between poor- and good-health mothers is not statistically different from 0 (with a p value of .18), these results lend some weak support to the notion that particularly poor-health mothers’ chances to return to their previous employer improved in response to the reform.
Overall, our results strongly support the notion that the reform has induced particularly those with a poor prebirth health status to reenter the labor market. Still, they do not tell us anything about whether this negative health selection also accounts for the unfavorable health outcomes among post-reform mothers. To explore this issue further, columns 3 and 4 of Table 5 report separate health outcome estimates for both groups. The results indicate that the established health disadvantage among post-reform mothers 3 years after childbirth is strongly driven by poor-health mothers. This is true for all health outcome variables. Turning to the long-run effects, the health disadvantage is still visible even 28 years after childbirth. Good-health mothers, in contrast, do not exhibit major positive and statistically significant differences in health outcomes between both groups.14 Taken together, these results therefore strongly support the view that the reform appears to have induced particularly those with a poor prebirth health status to reenter the labor market and that the unfavorable health outcomes among post-reform mothers are driven by this negatively selected group.
To further strengthen our findings, we conducted several robustness checks. To begin, we reestimated the model by reducing the time window to 2 (instead of 4) months before and after the reform to check whether the results are robust to the adopted time window around the reform’s threshold date. Although narrowing the observation window comes at the expense of a smaller sample size, it has the advantage of reducing any unobservable and seasonal differences across pre-reform and post-reform mothers. Second, to address the issue that we excluded shorter illness spells from our analysis, we reestimated our regressions after including these shorter spells in our health outcomes. Overall, our results are robust to these checks. The results of these robustness checks are available in Table S2 in Online Resource 1.
Finally, our estimates rely on the assumption that potential seasonal differences in outcomes of mothers giving birth during May to August and those of mothers giving birth during January to April should not differ over time; otherwise, we could not rely on differences between the two groups of mothers from one year prior to the reform to infer from it potential seasonality effects in the reform year. In general, we argue that it is implausible to assume that seasonal differences will differ across cohorts of mothers who gave birth just one year apart. To provide empirical support for this conjecture, we also conducted placebo difference-in-difference estimates for the years 1978 and 1977 in order to test the underlying parallel trends assumption of the difference-in-difference approach. As reported in Online Resource 1, our results are robust to the placebo estimates.
Summary and Conclusions
In this study, we explore the relationship between the 1979 expansion in maternity leave coverage in West Germany and long-term sickness absence among mothers who returned to the labor market after childbirth. Increasing the leave duration from two to six months, the reform explicitly aimed at improving working mothers’ health by alleviating the double burden of childrearing and paid employment immediately following childbirth. Exploiting unique administrative data from the German Pension Register and the Federal Employment Agency, we take a two-step approach, assessing jointly the reform’s effect on maternal labor supply (extending the analysis by Schönberg and Ludsteck 2014) and the association between the reform-induced expansion in leave duration with returning mothers’ long-term sickness absence. Different from Avendano et al. (2015), who investigated the long-run effect of maternity leave benefits on mental health based on a single measurement of maternal health in later life, our data allow us to monitor the intensive and extensive dimensions of mothers’ sickness absence over a period of up to 28 years after childbirth. Adopting a difference-in-difference approach, we exploit the fact that the reform caused an exogenous variation in mothers’ actual leave uptake behavior. Consistent with findings of previous studies, our results suggest that the leave extension caused mothers to significantly delay their return to work within the first year after childbirth. To explore whether the reform-induced change in return-to-work behavior after childbirth translates into different health outcomes, we look at the number and length of long-term sickness spells of gainfully employed mothers who gave birth before and after the change in leave legislation.
Our findings suggest that mothers subject to the leave extension exhibit a higher incidence of long-term sickness absence (in terms of the intensive and extensive dimension) compared with pre-reform mothers three years after childbirth. This result also holds after we control for observable prebirth illness differences. Because there are no pronounced effects on mothers’ labor market participation following the short-run delay in return to work that might rationalize a negative causal effect, the less-favorable health outcomes among post-reform mothers might reflect the lower bound of a negative unobservable health selection. To provide further support for this idea, we break down the estimates by mothers’ observable prebirth health status. Two major findings emerge from this analysis. First, the leave expansion induced particularly those with a poor prebirth health status to reenter the labor market. Second, the unfavorable health outcomes among post-reform mothers are mainly driven by this small, negatively selected group, thus indicating that the 1979 reform has indeed facilitated reentry of a negative health selection of mothers into the labor market.
Unfortunately, our data provide no direct information about the diagnoses underlying mothers’ sickness absence in our sample. It seems plausible to assume, though, that the two quantitatively most relevant diagnoses in 2014 (mental and behavioral disorders) as well as diseases of the musculoskeletal system (SVR Gesundheit 2015) were also very common (prebirth and postbirth) health problems among these women. Consistent with research suggesting a positive association between negative work–family spillover and depressive symptoms (e.g., Allen et al. 2000; Grzywacz and Bass 2003) or musculoskeletal problems (e.g., Hammig et al. 2011), we argue that these women likely would have dropped out of the labor force for good without a reform allowing them to take a longer leave, reducing work–family conflict for some more time—which, however, appears to have been insufficient to protect this particularly vulnerable group of mothers from further health impairments.
Taken together, our findings lead us to conclude that the reform failed to improve the quality of female labor market participation by supporting the return of a larger fraction of healthier female workers to the labor market. It seems important to keep in mind, though, that results from the West German context in 1979 cannot be naively transferred to other countries, such as the United States, where mothers’ attachment to the labor force tends to be higher and maternity leave entitlements continue to be less generous (e.g., Dagher et al. 2014). Still, the overall effects of expansions in maternity leave on the health composition of those returning to the labor market might be very small (or even negative) if such reforms are not complemented by other measures directed at maintaining or improving working mothers’ health following their return to the labor market.
We would like to thank Hendrik Jürges, Martin Salm, the anonymous referees, and the editor of this journal for their helpful comments and suggestions.
Another reason why we focus on the 1979 reform is that subsequent maternity leave reforms in Germany gave rise to a variety of reform scenarios that varied according to whether the policies altered (1) the length of the job protection period, (2) the length of the paid leave period, (3) the monetary amount of the maternity benefit, or (4) several of these components simultaneously. Schönberg and Ludsteck (2014) found that the combination of these reform parameters may matter greatly for the effects on labor market outcomes. We therefore refrain from exploring the subsequent reforms because this would require a full discussion of all scenarios’ labor market effects and their health implications.
Moreover, extensions of leave duration have also been shown to have a variety of unintended consequences. Puhani and Sonderhof (2011), for instance, documented that maternity leave extensions in Germany negatively affected job-related training for young women, irrespective of whether they have children.
Data access was provided via onsite use at the Research Data Centre (FDZ) of the German Federal Employment Agency (BA) at the Institute for Employment Research (IAB) and subsequent remote data access.
The data do not allow us to measure the exact day of birth. We therefore set each child’s birthday on day 5 of its respective month of birth. As a result, the measured leave duration will be associated with a measurement error of up to +14/–15 days.
Absence from unemployment may arise from the fact that an individual unemployed because of sickness is not available for the labor market. In such a case, she would not be obliged to accept job placements by the Federal Employment Agency.
Our data record illness episodes only in excess of the mandatory duration of sickness pay to be paid by employers (six weeks). This implies that, for example, a spell of 5 days recorded in the data reflects a spell of 47 days (six weeks plus 5 days).
The reform might theoretically also have had an effect on subsequent fertility, mediating its effect on health. Evidence from Austria (Lalive and Zweimüller 2009) and Sweden (Hoem 1993) has suggested that, for example, leave extensions opening up the possibility of renewing benefits by having another child without going back to work had a significant impact on subsequent births. However, these studies also suggested that this entailed a pure timing effect without affecting total fertility. The extension of maternity leave from two to six months observed in our study clearly was insufficient to allow for this kind of effect. Supplementary analyses based on our sample suggest that post-reform mothers did not experience more subsequent births than their pre-reform counterparts (more detailed results are available upon request).
This number derives from ≈ (580,000 / 12) ⋅ 8. We have several reasons for not observing the full number of births. First, the data exclude or underreport employment histories of civil servants and the self-employed. Second, due to the data’s restriction to cohorts born after 1939, we do not observe all relevant cohorts at risk of birth in 1979. Third, until 1967, married women had the possibility to apply for an advance payment of their pension entitlements, in which case their pension records were completely deleted. Note, however, that this latter restriction is very unlikely to affect our sample selection given that these women should have had a weak attachment to the labor market.
As we show later, the leave durations are fairly long: a large fraction of mothers continued to stay away from work at the end of the job protection period.
However, these seasonal effects three years after childbirth are significant only after we control for observable characteristics. See Table S1 in Online Resource 1, which reports the difference-in-difference estimates with respect to the baseline differences in column 1 without including any controls.
More precisely, the authors’ estimates of the differences in the return-to-work probability, the fraction employed, and the number of months worked are –30.5, –28.4, and –0.597, respectively (see Schönberg and Ludsteck 2014: table 1, column 1).
Although Schönberg and Ludsteck (2014) obtained a 1 percentage point lower return-to-work probability 28 months after childbirth, our estimates three years after childbirth point to a positive, albeit insignificant, effect. Their results, however, are not directly comparable with ours because their analysis included a different set of control variables and is based on a data set that does not allow for a precise identification of career interruptions due to childbirth.
In Table 4, the number of observations is reduced, compared with Table 3, because the health outcomes are measured conditional on being employed or unemployed. Note also that our health outcomes are measured for a larger group than those defined by our outcome indicator returned to work, which requires a mother to be employed for at least two consecutive months after childbirth.
The negative health outcomes might also reflect a negative causal effect that might stem from the reform’s effect on mothers’ labor market participation at the intensive margin, thereby giving rise to an increased double burden. To address this issue, we also performed regressions with the number of months worked conditional on having returned to the labor market as the dependent variable. The estimates indicate that conditional on having returned, the effect on poor-health post-reform mothers’ cumulated months worked becomes much smaller. In particular, one and three years after childbirth, the insignificant estimated effects corresponding to those in columns 1 and 2 of Table 5 are –1.00 and 2.01, with standard errors of 1.08 and 3.25, respectively. This suggests that the positive effect on poor-health mothers’ months worked is mainly due to higher labor market participation and should rule out a negative causal effect of the reform via its impact on labor market participation at the intensive margin.