Abstract

The impact of community-based family planning programs and access to credit on contraceptive use, fertility, and family size preferences has not been established conclusively in the literature. We provide additional evidence on the possible effect of such programs by describing the results of a randomized field experiment whose main purpose was to increase the use of contraceptive methods in rural areas of Ethiopia. In the experiment, administrative areas were randomly allocated to one of three intervention groups or to a fourth control group. In the first intervention group, both credit and family planning services were provided and the credit officers also provided information on family planning. Only credit or family planning services, but not both, were provided in the other two intervention groups, while areas in the control group received neither type of service. Using pre- and post-intervention surveys, we find that neither type of program, combined or in isolation, led to an increase in contraceptive use that is significantly greater than that observed in the control group. We conjecture that the lack of impact has much to do with the mismatch between women’s preferred contraceptive method (injectibles) and the contraceptives provided by community-based agents (pills and condoms).

Introduction

Family planning programs have been adopted in several developing countries with the purpose of increasing contraceptive use, improving reproductive health, lowering fertility, and reducing rates of population growth.1 One widely used approach has been community-based programs wherein individuals based within a community are trained to provide information on family planning and nonclinical methods, such as pills and condoms. The active involvement of communities is appealing, but the effectiveness of these programs in increasing contraceptive use and lowering fertility has not been demonstrated in a convincing way (Bauman 1997; Freedman 1997).

In this paper, we present results of a randomized field experiment. In the study, administrative areas in the Amhara and Oromia regions of Ethiopia were randomly allocated to one of three intervention groups or to a fourth control group. In the first intervention group, both credit and family planning services were provided, and the credit officers also provided information on family planning. In the other two intervention groups, only credit or family planning services, but not both, were provided. Areas in the control group received neither type of service. The study was designed to determine whether linking microcredit and family planning programs would increase contraceptive use by more than what could be accomplished by each program on its own. Pre- and post-intervention surveys (in 2003 and 2006) of two independent cross sections of approximately 6,400 households were used to collect data on contraceptive use, fertility, and various other outcomes. The study areas are rural, where households are largely subsistence-oriented and agriculture and livestock are the main sources of income.

Our data show that the study areas experienced considerable changes in fertility behavior and socioeconomic status during the study period, but these changes in fertility behavior and preferences appear to have been largely unrelated to changes in availability of credit and family planning services. Linking credit and family planning services did not increase contraceptive use any more than what was achieved by either program on its own, and neither type of program, linked or unlinked, led to an increase in contraceptive use significantly greater than that observed in the control group. In addition, the programs appear to have been ineffective at changing reproductive behavior and preferences among women in different age groups, even when we take into account differences in pre-intervention demand for contraceptive use.

The Impact of Family Planning Programs and Microcredit on Fertility-Related Outcomes

Evaluation of the impact of social programs is typically complicated by the fact that both program placement and individual participation are endogenous, that is, correlated with location or individual unobservable characteristics, which in turn are correlated with the outcome of interest. Family planning programs (FPPs hereafter) are no exception because they often target high-fertility areas and are used by self-selecting individuals. Pitt et al. (1993) and Gertler and Molyneaux (1994) used location-specific fixed effects to evaluate the impact of FPPs in Indonesia in the 1970s and 1980s while accounting for endogenous program placement. Both studies concluded that fertility was only marginally affected by the programs and showed that simple cross-sectional estimates would provide misleading results. Miller (2010) showed that the far-reaching Colombian FPP PROFAMILIA expanded in a near-random fashion and then estimated that it contributed to only 10% of the fertility decline observed during the country’s demographic transition.

Other studies have found substantial effects of FPPs while accounting for the potential endogeneity of program placement and participation. Thomas and Maluccio (2001) found that mobile family planning clinics and community-based distributors led to a substantial increase in contraceptive use in Zimbabwe. Angeles et al. (1998) used a structural model to analyze fertility decisions and the choice of program location and estimated that family planning clinics had a large impact on fertility in Tanzania. Angeles et al. (2005a) adopted an analogous estimation strategy to show that the 1985 enactment of the National Policy on Population in Peru was successful at reducing fertility. Using a stochastic dynamic model of family planning, education, and fertility, Angeles et al. (2005b) reevaluated the efficacy of the FPPs in Indonesia studied in Pitt et al. (1993) and Gertler and Molyneaux (1994). Using data from a longer time span, they found substantial program effects.

A more radical approach for avoiding these identification issues is the evaluation of FPPs by means of a true experimental design. Field experiments to evaluate the impact of FPPs are, however, complex and expensive, which may account for their limited number to date (Bauman 1997; Bauman et al. 1994; Phillips et al. 1999). The evaluation of an outreach family planning and health services project in the rural district of Matlab, Bangladesh, is perhaps the most widely cited social experiment. This long-term study has been conducted in an area followed since 1966 with a Demographic Surveillance System. A community-based program was introduced in 70 of the 149 villages that are part of the System; it consists of training local women who visit all reproductive-aged married women approximately every two weeks. Several studies have documented the sizeable impact of this program, which, however, was unusually intensive (see, in particular, Foster and Roy 1997; Joshi and Schultz 2007; and Sinha 2005).

Other family planning experiments have yielded varying results. Yang et al. (1965) reported results of a study from rural Korea, completed in 1962–1964, of married couples in which the wife was between 15 and 50 years old. In that study, a family planning program involving home visits, group meetings, and clinical services led to a large increase in contraceptive practices in treatment areas (from 8% to 37%), and this was greater than the increase observed in control areas (from 12% to 22%). However, the study was carried out in only two locations, with different preprogram characteristics. Freedman and Takeshita (1969) described the results of an experiment in Taichung (Taiwan) involving about 36,000 married couples with wives aged 20–29 randomly allocated to four treatment groups. They found that home visits and small group meetings increased acceptance of intrauterine devices (IUDs), while mailings and husband involvement did not. Bang (1971) found that randomly assigned requests of early check-up visits in Koyang (South Korea) brought evidence against the initial hypothesis that such a program would lead to increased IUD retention. Chan (1971) found that a home-visit program in Hong Kong aimed at reducing the number of IUD removals among IUD acceptors had only a negligible impact. Rosenfield and Limcharoen (1972) described an experimental program that allowed auxiliary midwives (rather than physicians) to prescribe oral contraceptives in rural Thailand, leading to substantially larger acceptance rates in treatment versus control provinces.

Experimental evaluations of FPPs in Africa are rare. Most of them have been carried out in West Africa, and almost all of them rely on a very small number of primary stage units. As a result, even though they are very valuable as case studies, it is too early to draw more general inferences on the basis of these results. (See Phillips et al. 1999 for a broader overview of community-based distribution of family planning in Africa.) Omu et al. (1989) reported results from an experiment in Nigeria in which more than 1,000 high-parity women admitted for prenatal care at the University of Benin Teaching Hospital during a 19-month period were randomly assigned to receive either standard family planning information (the control group) or individualized counseling sessions on family planning methods and the health risks associated with high parity. The fraction of women who were using contraception six weeks postpartum was 71% among women in the treatment group and only 51% among the controls.

In the Navrongo study (Binka et al. 1995; Debpuur et al. 2002; Phillips et al. 2006), carried out in a poor and relatively remote area of Northern Ghana, four groups of neighboring communities were randomly assigned to different experimental arms. In the first intervention arm, trained nurses were assigned to live and work in communities. They provided health services and made door-to-door visits to provide family planning services, including contraception methods. In a second arm, analogous tasks were assigned to local volunteers, and community mobilization was emphasized to reduce the social cost of adopting contraception. A third group of communities saw the introduction of both interventions, while a fourth received neither. In areas with both interventions, mean fertility was reduced by one birth in three years, a 15% reduction.

In The Gambia, a community-based information campaign in one primary health care circuit increased contraceptive use (relative to a control area), but improved access to fertility control methods in a third circuit did not lead to any additional increase in contraceptive use (Luck et al. 2000). Elsewhere, Katz et al. (1998) discussed the results of interventions in five subdistricts of rural Mali: A nongovernmental organization’s primary health care system saw the introduction of a community-based distribution program in two subdistricts, information campaigns were introduced in two other subdistricts, and a fifth subdistrict served as a control group. A survey completed a year-and-a-half after the introduction of these interventions showed increased knowledge and use of contraceptives in the subdistricts where the community-based distribution was initiated, but no additional increase was observed in districts with information campaigns.

Knowledge and availability of contraceptive methods are of course only two of the many factors that influence fertility decisions. At least as important are socioeconomic factors, such as gender-specific human capital, which can modify the cost-benefit calculus of contraceptive use and childbearing (see Schultz 1997, 2005 for overviews). Some have argued that credit programs that encourage borrowing by women can increase the opportunity cost of women’s time, potentially increase their control over household resources, and thus empower them enough to express their fertility preferences (Hashemi et al. 1996; Schuler and Hashemi 1994; Schuler et al. 1997; but see also Mayoux 1999 for an opposite argument).

Some authors have argued that fertility in poor areas may decrease among women with access to microcredit—that is, to small, collateral-free and group-based loans that allow relaxing budget constraints and may facilitate the creation of small economic enterprises. The theoretical predictions are, however, unclear. On the one hand, if access to credit enhances women’s economic opportunities, it may reduce desired fertility by increasing their opportunity cost of time. Where women’s desired fertility is lower than men’s, access to credit may also improve the woman’s bargaining power and increase contraceptive behavior. Empowering effects may also derive from the interactions with other women and social workers that usually accompany microcredit operations. On the other hand, if children are “normal goods,” fertility may actually increase if access to credit results in an increase in income. Overall, the direction of any causal impact of microcredit on fertility is unclear because it depends on income and substitution effects of opposite signs.

A number of studies have discussed these considerations and attempted to empirically test the existence and magnitude of a causal link between access to credit and fertility choices. The identification of the causal link has to overcome the usual selection concerns: microcredit programs are usually nonrandomly placed, and both eligibility and actual participation in the programs are likely correlated with characteristics that also directly affect fertility choices. Using different estimation strategies, several authors have found that in Bangladesh, in the 1990s, participation in credit programs increased contraceptive use and in some cases reduced fertility (Amin et al. 1995; Hashemi et al. 1996; Schuler and Hashemi 1994; Schuler et al. 1997; Steele et al. 2001). Pitt et al. (1999) found instead no evidence of an impact when program participation was measured as a function of the amount borrowed. Buttenheim (2006) used panel data from Indonesia and used community and household fixed effects to control for endogenous program placement and participation. She found that the presence of credit programs increases contraceptive use, while borrowing does not. To the best of our knowledge, the link between microcredit and fertility choices has not been studied rigorously outside Bangladesh (with the exception of Buttenheim 2006), and it has never been analyzed using a randomized field experiment.

One further aspect that has not been addressed in the literature is whether microfinance programs can represent useful entry points for family planning services. The regular contact that credit officers have with their clients offers a convenient avenue for providing information on family planning methods, and the group-monitoring element of micro-financing programs has the potential for building a support mechanism for adoption of a new practice. One of the purposes of the evaluation described in this paper was precisely this, to analyze whether such an accessory service offered by loan officers could lead to an increase in contraceptive adoption over and above that achievable by a community-based FPP on its own.

Program Interventions and Study Design

In Ethiopia, the Packard Foundation provides grants and technical assistance to microcredit programs and family planning programs in the Amhara and Oromia regions (see Fig. 1 for a map). In Amhara, the Program supports the credit activities of the Amhara Credit and Savings Institute (ACSI) and the community-based family planning programs of the Amhara Development Association (ADA). In Oromia, support is provided to the credit activities of the Oromia Credit and Savings Share Company (OCSSCO) and the family planning programs of the Oromia Development Association (ODA).

The credit programs target poor households and “emphasize” women borrowers, though no specific activities or criteria are used to seek out this target group. Each organization has a specific set of criteria that it uses to select borrowers, of which creditworthiness, viable business plan, and poverty are the more salient. There is no collateral requirement; instead, borrowers form small groups and take on collective responsibility for repayment of loans. Loans are made for a year at interest rates that reflect market conditions. Credit officers help fill out loan applications and also monitor the groups through monthly and biweekly meetings with all clients. Borrowers are expected to make regular deposits and repayments, and in recent years, the repayment rate has been reported to exceed 95%.

The family planning programs have a community-based orientation. Residents from the communities are trained, provided a uniform, and receive a fee for their services. They make house-to-house visits to provide information and pills and condoms. They also provide referrals for clinic-based services like injectibles—the main method in use in these regions—but they do not provide these injectibles. The organizations also organize other events to provide information on family planning, reproductive health, and sexually transmitted diseases, including HIV/AIDS.

The family planning programs have been in operation for several years and have steadily extended their coverage. In this process, they have continually sought to improve the quality of services they provide and to identify new service delivery options. Given the Packard Foundation’s continued support for microcredit in the same regions, one option that evoked interest was the linking of the family planning programs with the credit programs. The underlying rationale was that the monthly meetings of the borrowers and credit officers present an (additional) opportunity for providing information on family planning and motivating clients to adopt contraception. One of the objectives of the present study is to evaluate this link.

The evaluation was conducted by Family Health International with the purpose of determining whether linking the credit programs of ACSI and OCSSCO with the family planning activities of ADA and ODA could lead to a measurable increase in contraceptive use over and above that achievable by each type of program on its own. An experimental design was used to randomly allocate administrative areas to four groups with different combinations of credit and family planning services. One intervention group received both credit and family planning services, with the credit officers also providing information on family planning. A second intervention group received only family planning services, and a third received only credit services. The fourth arm was a control group that received neither type of program service. Change in contraceptive use (and related variables) was then measured with pre- and post-intervention household surveys.

The randomly allocated administrative areas are kebeles, or peasant associations (PAs). Administratively, Ethiopia is divided (from large to small) into regions, zones, woredas, kebeles, and villages, and so a kebele is a cluster of villages. In 2002, the four organizations identified 133 PAs as areas where they intended to start activities in the coming years. Fifty-five of these are in the Amhara region, and 78 are in the Oromia region. In Amhara, the PAs are from Bugna, Gidan, Meket, Delanta, Metema, Chilga, Alefa Takusa, and Lay Armachiho woredas, which are in the North Wollo and North Gonder zones. In Oromia, the PAs are from Mendi, Harru, Nejo, Ayra Guliso, Sayo, Anfilo, Metu, and Chora woredas, which fall in the West Wollega and Illubabor zones. Population data from the most recent census were used to randomly allocate PAs to the four groups, and the allocation was stratified by region.

Unfortunately, the randomized design was not always followed by the implementing agencies, although deviations remained relatively limited, especially in Oromia. The study protocol was followed in 37 of 55 PAs in Amhara and 66 of 78 PAs in Oromia (see Table 1). Eight PAs already had functioning programs at the time of randomization, of which two were interrupted at the start of the study, and two other PAs were merged by local authorities during the study period. We analyze noncompliance in detail in the next section, which also describes the data and the main estimation strategy.

Data and Methods

A baseline survey was conducted between the months of January and April in 2003, preceding the start of subgrantee programs in the study areas. The survey covered 6,440 households and was spread over 356 villages in the 133 PAs where the family planning and credit organizations intended to expand in the following years. The population in each PA includes a relatively small number of households, ranging from 109 to 1,377. Sampling was designed to select approximately 3,200 households containing at least one woman aged 15–49 years in each of the two regions. Within each region, PAs were randomly allocated to the four study cells so as to yield 800 to 810 households in each cell in each region. In the selected PAs, interview teams obtained a complete listing of all villages along with estimates of the number of households in each village. If a PA had more than 400 households, then three villages were selected at random for interviewing. If the PA had fewer than 400 households, then two villages were selected at random. Within the selected villages, a complete enumeration of households was undertaken, and a random sample of households was selected for interviewing. The sample is not self-weighted; therefore, sampling weights are required to produce unbiased estimates of population statistics.

A follow-up survey was completed during the months of April to July 2006. The survey was conducted in the same villages as the baseline survey, but a new sample of households was drawn using the same procedures used in the baseline survey. It is therefore important to note that the two surveys constitute a panel of villages, but not a panel of households. Difficulties in accessing some areas resulted in the survey teams not being able to cover all villages in one PA and one village in another PA. As a result, the follow-up sample has only 6,275 households.

During the final household survey, a community questionnaire was used to collect village-level information on demographics, income sources, infrastructure, access to markets and healthcare facilities, and availability (and timing) of family planning services; unfortunately, similar information was not collected at the time of the baseline survey. During the study period, monthly service statistics data were obtained from the woreda offices of the four subgrantee organizations. These data allow us to determine when program services were first introduced in a PA, what services were provided, and how many clients were served. At the end of the study, the same woreda offices were visited again, and data was collected on provision of program services in all PAs in the woreda. This information allows us to gauge program coverage in a woreda, in the PAs that make up the study area, and in PAs not in the study area.

Table 2 presents selected summary statistics from the baseline survey for the four assigned study groups along with tests of randomization. For each variable, we test the null hypothesis of equal means across the study groups, taking into account the clustered nature of the design. These test results show that in each region, the four study groups are well-balanced along almost all examined dimensions. The main outcome of interest is contraceptive use, and in neither region can we reject the null hypothesis of equal pre-intervention usage rates across the four arms at standard significance levels. In Amhara, the null hypothesis of equality is rejected at the 10% level for three variables: intention to use family planning in the future, desired number of children, and proportion of households who borrowed from revolving credit associations. In Oromia, the null hypothesis cannot be rejected for any variable.

Table 2 also highlights clear demographic and socioeconomic differences between the two regions. Fertility was high in both regions in 2003 but distinctly higher in the study areas in Amhara, where women married earlier, began childbearing sooner, had more births, and wanted to have more children than their counterparts in Oromia. For example, mean desired family size was 4.83 in Amhara, which is 0.5 less than the mean in Oromia. Relative to Oromia, twice as many women were married at age 25 in Amhara (74% versus 37%), and a higher fraction had already had children (53% versus 42%).

There was minimal contraceptive use in both regions in 2002, with the current contraceptive prevalence rate (CPR) at 3.5% in the Amhara study areas and 7.4% in Oromia. Intentions to use were higher in Oromia, where 71% of nonusers said they intended to use contraceptives in the future; the corresponding figure for Amhara was 46%. While contraceptive use and intentions to use were lower in Amhara than in Oromia, awareness of contraceptives was higher: 85% of women in the Amhara study areas were able to list at least one form of contraceptive, and 58% were aware of the existence of pills or injectibles. In Oromia, 78% identified at least one contraceptive, but only 45% were aware of pills or injectibles. The latter results, however, are not necessarily at odds with the lower contraceptive use observed in Amhara because factors other than awareness affect use, and these (like desired family size) point toward lower motivation to use contraception.

The study areas in the two regions are also substantially different in their economic structure, their schooling levels, and the religious affiliation of inhabitants. In both regions, almost all households engage in crop cultivation, but households in Oromia are much more likely (48% vs. 3% in Amhara) to cultivate coffee (a cash crop), even though a smaller fraction sell crops (37% vs. 54% in Amhara). Three quarters of households in Amhara are engaged in livestock maintenance, and the value of their livestock assets is, on average, around twice that of households in Oromia, where livestock care only engages about half the sample. In both regions, 18% of households reported taking or repaying a loan in the 12 months before the interview, and borrowing by women, as well as borrowing from revolving credit associations, was infrequent. Education levels are much higher in Oromia. For example, in Amhara, only 11% of women 15 to 49 years old had ever attended school, while in Oromia, the corresponding figure was 47%. Almost all households in Amhara described themselves as Christian Orthodox, while only one-third of respondents did so in Oromia. In the latter region, religious affiliation is more heterogeneous and also includes 51% Christian Protestants and 13% Muslims. More than half of the household heads in Oromia ever attended school, while only 10.5% did in Amhara.

These differences between the two regions present an interesting possibility for comparing the impact of essentially similar credit and family planning programs in different settings, but we do not pursue this in this paper. Instead, we analyze the two regions separately, largely because different organizations implemented the interventions in each region, and for our results to have programmatic value, it is important that they be region- and therefore organization-specific.

Estimation Strategy

To estimate the impact of the programs on different outcomes, we use a difference-in-differences (DD) approach. Let ypit denote an outcome for individual (or household) i from peasant association p at time t, and let Dt denote a binary variable equal to 1 in the post-intervention period (that is, when t = 1). Let also Crp, FPp, and CrFPp denote binary variables equal to 1 for PAs where the intervention introduces, respectively, microcredit, family planning, or both services (with loan officers also providing contraceptive information). Then, if treatment is assigned randomly, the causal impact of each intervention is measured by the coefficients α1, α2, and α3 in the following regression:
formula
(1)
This equation cannot be estimated in DD form because we do not have a panel of households, so that the household-specific differences cannot be calculated. We therefore transform Eq. 1 into DD form by calculating survey-specific, PA-level means of both sides and take first differences, which leads to
formula
(2)
where Δyp is the change over time in the PA-specific mean of ypit, and where the error up is the corresponding mean change of εpit.2 Because the left-side variable in Eq. 2 is a mean, we estimate the model using weighted regression, with weights proportional to the number of observations in the PA.3

With random assignment of treatment, the parameters of interest can be estimated consistently using Ordinary Least Squares (OLS). However, the analysis of the results is complicated by the imperfect compliance of the subgrantees with the randomization protocol. We then analyze whether noncompliance was associated with specific characteristics of the communities involved, as measured using data from the 2003 baseline household survey. In each region, we estimate two regressions with a binary dependent variable equal to 1 if the PA actually received FPP or microcredit (MC). The regressors are two dummy variables equal to 1 if either FPP or microcredit (MC) were assigned to the PA, and a set of 17 pre-intervention, PA-specific characteristics likely correlated with contraceptive demand and supply, and with the local socioeconomic environment. Here, we only summarize the results; details about the estimates and variable definitions are available upon request.

In Amhara (where, as we documented above, compliance was more problematic), the only predictor of actual family planning (FP) implementation that is individually statistically significant at standard levels is the randomly assigned FP treatment. The joint null hypothesis that the coefficients for all 17 PA-specific characteristics equal zero is rejected at standard levels (p = .0039), but the individual coefficients do not follow any consistent pattern that would suggest that actual treatment status was associated with observables related to existing demand or supply of family planning or to the economic environment. In Oromia, the results are even more strongly consistent with the absence of endogenous placement. First, the coefficient for randomly assigned FP treatment is 0.960—very close to 1. Second, none of the other regressors is significant at standard levels, and in this case, the joint null hypothesis that all PA-specific characteristics do not enter the regression cannot be rejected (p = .9998). When we look at assignment of microcredit, we again find no clear indication of endogenous program placement in Oromia. In Amhara, the null hypothesis that the PA-specific characteristics are jointly equal to zero is rejected (p = .00), although once again we find no easily interpretable pattern that explains how the associated factors might have informed decisions on microcredit program placement. Overall, our analyses show that the deviations from protocol of program implementation do not appear to be associated in any transparent way with PA-level indicators likely correlated with unobserved levels or trends in demand or supply of contraception, especially in Oromia. Note also that OLS estimation of the DD model in Eq. 2 will adequately account for endogenous program placement if endogeneity depended only on time-invariant characteristics at the PA level, which are differenced out in the DD framework. At the same time, correlation with unobserved and time-variant characteristics cannot be ruled out, in which case a simple difference-in-differences approach would not uncover the causal relationship between interventions and outcomes of interest.

In order to measure the causal impact of treatment, we adopt an instrumental variable approach, using treatment assignment, which was random and therefore exogenous, as an instrument for actual treatment, which is potentially endogenous due to imperfect compliance with the study design. In other words, we estimate Eq. 2 with two-stage least squares (2SLS), using three binary indicators for the randomly assigned treatment groups as instruments.4 As a consequence, the number of instruments is the same as the number of endogenous variables, so that Eq. 2 is exactly identified. One limitation of this is that we cannot perform tests of overidentification; this is not a serious concern because the instruments, by construction, are random, and thus the tests are not necessary.

We estimate all regressions using linear models, and we adjust standard errors for heteroskedasticity. Note that no adjustment for clustering is necessary because we estimate the regressions at the cluster (that is, at the PA) level. In an attempt to increase precision, we also estimate models in which we include a set of pre-intervention controls. Given the relatively small number of observations (54 in Amhara and 78 in Oromia), we chose a small set of controls, thus avoiding excessive reduction in degrees of freedom. Specifically, we estimate models in which we add, as controls, the same 17 PA-level characteristics we included in the analysis of compliance with the experimental protocol. The DD estimates, by construction, eliminate all PA-specific pre-intervention and/or time-invariant characteristics. This means that such estimates already wipe out residual variance associated with time-invariant PA-level characteristics. Still, the inclusion of covariates may be interpreted as allowing for the model to include time trends associated with pre-intervention PA characteristics, which may contribute to an increase in the precision of the estimates.5

Basic Results

The study areas witnessed substantial demographic and economic change during the three years of this study, though the patterns of change were different in the two regions. In Table 3, we briefly highlight some of these aggregate changes before turning to an examination of the impact of the interventions. As we will show, while the changes are remarkable, particularly so given the relatively short period of time between the two surveys, the evidence suggests that they were not necessarily associated with the interventions.

In Amhara, contraceptive use increased by 9 percentage points, and the percentage of non-users who said they intend to use contraception in the future increased from 46% to 65%. Awareness of contraceptives, which was already high at 85% in 2003, increased to 97%, and the percentage of women who had heard of pills and injectibles, the two most commonly used methods, increased from 58% to 80%. The increase in contraceptive use does not appear to have had much effect on fertility: the total fertility rate (TFR) actually went up by 0.5 births, from 5.5 to 6 births per woman, a significant change at the 5% level. The number of births women had in the three years before the interview also increased from 0.51 to 0.55 (significant at the 1% level). Desired family size was essentially unchanged, and women, on average, continued to want almost 5 children.

In the Oromia study areas, demographic change was more marked. Contraceptive use went up by 14 percentage points among all women (from 7% to 21%), and even more among currently married women, for whom the increase was threefold; in 2006, almost a third were using contraceptives. There was little change in intention to use contraception in the future among nonusers, but this was already high (at 71%) in 2003. Awareness increased and, as in Amhara, was almost universal at 97%; awareness of pills and injectibles increased from 45% to 78%. There was a small drop in fertility in most age groups, resulting in a drop in the Total Fertility Rate from 5.1 to 4.8. The change is, however, not statistically significant at standard levels. Desired family size also dropped by 0.5 births and, on average, women in the study areas in Oromia wanted only four children.

In both regions, there is underlying momentum for further change because younger cohorts, who have lower desired family size and high levels of awareness of contraceptive methods, are delaying marriage and the start of childbearing. For example, in the study areas, the fraction of women married by age 25 decreased from 74% to 63% in Amhara and from 37% to 30% in Oromia. The share of women who already had children at 25 also declined, from 53% to 48% in Amhara and from 42% to 28% in Oromia. Furthermore, there has been a large increase in schooling in recent years, and as a result, a substantially larger percentage of younger cohorts have attended school. As younger, better-educated women—with high levels of awareness of contraception and lower desired family size—move into childbearing years, contraceptive use is likely to increase further and also result in lower fertility.

Table 4 shows that remarkable changes also took place in relation to economic indicators such as credit uptake, market participation, and livestock holdings. For example, the percentage of households that took a loan in the 12 months before the interview increased from 18% to 44% in Amhara and from 18% to 37% in Oromia. Even though much of the borrowing in 2006 was still being undertaken by males, the proportion of households in which a woman borrowed increased substantially, from 3% to 10% in Amhara and from 2% to 14% in Oromia. There was a significant increase in the value of livestock holdings, which, in real terms (2003 prices), doubled in Amhara and almost tripled in Oromia. The large increases in the number of animals owned by households show that the increase in livestock value was not merely due to increases in the relative price of animals. In Oromia, household income sources became more diversified, with larger percentages of households deriving income from services, trade, and manufacturing and production. In both regions, but particularly in Oromia, there was an increase in cash crop cultivation and marketing of crops. What is most striking, though, is the large increase in school attendance. In the primary-school age group (6 to 10 years), attendance in Amhara more than doubled, from 17.3% to 41.7%. In Oromia, attendance in these ages increased from 36.2% to 45%. Among 11–14-year-olds, school attendance increased by 27.4 percentage points in Amhara and by 12.5 percentage points in Oromia. Similarly, in the 15–18-year age group, there was a 23 percentage-point increase in Amhara and a 14.8 percentage-point increase in Oromia.6

Overall, Tables 4 and 5 reveal a scenario of remarkable change in the study areas. This is consistent with the important changes documented between 2000 and 2005 in Ethiopia in two nationally representative Demographic and Health Surveys (Macro International Inc. 2007). For example, data from these surveys show a 10% decrease in the fraction of women aged 15 to 49 with no schooling in Amhara, and a 16% decline in Oromia (see Table 2.2 in the report). The increase in contraception among currently married women of childbearing age in the DHS data is also remarkable, from 6.6% to 15.7% in Amhara and from 4.3% to 12.9% in Oromia (see Table 5.2 in the report).

Basic Estimates of Program Impact

Turning to the focus of this paper, Table 5 presents estimates of the impact of interventions on contraceptive use. All estimates should be interpreted as measuring an intent to treat, interpreted as the average impact of exposure to the programs in the community rather than the average impact of using any of the studied interventions (see, e.g., Heckman et al. 1999:1903). Three sets of estimates are presented for each region. Columns 1 and 4 display the results from an OLS estimation of Eq. 2, where exposure to interventions is defined by dummy variables for actual exposure to intervention. These estimates do not identify the causal program impacts if program placement is systematically correlated with unobserved differences in trends across the assigned treatment groups. For this reason, we also estimate a second model using 2SLS (columns 2 and 5), with the assigned exposure dummy variables as instruments for actual exposure. Such instruments, which are strongly correlated with actual exposure in both regions, are also arguably exogenous because their being randomly determined implies that their only correlation with the dependent variable should be through actual treatment. Finally, columns 3 and 6 show the results of 2SLS estimation with the inclusion of pre-intervention PA-level characteristics as controls.

Even though our focus is on the two-stage least squares specifications, the OLS results are, on the whole, very close to those obtained using instrumental variable estimation, suggesting that deviations from the study protocol in program placement were not strongly correlated with unobserved location-specific differences in outcome trends. We also test formally the null hypothesis that actual program placement is exogenous, using a test that is robust to the presence of heteroskedasticity of unknown form. Under the null hypothesis, both the 2SLS in Table 5 and an alternative 2SLS estimator, where both actual and randomly assigned treatment are used as instruments, is consistent.7

As we discussed earlier, the dummy variables for assigned treatment (the instruments) are strongly associated with actual treatment (the endogenous variables). However, we also formally test the null hypothesis that the estimated equations are identified by using a Kleibergen-Paap rk LM statistic, which tests the null hypothesis of underidentification (Kleibergen and Paap 2006).8 We do not provide results for formal tests of instrument strength. Because our model has three instruments and three endogenous variables, we cannot use the “usual” rule-of-thumb F test, where the value of the first-stage F test for the exclusion restrictions is compared with a threshold equal to 10, below which weak instruments should be suspected. Such test is only appropriate when there is only one endogenous variable (see Section 4.2 in Stock et al. 2002). 9

The sample is restricted to currently married women because in rural Ethiopia, contraceptive use occurs essentially only within marital unions; in the sample, 95% of contraceptive users are those who are currently married. Results for all eligible women (15 to 49 years of age) are not substantively different and are available upon request. The estimated intercepts in the models without added controls (columns 1, 2, 4, and 5) are large and significant at the 1% level, indicating that contraceptive use increased in the control group between 2003 (baseline survey) and 2006 (follow-up survey) by about 12 percentage points in Amhara and by twice as much in Oromia.10 This is consistent with the aggregate results in Table 4. However, in both regions and in all regressions, almost all the coefficients that measure intervention impacts are small, not significant at standard levels, and in most cases negative, suggesting that, if anything, the increase in contraceptive use in the intervention groups was slightly smaller than in control groups. The only two exceptions arise for the impact of microcredit. The null hypothesis of no impact is rejected at the 5% level in Amhara when we estimate the model with OLS and at the 10% level in the 2SLS estimates with added controls for Oromia. In both cases, the point estimate is negative (−0.066 and −0.076, respectively). The null hypothesis of no differential change in contraceptive use between the linked group and the groups with only family planning services is never rejected at standard confidence levels (row 3). The p values in row 4 indicate that the null hypothesis of exogeneity of program placement is never rejected at the 1% or 5% level, although it is rejected at the 10% level in Oromia (column 6). Note finally that the test of underidentification is always rejected at standard levels (row 5), which is not surprising given that assigned treatment is a strong predictor of actual treatment in both regions.

Table 6 reports the 2SLS results for other correlates of contraceptive use, namely, the intention to use family planning in the future among current non-users, the number of contraceptive methods the woman has heard of, the number of births in the previous three years, and desired family size.11 The number of methods known to the respondent (which is at most 11), includes methods mentioned by the respondent either spontaneously or after prompting. As in Table 5, we estimate model 2 both with and without additional controls, and we perform the same set of tests. In all cases, the null hypothesis of underidentification is rejected at the 1% level.

In Amhara, the null hypothesis of no program impact cannot be rejected at standard significance levels in all but one case. The only exception is the estimate for the introduction of microcredit only, which is estimated to have increased desired family size by almost one child and is significant at the 10% level (column 7), although even this coefficient is no longer significant after we include pre-intervention controls (column 8). In Oromia, too, we find that most estimates are close to zero and not statistically significant at standard significance levels, although the picture is more complex. When we include pre-intervention covariates, the introduction of microcredit predicts a 9% decline in the intention to use contraception (significant at the 10% level) and an increase of 0.38 in desired family size (significant at the 5% level). The predicted impact on awareness and births in the previous three years is instead negative but small and not significant. Overall, these results are consistent with microcredit leading to an increase in the demand for children.

All estimates of the impact on number of births in the previous three years are negative, and even though in some cases they are statistically significant, the magnitude is always small, ranging from −0.106 to −0.166. When we look at estimated program impacts on desired family size, we find instead positive and, in most cases, statistically significant results. In the model with added controls (column 8), the already mentioned 0.4 predicted increase associated with microcredit is complemented by a similar increase in FP areas (0.42, significant at the 5% level) and a lower increase in PAs where both services were introduced (0.27, significant at the 10% level).

Overall, then, the impact of the credit and family planning programs on contraception and other fertility-related outcomes appears to have been limited and, in some cases, even opposite of what was expected. In the next section, we examine possible explanations and their implications, as well as alternative estimates of program impacts.

Discussion

What explains the limited impact revealed by our estimates? A first important drawback of the intent-to-treat analysis described in the previous section is that the parameter being estimated is a measure of the impact of the programs on mean outcomes for the whole targeted population. This implies that the intent-to-treat estimates will be a weighted average of likely heterogeneous responses, with weights proportional to the prevalence of each response type in the population. In particular, the introduction of FPPs should be expected to be more effective in communities with a higher degree of latent demand for contraception. First, we analyze the hypothesis that program impact differed among women in different age groups or among communities where indices of demand for contraception are dissimilar. In addition, we analyze further how the results may be explained by the design of the study and by the coverage and content of the interventions. As a final step, we look at patterns of migration in the study areas to see if the observed results could be due to selective migration related to program placement.

Heterogeneity in Program Response

The lack of significant program impacts on contraceptive prevalence may, in principle, hide the existence of heterogeneous response. In particular, gauging whether the interventions had some efficacy among women with latent demand for family planning methods is important because, given the relative duration of the study, one would expect increases in contraceptive use due to easier access to contraception to derive mostly from the satisfaction of existing demand. We first explore the hypothesis that different age groups may have been impacted differently by the programs by re-estimating Eq. 2 separately for different cohorts defined on the basis of women’s age at baseline, that is, in 2003. Overall, we find that contraceptive adoption increased especially among younger women, but the lack of program impacts was shared by all demographic groups.12

Age, however, is a very poor indicator for demand for contraception. On the one hand, larger impacts can be expected among younger women, who are on average better educated and who may be more willing to accept methods of fertility control largely ignored in the past by older women. On the other hand, older women may be more likely to adopt because they are, on average, closer to achieving their desired family size. To probe further whether the program impacts were related to pre-intervention demand for contraception, we reestimate the basic model in Eq. 2, interacting treatment status with indicators of demand, which we construct using baseline information on the desire to delay or avoid past pregnancies. More specifically, we measure “demand” as the fraction of ever-pregnant women of child-bearing age in the PA who (or whose partner) would have preferred to avoid or delay the current pregnancy (if any) or the previous one (if any). Admittedly, this measure of demand is not perfect. First, wanting to delay or “avoid” a pregnancy may signal, but does not imply, a demand for contraception: instead of using contraceptives such as condoms, pills, or injectibles, couples may decide to regulate their fertility with abstinence, or they may differ in their attitudes toward pregnancy risk. Second, the measure is subject, to some extent, to post hoc rationalization, that is, respondents may avoid describing a pregnancy or its timing as unwanted because they prefer to see their fertility outcomes as the result of rational choices, or they may feel uneasy about describing the birth or conception of a child as undesired. In addition, the indicator may to some extent reflect not only demand but also the cost and availability of contraception (Pritchett 1994).

In Amhara, the values of the PA-specific measure of demand have mean and median equal to 0.19 and range from 0% to 38%. In Oromia, consistent with the descriptive statistics presented in Table 1, demand for contraception is significantly higher, with the median equal to 0.37 (almost twice as large as in Amhara) and the values ranging from 2% to 81%. We then re-estimate equation (2) with 2SLS, adding interactions between the treatment variables and our measure of demand and using as outcomes both contraceptive use and the other outcomes analyzed in Table 6. We treat the three actual program dummy variables and their interactions with demand as six endogenous variables, and we use the randomized assigned program dummy variables and their interaction with demand as the six instruments. We also include demand itself in the model, but because demand is measured before the intervention, we treat it as exogenous. The results indicate that, in both regions, the null hypothesis of no interactions between demand and program indicators is never rejected at standard significance levels.13 Overall, we are left to conclude that the FPPs were ineffective in changing contraceptive use even after taking into account differences in preexisting demand. So, in the next subsections, we consider other possible explanations for these results.

Study Design and Contamination

The design of the study called for randomly allocating administrative areas (the PAs) to the four study groups, and for the most part, the process followed protocol. Subgrantee organizations provided the list of PAs to be randomized, and this consisted of areas they intended to start programs in. Randomization was undertaken by Family Health International, and the randomized list was communicated to the implementing organizations in both regions. It turns out that the list was not entirely error-free because programs were already functional in 8 of the 133 PAs. Depending on the allocation, these were either continued or interrupted. We analyzed the results excluding these areas, but all the conclusions remain essentially unchanged.

As already indicated, implementation of the interventions deviated from the study protocol in a number of PAs (see Table 1). This occurred either because of pressure from local authorities or due to organizational decisions related to availability of services from other organizations or inaccessibility of PAs. Note, however, that we do not find evidence of important systematic correlations between program implementation and pre-intervention community characteristics. We also note that the DD estimates control, by construction, for any pre-intervention difference in PA characteristics, as long as these enter the outcome equation in an additively separable form. In addition, the 2SLS estimates should take care of the possible endogeneity of program placement, which may still generate inconsistent estimates of program impacts if program placement was correlated with group specific unobserved trends. At the same time, the estimates that include pretreatment characteristics also implicitly control for differences in trends explained by such added variables. On the other hand, neither DD nor 2SLS estimates can overcome the confounding impact that may derive, first, from spillovers across treatment groups and, second, from the fact that nonprogram FPPs may have endogenously reacted to the introduction of program FP services in a way that could have attenuated any impact of our study interventions. We address these two issues in turn.

The possibility of spillovers across study areas is especially relevant for the impact of family planning services, for which information was a central component. To the extent that PAs in the two groups that did not receive family planning services from ADA or ODA (the control group and the credit-only group) bordered those with family planning services from the subgrantees, there could have been spillover of information and infusion of the idea of family limitation. This possibility cannot be ruled out because the program office is organized at the level of the woreda, the larger administrative area within which the study PAs are based, and each woreda was blanketed by family planning services from ADA and ODA. For example, program functioning data show that in the study woredas, on average, 70% to 73% of all PAs had family planning services from one of these two organizations. Also, we find that all woredas, each of which included at least one control PA, also included at least one PA where FP services were introduced. While this certainly raises the possibility that information available in PAs with family planning programs spread to other PAs, its impact is unlikely to have been large for at least two reasons. Firstly, personal contact with community-based reproductive health agents, and their motivational impact, was available only in the designated PAs. Secondly, injectibles are the preferred method for most women in this region, and the health agents only provided pills and condoms (an issue we return to later).

Another possible source of study contamination is the presence of nonprogram providers of FP services. The four intervention groups are defined in terms of exposure to the credit and family planning services provided by the subgrantee organizations. There was no control over the health services provided by government facilities and providers, and only limited ability to influence the actions of other organizations. The expectation was that the initial randomization of PAs would yield a random distribution of government health facilities and services from other organizations, so that the services provided by the subgrantees could be viewed as being “additional.” However, the presence of additional programs may have contributed to the attenuation of any cross-group differences in the impact of the interventions implemented by the subgrantees if such other programs were disproportionately introduced after 2003 in communities not impacted by ADA or ODA.

Data from a community questionnaire administered at the same time as the post-intervention survey suggest that in about half of all surveyed villages, family planning services were available from non-ADA/ODA sources. In Oromia villages, these other sources were primarily public-sector providers like the “Health Post,” “Health Center,” or “Health Worker,” and in Amhara, nongovernmental organizations played an important role (see Table 7). In Oromia, 46% to 55% of villages have non-ODA family planning services, but the differences between the four study arms are not statistically significant; a chi-square test of association is not able to reject the null hypothesis that the presence of nonintervention programs is the same across all treatment arms (p = .535). However, in the Amhara region, the presence of nonintervention programs is not similar across the four study arms, and the control villages are much more likely to have services from other sources (58% vs. 21%–50% in the other three groups); a chi-square test of association strongly rejects the null hypothesis of equal distributions (p = .000).

These data suggest that in the case of Oromia, it is reasonable to assume that the presence of other nonintervention services did not compromise the study design. In the Amhara study areas, it is possible that the availability of family planning from nonpublic sources compensated for the lack of ADA services, though data also indicate that this is likely only in the control group villages. No such compensatory placement is evident for the credit-only group, which also did not receive services from ADA. Indeed, the credit-only group is least likely to have received services from all other sources (public and nonpublic), and yet we observe no difference in change in contraceptive use between any of the other groups and this group. Note also that the two intervention groups in Amhara (receiving both family planning and credit services or receiving only family planning) are more likely to have had public providers, who are the primary source for injectibles, the method of choice. Discussions with organization officials also indicate that placement of public providers—who provide a wide range of health services besides family planning—is unlikely to have had anything to do with the operation of the ADA program. In fact, if anything, the unequal presence of these providers should have produced an upward bias to any difference between the family planning intervention groups and the groups not receiving these services. Unfortunately, we do not have baseline data on the presence of these other programs, so it is not possible to provide a definitive answer to this issue. But given the role played by public providers and their disproportionately higher presence in the intervention villages, it seems reasonable to conclude that placement of other services probably did not have a major bearing on the study results.

Coverage and Content of Interventions

If programs did not reach sufficiently large numbers of individuals or if the types of services they provided were not consistent with what women want or what holds them back from adopting family planning, then the interventions might not have any demonstrable impact on contraceptive use. We examine these issues separately for credit and family planning interventions because the functioning of the former is relevant to the question of linking while the latter might help explain the lack of any impact of family planning programs.

The microcredit programs operated by ACSI and OCSSCO are necessarily limited in their coverage because, like several other microcredit programs, they employ selection criteria that restrict lending to certain types of individuals and households. Service statistics show that, on average, the credit programs serve between 112 and 125 clients per PA per month, and while 60% to 70% of these clients are female, these account for no more than 20% to 25% of the adult population of a PA. The number of credit clients is higher in Amhara, and even though in both regions the number of clients (per month) increased over the two-year period covered by these data, at the end of this period, these still make up no more than 28% of households in the credit intervention PAs. In both regions, coverage of households in the linked PAs (at 25%) is significantly lower than that in the unlinked, credit-only PAs (34%). This means that even if the linking of credit and family planning services were to lead to higher contraceptive use among borrowers (from the credit program), this might not get reflected in a group-wide measure of contraceptive prevalence. Of course, it is an open question as to whether this type of linking even has any effect on the subset of the population that borrows from the credit program.

We next examine data on borrowing to see if there is any relationship between borrowing, awareness of family planning methods, and contraceptive use. Our intent is not to establish a causal relationship between these variables, but to see if there is any association between participation in the credit intervention, contraceptive awareness, and contraceptive use. Borrowing and contraceptive use are both individual decisions, and as such are affected by individual characteristics, only some of which are observed in our data set. Factors such as entrepreneurship, quality of schooling, risk aversion, and attitudes toward modern contraceptive methods are all likely to affect both outcomes, but they are all inherently hard to measure. Establishing a causal relationship between borrowing and contraceptive use requires identifying at least one variable that affects borrowing but not contraceptive use. Such a requirement does not seem to hold for any of the variables in our data, so even though exposure to a credit program is randomized in our experiment, we cannot identify the causal impact of borrowing on fertility-related choices.

Table 8 presents data on contraceptive awareness and use among women from households that did not take any loans in the 12 months preceding the follow-up survey, those that took loans from the subgrantee credit organizations (ACSI and OCSSCO), and those that took loans from other sources. On the whole, those who took a loan are somewhat more likely to be aware of family planning methods, but the differences are minor and not statistically significant (test results not shown but available on request). Here, awareness is measured by the number of family planning methods mentioned by a woman either spontaneously or after being prompted. Contraceptive use displays greater differences between the different types of households, and there is some indication that contraceptive use is higher among women from households that are engaged in the credit market. However, there is no difference between those who borrowed from ACSI/OCSSCO and those who borrowed from other sources, suggesting that the information provided by credit officers did not necessarily lead to their clients having appreciably higher levels of awareness and contraceptive use. What this means is that not only do the credit programs reach a subset of households, but the type of family planning service they provide (information) is largely redundant. It is, therefore, not surprising that linking the credit programs of ACSI and OCSSCO and the family planning programs of ADA and ODA, in this particular way, had no impact on contraceptive use.

Turning to the coverage and content of the family planning intervention, there are few concerns with coverage because most programs (87%) started within 12 months of the baseline survey and were in operation for at least 24 months of the 36-month study period. We also find that program duration has little bearing on levels of contraceptive use (Family Health International 2007). In addition, service statistics data from the woreda offices show that the programs covered at least 50% of eligible households in the initial 9-month period (August 2004 to April 2005) and almost 60% in the later 12-month period (Family Health International 2007). Interestingly, in both regions and over both time periods, the rate of household coverage was much greater in the PAs that received both credit and family planning programs. This was not part of the study design but is an important finding nevertheless because, even with a more intensive effort in the linked group, contraceptive prevalence increased by the same amount in all groups.

The content of these planning programs is more likely to be the reason for their limited impact on contraceptive use. By all accounts, the information provision activities of the community-based agents were remarkable, but these do not seem to have translated into significantly higher levels of awareness of women in intervention PAs. This might well be a reflection of the limitation of the survey instrument and the questions that we are using to measure knowledge and awareness, but it is important to remember that awareness was already high before these programs were introduced (Table 2), so limited awareness was not the main barrier to adoption of family planning.

A bigger shortcoming of the programs might have been that the contraceptives provided by the community-based agents, that is, pills and condoms, were not the ones women were increasingly turning to by 2006. Figure 2 shows that in 2003, the method mix was dominated by injectibles and pills, with injectibles making up a larger share in Amhara and pills a larger share in Oromia, although with contraceptive prevalence only 3% in Amhara and 7% in Oromia (in 2003), these shares do not translate into large numbers of users. Over the next three years, the method mix shifted toward injectibles. By 2006, almost 80% of women using contraceptives in Amhara were using injectibles, and in Oromia the share of injectibles was almost 62%. Since women have to go to a health center or clinic for an injectible, location of these facilities, more than the efforts of community-based agents, might at least partly account for differences in contraceptive use across communities. Indeed, Fig. 3 shows that there is a clear correlation between contraceptive use among currently married women and distance to the nearest health center. Of course, such correlation does not necessarily indicate that a causal relation exists, because women who live at different distances from health centers might also differ along several other characteristics, such as attitudes toward contraceptives or schooling levels.

Migration

As in virtually any program evaluation, results can be biased by the presence of selective attrition. In our case, it is important to establish whether the results we have described are likely to be driven by selective migration. Recall that the two surveys that constitute our dataset are repeated cross sections from the same list of villages. This, unfortunately, does not allow us to evaluate the extent of migration away from the sample villages. On the other hand, the follow-up survey includes a random sample from the complete listing of households that resided in the selected villages at the time of the field work. In this post-intervention sample, only 80 of 6,275 respondents (that is, 1.28%) report having lived in their village for less than four years. Information on the reason for migration is available for only 46 of these households, but in no case is availability of family planning or microcredit indicated as a reason for relocation. Overall, the data suggest that the extent of migration in the study areas between the two surveys was very limited and unrelated to the interventions. Note also that any bias due to migration would be likely to bias the impact of the interventions upward because we would expect relocation to be mostly toward areas where the programs have been introduced and from households that intend to use the programs.

Conclusions

The results of this study show that, in the study areas, linking credit and family planning services did not increase contraceptive use any more than what was achieved by either program on its own. More importantly, neither type of program, linked or unlinked, led to an increase in contraceptive use that is significantly greater than that observed in the control group. We also do not find systematic differences in program impact among women of different ages or in areas where pre-intervention measures of demand for contraception were higher.

When interpreting these findings, and in assessing whether they can be extended to other geographical and institutional frameworks, it is important to recognize the specifics of the interventions and the study locations; in other words, the usual caveats about external validity of experimental results should be kept in mind. In the programs evaluated in this paper, the linking of credit and family planning services took the specific form of credit officers providing information on family planning to their clients, and the family planning services offered relied on using community-based reproductive health agents to inform and motivate potential users and to provide nonclinical contraceptives (pills and condoms) and referrals for clinical methods. It is also possible that fertility behavior did not respond to the intervention because the study period (three years) was not long enough to allow reproductive behavior and especially demand for contraception to change. On the other hand, as we have documented, even in such a relatively short period, the study areas did, as a whole, experience a large increase in contraceptive use. This change does not appear to be associated with the programs we evaluated, but is most likely related to the improved availability of injectibles in local health facilities. This latter observation still strongly points to the potential importance of family planning service provision in changing contraceptive behavior.

Our finding that linking credit and family planning services does not have incremental benefits for contraceptive use is quite robust. The lack of differences in change in contraceptive use is not an isolated finding, but it is confirmed by the lack of statistically significant differences in current fertility, contraceptive awareness, intentions to use contraception, and other relevant demographic variables. We hypothesize that linking has such a limited impact because the credit programs reach only one-quarter of all adults, and it only provides them with information, which is important, but probably not the main constraint. If linking were to take a form that altered the incentive structure for contraceptive use, say by offering credit on better terms to women or to contraceptive users, it might have a greater impact, although our data are silent about this possibility. The data show higher contraceptive use in households that are engaged in the credit market, but given that such correlation cannot be interpreted causally, it is not clear whether this result should be interpreted as suggesting that an expansion of credit access would lead to an increase in contraceptive use. Indeed, the 2SLS results reported in Table 5 show that contraceptive use in PAs where credit was expanded saw relative declines in contraceptive use, even though the estimates are small in magnitude and not statistically significant at the 5% level.

Our second finding, that the family planning programs of ADA and ODA had no measurable impact on contraceptive use, is perhaps more surprising. We have hypothesized that, besides the relatively short time between pre- and post-intervention surveys, the most likely cause for the lack of impact is the fact that community health agents were able to supply condoms and pills but not injectibles, despite injectibles being the most commonly used contraceptive method in the area, especially in the post-intervention year. We have also argued that spillovers from intervention to control areas (or from neighboring nonstudy PAs where ADA or ODA programs were already in place) may have further attenuated any impact. On the other hand, these concerns are mitigated by the fact that information about contraceptives was already widespread in study areas and by the fact that ADA-/ODA-trained health agents did not supply injectibles. Similarly, we do not find clear evidence (especially in Oromia) that the lack of impacts on fertility behavior and preferences may have been caused by the entry of alternative family planning services offered by public or private structures other than ADA or ODA.

Overall, given women’s preference for injectibles and the importance of location of the health center for provision of injectibles, one obvious modification of the family planning programs operated by ADA and ODA is to train their community-based reproductive health agents to provide injectibles. As it turns out, independent of this evaluation, the Ethiopian government has recently adopted exactly this type of approach and started placing trained village health workers in each PA.

Acknowledgements

Jaikishan Desai was employed by Family Health International during the course of this study. We thank the David and Lucile Packard Foundation for financial support and encouragement for the study, the Packard Foundation in Ethiopia for assistance with coordinating all aspects of the study, Birhan Research and Development Consultancy and Miz Hasab Research Center for conducting the two household surveys, and the four subgrantee organizations―ACSI, ADA, OCSSCO, and ODA―who, despite several pressures, extended their cooperation in implementing their interventions according to the study design. Last but not least, we are grateful to Laura Chioda, Jed Friedman, Jonathan Robinson, seminar participants at the World Bank and NEUDC (Boston), the Editor, and especially two anonymous referees for valuable comments and suggestions. The authors remain solely responsible for all remaining errors and omissions as well as for all the views and interpretations expressed throughout the paper.

Notes

1

There is not much consensus, however, on the overall effectiveness of such programs in achieving these goals. While Pritchett (1994) is skeptical of family planning programs, others like Bongaarts (1994) have argued for an important albeit nondominant role. An intermediate position is taken by Freedman (1997), whose literature survey concluded that, while an impact on fertility preferences has rarely been documented convincingly, several family planning services have allowed families to meet existing demand for fertility control.

2

Alternatively, Eq. 1 could be estimated directly using PA-fixed effects. An earlier draft of this paper used this approach and led to almost identical results.

3

More specifically, we calculate the weights as the mean number of observations from a given PA in the baseline and in the follow-up. These numbers are not always identical, although they are always very similar (the correlation is approximately .97 in both regions). Note also that the dependent variable in Model 2 is a weighted PA-specific mean, where all observations are weighted using the village-specific sampling weights.

4

Alternatively, Eq. 1 could be estimated with 2SLS and PA fixed effects. This estimation strategy, which we adopted in an earlier version of the paper, leads to almost identical results.

5

We note, however, that in small samples, the inclusion of pre-intervention characteristics does not necessarily increase the precision of the estimates, even in situations in which the randomization was done carefully. On the one hand, the R2 of the regression will, by construction, increase. However, in small samples, the orthogonality between assigned treatment and the residuals usually does not hold exactly, with the consequence that the inclusion of other covariates is not guaranteed to decrease the standard errors. We also note that, although the inclusion of additional covariates is often advocated by randomized controlled trials practitioners (see, e.g., Duflo et al. 2008), the practice has been criticized by others as an ex-post adjustment not justified by randomization (see Deaton 2010; Freedman 2008).

6

The changes in socioeconomic status described in Table 4 were relatively similar across treatment groups with the exception of borrowing, which, not surprisingly, shows significantly larger increases in areas where microcredit was introduced. For almost all other variables, we cannot reject the null hypothesis that the changes were the same across the four (actual) treatment groups. The detailed results are available upon request from the authors.

7

For a description of the test, which is identical to a Hausman test under conditional homoskedasticity, see Hayashi 2000:200–201 or Baum et al. 2007:16.

8

A version of the test, robust to the presence of heteroskedasticity or clustering, can be performed by using the Stata command ivreg2. This test can be considered as a generalization of the Anderson canonical correlation rank statistic to the non-i.i.d. case. The null hypothesis is that the smallest canonical correlation is zero, in which case the equation is not identified. A rejection of the null hypothesis indicates instead that the excluded instruments are relevant. The results of the first-stage regressions are available upon request from the authors.

9

In the presence of more than one endogenous variable, multivariate versions of the test have been developed that evaluate the first stage for all endogenous variables jointly. Critical values for such tests have been developed only for specific combinations of the number of instruments and endogenous variables (see Tables 1 and 2 in Stock and Yogo 2002). Unfortunately, such critical values do not exist for a case such as ours, where there are three endogenous variables and three excluded.

10

In the model with pre-intervention controls estimated in columns 3 and 6, the intercept does not have a meaningful interpretation.

11

The OLS estimates, which for brevity are not reported, are available upon request from the authors.

12

The full results are available upon request.

13

The full results, omitted for brevity, are available upon request. We also note that the results for Amhara should be interpreted with caution because the Kleibergen-Paap tests indicate that the null of underidentification is not rejected at standard levels.

References

Amin, R., Hill, R. B., & Li, Y. (
1995
).
Poor women’s participation in credit-based self employment: The impact on their empowerment, fertility, contraceptive use, and fertility desire in rural Bangladesh
.
Pakistan Development Review
,
34
,
93
119
.
Angeles, G., Guilkey, D., & Mroz, T. (
1998
).
Purposive program placement and the estimation of family planning program effects in Tanzania
.
Journal of the American Statistical Association
,
93
,
884
899
. 10.2307/2669827
Angeles, G., Guilkey, D., & Mroz, T. (
2005
).
The determinants of fertility in rural Peru: Program effects in the early years of the national family planning program
.
Journal of Population Economics
,
18
,
367
389
. 10.1007/s00148-005-0226-5
Angeles, G., Guilkey, D., & Mroz, T. (
2005
).
The effects of education and family planning programs on fertility in Indonesia
.
Economic Development and Cultural Change
,
54
(
1
),
165
201
. 10.1086/431261
Bang, S. (
1971
).
KOREA: The relationship between IUD retention and check-up visits
.
Studies in Family Planning
,
2
,
110
112
. 10.2307/1965146
Baum, C. F., Schaffer, M. E., & Stillman, S. (
2007
).
Enhanced routines for instrumental variables/generalized method of moments estimation and testing
.
Stata Journal
,
7
,
465
506
.
Bauman, K. E. (
1997
).
The effectiveness of family planning programs evaluated with true experimental designs
.
American Journal of Public Health
,
87
,
666
669
. 10.2105/AJPH.87.4.666
Bauman, K. E., Viadro, C. I., & Tsui, A. O. (
1994
).
Use of true experimental designs for family planning program evaluation: Merits, problems and solutions
.
International Family Planning Perspectives
,
20
,
108
113
. 10.2307/2133513
Binka, F. N., Nazzar, A., & Phillips, J. F. (
1995
).
The Navrongo community health and family planning project
.
Studies in Family Planning
,
26
,
121
139
. 10.2307/2137832
Bongaarts, J. (
1994
).
The impact of population policies: Comment
.
Population and Development Review
,
20
,
616
620
. 10.2307/2137604
Buttenheim, A. (
2006
).
Microfinance Programs and Contraceptive Use: Evidence from Indonesia (Working Paper CCPR-020-06)
.
Los Angeles, CA
:
California Center for Population Research
.
Chan, K. C. (
1971
).
Hong Kong: Report of the IUD reassurance project
.
Studies in Family Planning
,
2
,
225
233
. 10.2307/1965120
Deaton, A. (
2010
).
Instruments, randomization, and learning about development
.
Journal of Economic Literature
,
48
,
424
455
. 10.1257/jel.48.2.424
Debpuur, C., Phillips, J. F., Jackson, E. F., Nazzar, A., Ngom, P., & Binka, F. N. (
2002
).
The impact of the Navrongo project on contraceptive knowledge and use, reproductive preferences, and fertility
.
Studies in Family Planning
,
33
,
141
164
. 10.1111/j.1728-4465.2002.00141.x
Duflo, E., Glennerster, R., & Kremer, M. (
2008
).
Using randomization in development economics research: A toolkit
. In Schultz, T. P., & Strauss, J. (Eds.),
Handbook of development economics
(pp.
3895
3962
).
Amsterdam, The Netherlands
:
Elsevier
.
Family Health International. (2007). Linking Access to Credit and Family Planning Services in Ethiopia. Final Report. Prepared for the David and Lucile Packard Foundation Population Program in Ethiopia.
Foster, A., & Roy, N. (
1997
).
The dynamics of education and fertility: Evidence from a family planning experiment (Economics Department Working Paper)
.
Philadelphia, PA
:
University of Pennsylvania
.
Freedman, R. (
1997
).
Do family planning programs affect fertility preferences? A literature review
.
Studies in Family Planning
,
28
,
1
13
. 10.2307/2137966
Freedman, D. (
2008
).
On regression adjustments to experimental data
.
Advances in Applied Mathematics
,
40
,
180
193
. 10.1016/j.aam.2006.12.003
Freedman, R., & Takeshita, J. Y. (
1969
).
Family planning in Taiwan: An experiment in social change
.
Princeton, NJ
:
Princeton University Press
.
Gertler, P., & Molyneaux, J. (
1994
).
How economic development and family planning programs combined to reduce Indonesian fertility
.
Demography
,
31
,
33
63
. 10.2307/2061907
Hashemi, S. M., Schuler, S. R., & Riley, A. P. (
1996
).
Rural credit programs and women’s empowerment in Bangladesh
.
World Development
,
24
,
635
653
. 10.1016/0305-750X(95)00159-A
Hayashi, F. (
2000
).
Econometrics
. 1
Princeton, NJ
:
Princeton University Press
.
Heckman, J., LaLonde, R., & Smith, J. (
1999
).
The economics and econometrics of active labor market programs
. In Ashenfelter, O., & Card, D. (Eds.),
Handbook of labor economics, Vol. 3A
.
Amsterdam, The Netherlands
:
Elsevier Science
.
Joshi, S., & Schultz, T. P. (
2007
).
Family planning as an investment in development: Evaluation of a program’s consequences in Matlab, Bangladesh (Center Discussion Paper No. 951)
.
New Haven, CT
:
Economic Growth Center, Yale University
.
Katz, K., West, C., Doumbia, F., & Kané, F. (
1998
).
Increasing access to family planning services in rural Mali through community-based distribution
.
International Family Planning Perspectives
,
24
,
104
110
. 10.2307/3038206
Kleibergen, F., & Paap, R. (
2006
).
Generalized reduced rank tests using the singular value decomposition
.
Journal of Econometrics
,
127
,
97
126
. 10.1016/j.jeconom.2005.02.011
Luck, M., Jarju, E., Nell, M. D., & George, M. O. (
2000
).
Mobilizing demand for contraception in rural Gambia
.
Studies in Family Planning
,
31
,
325
335
. 10.1111/j.1728-4465.2000.00325.x
Trends in demographic and reproductive health indicators in Ethiopia
. (
2007
).
Calverton, MD
:
Macro International Inc
.
Mayoux, L. (
1999
).
Questioning virtuous spirals: Microfinance and women’s empowerment in Africa
.
Journal of International Development
,
11
,
957
984
. 10.1002/(SICI)1099-1328(199911/12)11:7<957::AID-JID623>3.0.CO;2-#
Miller, G. (
2010
).
Contraception as development? New evidence from family planning in Colombia
.
The Economic Journal
,
120
,
709
736
. 10.1111/j.1468-0297.2009.02306.x
Omu, A. E., Weir, S. S., Janowitz, B., Covington, D. L., Lamptey, P. R., & Burton, N. N. (
1989
).
The effect of counseling on sterilization acceptance by high-parity women in Nigeria
.
International Family Planning Perspectives
,
15
,
66
71
. 10.2307/2133484
Phillips, J., Bawah, A., & Binka, F. (
2006
).
Accelerating reproductive and child health programme impact with community-based services: The Navrongo experiment in Ghana
.
Bulletin of the World Health Organization
,
84
,
949
955
. 10.2471/BLT.06.030064
Phillips, J., Greene, W., & Jackson, E. (
1999
).
Lessons from community-based distribution of family planning in Africa (Policy Research Division Working Paper 121)
.
New York
:
The Population Council
.
Pitt, M., Khandker, S., Mckernan, S-M, & Abdul Latif, M. (
1999
).
Credit programs for the poor and reproductive behavior in low-income countries: Are the reported causal relationships the result of heterogeneity bias?
.
Demography
,
36
,
1
21
. 10.2307/2648131
Pitt, M., Rosenzweig, M., & Gibbons, D. (
1993
).
The determinants and consequences of the placement of government programs in Indonesia
.
World Bank Economic Review
,
7
,
319
348
. 10.1093/wber/7.3.319
Pritchett, L. (
1994
).
Desired fertility and the impact of population policies
.
Population and Development Review
,
20
,
1
55
. 10.2307/2137629
Rosenfield, A. G., & Limcharoen, C. (
1972
).
Auxiliary midwife prescription of oral contraceptives: An experimental project in Thailand
.
American Journal of Obstetrics and Gynecology
,
114
,
942
949
.
Schuler, S. R., & Hashemi, S. M. (
1994
).
Credit programs, women’s empowerment, and contraceptive use in rural Bangladesh
.
Studies in Family Planning
,
25
,
65
76
. 10.2307/2138085
Schuler, S. R., Hashemi, S. M., & Riley, A. P. (
1997
).
The influence of women’s changing roles and status in Bangladesh’s fertility transition: Evidence from a study of credit programs and contraceptive use
.
World Development
,
25
,
563
575
. 10.1016/S0305-750X(96)00119-2
Schultz, T. P. (
1997
).
Demand for children in low income countries
. In Rosenzweig, M. R., & Stark, O. (Eds.),
Handbook of population and family economics
.
Amsterdam, The Netherlands
:
Elsevier Science
.
Schultz, T. P. (
2005
).
Fertility and income (Center Discussion Paper No. 925)
.
New Haven, CT
:
Economic Growth Center, Yale University
.
Sinha, N. (
2005
).
Fertility, child work, and schooling consequences of family planning programs: Evidence from and experiment in rural Bangladesh
.
Economic Development and Cultural Change
,
54
,
97
128
. 10.1086/431259
Steele, F., Amin, S., & Naved, R. T. (
2001
).
Savings/credit group formation and change in contraception
.
Demography
,
38
,
267
282
. 10.1353/dem.2001.0021
Stock, J., Wright, J., & Yogo, M. (
2002
).
A survey of weak instruments and weak identification in generalized method of moments
.
Journal of Business and Economic Statistics
,
20
,
518
529
. 10.1198/073500102288618658
Stock, J., & Yogo, M. (
2002
).
Testing for weak instruments in linear IV regression (NBER Technical Working Paper 284)
.
Cambridge, MA
:
National Bureau of Economic Research
.
Thomas, D., & Maluccio, J. (
2001
).
Fertility, contraceptive choice, and public policy in Zimbabwe
.
The World Bank Economic Review
,
10
,
189
222
.
Yang, J. M., Bang, S., Kim, M. H., & Lee, M. G. (
1965
).
Fertility and family planning in rural Korea
.
Population Studies
,
18
,
237
250
. 10.2307/2173286