The desire for male children is prevalent in India, where son preference has been shown to affect fertility behavior and intrahousehold allocation of resources. Economic theory predicts less gender discrimination in wealthier households, but demographers and sociologists have argued that wealth can exacerbate bias in the Indian context. I argue that these apparently conflicting theories can be reconciled and simultaneously tested if one considers that they are based on two different notions of wealth: one related to resource constraints (absolute wealth), and the other to notions of local status (relative wealth). Using cross-sectional data from the 1998–1999 and 2005–2006 National Family and Health Surveys, I construct measures of absolute and relative wealth by using principal components analysis. A series of statistical models of son preference is estimated by using multilevel methods. Results consistently show that higher absolute wealth is strongly associated with lower son preference, and the effect is 20%–40% stronger when the household’s community-specific wealth score is included in the regression. Coefficients on relative wealth are positive and significant although lower in magnitude. Results are robust to using different samples, alternative groupings of households in local areas, different estimation methods, and alternative dependent variables.
Promoting gender equality and empowering women is one of the eight Millennium Development Goals that 191 U.N. countries pledged to meet by 2015. Gender equality is seen as both a development goal and a necessary condition for sustainable development. Economists tend to think of economic development and gender equity as reinforcing: reduced gender discrimination facilitates income growth, and higher incomes move societies toward less gender bias. Many have convincingly argued and empirically shown the negative impact of gender discrimination on economic development (e.g., Klasen 1999; Scanlan 2004; Sen 1984; Sweetman 2002; World Bank 2001), but the other side of the causality—that is, the role of income growth in altering gender biases—has received little attention. Recent evidence indicates that gender bias—measured by using juvenile sex ratios and mortality—is generalizing and intensifying in India (Agnihotri 2003; Basu 1999, 2000; Murthi et al. 1995; Premi 2001; Rajan et al. 2000). Taking son preference as a barometer of girl neglect in India and using data from the first wave of the National Family and Health Survey (NFHS-1), a recent publication of the International Center for Research on Women concluded that “wealth and economic development do not reduce son preference” (Pande and Malhotra 2006). In light of these facts, it is important to reconsider the linkages between wealth and anti-female biases.
Different theories drawn from the economics, sociology, demography, and anthropology literatures provide conflicting predictions about the impact of increased wealth on gender bias in India. This article does not add to these theories but, rather, argues and empirically shows that they rely on different notions of wealth: theories predicting less bias consider absolute notions of wealth based on resource constraints; theories predicting more bias consider relative notions of wealth based on local status. Using a single measure of wealth in multivariate analyses of gender bias could capture either or both effects. If the effects are indeed opposite, omitting relative wealth would cause to underestimate the effect of absolute wealth.
To test this hypothesis, I estimate a series of nested multivariate models of son preference by using cross-sectional data from the 1998–1999 and 2005–2006 NFHS. The dependent variable, son preference, is measured using survey responses on “ideal” family composition. Two objections can be raised about this choice of variable to capture gender bias: (1) answers are biased by rationalization—for example, a mother may declare preferring more sons because she has more sons; and (2) son preference does not measure actual manifestations of gender bias.1 The first issue of rationalization can be treated by using appropriate controls (as in Bhat and Zavier  and Pande and Astone ). The second objection is valid in the sense that stated son preference will never be perfectly correlated with manifestation of bias in any specific area; son-preferring households may still treat their sons and daughters equally, and girl-preferring households could be biased against girls when it comes to education and health. However, measures of revealed bias in one area are also problematic when generalizing to overall bias; for example, finding gender bias in education does not imply bias in health decisions. Stated son preference, on the other hand, can serve as a good barometer of trends in gender bias in all areas: girls in son-preferring households, all other things being equal, are at greater risk of discrimination than girls in neutral or girl-preferring households. Bhat and Zavier (2003), hereafter referred to as B&Z, showed that the summary statistics on stated son preference from NFHS-1 and NFHS-2 are consistent with sex ratios by state from the 1990 Indian Census.
Using dependent variables based on the same NFHS questions, neither B&Z nor Pande and Astone (2007), hereafter referred to as P&A, found strong support for a negative relationship between individual wealth and son preference. P&A constructed a measure of wealth by using principal components analysis (PCA) on a wide array of assets, as is done here for absolute wealth; they found no significant positive or negative effect of wealth on son preference in NFHS-1. B&Z used NFHS-1 and NFHS-2 data and a coarser measure of standards of living; they found “[ . . . ] weak support for the possibility that the preference for sons will decline with the increase in wealth and rising wage employment of women” (Bhat and Zavier 2003:649). Here, I construct two different measures of wealth by using PCA: the absolute wealth score is obtained using the full India sample, while relative wealth is calculated using a sample of households living nearby. Both variables are included in a multivariate model of son preference inspired by B&Z. I use multilevel statistical methods to take account of the hierarchical structure of the model and complex survey design. Consistent with economic theory, I find that higher standards of living at the household and macroeconomic levels are associated, all things being equal, with lower son preference. Incorporating relative wealth into the model reinforces the negative relationship between absolute wealth and son preference, increasing elasticities by 20%–40% in magnitude, depending on the estimation method and sample. Higher relative wealth is significantly associated with lower son preference, although the effect applies primarily to landed households. To test whether results generalize to gender bias beyond son preference, a similar model is estimated by using stated educational preferences for boys and girls as the dependent variable. The same pattern is revealed, although the role of landownership is no longer prevalent. Results are checked by using alternative estimation methods with alternative definitions of son preference as well as an alternative delineation of local areas for relative wealth calculations. Details of these analyses are given in supplements available online. In all cases, qualitative results support the main hypothesis.
There is no consensus across disciplines about the impact of increased wealth on gender bias in India. Economists generally predict a decrease in gender bias with increased wealth, at least after a certain level of development, whereas sociologists and anthropologists predict an increase in gender bias with increased wealth, based on features of Indian society. In this section, I review the arguments on both sides, separating those based on changes in resource constraints from those related to one’s economic position relative to others.
Theories Based on Absolute Notions of Wealth
Economic models of resource allocation within the family (including fertility choices) generally consider some type of maximization of expected returns from having and raising male/female children, given a budget constraint. Increases in wealth are represented by outward shifts of the budget constraint. The shifts represent increases in the quantity of resources available relative to a previous state, not relative to other households. The budget constraint may shift because of pure income effects if family resources increase while relative expected returns from boys and girls do not change. Alternatively, the budget constraint may shift from both income and substitution effects—for example, when expected net benefits from sons increase less rapidly than expected net benefits from daughters, an evolution that would be expected with economic development if wage gaps narrow. Larger budget sets may reduce bias through several direct or indirect interrelated channels: reduced competition for resources within the household, a narrowing of real gender-based wage gaps, and reduced fertility. Theories range from intrahousehold allocation decisions motivated by utility maximization and/or bargaining to representative-agent macroeconomic growth models.
Rosenzweig and Shultz (1982) hypothesized that households choose investment expenditures on boys and girls to maximize expected utility. Anticipations of higher returns motivate a larger share of expenditures on male children. If having a child is itself an investment, this could also explain son preference. Because the inequality is generated through budget constraints, relaxation of the constraint should reduce bias. Galor and Weil (1996) modeled the relationship between gender gaps, fertility, and growth in an overlapping-generations model. As the stock of capital grows, women’s relative wages increase (they assumed that capital is more complementary to women’s labor than to men’s), so the opportunity cost of having children increases, thus lowering fertility. Lower fertility in turn increases the amount of capital per worker, resulting in higher relative wages for female workers. Their argument implies that women participate in the labor force, or at least that the increase in the wage of women relative to men provides greater incentives to do so. Such participation in the labor force creates positive feedbacks toward wage equality. The model does not address son preference, but it is clear that the difference in expected lifetime earnings between boys and girls would decrease as the wage gap narrows, and this should in turn contribute to reducing son preference. Similarly, higher expected earnings would motivate greater educational investments in girls (Behrman et al. 1999, Kingdon 2005). Zhang et al. (1999) proposed an endogenous growth model in which a parent’s expectation of support for old-age perpetuates culturally and religiously induced son preference because greater support is expected from children who can accumulate greater human capital. Son preference is an argument of the parent’s utility function as well as a result of differential expected returns. If old-age support comes only from sons, bias continues with growth; however, if both children can accumulate human capital and girls can provide old-age support, gender bias decreases with growth. Taking the argument further and assuming growth promotes better functioning capital markets, old-age support would become less of a concern in wealthier households who would have less reason to prefer sons.
The relationship between wealth and gender bias may become negative only after a certain level of wealth is reached. Kanbur and Haddad (1994) introduced the possibility of such Kuznets’ effects (Kuznets 1955) between wealth and intrahousehold inequality by using a bargaining framework. They did not consider resources spent on children but the division of additional resources (pure income effects) within the household. Theoretical solutions to the bargaining model result in U-shaped relationships when there are increasing returns in production and differences in productivity between the two household members—in particular, when the members’ outside options do not change proportionately with joint production. If intrahousehold equality is associated with less gender bias, the theory would predict an eventual leveling of expenditures on boys and girls and lower son preference as household resources continue to grow. At the aggregate level, we may also find Kuznets’ effects if income effects first dominate substitution effects in the labor supply decision of women (Goldin 1995). When income effects dominate, higher incomes reduce female labor force participation, thus reducing women’s contribution to household wealth. Considering some evidence that gender bias at the household level is negatively related to the contribution of the female member to household resources (Agnihotri 2000), such reduction contributes to son preference. As economic activity expands, however, opportunities for women in the labor market become more lucrative. Substitution effects start working toward increased female labor force participation, creating positive feedbacks toward gender equality as in Galor and Weil (1996).
Theories Based on Relative Notions of Wealth
This section requires some background specific to the Indian context. There are three interrelated dimensions to socioeconomic status in India: economic, political, and ritual rankings (Srinivas 1962). Srinivas described “high” castes, or castes that have achieved a high ritual status, as “Sanskritized.” In Sanskritized castes, women are subordinate to men in religious rituals, do not work outside the home (and in some cases, do not leave the home), and, in order to preserve their social status, they need to marry and have male children and male grandchildren. They cannot initiate a divorce, and they cannot remarry as widows. Parents are responsible for finding a suitable husband for their daughter, generally within the caste; finding a husband of higher status than one’s own—a practice called hypergamy—is looked upon positively. Not being successful in arranging a marriage for one’s daughter, however, reflects badly on the unmarried daughter and her family. Parents of the bride pay all marriage costs and give some dowry and/or some substantial cash payment to the family of the groom. After marriage, the daughter is completely committed to the family of her husband and can no longer provide emotional, physical, or economic support to her own family, so parents can count exclusively on male children as support for old age (Arnold et al. 1998; Dyson and Moore 1983). The more Sanskritized a caste and the higher one’s status in the caste hierarchy (status being determined on all three aforementioned axes), the more costly it is to have a daughter instead of a son and the more barriers there are for women to integrate into the productive economy. Modern dowry (Srinivas 1989) or groomprices (Billig 1992) have become more and more common and have spread to all areas of India.2
Within this context, several (interrelated) channels have been identified linking rising gender biases to rising income. A wide body of research has pointed to the persistence of son preference combined with increased availability of sex-determination techniques and voluntary pregnancy terminations as largely responsible for the increasing number of “missing women” (e.g., Agnihotri 2003; Basu 1999; Das Gupta 2005; Sen 2003). It is a fact that wealth increases the ability to use such techniques, but the desire to do so is a phenomenon of son-preferring households only. If son preference is shown to decrease with wealth, fewer families would practice sex selection, and sex ratios would eventually improve.
The next arguments relate to the prevalence of son preference itself. Most compelling is the “prosperity effect” argument, also often loosely called “Sanskritization.”3 A simplified version of the argument is that individuals in families or castes of lower ritual status who have the economic means to do so feel inclined to emulate the behavior of higher castes (Agnihotri 2000, Basu 1999, 2000; Berreman 1993; Srinivas 1962, 1989). Following from earlier, although ritual status (related to caste), economic status (related to wealth), and political status are different concepts that need not be related, the three axes of the hierarchy are very much intertwined. Srinivas (1962) argued that the highest possible place in the social hierarchy includes a high ranking on all three axes. A family with high ritual status will likely seek a strong economic position to reinforce its social status, and a family or a group of families that have achieved relative economic prosperity will likely seek to improve their ritual status. The “prosperity effect” implies that gender bias/son preference is likely a feature of richer households, regardless of caste.
Another possible channel through which wealth can increase son preference is related to rising marriage costs (groomprices, dowries, and wedding ceremonies). Billig (1992) noted that groomprices have become common practice in the South and that the amounts of money involved have risen significantly (see also Basu 1999). Anderson (2003) showed theoretically that economic development can explain increases in modern dowries in India through the effect of wealth dispersion in a caste-based society. Higher groomprices and dowry values can cause greater son preference (or daughter aversion) because they increase the expected costs of having a daughter and the expected benefits of having a son. If dowries and groomprices as a proportion of income are greater for wealthier households, the argument can explain why richer households would be more likely to prefer sons. Given the hypergamous tradition, however, one must also take account that the expected gains from marriage may depend on one’s opportunity to gain status via marriage. If wedding ceremonies are themselves used to signal social status (as argued in Bloch et al. 2004), higher-ranked families will be expected to give grand weddings just to maintain their status, while families of lower social status will incur the expense in order to gain full social benefits when their daughter marries into a family of higher status. Given that it is looked upon positively to marry upward for a girl but not for a boy, a daughter can become more valuable than a boy in terms of gaining status as long as there are wealthier families to marry into:
Families seeking social advancement compete among themselves to amass a dowry large enough to secure a place for their daughter in an elite household. This brings a prestigious alliance for parents along with prospects of well-endowed grandsons. . . . At the top of the hierarchy, however, hypergamy dooms daughters. There is no family for them to marry into. (Blaffer Hrdy 1999:325–26)
The problem faced by higher-ranked families in this context can contribute to greater son preference, while aspiration for social advancement via marriage in lower-ranked families may explain some preference for daughters if they can afford such marriage—the caveat being related to notions of absolute wealth.4
Arguments related to Sanskritization, rising marriage costs, and hypergamy all suggest that the positive link between wealth and gender bias relates more to relative notions of socioeconomic status within the community rather than to standards of living.
Absolute Versus Relative Wealth
Although there is often confusion between the two concepts, the case for distinguishing between absolute and relative wealth in economics has been made before. Frank (1985) saw the value of local status as a major driver of economic decisions. He stressed, in particular, the role of imitative behavior:
Indeed, the most sensible strategy might involve focusing on the behavior of those positioned above ourselves on whatever index of ‘success’ we find most important. After all, if what they are doing seems to work for them, then why not for us?” (Frank 1985:19)
In relation to health access and outcomes, Deaton (2003) highlighted the importance of distinguishing between income and other related dimensions of well-being, such as education, wealth, control, or rank. He argued that all these variables have a potential effect on health, independently of income or even income dispersion; control and rank, in particular, are dimensions directly linked to relative wealth.
If these theories are supported by the data, the direction of the effect from wealth to son preference must depend on the reference base on which wealth indices are constructed. If households in the sample are not connected by a notion of community and if wealth is measured relative to the whole sample, one would expect less son preference in richer households. If, however, the country is highly segmented by community and wealth is measured relative to households in the same community, one would expect more son preference in richer households. Because measures of absolute and relative wealth are necessarily positively correlated, the estimated effect of a unique wealth variable on son preference shall include both effects. Estimating the effect of absolute levels of wealth on son preference independently of the confounding effect of relative wealth is the focus of the empirical analysis that follows.
Empirical Model: Data and Methods
The degree of son preference (SP) of an individual (i) at one point in time is assumed to be a function of various characteristics of the individual, her household, and the macroeconomic context in which she lives. Individual-level demographic and socioeconomic characteristics are borrowed from B&Z. At the household level, two measures of wealth are considered: absolute wealth (W) measures the household’s standard of living, and relative wealth (WR) measures the economic standing of the household in its community. Landownership, generally considered a prime element of socioeconomic status in rural areas, is also considered independently of its contribution to W and WR. The macroeconomic level is included to account for the effect of economic development in terms of goods and services available to all.
Individual- and household-level variables are constructed from the 1998–1999 and 2005–2006 National Family and Health Surveys of India (NFHS-2 and NFHS-3). NFHS is the Indian version of the Demographic and Health Surveys (DHS); it has been widely used for academic research and by international organizations since its first wave (1992–1993).5 The NFHS data are organized as individuals within households within primary sampling units (PSUs) within states. PSUs are villages (in rural areas) or census blocks (in urban areas) selected with probability inverse to size to form the rural and urban samples in each state. Households are selected randomly within each PSU. The full sample of households is used to calculate principal component (PC) absolute and relative wealth scores. The household data and associated wealth scores are then merged with the individual-level data. All ever-married women between the ages of 15 and 49 who were present in the selected household at the time of the visit were interviewed. After visitors and a few observations with missing data or unusable information on key variables are excluded, the samples consist of 77,886 women in NFHS-2 and 83,785 in NFHS-3. Overall, dropped observations on independent variables are evenly distributed in the data in terms of levels of son preference, education, and income (Online Resource 1, section 1).
Dependent Variable: Son Preference
Both NFHS-2 and NFHS-3 questionnaires included questions on ideal family size. In particular, women with living children were asked, “If you could go back to the time when you did not have any children and could choose exactly the number of children to have in your whole life, how many would that be?” Women with no living children were asked, “If you could choose exactly the number of children to have in your whole life, how many would that be?” Women who answered the first question with a numerical answer (as opposed to, for example, “many (or fewer) than my brother”) were then asked, “How many of these children would you like to be boys, how many would you like to be girls, and for how many would the sex not matter?”6SP is calculated by using the difference between ideal number of boys and ideal number of girls divided by the ideal family size.7 The ratio was chosen as a measure of son preference because it provides an intuitive ordering and can capture the full range of preferences, including preference for girls (counterbalancing some “pure” son preferences that are not related to a societal notion of gender bias). The proportion of sons in ideal family size chosen by B&Z for their ordinary least squares (OLS) model was not retained because it eliminates variation on the girl preference side and, more important, dampens variation on the son preference side.8 Figure 1 gives the distribution of the dependent variable SP; it ranges between –1 and +1 (by construction), with a large number of observations clustered at 0 (zero), indicating no preference for either boys or girls. Means are 0.13, 0.09, and 0.11 for the NFHS-2, NFHS-3, and pooled samples, respectively, with standard deviations around 0.27. The numerator—the difference between ideal number of boys and ideal number of girls—lies mostly between −1 and +2 (99% of the data). The denominator—ideal family size—averages between 2 and 3. Very few women gave answers of 0 (1%) or more than 10 (less than 0.02%); they were excluded without creating detectable selection problems (Online Resource 1, section 1). Dividing by the ideal number of total children avoids giving more weight to large ideal family sizes, which would be especially problematic here because desired total number of children and wealth level are strongly negatively correlated (ρ = − .31 for NFHS-2 and − .27 for NFHS-3).
Table 1 provides descriptions of explanatory variables and summary statistics. Variables are separated into three groups depending on the level at which they are measured: the macroeconomic level (state), the household level, and the individual level. I focus here on describing the methodology used to measure variables that distinguish this article from the existing literature.
Economic Development (Macroeconomic Level)
A single variable is used to approximate the level of development in each state. State-level gross domestic products for 1999 and 2005 are converted to constant rupees (thousands) and divided by state populations, using the 1991 and 2001 census to obtain the real gross per capita state domestic product (GSP/c).9 I anticipate that overall economic development would decrease son preference, or possibly increase son preference in the first stages of development and eventually decrease it.
These normalized scores provide an ordinal measure of wealth levels that can be used as an index of wealth. The measure is fundamentally relative by construction; but, the reference base being all of India, it can be used to approximate the variation in absolute wealth within the country at a given point in time. Note, however, that household wealth scores cannot be compared across the two surveys because PCA is carried out by using each sample separately (asset variables differ between surveys). Thus, the pooled analysis treats the absolute wealth scores of the two samples as two different variables: W99 and W06.
The relative wealth variable (WR) is constructed by using PCA on subsamples of households living nearby. The area of influence one would ideally choose depends on the size of the population and specific customs, but in general, the area must be large enough to constitute a community of people but not too large so that it can be differentiated from measures of absolute wealth (the larger the area considered, the greater the correlation with absolute wealth and the less meaningful the distinction). The only geographical identifiers available in both NFHS-2 and NFHS-3 that can place households as “neighbors” at the community level are the PSUs.11 There are more than 3,000 PSUs in each sample, with a median size of 30 households per PSU (as shown in Table 2). PCA is run PSU by PSU, using the same list of assets as for the full sample (if no one in the PSU owns a specific asset, the asset is dropped). Because PSU samples are small, tests are run to verify the behavior of PCA in small samples; they indicate that the procedure is relatively stable down to sample sizes of 10 (Online Resource 2). Scores obtained from the PSU-level analysis are normalized in a manner similar to Eq. 1 by using PSU specific minimum and maximum scores. Consequently, wealth scores are completely independent between PSUs but dependent within them. The overall distribution of WR includes a large number of zeros and ones by construction (normalization is done PSU by PSU); in between, it is close to uniformly distributed with slightly more mass below 0.4. More interesting statistically is the distribution of the variable within each PSU; Fig. 2 summarizes this distribution by PSU size. Each point on the graph is obtained by taking the average of the PSU-specific WR means and standard deviations among all PSUs of equal size. Standard deviations stabilize around sample sizes of 15; because close to 99% of PSUs are larger, samples with less than 15 households can be dropped without biasing results (Online Resource 1, section 1).
Absolute and relative wealth scores are obviously positively correlated, but a well-to-do individual by national standards may score lower in relative wealth than a much poorer individual living in a different area if the poorer individual is relatively richer than her neighbors (cf., Jaffe et al. 2004).12 The correlation coefficients between the all-India principal components household wealth scores and the PSU-level scores are approximately .75 in both samples. Categorical variables were constructed to create PSU-level wealth quintiles based both on the number of households and on deviations from the PSU means. The regressions did not change qualitative results or add important information relative to using the scores.
Ownership of agricultural land and the size of land holdings (cultivated land) enter in the principal components calculation of both the absolute and relative measures of wealth described earlier, along with other asset variables. There are two reasons to give them additional consideration, both related to relative wealth considerations. First, landownership, beyond its contribution to wealth, has traditionally been an important consideration when families in the same community evaluate one other’s worth; for example, in the hypergamous tradition, it is well regarded to marry a daughter in a family with lots of land. Second, ownership of agricultural land (regardless of size) may identify a group of households more likely to behave according to the theories outlined earlier that argue a positive link between wealth and son preference. Indeed, higher-caste families traditionally owned much of the land in agricultural village economies (Chakraborty and Kim 2008), and lower-caste landowners have been identified as the group most likely to sanskritize (Staal 1963). To account for the first effect, I include a continuous variable that records the number of acres of cultivated land that a household owns. Among households with agricultural land, the correlation between acres of land and absolute wealth scores is around .14 in both samples; correlations with relative wealth scores are .17 in NFHS-2 and .14 in NFHS-3. The relationship between son preference and land holdings is likely very different in rural and urban areas, so I allow the slopes to differ. To account for the second effect, I allow the impact of relative wealth to differ between landed and nonlanded households.
Other Determinants of Son Preference, Control Variables, and Fixed Effects
Socioeconomic and demographic variables measured at the individual level closely follow those in B&Z, and I refer the reader to their explanations (Bhat and Zavier 2003).13 Modifications were made regarding education variables and age; they are explained in Online Resource 1, section 3.
Variables from B&Z related to family demographics are included to account for rationalization effects (possible endogeneity issues are addressed in Bhat and Zavier .) Another variable from B&Z identifies observations with odd-numbered ideal family sizes; it is used because women who reported an odd number as their ideal family size generally preferred one more son rather than one more daughter (true in both the NFHS-2 and the NFHS-3 samples). The inclusion of this variable does make a significant difference in the estimation results, although qualitative results relative to the effects of absolute and relative income are robust to the change.
Because state-level random effects are used in the multilevel regression (explained later), I include regional fixed effects to account for differences in cultural norms that are known to vary closely with regions. In particular, the North, home of the Rajputs, traditionally has stronger son preferences, whereas the South’s cultural traditions have been more favorable to women. I follow the Indian Census to group states into regions. Finally, as in B&Z, I include a dichotomous variable for urban residence. In the pooled regression, another binary variable captures additional differences between the two samples; in addition to time, it can account for differences in general surveying procedures.
Although the distribution of the dependent variable would favor a discrete probability model, the linear model was chosen for several reasons: (1) it preserves more variance in terms of degrees of son preference; (2) it allows inferences about differential causal effects from estimated coefficients when residual variation varies across groups, which is not the case in logit or probit models (Allison 1999); (3) as will be shown later and as found in B&Z, OLS results compare favorably to logit and ordered logit models; and (4) the linear model facilitates the use of multilevel methods.14 Compared with feasible alternatives, the linear multilevel procedure proved to be the most conservative in hypothesis testing (Online Resource 4).
In the pooled regressions, state-level effects are taken to be random and different for NFHS-2 and NFHS-3. The choice was made for two reasons. First B&Z found that coefficients and significance levels on state dummy variables were different between NFHS-2 and NFHS-3 without following any logical pattern; they concluded that the effects were likely measuring differences in data quality across states. Including a random effect at the state/survey level should account for differences in surveying procedures, while unobserved regional characteristics can be captured by regional fixed effects. The second reason is that the state GDP per capita figures do not always change the same way, state by state, between 1999 and 2005 because a few state delineations were changed in 2000 (Online Resource 1, section 2).
Using the same mixed-effects linear estimation methods, several models are estimated, each one nested into the other in the sense that new variables are added to the X(1) matrix (household-level variables), while other variables are unchanged. All models are estimated for the NFHS-2, NFHS-3, and pooled samples. The base model, MW1, is the closest to B&Z (aside, notably, from the estimation procedure and the inclusion of the macroeconomic context). The model is first used as a benchmark to evaluate whether there is evidence of U-shaped relationships in aggregate and household wealth; it is then compared with models with variables related to dimensions of economic status. The Akaike information criterion (AIC), calculated as – 2 × ln(L) + 2 × k (where L is the value of the minimized likelihood function, and k is the number of parameters), is used to compare the different models; the lower the AIC, the better the fit for a given sample.
Before choosing MW1 as a benchmark to evaluate models with relative wealth variables, I estimate three alternative models to test for the presence of U-shaped relationships for aggregate and individual wealth in NFHS-2, NFHS-3, and the pooled data. All models are identical to MW1 except that they include squared terms for state wealth (MW21), household wealth (MW12), or both (MW22). Coefficients on squared terms are insignificantly different from zero in all models, with p values greater than .25.19 MW1 has the lowest AIC value and MW22 has the highest in all samples (the same rankings are obtained using Bayesian information criterion). I conclude that given the nature of the data examined (in particular, the fact that I am using cross-sectional data), there is no evidence of U-shaped relationships between son preference and absolute wealth at the microeconomic or macroeconomic levels.
MWR1 adds landownership (acres of cultivated land) as a separate variable, allowing different slopes whether the household is in rural or urban areas. I start with this specification because landownership has been used in previous models of gender bias focused on rural areas (in addition to wealth and for reasons related to social status). MWR2 adds the new relative wealth variable. Finally, MWR3 allows the relative wealth effect to vary whether the household owns land. All models include absolute wealth at the macroeconomic level. Full estimation results are presented in Table 3 for the NFHS-2 data and Table 4 for the pooled data. NFHS-3 results are not reported here to save space (see table S10 in Online Resource 4, section 1), but differences are highlighted below. For ease of presentation and interpretation, p values (instead of standard errors) are given beneath coefficients; the p values below .001 are omitted. Likelihood ratio tests indicate conservatively that the multilevel regression is appropriate (compared with a regression without group random effects). All estimated variance components for the different levels are virtually the same across model (but not across samples) and significantly different from zero at the strictest confidence level, although their respective contribution to the overall variance is very small compared with the individual-specific residual variance. All demographic control variables and regional fixed effects are the same across models and are consistent with B&Z’s findings except for ideal number of children and its square (discussed in Online Resource 4, section 2). When the dependent variable is a proportion of ideal family size, however, the interpretation of the effect of ideal total and its square is very different and less obvious than in discrete probability models. Coefficient estimates on socioeconomic individual-level variables change slightly across models, but signs and significance remain consistent with the literature. I concentrate therefore on commenting on results related to the main goal of the article: disentangling relative and absolute wealth effects.
First, looking at the statewide wealth level, I find that higher per capita state GDP reduces the prevalence of son preference at the strictest confidence level in the NFHS-2 and pooled samples. Using the 1998–1999 data and evaluating elasticities at mean values, I find that a 1% increase in state GDP per capita (measured in thousand rupees) decreases SP by roughly 0.25%. The relationship is less strong economically in the pooled data with elasticities of –0.14 to –0.15, but the result is still statistically significant at the strictest confidence level. There are two explanations for the lower magnitude of the pooled effect. First, the relationship is weaker in NFHS-3, with elasticities between –0.16 and –0.18, and p values between .06 and .09. Second, the estimated coefficient in the pooled sample depends on both time and cross-sectional variance; variation between states in a given sample is expected to be larger than variation across the six years considered. Indeed, the cross-sectional effect should capture a long-run effect in the sense that it measures the effect of deeper, more structural changes; effects measured using changes across short periods of time (here, six years) are expected to be smaller.
The specific hypothesis of this analysis relates to the direction of the effect of the household’s standard of living (W99 and W06) and its relative economic standing within the community (WR). Results on absolute wealth in all models and across samples are consistent with economic theory: wealthier households are less likely to express preference for boys. Coefficients vary between −0.045 and −0.066, translating into elasticities of −0.12 to −0.30 (using mean values). The largest elasticities are in models MWR2 and MWR3 with NFHS-3. Estimated coefficients on W (W99 and W06) are stable across all samples, with p values below .001. Controlling for local status increases the strength of the relationship, mostly because of the addition of the relative wealth variable: including landownership alone (size of cultivated land) increases the absolute value of the coefficient on W by less than 6% (the largest impact is in NFHS-2); this is very small compared with the increase in the magnitude of the effect after controlling for relative wealth in MWR2 (32%–43%).
The most compelling result that is least intuitive to economists but finds much support in the sociology and anthropology literatures on gender bias in India is the role of variables related to relative economic status. A higher standing in the local community, whether related to higher relative wealth or landownership, is associated with a higher preference for sons. Looking at landownership alone does not suffice to capture the effect of relative economic status. Landownership variables are significant and positive in the NFHS-2 and the pooled sample when looking at rural areas only; it becomes largely statistically insignificant both for urban and rural areas when using NFHS-3 data. When the effect is statistically significant, its magnitude is quite small, with elasticities not exceeding 0.009, indicating that a large percentage difference in the size of land holdings is necessary to create a noticeable change in son-preferring attitudes. The relationship between son preference and relative economic status is, however, very robust when considering the principal components measure of relative wealth that defines relative economic status along a wide array of assets (including land). The coefficients on WR remain stable around 0.014 ±0.002 in all samples, with p values lower than .001 in the pooled sample and NFHS-3, and below .02 for NFHS-2. The relative wealth effect, with elasticities between 0.04 and 0.07, is about four times weaker than the household absolute wealth effect; but including WR in the regression significantly enhances the magnitude of the absolute wealth effect in terms of son-preference reduction.
Although the size of land holdings does not have a clearly important role in son-preferring attitudes, the impact of relative wealth appear to be a feature of landed households: in MWR3, coefficients on relative wealth for nonlanded households lose significance, while the significance level is reinforced for landed households.
Finally, the pooled regression includes a fixed effect for the NFHS-3 survey. The effect clearly shows lower levels of son preference overall in the 2005–2006 sample (independently of differences captured in the model).
Results on absolute and relative wealth effects are robust across all samples. An alternative specification of local areas using the NFHS-2 geographical identifiers is also tested (the full procedure is described in Online Resource 3); the level of significance of the relative wealth variable is slightly reduced but remains significant at the same level; estimation results are virtually identical to those in Table 3. The qualitative linear results are also validated by comparisons of single-level OLS results with logit and ordered logit estimations on dependent variables as defined in B&Z and P&A, and by taking account of the complex survey design. Table 5 presents results of the three alternative estimations for the pooled sample. Changes in the absolute wealth impact with or without relative wealth were also tested with the logit model, both the single-level model and a random-effect logit model with state fixed effects. Adding the size of cultivated land decreases the odd ratio of absolute wealth by less than 5%, while adding WR decreases it by about 25%. Details of the estimations and additional results are given in Online Resource 4.
As mentioned in the introduction of this article, although variation in son preference should be a good approximation of overall changes in attitudes toward women, the dependent variable does not measure gender bias in terms of girls’ access to resources. In addition, explanatory variables on family demographics could be endogenous in a context where sex selection is increasingly practiced. To test the absolute/relative wealth dichotomy further, I use an alternative dependent variable (EduBias) based on questions on desired level of education for boys and girls. Details on the variable, summary statistics, and complete results are given in Online Resource 4, section 5. Although it is still a stated preference variable, it is expected to be positively correlated with son preference and gender bias in general, while mostly free of possible endogeneity problems. Unfortunately, the information was no longer available in NFHS-3. The same empirical analysis is run with EduBias as the dependent variable, removing demographic controls specifically related to son preference. To facilitate comparison with the son-preference model, Table 6 reports results on wealth-related variables in terms of elasticities. The effects of absolute wealth, both at the macroeconomic and household levels, are significant and negative, with magnitudes 50% higher than for son preference. Relative wealth elasticities are also higher than in the son preference model. Inclusion of the relative wealth variable in the estimation also increases the magnitude of the absolute wealth impact, although by a lower percentage than in the son-preference model (13%). One important difference between the two models concerns the role of landownership. Size of land holdings, outside of its contribution to W and WR, is found to have no impact on stated educational bias (see table S5 in Online Resource 4, section 5). The link between relative wealth and gender bias is still positive and significant but is no longer a feature of landed households only. The result is not surprising if one believes that issues of marriage are what distinguish the behavior of a landed household and if educating a girl facilitates her placement in a good family.
Contradictory predictions from different disciplines about son preference in India as households become richer can be reconciled if one accepts that these predictions rely on different notions of wealth. Empirical evidence demonstrates that the link between son preference and absolute wealth is clearly negative and significantly stronger when one controls for the effect of relative wealth. The observed negative influence of “prosperity” on gender equality appears to work through notions of local economic status: that is, it is the higher relative wealth position of a household in the local community that generates gender bias.
Including both absolute and relative wealth in the empirical analysis is not only necessary to avoid biased results, but it also allows us to think differently about the types of development strategies that are most likely to mitigate son preference and gender bias in general. In addition to the general goal of increasing standards of living and specific targeting of women in education and health, development strategies that decrease the importance of local status are expected to be instrumental in reducing son preference. Such policies could include, for example, the development of transportation infrastructure and the promotion of their use, delocalization of information (e.g., with the use of modern communication technology), and access to education. Their potential effect on labor mobility and geographical reach of marriage searches is most likely to minimize the role of local status-seeking behavior and to reinforce the positive effect of increased standards of living in son-preference reduction.
To better understand the links between son preference and economic development, it is important to estimate how much stated preference for sons effectively translates into de facto daughter neglect. Tarozzi and Mahajan (2007) showed that nutritional status improved more for boys than for girls in India and that the trend was mostly marked in son-preferring rural areas of northern states. Further research is needed to understand the strength of the relationship and whether it generalizes to other domains. Indeed, the main channels linking gender bias to slower economic development are not directly related to son preference or unbalanced sex ratios; rather, they are related to the treatment of “surviving” girls in terms of health care and access to education.
Initial work for this article was done while the author was a visiting research associate with the University of Maryland, AREC; and a research affiliate with the World Bank, Washington DC. In addition to her host institutions, the author thanks Gautam Datta, and Jayati Datta-Mitra for substantial comments. Acknowledgments also go to David Bishai, Barbara Craig, Hirschel Kasper, Kala Krishna, Imran Lalani, Yana Rogers, and Abdo Yazbeck. This revised version owes a great deal to comments of three anonymous referees. All analyses and remaining errors are the author’s responsibility.
A related issue is that “pure” son preference—that is, wanting a son based on personal taste rather than a response to structural motivating factors—will be confounded in the measure. Statistically, however, such preference should cancel out with girl preference, on average.
A groomprice is, according to Billig (1992), any payment in kind or cash that can be used by the husband’s family as it pleases (whereas a dowry remains the property of the daughter). Srinivas called it “modern dowry.”
The term “Sanskritization,” as it was introduced by Srinivas (1962), refers to a phenomenon deeply rooted in the caste system and more specifically related to group behavior; “prosperity effect” is used more generally to apply to individual households.
The relevance of this argument depends on conditions of the marriage market. The less geographically segmented the market and the less costly it is to establish contact with families further away, the less important narrowly defined relative wealth should be. Further information on this point is provided online (Online Resource 5).
Poorer women with no education were disproportionately represented in nonnumerical answers, but the number of cases was small enough to justify ignoring the bias (Online Resource 1, section 1).
In robustness analyses, alternative definitions of the dependent variable are used: a binary variable defined as in B&Z for logit models, and a three-level ordered categorical variable defined as in P&A for the ordered logit alternative (Online Resource 4, section 3).
Consider three responses: (A) 3 boys and 3 of either sex; (B) 3 boys, 2 girls, and 1 of either sex; and (C) 3 boys and 3 girls. In terms of son preference, most people would rank A above B, and B above C. Using B&Z’s measure, all three responses get a value of 1/2; with the measure chosen here A scores 1/2, B scores 1/6, and C scores 0.
Details on the construction of GDP/c are given in Online Resource 1, section 2.
The procedure is the same as in Gwatkin et al. (2000; see also Gaudin and Yazbeck 2006a; and Pande and Yazbeck 2003). For a list of asset variables and results on asset weights and scores obtained with NFHS-2 data, see Gaudin and Yazbeck (2006b).
Online Resource 3 discusses the construction of an alternative grouping of households based on geographical identifiers available in NFHS-2.
Jaffe et al. (2004) also distinguished between absolute and relative measure of socioeconomic status in their analysis of mortality in Israel.
Multilevel ordered logit is theoretically feasible but proved to be too computationally intensive with such a large data set and a four-level structure.
The multilevel modeling literature is divided in terms of numbering levels from the top (as here) and from the bottom. Models that consider base-level observations as level one would call this model a four-level hierarchical model rather than three-level.
The notation is adopted to facilitate the intuition behind the estimation procedure. For a clear exposition of hierarchical linear multilevel analysis and advantages of its use in policy analysis, see Leyland and Groenewegen (2003); for a more rigorous treatment see, for example, Rabe-Hesketh and Skrondal (2006).
The number of households is large, and 75%–80% of households contain a single observation. Although it complicated the procedure, the household-level residual was kept because its variance was consistently found significantly different from zero and higher in magnitude than variances for upper levels.
Maximum likelihood is easier to implement for unbalanced panels and allows comparisons between models with different fixed portion. Restricted maximum likelihood did not yield noticeable differences on coefficient estimates and standard errors.
Detailed results are not reported for these models (other than MW1) because coefficient estimates on all variables other than wealth and squared wealth did not change.