Using the nationally representative India Human Development Survey (IHDS), we create a unique son–father matched data set that is representative of the entire adult male population (aged 20–65) in India. We use these data to document the evolution of intergenerational transmission of educational attainment in India over time, among different castes and states for the birth cohorts of 1940–1985. We find that educational persistence, as measured by the regression coefficient of father’s education as a predictor of son’s education, has declined over time. This implies that increases in average educational attainment are driven primarily by increases among children of less-educated fathers. However, we do not find such a declining trend in the correlation between educational attainment of sons and fathers, which is another commonly used measure of persistence. To understand the source of such a discrepancy between the two measures of educational persistence, we decompose the intergenerational correlation and find that although persistence has declined at the lower end of the fathers’ educational distribution, it has increased at the top end of that distribution.
The principle of equality of opportunity finds support from most policy-makers and the general public alike.1 Intergenerational persistence in economic status is an important mechanism in perpetuating inequality of opportunities in a society. For instance, such persistence may differ across groups of people in a society who are typically identified by race, gender, and region, implying differential access to opportunities for different groups. Hence, the extent to which economic status is transmitted from one generation to the next has long been of interest to social scientists and policy-makers.2
India serves as an excellent case study for intergenerational mobility for two reasons. First, Indian society has historically been characterized by a high degree of social stratification governed by the caste system, which results in exclusion of certain groups from certain economic and social spheres. At the time of independence, the Indian Constitution identified the disadvantaged caste and tribes in a separate schedule of the constitution as Scheduled Castes and Scheduled Tribes (SC/STs), and extended affirmative action protection to these groups in the form of reserved seats in higher educational institutions, in public sector jobs, and in state legislatures as well as the Indian parliament. The Indian Constitution also described “socially and educationally backward classes,” and the Government of India is enjoined to ensure their social and educational development. In contrast to SC/STs, who were enumerated under a separate schedule, the Constitution did not identify those backward classes. The Government of India uses the collective term Other Backward Castes (OBCs) for these classes and has classified a number of castes as OBCs over time through various commissions. The Government of India provided reserved positions for OBCs in public sector jobs in 1993. Although the caste system has been weakened as a result of various policy measures taken by the government, social identity still remains an important dimension of social exclusion in India.3 For example, recent work by Munshi and Rosenzweig (2006) has shown how caste-based labor markets have trapped individuals in narrow occupational categories for generations, and persist even as of this writing. Hence, gauging how such inertia in economic mobility has changed over time is of interest. Second, although India has experienced rapid economic growth in recent decades, this economic growth has been far from uniformly distributed across state boundaries in India. Chaudhuri and Ravallion (2006) documented that between 1978 and 2004, among the 16 major states, Bihar (including the newly created state of Jharkhand) had the lowest growth rate of 2.2 %; by contrast, Karnataka had the highest, at 7.2 %. Such large statewise variation in growth rates implies increasing regional disparities in India. Asadullah and Yalonetzky (2012) found a significant variation in educational attainment at the state level.
Furthermore, until the mid-1970s, education policy was under the purview of state governments, which in principle could generate significant variation in education policy across states. To achieve the national objective of Universal Elementary Education, in 1976, the 42nd amendment to the Indian constitution placed education on the concurrent list.4 The main implication of this amendment is that the central government can directly implement any education policy decision in the states. One possible consequence of this change is increased uniformity in education policies across state boundaries, which in turn should reduce variation in educational opportunities across states. Hence, it can be argued that different cohorts may have differential access to educational opportunities across generations, depending on their state of residence.
Although, as argued earlier, understanding the extent of intergenerational mobility is especially important for India, this issue has received relatively less attention mainly because of the lack of suitable data. In the absence of long panel or administrative data, existing studies have used cross-sectional data and have relied on coresidence to identify son–father pairs (Emran and Shilpi 2015; Hnatkovskay et al. 2013; Jalan and Murgai 2008; Maitra and Sharma 2009). This leads to a significant loss of observations and, more importantly, raises serious sample selection issues given that the distribution of education of both generations is different in the subsample of adult males who coreside with their fathers versus the total sample of all adult males.5 Furthermore, a sample based on coresidence does not allow cohort-wise analysis and a longer-term trend in intergenerational transmission.
In this article, using the nationally representative 2005 India Human Development Survey (IHDS), we relax the data restriction by creating a unique son–father matched data set that is not limited to coresident households. Our data contain all the adult males surveyed in the IHDS; the data thus include not only those adult males whose fathers resided in the same household at the time of the survey but also those adult males whose fathers either did not reside in the same household or who are perhaps deceased. We measure economic status by educational attainment and examine the transmission of educational attainment across generations. We group adult males in year-of-birth cohorts and study the intergenerational mobility of these cohorts over time.6 To shed light on caste and geographical dimensions, we investigate the cohort trend in mobility by major castes and by major states.
This article contributes to the existing literature in four ways. First, we track changes in educational persistence across birth cohorts going back to as early as the 1940s using two widely used measures of persistence: a correlation coefficient (ρ) and a regression coefficient (β). We provide persistence estimates for successive birth cohorts at the aggregate level as well as for the major social groups (castes). We further decompose the correlation coefficient to address the issue of diverging trends in the two measures of persistence. Second, because our sample is comparable with the data sets used by Hertz et al. (2007) to rank 42 countries, we are able to rank India in terms of intergenerational educational persistence among other nations, following the Hertz et al. (2007) methodology closely. To the best of our knowledge, there exists no comparable estimate of intergenerational educational persistence that can be used to rank India among other countries. Third, we relax an important data restriction for intergenerational mobility studies in India by creating a unique son–father matched data set that is not limited to coresident households. We are able to match about 97 % of males aged 20–65 in a nationally representative household survey with their fathers’ information. As a result, unlike other studies on India, the estimates presented in this article do not suffer from sample selection bias caused by limiting the analysis to adults coresiding with their parents. Fourth, the regional dimension of intergenerational educational mobility in India remains largely unexplored. In this article, we attempt to fill this void in the literature and provide state-level estimates of intergenerational educational persistence by birth cohorts.
Our study reveals four key findings. First, among the birth cohorts of 1940–1985, we see a pronounced declining trend in the estimated intergenerational educational persistence, as measured by the regression coefficient of father’s schooling as a predictor of son’s schooling, thus implying greater mobility for more-recent cohorts in India. A one-year difference in father’s education corresponds to a smaller difference, on average, in the expected value of son’s education in the recent cohorts compared with older birth cohorts. This implies that increases in average educational attainment are driven primarily by increases among children of less-educated parents. This declining cohort trend exists at the aggregate level, for major castes, and for major states. However, as the variance of fathers’ years of schooling has also increased relative to the variance of sons’ years of schooling, the correlation between son’s and father’s schooling—another commonly used measure of persistence—shows no such declining cohort trend. Further decomposing the correlation coefficient, we find that persistence at the lower end of the fathers’ educational distribution has declined; however, at the top end of the fathers’ educational distribution, the persistence has increased, resulting in an overall steady trend in the correlation coefficient.
Second, the average intergenerational correlation in educational attainment in India is .52, which is higher than the global average of .42 reported by Hertz et al. (2007). Educational mobility in India is better than in Latin American countries. This conclusion remains similar if we measure persistence using the regression coefficient.
Third, we find that the education distributions of both fathers and sons are different in the sample of adult males who reside with their fathers versus the sample of all adult males. For instance, in the coresident sample, the average level of education is higher for both generations, but the relative dispersion of sons’ education to that of fathers is lower. As a result, the estimated education persistence based on the regression coefficient is significantly lower for the coresident sample when compared with the sample of all surveyed adult males.
Fourth, at the state level, we observe significant variation in the estimated intergenerational educational persistence, although this variation is smaller for more-recent cohorts. In addition, educational persistence has declined in all states based on the regression coefficient; however, we find mixed results based on the correlation coefficient.
The issue of intergenerational mobility in income, education, and occupation has been extensively explored in the literature. Black and Devereux (2011) presented a recent survey of the evidence and methodological problems of the research available for developed economies. Hertz et al. (2007) studied trends in intergenerational transmission of education for a sample of 42 countries,7 documenting large regional differences in educational persistence, with Latin America displaying the highest intergenerational correlations, and the Nordic countries having the lowest. They estimated that the global average correlation between parent and child schooling to be approximately .42 for the past 50 years. They also found a 30-point reduction in the estimated mean regression coefficient over 60 years, from 0.80 in 1920 to 0.50 in 1980. However, they found no such trend in the estimated intergenerational correlation coefficients. Daude (2011) presented educational mobility estimates for 18 Latin American countries and found a relatively low degree of intergenerational social mobility in Latin America. With respect to the persistence measure based on the regression coefficient, he found a statistically significant and steep decline in the intergenerational transmission coefficient for both women and men with respect to their parent’s education. However, he did not find a significant change in educational persistence based on the correlation coefficient. Shavit and Blossfeld (1993) analyzed data on 13 industrialized nations and found that the effect of father’s education on son’s educational attainment declined in 7 of the 13 economies.
The issue of intergenerational mobility in India has only recently started attracting attention.8 Jalan and Murgai (2008) investigated educational mobility in the age group 15–19 using 1992–1993 and 1998–1999 National Family Health Survey (NFHS) data. They found that educational mobility (as measured by (1 – regression coefficient (β))) for age group 15–19 has increased significantly between 1992–1993 and 1999–2000, and that education gaps between SCs/STs/OBCs and others are not that large after other attributes are controlled for. An important limitation of their analysis is that respondents in the NFHS are not directly asked about the education of their parents. Hence, parental outcomes are known for only child–parent pairs that are still living in the same household. As a result, they focused on children aged 15–19 years, who were more likely to be living with their parents.
Maitra and Sharma (2009) used the IHDS-2005 and explored the effect of parental education (both father and mother) on years of schooling of children, identifying children–parent pairs if they resided in the same household. Thus, they provided only a point-in-time estimate of persistence (measured by the regression coefficient (β)) based on a sample constructed through ccoresidence: they did not investigate the evolution of intergenerational persistence in India.
Finally, Hnatkovskay et al. (2013) used five rounds of National Sample Survey (NSS) that covers the period 1983–2005 to analyze intergenerational persistence in occupational choices, educational attainment, and wages. They estimated intergenerational elasticity based on synthetic parent–child pairs, wherein all household heads are combined into a group called “parents” and children/grandchildren are combined into a group called “children.” Specifically, they focused on households with an adult head of household coresiding with at least one adult of a lower generation (child and/or grandchild), both being in the age group 16–65. They also removed individuals who were enrolled in school at the time of a particular NSS survey round from their analysis. They found that the period 1983–2005 was characterized by a significant convergence of education, occupation distribution, wages, and consumption levels of Scheduled Castes/Scheduled Tribes toward non–Scheduled Castes/Scheduled Tribes levels.
We use data from the India Human Development Survey 2005 (IHDS), a nationally representative survey of households jointly organized by the National Council of Applied Economic Research (NCAER) and the University of Maryland. The IHDS covers 41,554 households in 1,503 villages and 971 urban neighborhoods located throughout India (Desai et al. 2009).9 The survey was conducted between November 2004 and October 2005 and collected a wealth of information on education, caste membership, health, employment, marriage, fertility, and geographical location of the household. There are two distinct advantages of using the IHDS data for an intergenerational education mobility study over the larger and more commonly used household surveys for India, such as the National Sample Survey (NSS) and National Family Health Survey (NFHS). First, the IHDS contains additional questions not asked in the NSS or NFHS. These questions allow us to identify father’s education for each adult male in the age group 20–65. We provide a detailed discussion of identification of fathers’ information for all adult males in the appendix. Second, the IHDS contains data on actual years of schooling rather than levels of schooling completed, which is generally reported in the NSS data. Having data on years of schooling avoids discontinuities in schooling distribution as a result of the imputation of years of schooling for the categorical variable measuring the level of schooling completed.
In any intergenerational study, the measurement of economic status remains an important issue, and several studies have proxied economic status by labor market characteristics, such as earnings, occupation, and educational attainment. In this article, we focus on intergenerational transmission of educational attainment. Although education is not the only proxy for economic status, there are several advantages in using education instead of earnings to measure intergenerational transmission, especially in a developing country context where the existence of long panel data is rare. First, on the measurement side, education is less prone to serious errors than earnings. In addition, a majority of workers in developing countries are self-employed and generally do not report labor earnings in a typical household or labor force survey. Second, because most individuals complete their education by their early or mid-20s, life cycle biases are unlikely to bias estimation when compared with earnings. Finally, a vast literature shows that higher education is associated with higher earnings, better health, and other economic outcomes (see Black and Devereux 2011), rendering a measure of intergenerational transmission based on education a reasonable proxy for mobility in overall economic status. For India, using nationally representative National Sample Surveys for 1983–1984 and 1993–1994, Duraisamy (2002) found that a greater level of educational attainment substantially increases the likelihood of wage employment. Using Oaxaca-Blinder decomposition, Bhaumik and Chakrabarty (2009) found that educational differences between Hindu and Muslim wage earners in India, especially differences in the proportion of wage earners with tertiary education, are largely responsible for the differences in the average (log) earnings of the two religious groups. By contrast, differences in the returns to education do not explain the aforementioned difference in average (log) earnings. Dutta (2006) reported positive returns to education for regular wage workers during 1983–1999.
We focus on the adult men in the age group 20–65.10 Our survey is from 2005, providing data on individuals born between 1940 and 1985. Hence, we can study the persistence in educational attainment across birth cohorts going as far back as 1940. We conduct our analysis at the all-India level, by social groups, and by states. For analyses at the all-India level and social group level, we divide our sample into nine 5-year birth cohorts: 1940–1945, 1946–1950, . . . , 1976–1980, and 1981–1985. At the state level, driven by sample size and space considerations, we concentrate on two 10-year cohorts: 1951–1960 and 1976–1985.11
The main variable of interest is the son’s educational attainment, which is measured as years of schooling. In the literature, parental education is proxied by the father’s education, the maximum of father’s or mother’s education, or the average of both parents’ education. In our analysis, we use father’s years of schooling to proxy for parental education.12 The sample statistics are presented in Tables 1 and 2. In Table 1, we report summary statistics at the all-India level and by social groups; in Table 2, we report these at the state level.
In columns 1 and 2 of Table 1, we present the total sample size and the minimum sample size, which is the size of the smallest five-year birth cohort. One issue in analyzing education data is the inclusion of individuals who have not completed their education in the sample. This right-censoring in the data could reflect delayed completion and/or pursuit of higher education. The main consequence of including individuals who are still in school is that it can potentially bias the estimates of intergenerational persistence downward. To shed light on the incidence of right-censoring in our data, in columns 3 and 4 of Table 1, we report the shares of adults in our sample who are enrolled in school in two age groups. For those aged 20–24, these shares are, on average, 10 %; by contrast, for the age group 25–29 the shares are less than 1 %. Given that the shares enrolled in school are small and that the true value of schooling is most likely to be just a year or two greater than what is observed for the right-censored observations, any potential bias caused by their inclusion should be relatively small.
The last four columns in Table 1 report the average levels of education for both generations for the first and the last five-year birth cohorts. On average, sons have higher levels of education than their fathers. Further, between two cohorts, educational attainment has increased for both generations. Table 1 reports the aforementioned statistics for major social groups in India. In our analysis, we consider four social groups: Scheduled Castes and Scheduled Tribes (SC/ST), Other Backward Castes (OBC), Muslims, and Higher Hindu Castes (HHC).13 HHCs are more educated than other groups, and this difference holds for both generations and across cohorts.
In Table 2, columns 1 and 2 report the average educational attainment for the two generations for the two 10-year birth cohorts: 1951–1960 and 1976–1985. Columns 5 and 6 contain the sample sizes for the two cohorts in each state. In all states, sons are more educated than fathers, and educational attainment has increased for both generations. However, significant variation exists across states. For instance, Southern states (such as Kerala and Tamil Nadu) have much higher average educational attainment for both generations when compared with the relatively poorer states of Bihar and Orissa. Between the two cohorts, at least in terms of average educational attainment, there seems to be some convergence across states.
A higher estimate of ρ would indicate that sons’ schooling is heavily influenced by fathers’ schooling, whereas an estimate close to 0 would indicate that sons’ schooling is independent of fathers’ schooling. As argued by Checchi et al. (2008), the main difference between the β coefficient in Eq. (1) and the ρ coefficient in Eq. (3) is that the former—by considering the ratio of variances—takes into account a change of inequality of educational outcomes in sons and fathers generations, providing a relative measure of intergenerational mobility. The latter provides an absolute measure of intergenerational transmission—that is, cleansed from possible evolution of the distribution of educational attainments, for instance, resulting from school reforms that increased the average schooling of the population, reducing its variance. The changes in the relative standard deviations will cause both measures to evolve differently, and international evidence (Hertz et al. 2007) has shown that in several countries, β and ρ behave differently. For this reason, it is a common practice in the literature to report both measures of persistence. In our empirical results, we follow this convention and report both the regression coefficient () and the correlation coefficient () across different cohorts.14
We estimate Eqs. (1) and (3) separately for nine 5-year cohorts starting with 1940. Note that the interpretation of β and ρ is descriptive rather than causal. However, assuming that the factors potentially biasing the persistence estimates are time-invariant, the evolution of these estimates can be reliably inferred from this approach (Checchi et al. 2008).
In this section, we present our estimation results. We first present our findings at the all-India level for the pooled sample. This is followed by a cohort-level analysis of intergenerational persistence in educational attainment, at the aggregate level, for major castes, and for major states in India.
Intergenerational Educational Persistence in India
Table 3 presents the estimation results for our pooled sample, revealing several findings of interest. First, from column 1, we observe that the estimated intergenerational educational persistence based on the regression coefficient () is 0.635. Hence, in our data, father’s education has an economically and statistically significant effect on the child’s education: a 1-year difference in fathers’ education is associated with a 0.635-year difference in sons’ education.15 As discussed earlier, one advantage of our son–father matched data is that they are not contingent on the father coresiding with the child. To illustrate the consequence of imposing the coresidence condition, we estimate the intergenerational educational persistence using the sample of adult males coresiding with their father. The estimated regression coefficient is 0.525, which is roughly 17 % lower than the estimate based on the full sample. We also conducted a chi-square test for the equality of the estimated regression coefficient from the full sample and that based on the coresidence sample. The test statistic is 225.36, with a p value of .000. Hence, we fail to accept the null hypothesis of the equality of the two estimates at all conventional levels of significance. Our findings are consistent with those of Francesconi and Nicoletti (2006), who showed that using coresidence to identify parent–child pairs leads to a downward bias in intergenerational elasticity estimates in the range of 12 % to 39 %.
As is evident from Table 3, the mean and standard deviation of sons’ years of schooling and fathers’ years of schooling differ greatly between the coresident sample and full sample. We fail to reject the equality of the correlation coefficient between the coresident and full sample because the value of is lower in the coresident sample compared with the full sample, which makes the correlation coefficient much closer in the two samples despite the observed differences in the regression coefficient and in the education distributions.
A Cohort Analysis of Intergenerational Education Mobility in India
Cohort Analysis at the All-India Level
Table 4 presents estimates for both measures of persistence for nine 5-year birth cohorts. As we observe from Table 4, a one-year difference in fathers’ education has been associated with a 0.739-year and a 0.508-year difference in sons’ education for sons born during 1940–1945 and 1980–1985, respectively. We observe a pronounced decline in the educational persistence based on regression coefficient () over the 1940–1985 period: a fall of 23 points in the estimated mean regression coefficient over 40 years.16 However, there is no such trend visible in the standardized measure of intergenerational coefficient, . The intergenerational schooling correlation increased marginally from 0.530 for the 1940–1945 birth cohort to 0.535 for the 1981–1985 birth cohort: thus, a difference of 1 standard deviation in parental education is associated with a difference of 0.530 and 0.535 of a standard deviation in children’s education for the birth cohorts 1940–1945 and 1981–1985, respectively.17 These results are in line with the findings of Hertz et al. (2007) that the estimated mean regression coefficient declined globally over 60 years, from 0.80 in 1920 to 0.50 in 1980. They also found that the correlation coefficient held steady during the same period. Daude (2011), who analyzed intergenerational transmission of educational achievements in 18 Latin American economies, reported similar findings. He found that educational persistence, measured by the regression coefficient (), was 23 % and 33 % smaller (women and men, respectively) in the cohort aged 25–34 compared with over the cohort older than 55 years in 2008. However, he found no significant change across generations in the correlation measure of education persistence.
To explore why the two measures of persistence show different trends, we compute the trend in the standard deviations of educational attainment of both generations in our sample (reported in Table 4). The standard deviation of fathers’ schooling increased during 1940–1985. This finding is expected: when nearly all fathers initially have no education and then a minority gain access to schooling, the variance of education will increase. Except for the most-recent cohort, the variance of sons’ schooling is greater than that of fathers’ schooling. This implies the ratio of the standard deviations of fathers’ years of schooling to that of sons’ years of schooling will be less than 1, causing to be less than for all cohorts between 1940 and 1980. For the 1981–1985 cohort, the ratio becomes greater than 1, implying a correlation coefficient that is greater than the estimated regression coefficient. These patterns are similar to those reported by Hertz et al. (2007) for other countries.
A negative trend in implies that increases in average educational attainment are driven primarily by increases among children of less-educated parents, which typically happen when access to primary school is expanded.18 To understand the lack of trend in the correlation coefficient, we decompose the correlation coefficient using Eq. (5). Table 5 presents decomposition results grouped by stages of father’s schooling for the 1940–1945, 1961–1965, and 1981–1985 cohorts. Line 31 of Table 5 reports the correlation coefficient , which is the sum of each combination of son’s and father’s education. Line 6 shows the total contribution of sons with uneducated fathers to the intergenerational correlation coefficient. This group accounts for a large part of the correlation in each cohort, but its weight declined from about two-thirds to one-third between the 1940–1945 and the 1981–1985 cohorts. This is a natural consequence of increase in average education over time starting with a largely uneducated society. However, this decline in correlation at the lower end of fathers’ education distribution is compensated by an increase at the higher end of the distribution. As observed from rows 18, 24, and 30, the contribution of sons of fathers with middle school and higher levels of education has increased steadily across cohorts. This leads to a steady trend in the overall correlation coefficient. The intergenerational transmission of education still remains polarized. Sons of disadvantaged fathers are still likely to be more disadvantaged (rows 1–4).
Checchi et al. (2008, 2013) argued that term B of Eq. (4) is the correct measure for analyzing the transmission of education: a system would achieve equality of opportunity if the probability of obtaining a particular degree for the son was independent of the father’s educational achievement.19 Figure 1 presents the probability of a son achieving different levels of schooling conditional on his father’s education.20 Panel 1 of Fig. 1 plots the probability of a son with less than a primary education conditional on different levels of his father’s education. As expected, with the expansion of primary education, the probability of the son having less than a primary education declines over time. With the universalization of primary education, one should expect the probability of obtaining less than a primary education to approach 0, irrespective of the father’s education level. Although there seems to be a convergence in probabilities of achieving less than a primary education, the probability of achieving less than primary remains quite high for a son whose father has either less than a primary education or a primary education. We also observe no convergence in probability of achieving a primary education (second figure of panel 1, Fig. 1). This implies that family background still plays an important role in access to education in India.
Panel 2 of Fig. 1 presents the probability of a son achieving middle and secondary education conditional on his father having different levels of education. The probability of sons achieving middle education has been increasing, and there seems to be some convergence at the lower end of fathers’ education; however, the probability of a son having middle education is much less if his father holds a secondary degree or more. Similarly, there is convergence in the probability of a son having a secondary degree for fathers who have less than a secondary degree, while the probability of sons achieving a secondary education is marginally lower if the father has a secondary or senior secondary education or above.
Panel 3 of Fig. 1 presents the probability of a son achieving senior secondary education or above. There remains a significant difference in the probability of a son achieving a senior secondary education or above conditional on his father’s education: the probability is larger as his father’s education increases. The probability of achieving a senior secondary education or above for someone born to a father with senior secondary education or above in 1940–1945 is about 0.75 points higher than for someone born to an illiterate father in the same period. Most importantly, the gap in probabilities remains more or less similar even for the 1981–1985 cohort. There is some increase in the probability of achieving senior secondary education or above for sons of low-educated fathers, and a marginal decline in the probability of achieving senior secondary education or above for sons of fathers with a secondary education or more. One may conclude that the opportunity for achieving senior secondary education or above remains unequal, with seemingly no convergence in probabilities.
Ranking India Among Other Nations in Terms of Educational Persistence
To rank India among the other nations, we take a simple average of persistence measures across cohorts.21 We find a correlation of .52 for India, which is above the global average of .42 (for 42 countries) reported by Hertz et al. (2007), is comparable with their estimates of .54 in Western Europe and the United States and .48 in Eastern Europe, and higher than their estimate of .39 in Asia, and less than their estimate of .60 in Latin America. Similarly, we find that persistence in terms of the regression coefficient in India is 0.64, which is higher than Hertz et al.’s (2007) persistence estimates of 0.41 in Eastern Europe and 0.39 in Western Europe and the United States, and lower than their persistence estimates of 0.79 in Latin America, 0.69 in Asia, and 0.80 in Africa. Figure 1 in Online Resource 2 presents the ranking of India in terms of correlation and regression coefficients among other countries.
Overall, educational persistence in India is lower than the average of seven Latin American countries reported in Hertz et al. (2007) based on both measures of persistence. Based on the correlation coefficient, the persistence in India is larger (comparable) than the average of the 10 Asian countries (Western Europe and the United States) reported in Hertz et al. (2007); however, based on the regression coefficient, educational persistence in India is smaller (larger) than the average persistence reported for the 10 Asian countries (West Europe and Asia).
The regression coefficient () measures interpersonal differences in status by the difference in grade, whereas the correlation coefficient (ρ) divides the grade difference by the standard deviation of education in that generation. Therefore, the question of which is a more appropriate measure of status differences is a matter of opinion (Hertz et al. 2007).22 Measuring both the regression coefficient and the correlation therefore allows researchers to assess the contribution that differences in inequality make to the observed patterns of persistence.
Cohort Analysis by Caste
In Table 6, we present the regression coefficients and estimated correlation by caste for each of the five-year birth cohort bands.23 The results are more or less similar to our findings for the all-India level. For HHCs, the persistence based on the regression coefficient has declined for recent cohorts. The persistence based on the correlation coefficient, however, has remained about the same. This is because the ratio of sons’ and fathers’ education dispersion has declined over time: while the dispersion of sons’ education has declined, the dispersion of fathers’ education has increased. Taking account of inequality, one may conclude that persistence has declined among HHCs. The regression and correlation estimates show patterns for OBCs that are similar to those of HHCs: although there is a decline in the regression coefficient for recent cohorts, the correlation coefficient does not show a definite trend. The dispersion in sons’ education has declined recently after increasing in the initial period, and the dispersion in fathers’ education has been increasing. For SC/STs, there is a decline in for recent cohorts, but remains steady. For Muslims, both measures show fluctuations over each successive birth cohorts. However, compared with the 1940–1945 birth cohort, the regression coefficient suggests lower persistence in the 1981–1985 birth cohort, whereas the correlation coefficient increased marginally.
Based on the regression coefficient, the transmission from father to son has declined among all social groups; however, the correlation coefficient shows no trend for HHCs, OBCs, and SC/STs. The different evolution of dispersion of sons’ and fathers’ schooling led to different patterns in the regression and correlation coefficients. Taking inequality into account, educational persistence within each group has declined over time. To explore the issue further, we turn our focus to term B of Eq. (4). As argued by Checchi et al. (2013), unlike the regression and correlation coefficients, which are not suitable for intergroup comparisons based on stratification, term B of Eq. (4) can be used to compare groups.24
Figure 2 presents the probability of a son achieving a secondary education or above, conditional on his father’s education (term B of Eq. (4)) for different caste groups.25 Muslims have a lower probability of achieving a secondary education or above for each level of father’s education, whereas HHCs have a significantly higher probability of achieving a secondary education or above than any other group.26 The probability of a son achieving a secondary education or above conditional on father’s education shows convergence across other social groups except HHCs. However, there is no convergence of probabilities between HHCs and others, suggesting not only inequality of opportunities based on caste membership (especially between HHCs and others) in India but also little improvement in such inequality over time.27
Education Mobility in Indian States
For the statewise analysis, we estimate the educational persistence for each state for two birth cohorts: 1951–1960 and 1976–1985. Table 7 presents the results of this exercise. Persistence estimates vary considerably among states for both cohorts; however, the variation among statewise estimates of both the regression coefficient and the correlation coefficient is smaller for the 1976–1985 cohort compared with the 1951–1960 cohort. For the 1951–1960 birth cohort, Madhya Pradesh has the highest persistence of 0.856 in terms of the regression coefficient, while Tamil Nadu and Kerala show the lowest persistence of 0.423 and 0.446, respectively. In terms of correlation, Tamil Nadu also has the lowest persistence, while Jammu and Kashmir has the highest persistence. The ranking of persistence differs based on the measure of persistence; however, the simple correlation between the rankings based on the two estimates is .63. For the 1976–1985 birth cohort, Himachal Pradesh, Maharashtra, and Kerala show the lowest persistence in terms of the regression coefficient, while West Bengal has the highest persistence. In terms of correlation, Maharashtra and Himachal have the lowest persistence, while West Bengal has the highest persistence. The simple correlation between the rankings based on the two estimates is .66 for the 1976–1985 birth cohort.
Not only has the variation in statewise persistence declined, but the persistence in all states except Tamil Nadu has declined in terms of the regression coefficient. Tamil Nadu, which has the lowest persistence in the 1951–1960 cohort, shows similar persistence in the 1976–1985 cohort. However, the story is mixed based on the correlation coefficient: only one-half of the states experienced a decline in persistence. Although the variation in sons’ years of schooling declined in majority of states, the variation in fathers’ years of schooling has increased in majority of states, resulting in different patterns in the regression and correlation coefficients (Table 7).
The results presented here indicate significant variation in education persistence across states. One possible explanation for such variation could be state-level differences in education policy and other institutional factors. For example, Chetty et al. (2014), who correlated regional variation in their mobility measures for the United States with local area characteristics, suggested that regional variation in public funding of education may be associated with the observed regional variation in intergenerational mobility in income. Similarly, Behrman et al. (2001), who covered 16 Latin American countries, demonstrated that greater government expenditures on primary schooling (per student of primary age) and higher average levels of education of the teachers both work to reduce their measure of educational persistence. In the working paper version of our article (Azam and Bhatt 2012), we documented a strong negative association between cumulative state-level per capita expenditure on primary education and education persistence in the state as measured by the regression coefficient for the 1976–1985 birth cohort. Naturally, such an association cannot be interpreted as causal. We believe that an important area for future research is to understand the mechanisms underlying the differential evolution of educational persistence across states in India.
In this article, we document the intergenerational educational persistence in India for successive birth cohorts from 1940 to 1985. Intergenerational educational persistence, as measured by the regression coefficient of fathers’ education as a predictor of schooling in the next generation, has decreased significantly across birth cohorts in last 45 years. This trend holds true across social groups and geographic boundaries. However, based on the estimated correlation between son–father educational attainments, no such trend is visible. We find that this discrepancy between the two measures is due to the differential evolution of the dispersion in educational attainment of the two generations. Based on a decomposition of the intergenerational correlation coefficient, we find that the decline in such correlation at the lower end of fathers’ education distribution is offset by the increase at the top end of fathers’ education distribution.
Our results have implications for policy in India. The decline in persistence at the lower end of fathers’ education indicates that public education has been increasingly able to compensate for the lack of educational inputs in the family. We also find a significant difference in the probability of achieving a senior secondary education or above based on fathers’ education levels. Thus, the policies that are able to move sons of less-educated fathers to higher levels of education compared with their fathers are not able to move those sons to the highest levels of education. This finding is corroborated by the recent experience in India: more and more students are enrolling in primary schools, but the dropout rates after primary and middle school remain significant.28
We also find no evidence for convergence in the probability of a son achieving a secondary education and above conditional on father’s education between HHCs versus others social groups. Specifically, sons belonging to HHCs have a higher probability of achieving a secondary education and above compared with sons belonging to other social groups, conditional on their fathers’ educational background. Thus, the caste gap remains a significant challenge to policy-makers in India despite various affirmative action policy measures taken since independence.
The authors would like to thank the participants of 10th Midwest International Economic Development Conference, 2013 Pacific Conference for Development Economics (PacDev), 2012 Northeast Universities Development Consortium (NEUDC) Conference, 7th IZA/World Bank Conference on Employment and Development, and 2014 IHDS Users Conference for their comments and suggestions.
Appendix: Identification of Father’s Educational Attainment
This section highlights the additional information in IHDS that is not available in the NSS or the NFHS that allows us to identify father’s schooling for almost every adult male respondent in the age group 20–65. Table 8 presents our sample selection process and the loss of observations at each stage.
The first variable that we use is the ID of father in the household roster, which helps to link individuals to their fathers directly if the father is living in the household.29 Using this information by default imposes the coresidence condition, which severely reduces the sample size. The last row of Table 8 in the appendix shows that using only this variable, we were able to extract father’s educational attainment for 34 % of the male respondents in the 20–65 age group.
In contrast to the NSS and the NFHS, the IHDS has another question regarding the education of the household head’s father (irrespective of the father living in the household or not).30 Combining this variable with the ID of father variable, we are able to identify fathers’ schooling for about 97 % of the adult male respondents.31 In comparison, Hnatkovskay et al. (2013), who used five rounds of the NSS, were able to identify fathers’ education for less than 15 % of males aged 16–65 interviewed in the NSS.32
This measurement issue is of practical as well as theoretical importance: using only coresidence to identify parents’ educational attainment may cause a severe sample-selection problem. The issue of sample selection and the resulting non-randomness in survey data has been extensively documented in the literature (see Francesconi and Nicoletti 2006). Further, a sample of son–father pairs achieved through coresidence may be misleading because it may not be a representative sample of the entire adult population of interest. For example, in our sample, almost 86 % of respondents whose father is identified through coresidence are aged 20–35. Hence, the coresidence condition generates a sample that effectively overrepresents younger adults, which is expected as these individuals are more likely to be living with their parents. The distribution of sons’ and fathers’ years of schooling is very different in the coresidence sample versus the total sample (see Table 3).
According to the Pew Research Center (2009:106), 87 % of respondents agree with the following statement: “Our society should do what is necessary to make sure that everyone has an equal opportunity to succeed.” Similarly, in an opinion poll on Economic Mobility and the American Dream by Pew Research Center conducted in March 2009, 71 % favored ensuring that everyone had a fair chance of improving their economic standing. In the same survey, only 21 % favored reducing inequality as a more important goal (Breen 2010). “In policy and political discourse, ‘equality of opportunity’ is the new motherhood and apple pie. In its strongest form, the position is that equality of outcomes should be irrelevant to policy; what matters is equality of opportunity” (Kanbur and Wagstaff 2014:2).
See Béteille (2002) for a discussion on caste system and affirmative action from the Government of India.
The Indian constitution defines the power distribution between the federal (center) government and its states. Both the central and the state governments have power to legislate in the areas mentioned under the concurrent list.
In Online Resource 1, we discuss the issue of coresidence in the existing studies on India and the resulting decline in the sample size, and the sample selection issue that can arise from the use of coresidence for matching children with parents. In Online Resource 1, we demonstrate that one can identify fathers’ information for merely 27 % of adult males in the age group 20–65 based on coresidence. In addition, the majority of the resulting sample consists of individuals in the age group 20–30 (roughly 80 %).
Intergenerational transmission among women is another interesting question; however, we are unable to address this issue because of non-availability of data. The majority of married women in India reside in different households (husbands’ households) than their parents, and household surveys typically collect information on members residing in the same household (through household roster) at the time of survey. The IHDS survey used in this article collected fathers’ information of male household head separately, which helped us to identify fathers for adult males only.
Their sample does not include India.
Munshi and Rosenzweig (2006) used a survey for 4,900 households residing in Bombay and investigated the effect of caste-based labor market networks on occupational mobility. They found strong effects of traditional networks for males on occupational choice. However, for females, they found relatively greater mobility in occupational choices. Munshi and Rosenzweig (2009) used a panel data of rural households: namely, the 1982 Rural Economic Development Survey, which covered 259 villages in 16 states in India. They reported low rates of spatial and marital mobility in rural India, and related these to the existence of caste networks that provide mutual insurance to their members.
The survey covered all the states and union territories of India except Andaman and Nicobar, and Lakshadweep. These two account for less than 0.05 % of India’s population. The data is publicly available from the Data Sharing for Demographic Research program of the Inter-university Consortium for Political and Social Research (ICPSR).
We use a lower limit of 20 because most individuals in India finish college (about 15 years of education) around this age. In our data, only 10 % (1 %) of individuals in the age group 20–24 (25–29) are still in school and have not completed their education. Following Behrman et al. (2001), we use an upper age limit of 65 years. All the analyses in this article use the survey weights provided in the data.
In our analysis, we adopt 5- and 10-year age-bands, and do not examine results under alternative aggregation schemes. As suggested by Hertz et al. (2007), such an aggregation scheme, although essentially arbitrary, should not bias the trend estimates unless it is chosen with a particular set of results in mind.
We do not have information on mother’s education for the entire sample. We also carried out our analysis using average education for both parents: 44 % of the observations in our sample have information on mother’s education. We find a similar correlation coefficient but a larger regression coefficient at the all-India level. For brevity, we do not report these results in this article, but they are available upon request from the authors.
Muslims are the largest minority religious group in India, and according to the Government of India (2006), their performance on many economic and education indicators is comparable with that for SC/ST. Certain differences exist among ST and SC. However, because of small sample sizes of ST after we divide the data into cohorts, we group SC and ST.
It is common among economists to refer to both intergenerational regression coefficients and correlation coefficients as inverse measures of intergenerational mobility (Solon 1999). Hertz et al. (2007) provided comparable estimates of the two measures of status persistence for 42 countries.
A weak intergenerational association (closer to 0) would have indicated that the opportunity to get any level of education is open to all, regardless of their fathers’ education.
A chi-square test of equality of for cohorts 1940–1945 and 1981–1985 rejects the null (p value = .000). A chi-square test of equality of for successive cohorts rejects the null for 1956–1960 versus 1961–1965; 1961–1965 versus 1966–1970; 1966–1970 versus 1971–1975; 1971–1975 versus 1976–1980; and 1976–1980 versus 1981–1985. However, it fails to reject the null for 1940–1945 versus 1946–1950; 1946–1950 versus 1951–1955; and 1951–1955 versus 1956–1960.
A chi-square test of equality of for cohorts 1940–1945 and 1981–1985 fails to reject the null (p value = .60). However, a chi-square test of equality of for successive cohorts rejects the null for 1951–1955 versus 1956–1960; 1961–1965 versus 1966–1970; 1966–1970 versus 1971–1975; 1971–1975 versus 1976–1980; and 1976–1980 versus 1981–1985. It fails to reject the null for 1940–1945 versus 1946–1950; 1946–1950 versus 1951–1955; and 1956–1960 versus 1961–1965.
If the sons of better-educated fathers are the first to take advantage of new educational opportunities, the persistence measured by will increase (Hertz et al., 2007).
Note that it is important to account for preference for mobility when analyzing the issue of equality of opportunity based on realized outcomes in data. Immigrants, for instance, may have lower opportunity but greater observed mobility due to stronger preference for mobility. Because preferences are not uniform across social groups, one has to be careful in interpreting the extent of equal opportunity based on observed mobility (Breen and Goldthorpe 1997). We thank an anonymous referee for bringing this point to our notice.
To investigate the persistence in education, or term B, we collapse our years of schooling into stages of schooling achieved by sons and fathers. We group the years of schooling into five achievement levels: years of schooling 0--4: below primary, 5--7: primary, 8--9: middle, 10--11: secondary, and 12--15: senior secondary or above.
Hertz et al. (2007) also formed the simple average across cohorts. They argued that the advantage of this approach compared with running a single regression for all ages is that it does not give more weight to larger cohorts.
Daude (2011) found that the countries in Latin America that show a high persistence using the beta coefficient measure also present a high correlation-coefficient persistence, with the correlation between the two measures being .75. Hertz et al. (2007) found a correlation of .51 between the two measures.
We are interested in analyzing only the cohort trend in intergenerational educational persistence within each group, where we identify a group by caste membership and by state of residence. This analysis is based on intergenerational educational persistence estimated separately from a subsample of observations belonging to a particular group. Such an analysis is useful only in describing the extent of intergenerational mobility in educational attainment within a group. However, these intergenerational educational persistence estimates are not very informative for comparisons across groups because the estimated persistence for any group provides only an estimate of the rate to regression to the mean for that particular group and not for the overall education distribution. See Hertz (2005, 2008) and Mazumder (2011) for a detailed discussion of group-specific measures of intergenerational persistence.
This is because the estimated persistence for any group provides only an estimate of the rate to regression to the mean for that particular group and not for the overall education distribution.
For space considerations, we present only the probability of achieving higher level of education.
The estimates for sons of fathers with secondary education or above among Muslims are quite imprecise because of a small sample size in this group.
The non-convergence is in contrast to Hnatkovskay et al. (2013), who found convergence in mobility among caste groups based on education switches and average size of education switch, where education switch is defined as sons and fathers having different education levels. Hnatkovskay et al. (2013) looked at SC/ST versus non–SC/ST, thus grouping OBCs and Muslims with HHC as non–SC/ST. Further, they looked at the educational switches (son–father having different education levels), and thus their results are not directly comparable with ours.
According to the Planning Commission of India (2008:5) the dropout rate in primary classes—which has been decreasing at a very low average rate of 0.5 % per annum since the 1960s—showed a steeper decline of 10.03 % over the first three years of the Tenth Plan (29 % in 2004–2005 compared with 39.03 % in 2001–2002). However, the dropout rate at the elementary level (classes I–VIII) has remained very high, at 50.8 %. In our sample, the most recent birth cohort is 1981–1985, who should have attended the schools in the 1990s. Hence, the dropout rates representing individuals in our sample would be even more severe than those reported in the Planning Commission report for 2004–2005.
This is Question 2.8, on page 4 of the Household Questionnaire. In both the NSS and the NFHS, the analogous identification is achieved by utilizing the “relationship to the household head” question in the household roster (see Online Resource 1 for a discussion of such identification in the NSS data).
This is Question 1.20, “How many standards/years of education had the household head’s father/husband completed?,” on page 3 of the Household Questionnaire.
We also identify fathers’ years of education for some of the remaining adult males (who are not the household heads and whose fathers are not identified through coresidence) by exploiting relation to the head. A STATA .do file used to construct the son–father sample is available from the authors. Maitra and Sharma (2009) also used the IHDS data in their analysis but used coresidence to identify parental educational attainment. As a result, their sample is restricted to only 27.7 % and 6.4 % of the total adult male and female sample interviewed in the IHDS. For instance, in table 4 of their paper, they reported a sample size of 5,789 and 11,515 for males in urban and rural area, respectively, although the total adult male (20 and older) sample is 22,071 and 40,460 in urban and rural areas, respectively. Similarly, they used a sample of only 1,886 and 2,078 adult females living in urban and rural areas, whereas the total adult female sample is 21,790 and 40,378 in urban and rural areas, respectively.
In the supplement to their paper, Hnatkovskay et al (2013: table S2) reported the sample sizes for each round of the NSS. They reported the number of observations (son–father pair) as 24,119 in 1983; 28,149 in 1987–1988; 25,716 in 1993–1994; 25,994 in 1999–2000; and 27,051 in 2004–2005. The actual number of males aged 16–65 surveyed in these cross-sections are 177,008; 196,412; 173,182; 183,732; and 188,585, respectively.