Abstract
Since the end of 1990s, approximately 160 million Chinese rural workers migrated to cities for work. Because of restrictions on migrant access to local health and education systems, many rural children are left behind in home villages to grow up without parental care. This article examines how exposure to cumulative parental migration affects children’s health and education outcomes. Using the Rural-Urban Migration Survey in China (RUMiC) data, we measure the share of children’s lifetime during which parents were away from home. We instrument this measure of parental absence with weather changes in their home villages when parents were aged 16–25, when they were most likely to initiate migration. Results show a sizable adverse effect of exposure to parental migration on the health and education outcomes of children: in particular, boys. We also find that the use of the contemporaneous measure for parental migration in previous studies is likely to underestimate the effect of exposure to parental migration on children’s outcomes.
Introduction
The human capital development of children has long been of great interest to social scientists. Children’s current education and health outcomes have significant implications for their own social and economic well-being in the future, and how they are faring today as a group will affect the quality of human capital supply in the future for society as a whole.
The unprecedented economic growth in China in the past few decades has led to a large-scale rural-urban migration. In 2015, 166 million rural workers migrated to cities to work (National Bureau of Statistics 2016). This number is expected to grow rapidly in the coming years, with rural-urban migrant workers already accounting for more than 40 % of the total urban labor force.
Large-scale rural-urban migration may have a significant effect on human capital development of future generations. In particular, China’s institutional restrictions on rural-urban migration prevent adult migrants from bringing their children with them when they choose to work in cities. The Household Registration System (hukou) limits rural migrant workers’ and their families’ access to subsidized education, health care, and other public services available to local residents in cities. As a result, rural parents often leave their children behind in rural villages when they migrate to cities. Children who remain in rural areas either live with other family members or enroll in boarding schools, without the care and supervision of their parents.
The number of children being left behind is large. A 2009 study by the All-China Women’s Federation estimated that 58 million children aged 0–18 are left behind in rural villages because of parental migration (40 million aged 0–15), accounting for 28 % of all rural children (Ruan and Sun 2011). Also, the Rural-Urban Migration in China (RUMiC) project survey of 5,000 migrant households suggested that in 2008, approximately 57 % of migrant children aged 0–15 were left behind in rural villages.
The education and health outcomes of these children are likely to be affected via two channels. On the positive side, additional income brought home by migrant parents can increase investments in children’s education and health. On the negative side, lack of parental care can decrease such investments because parental care is one of the most important determinants of children’s development (see, e.g., Ginther and Pollak 2003; Lamb 1998; Love et al. 1996; McLanahan and Sandefur 1994; Sigle-Rushton and McLanahan 2006; Whitebook et al. 1989). The net effect of parental migration depends on the relative magnitude of the two effects, and it is largely an empirical issue.
Parental migration is also likely to affect children’s human capital development cumulatively. Human capital theory suggests that education and health outcomes are formed in an accumulative process, in which current and past inputs (such as home learning environments and school inputs for education outcomes; and medical care, diet, and exercise for health outcomes) are combined with an individual’s innate abilities or inherited health endowments (Grossman 2000; Hanushek 2003; Todd and Wolpin 2003).1 Thus, the current health and education outcomes of children are likely to depend not only on contemporaneous investments but also on a history of investments.
However, to date, the literature on the effect of parental migration on left-behind children’s education and/or health outcomes has largely focused on the contemporaneous effect: that is, the effect of whether parents were recently or are currently absent. The findings in this strand of literature are mixed for China as well as other countries. Some studies have found positive effects on children’s educational attainment/achievement of remittances or the migration of a household member or a parent. See, for example, Yang (2008) for the Philippines; Edwards and Ureta (2003) for El Salvador; Hanson and Woodruff (2003) and Antman (2012) for Mexico; and Chen et al. (2009), Lee and Park (2010), and Lu (2012) for China. Other studies, however, have found negative effects of parental migration. For instance, in Mexico, the migration of a parent or household member is found to reduce children’s study hours, increase their engagement in household chores, and increase their work hours (Antman 2011; McKenzie and Rapoport 2011). For China, studies have found that parental migration delays children’s educational attainment (Meyerhoefer and Chen 2011), negatively affects their test scores (Zhang et al. 2014), and increases the probability of being underweight (de Brauw and Mu 2011). Yet, an equally large number of studies found no effect of parental migration on children’s education or health outcomes (see, e.g., Chen 2013; Xu and Xie 2015).
A few studies have examined how the length of parental absence affects children’s outcomes, with mixed findings. For instance, Krein and Beller (1988) examined the length of time a child lived in a single-parent family and showed that the negative effect of living in such a family amplifies as the length of parental absence increases. Using Chinese data, Zhou et al. (2015) indicated that the effect of the duration of parental absence due to migration has no association with children’s outcomes, whereas Zhou et al. (2014) found a detrimental effect of parental absence in the last three years on children’s test scores. Therefore, considerable uncertainty remains regarding the relationship between lifetime exposure to parental migration and children’s education and health outcomes.
Another important issue in the investigation of parental migration is how migration of one parent versus both parents, and migration of fathers versus mothers, may differentially affect children’s outcomes. Some of the aforementioned studies examined whether one parent or both parents are away for work show different effects, which is in line with the view in the literature on the effects of family structures on children’s outcomes, which suggests that children exhibit poorer educational attainments when raised by single parents (Astone and McLanahan 1991; Krein and Beller 1988). Another line of literature on evolutionary psychology suggests that the absence of mothers is more detrimental to children because mothers focus more on the investment in current children than fathers, who could have more children in future with different partners because of their longer reproductive period (Biblarz and Raftery 1999). Empirical evidence on the difference between paternal and maternal migration is mixed. On one hand, children left behind by their mothers are disadvantaged in terms of both education and health relative to children who have neither parent absent and those who have only fathers absent (Wen and Lin 2012). A long-term absence of mothers is also associated with a lower probability of school enrollment (Jampaklay 2006). On the other hand, maternal migration is positively correlated with educational attainment of children (Arguillas and Williams 2010). These studies, however, did not take into account the potential endogeneity problem; thus, it is unclear whether the results reflect a causal effect of parental migration.
This article contributes to the literature in the following ways. First, we provide evidence for the effect of lifetime, cumulative exposure to parental migration on children’s education and health outcomes. We exploit detailed information on parental migration history, which is available in the Rural Household Survey (RHS) of the RUMiC, a large panel survey of 8,000 rural households, covering nine provinces from different regions in China. To the best of our knowledge, this is the only study that addresses the question on cumulative exposure to parental migration. Although we do not show the trajectory of those outcomes at different points in time, we show how children’s lifetime exposure to parental absence cumulatively results in differential levels of educational and health outcomes for children aged 0–15.2
Second, we address the issue of endogeneity in parental migration decision by adopting the instrumental variable (IV) approach. We use as instruments (1) the weather shocks that occurred when parents had just become adults but before their children were born, and (2) the distance between one’s home village and the provincial capital cities. The validity of the first IV lies on the fact that rural individuals are most likely to migrate for work when they are in their late teens and early 20s; weather shocks, which can reduce agricultural income, are likely to induce rural youth to emigrate to cities. However, these shocks are unlikely to directly affect children who are yet to be born. The second IV—distance to the provincial capital city—is valid because it captures both cost and the need to migrate, and hence it should be highly correlated with parental migration. At the same time, because the household registration system (hukou) in China restricted people from relocating in the past 60 years,3 where one’s home village is—and thus the distance between the village and provincial capital—is likely to be exogenously determined.
Finally, we investigate not only the effect of parental migration on children’s outcomes but also the channels through which such effects occur. In particular, we investigate the effect of parental migration on various investments in children’s education. Although some studies have shown that parental migration can have negative effects on children’s time allocation measured by hours of household chores (Chang et al. 2011; de Brauw and Mu 2011) and study and work hours (Antman 2011; McKenzie and Rapoport 2011), we assess the effect on a more comprehensive set of education-related inputs, including age of school commencement, boarding status, hours of study, and expenditures on fees and private tutoring.
Our results suggest that children’s lifetime exposure to parental migration significantly worsens health and education outcomes. We also find that a longer exposure to parental migration reduces children’s after-school study time and delays their grade progression for a given age. Our results also show that longer exposure to paternal migration increases the probability that left-behind children will be enrolled in boarding schools, the quality of which has been debated.4
Background
Since the Communist Party’s rise to power in 1949, China has segregated rural and urban economies. Rural-urban migration was strictly restricted before the mid-1980s. As market-oriented economic reform deepened, demand for unskilled labor in cities increased. To meet this demand, the government gradually relaxed the migration restrictions, but migrants are still treated differently from urban local people. In particular, migrant workers are restricted in the type of job they can obtain, and they have limited access to urban social welfare and social services, such as education, health care, unemployment benefits, and pensions (Meng and Manning 2010). These restrictions prevent migrant workers from staying in cities for long and from bringing their families to cities. They often work in the cities for a few years, depending on their personal and family circumstances, and then return to their rural home. Sometimes, they engage in circular migration.
Because access to schools and health care in cities is very costly for migrants, many of them leave their children behind in rural villages. For example, the RUMiC 2008 survey of migrant households in cities showed that among the 2,300 children aged 0–15, 57 % were left behind in rural areas. Of these left-behind children, 30 % were looked after by the one parent who stayed behind, and 59 % of them were with grandparents or other relatives. The remaining 8 % were at boarding schools.
Left-behind children may be disadvantaged with regard to their nutrition intake, health outcomes, and school performance if alternative caretakers have less time to spend with the children or attend less to their day-to-day needs. Alternatively, if left-behind children are in boarding schools, the living conditions and management system of boarding schools could be important. In rural China, anecdotal evidence suggests that “the quality of the facilities and the nature of the management (of the boarding schools) might best be described as horrific. The safety, hygiene, supervision, diet and nutrition are all serious problems to these boarding schools” (REAP 2009).5 A recent study reported that boarding school students in rural areas are 9cm shorter than the relevant median height set by the World Health Organization (WHO) (Shi and Zhang 2010). Another study found that boarding schools also have a negative effect on students’ school performance (Mo et al. 2012).
Data
Data Sources
The data used in this study are collected for the RUMiC project. This annual longitudinal survey began in 2008, covering nine provinces or municipalities that are major sending or receiving areas of rural-to-urban migration. In each province, three random samples were drawn: rural households, urban local households, and rural-to-urban migrant households. The sizes of the three samples are 8,000, 5,000, and 5,000, respectively. In this article, we primarily use the sample of rural households from the second and third waves (2009 and 2010).6
The rural surveys were conducted by National Bureau of Statistics (NBS). The 8,000 households (in 800 rural villages) in our survey provinces included in the regular NBS annual RHS are surveyed between March and June each year.7 All household members who are officially registered in the household are included in the survey. Members who are not officially registered in the household but have been living there for six months or more are also included in the survey. Information on household members who were not present at the time of the survey was reported by other household members.
The RUMiC contains a rich array of information for these household members, including basic information for adults (aged 16 or older), such as education, employment, migration experience, and anthropometry. The survey also asks parents or guardians to report their children’s anthropometry (if aged 0–15) and school test scores (if aged 7–15 years). In addition, the survey conducted a uniform mathematics test for most primary school children (grades 1 to 6) in five of the nine rural sample provinces in 2011.8
Outcome Variables
From the RUMiC data, we use the z scores for height-for-age and weight-for-age as our main outcome variables for children’s health status. (See the Data Appendix for details on how these z scores are generated.) If parental migration has any effect on child’s health, it is likely to be found in his/her height-for-age, which reflects the health and nutrition conditions in the long run. Although weight-for-age reflects both height-for-age and weight-for-height, if wasting is not prevalent, it is also likely to indicate the long-run health/nutritional conditions (de Onis and Blossner 1997). Because weight data have fewer missing value cases, weight-for-age can provide a robustness check for the results for the height-for-age z score. We do not use weight-for-height or body mass index (BMI) as outcomes because they are more likely to reflect temporary changes in nutrition and health conditions.
The outcome variables that we use for children’s school achievement are the final exam scores attained in the last school term for the subjects of Chinese and mathematics. School test scores have been used to capture children’s cognitive skills in the education literature. For example, recent studies have found that schooling test scores have long-lasting effects on labor market outcomes (Chetty et al. 2011; Hanushek and Woessmann 2012; Hanushek and Zhang 2009; Mulligan 1999). In particular, Chetty et al. (2011) found that kindergarten test scores have a strong effect on individual earnings at ages 25–27. These studies suggest that test scores are good measures for embodied human capital. If parental absence leads to limited supervision and discipline of children, their study habits may not develop as well as they might under parental care. The lack of parental supervision thus can result in limited human capital accumulation, as measured by lower test scores.
These outcome variables were reported by parents or guardians, who are likely to have good knowledge of the children’s anthropometry and test scores. For health outcomes, rural primary schools typically measure height and weight during annual basic health check-ups, and the results are reported to parents/guardians. Younger children who have not started schooling are also invited by village clinics to have similar annual regular check-ups, and this information is shared with parents/guardians. Regarding the test scores, most schools in China have good communication channels with parents. In addition to regularly organized parental meetings in each semester, parents also receive report cards, which display the child’s assignments as well as final test scores (see, e.g., Chen and Feng 2013). If some parents are not fully aware of the children’s outcomes, to the extent that these measurement errors occur randomly, it will not affect our estimates. Nevertheless, nonrandom measurement error may arise in our case. For example, left-behind children are often cared for by their relatives. If those relatives pay less attention to the children than the children’s parents and report the children’s outcomes with greater errors, the measurement error could cause an endogeneity problem. In order to purge any resulting bias, we use the IV approach, discussed in detail in the next section.
The test scores are likely to be comparable across children in the sample. The contents of exam papers are largely consistent across different regions because all the schools use one of the three textbook series, all of which closely follow the National Curriculum Standard, the standard for basic education (grades 1–9) designed by the Ministry of Education. This is particularly true for the core subjects of Chinese, mathematics, and English. Although the full test scores may vary across schools, we normalize test scores by dividing them by the full test score used in the child’s school, which was also reported by parents/guardians. Ideally, we would have information on which schools these children were attending so that we could standardize the scores across schools by taking student ranking within the school (Zhang et al. 2014) or dividing the test score by within-school standard deviations. Unfortunately, our data were collected from the household surveys, and no detailed school information is available. To control for potential differences across schools, we include provincial dummy variables in the regression analysis. Based on the information collected from our own fieldwork, provincial governments have the flexibility to choose from three textbook series. Usually, the same textbook series is used within a province or prefecture. However, potential differences occur within a province across counties, which may not be fully controlled. As a robustness check for both potential parental reporting errors and possible inconsistency of exam papers across different counties, we also use the mathematics score obtained from the uniform test we conducted in the 2011 survey.9 The health outcome variables are also likely to be comparable because the z scores take into account differential growth patterns for each group defined by age and gender. We also control a set of age- and gender-specific dummy variables in the regression.
Key Treatment Variables
As discussed earlier, we focus mainly on the cumulative exposure effect of parental migration on children. A frequently used indicator for parental migration is the very recent migration experience, such as the number of months during which parents or household members were away from home in the previous year or whether parents were away in the past few years. However, the parental inputs provided before the previous year (or the time of the survey) are also likely to be important in shaping current health status and educational achievement of the children. To measure the children’s exposure to parental migration more comprehensively, we define the share of a child’s lifetime in which the parents were away from the rural home, hereafter referred to as “lifetime exposure to parental migration”:
If a parent has never migrated, this share takes the value of 0. If a parent has been away all the time since a child was born, this share takes the value of 1. If a parent started working away from home some time after a child was born, this share takes a value somewhere between 0 and 1.
We use the following two pieces of information regarding when parents started migration to construct the numerators for two lifetime-exposure measures. First, all the adult respondents were asked, “If you ever migrated to a city, in which year/month did you first migrate?” This information is available for parents who have ever migrated. Second, in the 2009 survey wave, those adult respondents who indicated that they migrated in the previous year were asked an additional question: “If you migrated in 2008, in which year/month did this episode of migration start?”10 In addition to these two pieces of information, the 2008, 2009, and 2010 RUMiC survey waves also collected detailed information on the number of months during which adult respondents were away in the previous year (2007, 2008, and 2009, respectively).
Based on these pieces of information, we construct the numerators for the two measures of lifetime exposure as follows. When the start year of parental first migration or the 2008 spell of migration was before the child’s birth year, we assume that parents were away in all the months since the child’s birth until the end of 2006. We then add these durations to the number of months in which the parents were away from home between 2007 and 2009. If the start year of the first migration (or the start of the 2008 migration spell) was after the child’s birth year, we count the number of months from the start of parental migration until the end of 2006, and add that to the duration of migration between 2007 and 2009. We calculate these two numerators for each parent (see the Data Appendix for more details), and then divide them by the child’s age measured in months to generate the two exposure variables.
The two numerators generated from either initial migration or the start of the 2008 spell may suffer from a measurement error problem. First, the duration calculated based on the first migration spell is likely to overestimate the true duration of parental migration because we assume that parents were away in all the months between their first migration and the end of 2006. However, it is very common for Chinese migrants to churn back and forth between city jobs and villages.11 Our calculation counts the periods during which migrants were back in home villages as part of their migration spell. Second, the duration of parental migration calculated based on the start of the 2008 migration spell is likely to underestimate the true duration of parental migration because parents may have had other spells prior to the 2008 spell.12 To purge a possible bias that might stem from these measurement errors, we use the IV method (see the Estimation Strategy section for detailed discussion). In addition, using the two measures provides us with a bounded range of the effect, with the measure calculated based on the 2008 spell representing the upper-bound estimate and the one based on the first migration duration representing the lower-bound estimate. We present the results using the exposure measure based on the 2008 spell in our main results tables and the one based on the first migration in the Robustness Test section.
To compare our results with the existing literature, we also estimate the effect of contemporaneous parental migration using the following two measures: (1) the share of the average number of months the mother and/or father was away from home in 2008 and 2009 (divided by 12), and (2) a dummy variable indicating whether mother or father was away in either 2008 or 2009.
Sample
The analysis for children’s height-for-age and weight-for-age z scores includes children ages 0–15 years, and the analysis for the test scores is conducted using the sample of children aged 7–15 years.13 The size of these two samples is affected by the availability of the outcome variables in the RUMiC survey and the construction of the key independent variable, children’s exposure to parental migration. First, although health outcome variables are available for all children aged 0–15 in all three survey waves, the education variables were collected only in 2009 and 2010 for children at school. To make our analysis consistent across different outcome variables, we focus on the data from the 2009 and 2010 waves. Also, we use the mean value of the outcome variables from these two waves to minimize the cases of measurement errors and missing values.14 If either wave has a missing value, we substitute it with the data in the other wave.15 Second, to construct the key treatment variable—children’s exposure to parental migration—up to 2009, information is needed on parental migration for both 2008 and 2009.16 Thus, our analytical sample includes children aged 0–14 in 2008 who have parental migration information in both 2008 and 2009 waves—a total of 4,406 children.17 An additional 314 children were excluded from the sample in 2010 mainly because there was a change in survey sites,18 but this does not reduce our sample size for this analysis given that we use the 2009 values for these cases. We also exclude 647 children who spent some time in 2007, 2008, or 2009 away from their rural homes in order to focus on children who were left behind in rural areas. Another 412 children whose parents have missing values in their years of birth are excluded because one of our IVs, the weather patterns when parents were young, requires this information. All the remaining children have both parents; therefore, our sample is not affected by their parental marriage dissolution.19 Another 380 observations have missing values for other control variables. Thus, our final sample for the health outcomes (for children aged 0–15) is 2,967. The sample for the education outcomes (children aged 7–15) is 2,215. These samples will also be used for our contemporaneous effect analysis.
To assess whether our final sample is random, we compare those who are included in the final sample with those who are excluded. For this exercise, we focus on the 4,548 children who can potentially be included in our analysis sample: those aged 14 or younger in 2008.
Table 1, column 1 shows the results of regressing the dummy variables indicating whether the child is included in the final health sample on a vector of child-, household-, and village-level characteristics, controlling for provincial fixed effects. Column 2 presents the results for the education sample. Because the major cause of the exclusion from the analysis samples is the mechanical omission of children who turned 16 years old in 2009 or 2010, age is negatively correlated with the inclusion in the samples. After it is controlled, the results suggest no statistically significant differences in the outcome variables between children who are included in our final samples and those who are excluded. However, we observe some statistically significant differences in control variables. For example, children included in the samples have lower birth weight and live relatively close to a junior high school, although both differences are very small in magnitude. In addition, those included in the education sample are 3 percentage points more likely to be girls and have a larger household. Nevertheless, any differences that might stem from the differences in these observed characteristics are controlled in all the regressions conducted in our analysis.
Summary Statistics
Table 2 presents the summary statistics of the outcome, treatment, and other explanatory variables for the health and education samples. Panel A of the table presents children’s outcome and treatment variables, and panel B shows summary statistics for children, parents, and households’ characteristics.
The average height and weight z scores are –1.49 and 0.16, respectively. Note that z scores compare each child’s height and weight with the United States’ age-specific height and weight distribution. Thus, the average height z score of –1.49 indicates that, on average, our sample children are 1.49 standard deviations below the U.S. age-standardized height distribution, whereas the weight z score of 0.16 indicates that they are 0.16 standard deviations above the age-standardized weight distribution for U.S. children. Panel a of Fig. 1 presents the distributions of these z scores. It shows that the height z score distribution is skewed to the left, while the weight z score almost follows a normal distribution. Both weight and height z scores are trimmed because they contain extreme values (see the Data Appendix for details). For school-aged children (the education sample), the Chinese and math test scores averaged approximately 82 % and 83 %, respectively. Panel b of Fig. 1 shows that more children achieve above 90 % in the mathematics test than it in the Chinese test.
The summary statistics of the treatment indicators are reported separately for fathers and mothers. First, the measure for children’s lifetime exposure to parental migration based on the start of the 2008 migration spell suggests that for our health sample, the average father and average mother were away from home for 15 % and 10 % of the child’s lifetime, respectively. The alternative indicators for children’s lifetime exposure to parental migration based on the start of the initial migration spell exhibit higher values: for an average child, the father and mother were away from home for, respectively, 41 % and 24 % of the child’s lifetime. This finding is not surprising because the alternative indicators are more likely to overestimate the true duration of parental migration since the birth of the child.
Figure 2 presents the distribution of children’s lifetime exposure to parental migration based on the 2008 migration spell. It shows that approximately one-half of the children had fathers away for some time in their lives, whereas approximately 33 % of the children had their mothers away for some time in their lives (panel A of Fig. 2). Among those children with a father and/or mother away for at least a month since they were born, 3 % to 4 % of them had their father/mother away for their entire lifetime (panel B of Fig. 2). The average duration of paternal absence from home amounts to 32 % of the child’s lifetime if we exclude children whose fathers were never away. The equivalent figure for mothers is 31 %. These statistics indicate that both fathers and mothers spend approximately the same length of time away from their children after they leave home, although mothers are less likely to leave.
For the contemporaneous migration measure among the health sample, we find that fathers and mothers were away for an average of 3.4 and 2.2 months, respectively, in the previous year. For the education sample, the average number of months that fathers and mothers were away was slightly smaller. This difference may reflect that parents are less likely to migrate when their children are at school. Among the health sample, 43 % of the children had their fathers away in the previous year, and 28 % had their mothers away. The equivalent figures for the education sample are lower as well: 39 % and 24 % for fathers and mothers, respectively.
Panel B of Table 2 presents children’s, parents’, and household characteristics. Children in the health and education samples are, on average, 10.0 and 11.8 years old, respectively. Approximately 55 % of them are males, and their average birth weight is 3.2kg for both samples. Their fathers and mothers are on average 168cm and 160cm in height, respectively, with approximately eight and seven years of schooling, respectively.20 The average age of fathers is 38 and 40 for the health and education samples, respectively. On average, mothers are two years younger than fathers in both samples. Finally, the households in our samples, on average, consist of approximately five individuals. Approximately one-third of the household members are children aged 0–15.
Estimation Strategy
To examine the effect of parental migration on health and education outcomes of children who are left behind in their rural hometown, consider the following equation:
where Yijt is the health or education outcome for child i, in village j, at time t (the average of the outcome variables over the years of 2009 and 2010). Mijt − 1 measures either the contemporaneous parental migration in time t − 1, or child i’s lifetime exposure to parental migration up to time t − 1. Xijt is a vector of child, parent, and village characteristics that may affect children’s outcomes. These variables include child’s birth weight; a set of gender-specific age dummy variables for children; parental age, education, and height; and four village-level characteristics measuring village public facility accessibility (i.e., the distance from the village to the nearest primary school, junior high school, health clinic, and bus station). We also control for provincial fixed effects. νijt is the error term.21
Remember that our conceptual framework indicates that parental migration may affect children’s outcomes in two ways. First, it reduces direct parental care for children, which in turn could have adverse effects on children’s health and education outcomes. Second, parental migration could increase household income and potentially have positive effects on children’s health and education outcomes. Thus, the estimated effects of parental migration, β1 in Eq. (1), captures the net of these two migration effects. We do not directly control for the household per capita income or parental remittances because of the potential endogeneity problem that they may introduce.
Equation (1) is first estimated using ordinary least squares (OLS) regressions. However, because parental migration is an endogenous variable, the estimated results are by no means causal effects. First, parental migration decisions may be affected by children’s health and education outcomes. Parents of children who are healthy and doing well at school may be more likely to choose to migrate or stay away for a longer time. If this is the case, the OLS results will underestimate the true effects of cumulative exposure to parental migration. Second, it is also possible that healthier and more-able parents are more likely to choose to migrate. These unobserved characteristics can be inherited by their children and hence are positively related to children’s health and education outcomes. If so, this will also cause the underestimation of the actual effects of exposure to parental migration on children’s outcomes. Third, our major dependent variables and independent variable may suffer from a measurement error problem (see the discussions in the Data section), which can also bias the results if Eq. (1) is estimated using OLS. Note that regarding parental migration measures, the contemporaneous parental migration is less likely to suffer from the measurement error problem than the cumulative parental migration measure.
Considering these problems, the ideal way to investigate the effect of exposure to parental migration on children’s outcomes is probably to adopt the IV approach. We use two potential instruments for children’s lifetime exposure to parent migration and the contemporaneous parental migration measures.
The first instrument is the number of weather shocks that occurred when a parent (for father and mother separately) was aged 16–25, when individuals are most likely to initiate migration (Meng 2012). For each parent, we specify the years in which she or he was aged 16–25 years. In those 10 years, we count the times of extraordinarily little rain and high temperature in spring.22 Too little rain or too high temperatures in spring could have reduced agricultural outputs in that year. If this happened many times during their youth, parents might have decided to migrate to cope with the income losses. On the other hand, these weather shocks occurred before children were born. Our oldest sample children were aged 15 in 2009 (i.e., born in 1994). On average, their fathers were aged 27 at their births.
More specifically, we define rainfall and temperature shocks based on their averages over spring months in 30 years between 1980 and 2010. If a weather station observed 1 standard deviation below the 30-year rainfall average or 1 standard deviation above the 30-year temperature average in a particular year, we define this as a shock in that year. We count, separately, the number of these shocks in the 10 years during the period in which the parent was aged 16–25. We then interact the number of years with very high temperature and that with very little rain in spring to generate one instrument. We also test whether our IV results are robust against using other definitions of weather shocks in the subsection, Robustness Tests.
The 10-year youth interval is unlikely to be sensible for older parents who turned age 25 before 1980 because the migration to cities was not a realistic option before 1980. Thus, for these cohorts, we assign the equivalent interaction term for years between 1980 and 1989. The weather data are from the China’s National Meteorological Information Centre (NMIC). The historical daily rainfall and temperature information is available for 824 weather stations across China between 1980 and 2010. Of these 824 stations, 75 have been assigned to our 82 sample counties as the nearest weather stations.23
The second instrument is the distance between the home village and the capital city of the province where the child lives. This instrument captures the combination of potential migration cost (the farther the village is from the provincial capital city, the higher the cost) and the need to migrate (the closer the village is to the capital city, the lower the need to migrate to the city to find a job). One might consider using the distance to the actual destination in order to measure the costs of migration. However, moving to a city that is close to, or far from, the rural village is likely to be an endogenous choice that can be affected by unobserved factors influencing children’s outcomes. By using the distance to the home provincial capital instead, a correlation with such unobserved heterogeneity is likely to be purged because it has been extremely difficult for rural people to change their hukou registration status from rural to urban or even from one rural village to another over the past 60 years. Typically, individuals born in a rural village are registered with a rural hukou at their birth village for the rest of their lives, with very few exceptions. This feature of China’s HRS implies that in general, the location of an individual home—and thus the distance between the home village and the home provincial capital—are both exogenous. In other words, the basic exogeneity of household registration location ensures that the distance between one’s home village and the capital city of one’s home province is out of the individual’s control, regardless of whether one’s village is rich or poor, or whether one’s village suffered from natural disasters. We derive this instrument from Google Earth, measuring the straight-line distance between our 82 survey counties and their respective provincial capital cities. We add this distance measure to the measure we obtained from the 2008 RUMiC Rural Village Survey, which inquires the distance between the village and the county city.
However, concerns may remain regarding the two instruments. The weather shocks are measured at the time parents were aged 16–25. During that period, some children might have been born, and hence their health and education outcomes may be affected by these weather events. In the sensitivity test section, we test the robustness of using weather events occurring when parents were aged 16–21, given that the Chinese legal marriage age for men and women is 22 and 20 years, respectively, and an out-of-wedlock birth is extremely rare in rural China. The distance variable may also suffer from the concern that the closer to the provincial capital city, the better the economic conditions of the village. To mitigate this possibility, we include in our regressions a vector of village public facility accessibility indicators, as well as the average per capita income for the village, in addition to provincial fixed effects.
The Empirical Results
Before discussing empirical results, we note an important technical issue. Ideally, one would like to include both father’s and mother’s migration duration separately as measures of Mijt − 1 in Eq. (1). However, these two variables are highly correlated. The correlation coefficient between the share of father’s and mother’s migration durations in child’s lifetime for our sample is .78. This level of correlation may be tolerable in the case of OLS estimation but not in the case of the IV (or two-stage least squares (2SLS)) estimation, in which the correlation coefficient between the predicted lifetime exposure for father’s and mother’s migrations is .96. Such a high level of multicollinearity results in insignificant estimates for both treatment variables. Hence, we estimate Eq. (1) for mother’s and father’s effects separately.
The OLS Results
The OLS results are reported in Table 3. The results for the lifetime exposure regression (panel A) indicate that fathers’ and mothers’ accumulated migration years as the proportion of children’s lifetime are negatively associated with children’s height-for-age (columns 1 and 2), but these relationships are not statistically significant. For weight-for-age z scores, the negative effect of exposure to mothers’ migration is statistically significant. For the education outcomes, however, we find small, positive, and significant correlations between children’s lifetime exposure to parental migration and test scores. These positive relationships can also be revealed from the unconditional relationship plotted in Fig. 3, which displays strong positive relationships between the outcome variables and the child’s exposure to parental migration.
Is this finding counterintuitive? Not necessarily. As discussed earlier, in the presence of endogeneity due to possible reverse causality, omitted variables, and measurement errors, the unconditional relationships and the OLS results are likely to underestimate the actual causal effects of children’s lifetime exposure to parental migration on children’s health and education outcomes. Intuitively, it is easy to understand that parents decide to migrate when their children are doing well, whereas when their children are not doing well, they may choose to stay at home. At the same time, we understand that parents who choose to migrate may be a special group. They may be more risk-seeking, more competitive, and more driven. These characteristics could be inherited by their children; and children who are more competitive and more driven may also be more likely to do well in school tests. Because our regressions fail to control for these unobserved characteristics, the correlation observed from the regressions may simply be a reflection of the endogeneity issues.
In addition, the fact that our measure for cumulative parental migration is very likely to suffer from a measurement error problem could also generate the tendency to underestimate the coefficients. There may be some evidence for this. The measures of contemporaneous parental migration are less likely to suffer from large measurement errors than the cumulative parental migration measures. Although we find a significantly positive correlation between children’s lifetime exposure to parental migration and their test scores, we observe no relationship between test scores and children’s exposure to parental migration in the last year (panel B, Table 3). Panel C of Table 3 further strengthens this argument: the treatment variable used in this panel is a dummy variable indicating whether the father or mother ever migrated in the last year. Relative to the number of months migration measure, this measure is even less likely to be measured with errors. Here we observe that both health and education outcome variables are negatively and statistically significantly related to fathers’ migration indicator.
With regard to other independent variables, we find that after we control for children’s age, gender, and birth weight, parental height is closely related to children’s health outcome variables, and parental years of schooling is closely related to children’s education outcome variables. Household size is negatively associated with both children’s health and education outcomes.
In the last panel of Table 3 (panel D), we also test whether children’s lifetime exposure to migration by one parent and both parents have differential effects. One of the commonly used contemporaneous measures for parental migration is whether one parent or both parents migrated in the previous year. For comparison, we construct cumulative measures indicating the share of children’s lifetime exposure to migration of one parent and both parents. Unfortunately, we are unable to observe a stable pattern with this test. However, the signs of the coefficients are largely the same as those observed using the indicators for exposure to migration by mothers and fathers.
The IV Results
We now discuss the IV results. First, Table 4 reports selected results from the first-stage estimations. They indicate that all instruments are highly significantly correlated with children’s lifetime exposure to parental migration (panel A) and the indicator for parental contemporaneous migration (panel B). The instruments are stronger for parental contemporaneous migration than for parental cumulative migration. This is understandable because the latter is affected by not only whether parents migrated but also by the duration of parental migration and the age of the children when parents started the 2008 migration spell.
We observe that the more often severe droughts occurred when parents were ages 16–25, the longer the parents stayed away from their children. Also, the farther the home village is from the provincial capital, the longer the parents stayed away from their children. As discussed earlier, the distance between a parent’s own village and provincial capital city is used to capture the combination of potential migration costs and the need to migrate to find a job in the city: the closer the village is to the capital city, the lower the need to migrate. The sign of the relationship between the distance and the migration measure is, therefore, ambiguous a priori. If the distance to the provincial capital chiefly reflects the costs of migration, it should be negatively correlated with the child’s exposure to parental migration. However, because our migration measure includes a duration component (accumulated parental migration during children’s lifetime or the number of months parents migrated last year), high migration costs could reduce the number of home visits for migrants and increase the time they stay away from their children. In addition, the closer the parents’ own village is to the provincial capital, the lower the need for parents to live away from their children to work in cities because they could commute to work daily. Thus, the positive relationship we observe between the distance and our endogenous variables suggests that the IV is likely to predominantly capture the absence of nearby job opportunities and the costs of returning home.
Other covariates in the first-stage regression, which are statistically significant, include child’s birth weight, father’s age, household composition, and some village-level economic condition variables. All have the expected signs. The heavier the child is at birth, the longer the parent stayed away. This seems to suggest reverse causality. Further, younger fathers and parents with large families have a longer duration of absence. Finally, parents from villages with lower average income are also more likely to be absent from home for a longer period.
The first three panels of Table 5 report the IV results for the effect of children’s cumulative exposure to parental migration. Panel A shows the results for the whole sample. We find that children’s exposure to parental migration has a negative and, in most of the cases, statistically significant effect on children’s health and education outcomes. The estimated effects suggest that every 10 percentage point increase in the exposure to mother’s migration reduces the child’s height-for-age z score by 0.34, which is 20 % of a standard deviation for our sample (see Table 2 for the standard deviation). Because the average children’s lifetime exposure to mother’s migration is 9.9 %, the coefficient implies that, on average, the reduction in height-for-age z score is approximately 0.33 (= 3.356 × 0.099). This is the average effect over all children regardless of whether their mothers were away. If we confine the calculation to those whose mothers actually migrated, the effect is larger. In our sample, 32 % of children experienced maternal migration, and for them the average lifetime with mothers being absent is 30 %. Thus, for this group, the reduction in the height-for-age z score associated with the average exposure to maternal migration is 1.02 (= 3.356 × 0.305), which is 59 % of the standard deviation for the sample. This is a very large effect. The effect of a father’s migration, as the share of the child’s lifetime, on children’s height-for-age z score is also negative but much smaller and statistically insignificant. For the weight-for-age z score, the exposure to both paternal and maternal migration has a negative and statistically significant impact. Although the weight-for-age z score can reflect not only the long-run growth (such as height-for-age) but also the short-run fluctuation, to the extent that the results are similar to those for the height-for-age z score, these results provide suggestive evidence that cumulative exposure to both paternal and maternal migration limits the long-run growth of left-behind children. These IV estimates indicate larger negative effects, confirming the upward bias in the OLS estimates.
We now turn to education outcomes (columns 5–8 of Table 5). The results show that every 10 percentage point increase in the exposure to father’s migration reduces the Chinese and mathematics test scores by 2.7 and 1.7 percentage points (25 % and 15 % of 1 standard deviation of the Chinese and mathematics test scores), respectively. For the whole sample of children, regardless of whether the father has ever been away, the fathers were away, on average, 11 % of the child’s lifetime. Hence, the average effect of fathers’ migration on this sample is a 2.9 and 1.9 percentage point reduction in the Chinese and mathematics test score, respectively. For the 43 % of children whose fathers ever migrated—the fathers were away 24 % of the child’s lifetime—the reduction in their Chinese and mathematics test scores caused by their fathers’ migration is 6.6 and 4.2 percentage points, respectively, accounting for 60 % and 37 % of the standard deviation of the Chinese and mathematics scores for the sample.
For children whose mothers have been away, the negative effect is particularly large for the Chinese test score. A 10 percentage point increase in a child’s exposure to maternal migration reduces the child’s Chinese test score by 3.5 percentage points. Among the whole sample, the mothers have been away for 6.5 % of the children’s lifetime, on average, and the effect on the Chinese test score associated with this level of exposure is a 2.3 percentage point reduction. Among 29 % of children whose mothers have been away some time in their lives, the exposure to mother’s migration took place in 23 % of the child’s lifetime. The associated reduction in the Chinese score for this group is 8.0 percentage points, which accounts for 73 % of a standard deviation.
Panels B and C of Table 5 present the results for the effect of cumulative exposure to parental migration by the gender of children. In general, the negative significant effect persists in most cases for sons. However, the effects for daughters, although negative, are often not precisely estimated, most likely because of the weak instruments. These results are consistent with previous studies that show greater vulnerability to parental absence for sons (Bertrand and Pan 2013).
The results for the contemporaneous effect of parental migration (panels D and E of Table 5) show that children whose parents migrated in the previous year exhibit less favorable outcomes in general. Understandably, the size of the coefficients is smaller for the contemporaneous effect (panel D) than the cumulative effect (panel A) given that the latter encompasses the former. Although the literature often uses contemporaneous measures to estimate parental migration effects, our findings indicate that this is likely to underestimate the true effect of parental migration.
Panel F shows the IV estimates for the impact of exposure to one parent or both parents, using as the IVs the weather shock incidents during youth for both fathers and mothers as well as the distance to the provincial capital. Odd-numbered columns show the results based on the specification including only the measure for exposure to both parents, and even-numbered columns indicate the results that simultaneously include both indicators. The results generally point to the negative effects on the outcomes, particularly for the test scores. However, because the first-stage results are unfortunately weak in both specifications (see the F test for weak IV test), the effect of parental migration defined by the number of migrating parents is not precisely estimated.
Robustness Tests
In this subsection, we report the results of four sensitivity tests. The first concerns the measure of cumulative parental migration. Until now, we used the measure for parental migration calculated based on the start of the 2008 migration spell. As discussed earlier, this measure may underestimate the actual duration of parental migration since the child was born because of the possibility of multiple spells. Alternatively, we could calculate the duration based on the start of the first migration spell. Because this method is likely to overestimate the duration of parental migration, the estimated coefficient is likely to provide the lower bound for the effect of parental migration. We present the estimated results using this lower-bound measure of parental migration in panel A of Table 6. The results are consistent with those using the 2008 spell, but the magnitude of the coefficients is indeed smaller.
The second set of the tests is related to the concern that the extreme spring drought is measured when parents were aged 16–25. If the child was born during this time, the drought could have a direct effect on children’s outcomes. To test this, we restrict the droughts counted to those occurring when parents were aged 16–21. The 1980 Chinese Marriage Law stipulates that the legal marriage ages for females and males are no earlier than 20 and 22, respectively. This age restriction has not changed since then. Thus, the alternative measure for the drought is likely to minimize a potential direct effect on children’s outcomes. The results using this IV are reported in panel B of Table 6. The results show that this change in the definition of the IV does not affect the sign and size of the coefficients much, and most of the coefficients are statistically significant.
One might also wonder what happens if the definitions of an extremely low rainfall and extremely high temperature are changed. To examine this, we use two drought IVs based on slightly different definitions. The current IV is defined as the interaction between the number of years in which spring rainfall was 1 standard deviation below the historical mean and the number of years in which spring temperature was 1 standard deviation above the historical mean. The first alternative IV uses the same definition as the current IV, but changes the threshold level for extreme weather to be 1.5 standard deviations away from the historical means. Another alternative IV also uses the 1.5 standard deviations for the thresholds but counts the number of spring months instead of years during which the shocks occurred. The estimates based on these two alternative IVs are reported in panels C and D, which indicate the findings are qualitatively unchanged from those based on the original definition of the IV reported in panel A of Table 5. In particular, the negative effects on height, weight, and Chinese test score are robust against the use of differently defined IVs.
Finally, we test for the comparability of the test score variables across regions. As discussed earlier, although we normalize the test scores obtained from the survey, the content of exam papers may be different across regions. To test whether our results are sensitive to the way the education outcomes are measured, we use the data collected from a uniformed mathematical test conducted in 2011 (see footnote 9 for details of the test instruments). This was conducted for a subset of the RUMiC sample households (only for primary school children (Years 1 to 6) in five of nine RUMiC rural sampling provinces). Using this subgroup of children, we reestimate the IV regression in Table 5. The results are reported in panel E of Table 6. Because of the small sample size, the explanatory power of the IVs is somewhat reduced. Nevertheless, all the estimated coefficients have the right sign and are statistically significant at the 5 % level. The sensitivity tests seem, to some extent, to confirm a strong negative effect of left-behind children’s lifetime exposure to parental migration on their health and education outcomes.
How Parental Migration Affects Test Scores
Why does parental migration have negative effects on children’s outcomes? In this section, we examine whether and to what extent parental migration affects the level of inputs into children’s educational outcomes.24 In particular, we examine school starting age, distance to school, whether the child attends a boarding school, the number of hours the child studies after school, and various annual fees paid to school as well as private tutoring. We also include a variable measuring the difference between the child’s current age and current grade. The normal gap should be 6 or 7, depending on when the child started schooling. A larger gap may suggest delayed entry and/or schooling repetition.
The model specification for the equations of these educational inputs is exactly the same as that specified in Eq. (1). The IV results are reported in Table 7. We find that children’s exposure to parental migration increases their likelihood of grade repetition and enrollment in boarding schools, but it reduces their time spent on studying after school.
The magnitude of these effects is rather large. Every 10 percentage point increase in exposure to paternal and maternal migration increases the child’s age-grade gap by 0.27 and 0.39 of a year, respectively. The results for children’s school starting age (column 1) are not statistically significant. Thus, it is not the case that left-behind children are older for their grades because they start school late. Combined with the results in column 2, they suggest that left-behind children experience considerable grade repetition. This finding is consistent with findings of Meyerhoefer and Chen (2011), who also found that left-behind children are more likely to repeat a school grade.
The results also indicate that children of migrants study less at home. The results in column 6 indicate that relative to a child whose father or mother has never migrated, a child exposed to father or mother migration during the child’s entire life spends 20 and 25 hours per week less on homework, respectively. In other words, a 10 percentage point increase in exposure to paternal and maternal migration reduces children’s homework time by 2.0 and 2.5 hours per week, respectively. Given that, on average, our sample children spend 7 hours weekly studying after school, this level of reduction in the hours of study amounts to a 28 % and 35 % reduction for paternal and maternal migration, respectively. Also, a 10 percentage point increase in exposure to paternal migration increases the probability of attending a boarding school by 7.9 percentage points.
These results provide some explanation for why exposure to parental migration has a negative effect on left-behind children’s education outcomes. Namely, they fail to aid desirable educational inputs, such as ample homework time and normal grade progression, which likely contributes to lower test scores for children experiencing parental migration.
Conclusion
Using RUMiC data, we examined the effect of cumulative exposure to parental migration on children’s health and education outcomes. Unlike most studies in this area, we were able to measure the share of children’s lifetime to date during which their parents were away from home. To the best of our knowledge, this is the first study providing evidence on the effect of lifetime exposure to parental migration.
The unconditional relationships between our parental migration and children’s outcomes revealed potential reverse causality, omitted variables, and measurement error problems. To mitigate this, we adopted an IV approach. We instrumented the exposure to parental migration using weather shocks during parental youth and prior to children’s births, as well as the distance to the provincial capital city.
One major finding is that children’s health and education outcomes were adversely affected by exposure to parental (both paternal and maternal) migration. For example, if a child’s father was away for one-quarter of his/her life, the child’s weight-for-age and Chinese test score are likely to be 0.33 standard deviation and 6.8 percentage points lower, respectively, than a child whose father was never away. The same amount of exposure to maternal absence is likely to reduce weight-for-age and the Chinese test score by 0.57 standard deviation and 8.8 percentage points, respectively. Note that our estimated effect is a combination of the negative effect of parental absence and the positive effect of remittances generated from parental migration. Had we been able to control for the income effect, the estimated parental absence effect could have been even larger.
Another important finding is that when parents migrate and leave their children behind in rural villages, children spend considerably less time studying after school and are more likely to repeat a grade. These negative effects on inputs into children’s schooling are likely to contribute to the negative effect on their education outcomes measured by the test scores.
Finally, by comparing our results for the impact of children’s lifetime exposure to parental migration with the effect of contemporaneous parental migration, we show that what the literature has commonly estimated as the effect of parental migration (using contemporaneous measures for it) is likely to reflect the lower bound of the full exposure effect.
China’s rural-urban migration has affected tens of millions of rural children. More than 60 % of migrant children are left behind by their migrating parents in rural villages, primarily because migrant children have limited access to cities’ public services (such as education and health care). The large negative effect of parental migration on left-behind children’s health and education outcomes uncovered in this study is alarming. If allowed to continue, the intergenerational impact of parental migration may have significant adverse effects on the quality of future labor supply, reinforcing the lack of skills in general and widening income inequality between rural and urban areas.
Acknowledgments
This research has benefited from the Australian Research Council Grant (LP066972 and LP140100514) and the Grant-in-Aid for Scientific Research (No. 24730239) of the Japan Society for the Promotion of Science. The authors acknowledge financial support from Australian Research Council (ARC) Linkage Grants LP0669728 and LP140100514 for funding the RUMiC survey, as well as JSPS KAKENHI Grant No. JP24730239 for research support.
Data Appendix
The share of a child’s lifetime during which parents were away is calculated using two types of information from the RUMiC study. One is the number of months in the previous year during which parents were away from home. This information is available from all the waves. Thus, we know the number of months parents were away in 2007, 2008, and 2009. The other piece of information is when parents started migration, which was asked only in the 2009 wave for individuals who reported having been away in 2008. To assess the number of months parents were away since the child’s birth, we combine the information from the two sources. First, if a child was born after 2007, we aggregated the number of months in which parents were away between 2007 and 2009. Second, if a child was born before 2007, we compare the year of birth and the year in which the 2008 parental migration started. If the child was born after the starting year, we assume that parents have been away for work since the birth of the child until the end of 2006. If the child was born before the starting year, we assume that parents have been away since the starting year until the end of 2006. For individuals who answered that their migration started before 2007, we added the number of months between the beginning of the migration until the end of 2006. In all cases, the duration of migration since the birth of children is expressed in months and is divided by the number of months since the child was born. In a robustness test, we use an alternative indicator for the year in which parental migration started: the year in which parents migrated for the first time.
The measures for migration of one parent and both parents are created in a similar manner. We compare the timing of the three events: the birth of a child, start of paternal migration, and start of maternal migration. For example, if the father was away for work before the child’s birth, and the mother joined the father for work in cities some years after the child’s birth, the child is assumed to have been exposed to migration of one parent since birth, and started to be exposed to migration of both parents since the mother started migration. We count the number of months that fall in each period and divide by the total number of months in the child’s lifetime. Because the months of absence are unknown between 2007 and 2009, parents are assumed to have been away if they were away for six months or more. Thus, if one parent was away for six months or more and the other was away for five months or less, we assume that the child was exposed to the migration of one parent in that year.
Individual test scores were asked for the previous semester, together with the full score for each subject. The ratio of the individual score over the full score is used as our outcome variables.
Distance to public facilities is coded using five categories: (1) <2km, (2) 2–5km, (3) 5–10km, (4) 10–20km, and (5) >20km.
The height-for-age z score is created using parameters from the Centers for Disease Control (CDC) 2000 Growth Charts and 2006 World Health Organization (WHO) Growth Charts (de Onis et al. 2007; Kuczmarski et al. 2002; WHO 2006). The results do not change substantively depending on the choice over the two parameter sources. In this article, we report the estimates based on the CDC parameters because they provide a more suitable reference group for our analysis. In the WHO growth charts, the comparison group for children aged 0–5 comprises children following optimal health practices. Thus, the charts depict the standard that is likely to realize under optimal conditions, rather than just a reference. However, the same standard was not used for the comparison group for 5- to 19-year-olds, and it is based on the U.S. reference children used in the 1977 National Center for Health Statistics (NCHS) Growth Charts. On the other hand, the CDC 2000 charts provide a more consistent reference for children in our sample, based on a group of children in the United States. We apply the CDC parameters for infants (based on length) to observations aged 0–24 months, and the parameters for children (based on statue) observations aged 25–180 months. The transition between these two charts have been made smooth in the 2000 CDC charts (Kuczmarski et al. 2002).
Because our anthropometric data are based on reports by parents, there are likely to be measurement errors. Several methods have been suggested to deal with them. The WHO recommends the use of different formula for those children whose z scores are larger than 3 in the absolute terms. The CDC training modules contain a note that recommends distinguishing “biologically implausible values” (CDC n.d.), which are observations whose z scores are larger than certain values, or are away from the mean z score in terms of standard deviations. We report the results that are commonly found regardless of the choice over these alternative adjustments to the measures for outcomes.
Notes
The idea that human capital acquisition is a cumulative process goes back to Ben-Porath (1967), who provided the theory of individual human capital investment where one chooses the level of time and monetary investments over one’s life cycle. Leibowitz (1974) applied this idea to the investments in children, which include home investments and school inputs at various stages of child development.
In the study, we also made attempt to distinguish between one or both parents migrating, but due to lack of strong instruments, our result is not conclusive in this aspect.
Rural-urban migration in China occurs under a “guest worker” system, whereby migrants are unable to settle in cities (see Meng 2012).
Findings regarding the effect of boarding schools are mixed. The low-quality care offered by boarding schools has been reported in REAP (2009). Other work has found that they improve children’s academic skills but worsen some health outcomes (Shu and Tong 2015).
The situation may have improved after significant investment in rural schools after 2009. The authors recently visited some rural schools in one county and found that most schools had new buildings and that the living conditions at these schools had improved significantly.
The first wave is not used because we as discuss later, two of the key outcome variables were not available in the first wave data, and one of the key questions on the duration of parental migration was asked only in the second wave.
The RUMiC Rural Household Survey uses National Bureau of Statistics annual household survey sample, which is designed to be representative sample nationwide as well as within each province. The nine provinces covered in the RUMiC survey include Hebei, Jiangsu, Zhejiang, Anhui, Henan, Hubei, Guangdong, Chongqing, and Sichuan. Of the nine provinces, five are considered to be the predominant migrant-sending provinces (Anhui, Henan, Hubei, Sichuan, and Chongqing), and the number of migrants from these provinces exceeds 50 % of total rural-to-urban outmigration.
The test was not conducted in the remaining four rural sample provinces because of the cost and complexity of conducting the test in rural areas.
The test instruments were designed specifically for the RUMiC project by the Research Institute for Education Statistics and Measurement (RIESM) at Beijing Normal University. The RIESM organized teachers from the nine sample provinces (including the sample province without rural areas) in order to design and test the instruments. They created two types of instruments: one for Years 1–3, and the other for Years 4–6. These tests are designed to take about 30 minutes. University students were sent to the sample households to conduct the test during the months of July through August.
As discussed earlier, in the RUMiC Rural Household Surveys, all individuals who were registered in the household were asked to record all the information. For those who were absent at the time of the survey, the household main respondent reported on their behalf, except for subjective questions.
For example, one-third of those who have ever migrated and reported the initial year of migration in the three waves of the RUMiC data spent the entire previous year in rural homes. Thus, parents with migration experience might have had years without migration between the initial year and 2006.
Rural individuals who were working in cities in 2008 had spent an average of 2.4 years there since the start of their current migration spell. On the other hand, they had spent an average of 7.1 years there since the start of their first migration spell, which occurred before the current migration spell for those with the experience of multiple migration spells. This difference indicates that, on average, these parents had more than one episode of migration, with their first migration spell starting in 2000 and the current spell starting in 2005.
The RUMiC defines children as individuals aged 0–15 years. After they turn age 16, they are classified as adults under the RUMiC framework, and their test scores and other school-related information are no longer collected.
For the small number of cases (10, 14, 78, and 89 for height, weight, the Chinese test score, and the math test score, respectively) with obvious reporting errors, we replaced the original data with the likely values. For instance, we use the adjusted score of 70 when the original reported score was 700 but the full score was 100; we use the average between 120 and 125 when the original height was reported to be 120cm in 2008, 600cm in 2009, and 125cm in 2010. Excluding these cases with obvious reporting errors does not change our regression results.
Thus, if a child has two data points on health, for example, his/her health measure is constructed as the mean value of the two data points. However, if a child has only one data point, be it in 2009 or 2010, his/her health measure is the nonmissing data point.
Similarly, exposure up to 2010 requires information from all three waves.
The 2008 wave contains 4,548 children, 142 (3.1 %) of whom attrited by 2009.
The survey team within NBS changed between 2009 and 2010, resulting in 11 villages from the 2009 wave not being included in the 2010 wave; instead, 8 new villages were included in the 2010 wave. In total, 19 villages were not matched between the two waves.
The rate of divorce has been rising in China recently, but the share of children with divorced parents in the RUMIC survey is very low, at approximately 1 % in all three survey waves.
Approximately 1 % and 5 % of fathers and mothers, respectively, have missing values in the years of schooling variable. We coded them as 0 years of schooling and used a dummy variable to identify this group.
The error term, νijt, is assumed to be independent across households in the basic OLS estimation. In the IV estimation, we assume that it is independent across groups defined by county and cohort for each parent. As discussed later, our IVs vary across counties and parental cohorts.
We focus on spring months because March through May (or in Chinese calendar terms, Chun Fen to Xiao Man) is the period believed to be crucial to the year’s harvest.
To allocate the nearest weather station for our sample counties, we marked all the counties and the stations in Google Earth based on their latitude and longitude and then measured their straight line distance.
Unfortunately, we do not have good measures for the inputs into children’s health production function.