In the past century, China has undergone rapid and dramatic social and economic changes. This article describes trends in educational assortative marriages of cohorts born in 1906–1995 in China. We measure educational attainment relatively as an individual's percentile position in the education distribution of a 10-year birth cohort and study trends using comparable, easy-to-interpret couple rank-rank correlations. We analyze microdata samples from the 1982, 1990, 2000, and 2010 China censuses and the 2015 1% intercensus survey and nationally representative surveys between 1996 and 2018. We find a large and steady increase in educational assortative marriage over the past century, except among those born in 1946–1965, whose schooling and marriage were impacted by the Cultural Revolution. Our study highlights the critical roles of social, political, and economic contexts in shaping trends in educational assortative marriage.
Assortative marriage refers to the tendency of marriage to occur between individuals with similar characteristics. The extent of assortative marriage by socioeconomic status (SES) is an important indicator of social openness (e.g., Blau and Duncan 1967; Blossfeld 2009; Hout 1982; Kalmijn 1991a, 1998; Smits 2003; Smits and Park 2009; Smits et al. 1998). Marriage within the same status group, or homogamy, limits interaction and mobility between groups, exacerbating social inequality within and between generations (for a review, see Schwartz 2013).
Educational assortative marriage is a widely studied type of assortative marriage (Blossfeld 2009; Kalmijn 1998; Mare 1991; Schwartz 2013). In modern societies, while generally considered an achieved characteristic, educational attainment mediates the influence of family background on economic outcomes (Blau and Duncan 1967; Breen and Jonsson 2005; Hout and DiPrete 2006; Treiman 1970). For mate selection, educational attainment is a strong predictor of the socioeconomic prosperity and cultural affinity of a potential spouse (Kalmijn 1991a; Thornton et al. 1995; Xie et al. 2003). Educational institutions are common venues where young adults interact while considering marriage (Mare 1991). An extensive literature documents changes in educational assortative marriage in the West as a society undergoes educational expansion, experiences increasing returns to schooling, and witnesses changing economic roles of husband and wife (Esteve et al. 2012; Oppenheimer 1994; Smits et al. 1998, 2000; Sweeney and Cancian 2004). However, systematic longitudinal research on how patterns in educational assortative marriages change throughout modernization and in response to rapid economic development, particularly in non-Western settings, remains limited.
The historical context in which educational assortative marriage has evolved in China—and probably many other parts of the developing world—differed qualitatively from those of Western settings. Chinese traditional education had been delivered in private domains for centuries, mainly through family tutors, academies, or lineage schools. Most women were prevented from attaining any systematic education. Until the early twentieth century, public schools offering Western-style modern education were rare in China (Liang et al. 2017). Only a small, select group of young persons received higher education (Elman 2013; Liang et al. 2017). Also, parents traditionally arranged marriages in China (e.g., Lavely 1991; Xu and Whyte 1990). However, since the mid-twentieth century, China has experienced ideological changes regarding gender inequality and freedom of choice in marriage, as well as rapid educational expansion and economic development, particularly after 1978 (Bauer et al. 1992; Hannum 1999; Hannum and Xie 1994; Lavely et al. 1990; Treiman 2013; Wu 2010; Yu and Xie 2015).
Meanwhile, China has witnessed societal changes in state regime and ideology before and after 1949; the prereform period (1949–1978) was characterized by a centralized economy and egalitarian redistribution policies, and the postreform period (1979–present) by a market economy and rising inequality. Previous research has already shown that large, abrupt interruptions of the social stratification order resulted from the 1949 Communist Revolution (e.g., Treiman and Walder 2019; Whyte et al. 1977; Xie and Zhang 2019) and the 1966–1976 Cultural Revolution (Bian 2002; Deng and Treiman 1997; Song 2009; Walder 1989; Zhou and Hou 1999). A large literature has also explored social changes and their consequences for inequality in the postreform period (e.g., Wu 2019; Xie et al. 2022; Xie and Zhou 2014). Hence, through dramatic social and economic transitions in China, opportunities, preferences, and behaviors regarding educational assortative marriage may have changed substantially.
This study is among the first to offer an extensive examination of long-term trends in educational assortative marriage for Chinese cohorts born from 1906 to 1995. Using large-scale surveys and census microdata samples, we apply a novel percentile rank-rank correlation approach to overcome the methodological challenge of studying such long-term trends amid structural changes in a developing society. From a longitudinal, comparative perspective, our findings also contribute to a better understanding of how social closure in general and educational assortative marriage in particular are shaped by large societal changes in one of the world's largest developing societies.
Theoretical Issues and Post-1900 China Settings
Social science research has long been concerned with mate selection for marriage and has proposed various explanations for assortative mating patterns (for reviews, see Blossfeld 2009; Kalmijn 1998; Schwartz 2013). However, explanations of static or steady-state educational assortative marriage patterns cannot be used to explain long-term trends amid substantive social changes. As detailed later in this section, two major theoretical strands in the literature stand out as candidates for explaining long-run variations in educational assortative marriages: the modernization thesis and the economic inequality thesis.
However, empirical evidence has not consistently supported the predictions of these two theses (Blossfeld 2009; Kalmijn 1998; Schwartz 2013). One reason is the difficulty of finding longitudinal, systematic micro-level data on changes in educational assortative mating patterns throughout modernization or across inequality shifts in a society. The lack of historical data for many societies has prevented observations of the evolution of assortative marriages since early modernization. To compensate for such longitudinal data deficiency, many studies have made inferences about long-term temporal variation based on cross-national spatial variation in modernization levels among contemporary societies (e.g., Smits et al. 1998, 2000). Others have examined historical patterns limited to a few European regions in the eighteenth and nineteenth centuries for which historical data are available (for a review, see van Leeuwen and Maas 2010). Comparative research has revealed large variations in assortative marriage patterns over time and across places (e.g., Hou and Myles 2008; Monaghan 2015; Smits and Park 2009; Torche 2010). In particular, substantial differences and changes have been observed in non-Western societies, such as those in East Asia and Latin America (e.g., Raymo and Xie 2000; Smits and Park 2009; Torche 2010), highlighting the need to consider society-specific contexts.
Hence, new empirical evidence describing single-country long-term trends in educational assortative marriage is particularly valuable for the ongoing debates surrounding the modernization and economic inequality theses. More attention should be paid to those rapidly developing countries where marriage patterns in early modernization are still observable and where norms and contexts differ from the more thoroughly examined West. In the remainder of this section, we discuss the two theses and their implications for the Chinese context. We summarize existing knowledge about trends in Chinese educational assortative marriage in the final subsection.
The Modernization Thesis
Many facets of a society undergo tremendous changes during what is commonly called modernization, the transition from a traditional to a modern society (e.g., Inglehart 1997). Examples include, but are not limited to, industrialization, urbanization, educational expansion, increased geographic mobility, improvements in transportation and communication, separation of work and family, increased individual freedom and liberty, declining mortality and fertility, and improvement in women's status. Modernization also has important implications for status hierarchy and social fluidity (e.g., Treiman 1970). In a modern economy, typically a market economy that values efficiency, individual workers are supposed to be allocated to positions based on achieved attributes, such as job skills and qualifications, rather than functionally irrelevant ascribed attributes, such as family origin (Blau and Duncan 1967). Thus, one fundamental change with modernization is the rising importance of individual achievement, especially educational attainment, over family background (Treiman 1970). Nonetheless, the extensive literature on social mobility trends provides only inconsistent and nuanced evidence supporting the predicted effects of modernization (for reviews, see, e.g., Hout and DiPrete 2006; van Leeuwen and Maas 2010).
How does modernization shape educational assortative marriages? One widely accepted prediction is that as a society modernizes, assortative marriage on educational attainment increases (Goode 1970; Kalmijn 1991a, 1991b; Smits et al. 1998, 2000). Because industrialization requires professional knowledge and skills accumulation, education becomes important in determining an individual's SES. Industrialization increases educational homogamy through three mechanisms: (1) education expansion means that young persons are likely to meet potential spouses in school settings while completing their education (Mare 1991); (2) highly educated persons may wish to marry similarly educated spouses to maximize joint economic outcomes (Schwartz and Mare 2012); and (3) modernization differentiates people in sociopsychological domains by educational level, such that a man and a woman with the same educational level have more similar values and beliefs and are more compatible as marriage partners (Kalmijn 1991a, 1991b).
However, two other modernization hypotheses predict different trends (Smits et al. 1998, 2000). First, the romantic love hypothesis predicts decreasing trends in educational homogamy. As an indicator of social openness, educational homogamy should decline along with modernization because sorting on family background gives way to (supposedly SES-blind) romantic love. Second, the inverted U-curve hypothesis attempts to unify the two seemingly contradictory trends discussed earlier by suggesting that educational homogamy increases in the early stages of modernization but decreases in the later stages.
Social Transformation and Modernization in China
Chinese society has experienced several major transformations since the early twentieth century. The imperial Qing Dynasty was ended by the Xinhai Revolution in 1911, when China was still a premodern agrarian society. The ensuing Republican era saw early modernization, but development was hindered first by weak state capacity owing to frequent leadership turnovers and the interference of warlords and foreign powers and then by the Sino-Japanese War (1937–1945) and the Chinese Civil War (1946–1949). In 1949, the Communist Revolution culminated in the founding of the People's Republic of China (PRC). Under a centrally planned economy in the following three decades, the PRC initiated policies and campaigns for rapid industrialization and modernization: land reform; the establishment of the rural–urban hukou (household registration) system; agriculture collectivization; nationalization of banking, trade, and industry; and education expansion. After the turmoil of the Cultural Revolution (1966–1976), the economic reform that began in 1978 replaced the redistributive, centralized economy with a market economy. Since then, China has experienced more than four decades of rapid and steady economic growth.
Several macroeconomic indicators characterize China's rapid modernization. According to the Maddison Project Database version 2018 (Bolt et al. 2018), China's real GDP per capita increased from approximately 800 in 1911 to 12,002 in 2015 (both in 2011 U.S. dollars), a 15-fold increase. Data from the National Bureau of Statistics of China1 indicate that from 1952 to 2015, the percentage of the working population in primary industry dropped from more than 80% to 28%, whereas the proportion in secondary industry increased from approximately 7% to 29%. In 2015, more than 42% of individuals in the country's working population were in the service industry. Alongside the shift of the working population from farming to nonfarming sectors, the country has rapidly urbanized. In 1949, urban residents accounted for just 10% of the national population, increasing to 56% in 2015.
Educational expansion is a central aspect of modernization in our study. Education at all levels expanded dramatically over the twentieth century in China. The increasing trends in educational attainment have remained steady, and the gender gap between men and women has converged. In Figure 1, our calculation based on the 1982, 1990, 2000, and 2010 censuses and the 2015 1% intercensus survey (also called the “mini-census”) microdata samples reflects the shifting distribution of educational attainment among husbands and wives by birth cohort. Among those born between 1906 and 1915 (the 1910 cohort), more than 60% of men and 90% of women were illiterate, and very few attained an education higher than primary school. Over the successive birth cohorts, the proportion of the population that was illiterate shrank and the proportion that was educated expanded rapidly.
Studies have documented historical and political contexts for changes in educational attainment (Deng and Treiman 1997; Hannum 1999; Treiman 2013; Wu 2010). Educational distribution and its temporal changes are consistent across married and unmarried groups in the general population. The Cultural Revolution marked a critical period during which the educational distribution deviated from generally smooth trends toward educational expansion (Treiman 2013). Between 1966 and 1972, most colleges and universities were shut down. While a small number of students were allowed in colleges and universities toward the end of the Cultural Revolution, admission was based on political nomination rather than examination. As a result, education was massively disrupted for most Chinese youth during the period. Moreover, political class labels became critical in determining educational opportunities, which particularly reduced the educational advantages of children from educated, professional families (Deng and Treiman 1997; Xie and Zhang 2019). However, the proportions of the population receiving middle and high school education increased, partly owing to the expansion of new schools in rural areas (Hannum 1999; Lavely et al. 1990; Treiman 2013). Gender inequality in educational attainment was also reduced (Hannum and Xie 1994). These changes are reflected in Figure 1, with the increased proportions of middle and high school education groups in the 1960 cohort and the low proportions of junior college or higher education groups among the 1950 and 1960 cohorts.
In sum, relating China's modernization processes to the modernization thesis, we predict an overall increasing trend in educational assortative marriages over the past century. Given that average education is still much lower in China than in developed countries, the possibility of observing decreasing or inverted U-shaped trends in later periods should be low. Furthermore, given the shock of the Cultural Revolution, we may observe a temporary aberration to the overall increasing trend: a relative decline in educational assortative marriages among the affected cohorts.
The Economic Inequality Thesis
Economic inequality is regarded as an important cause of varying patterns of educational homogamy (Schwartz 2013), although it is mainly referenced to explain trends in recent decades rather than during modernization per se. Rising inequality enlarges economic distances between social groups and alters preferences regarding intermarriage. When returns to schooling increase, the cost of marrying someone with low education becomes more consequential to a family's well-being. Hence, the thesis predicts that rising economic inequality leads to increases in educational homogamy.
Also important to consider are the different perspectives of the husband and wife in the context of changing economic inequality (Becker 1981; Fernández et al. 2005; Schwartz 2013; Sweeney and Cancian 2004). Although men's educational attainment may have long been a desirable socioeconomic characteristic for women choosing men, women's educational attainment gained importance for men choosing women only in recent decades. Women's educational attainment rapidly improved to a level reaching parity with or surpassing men's (DiPrete and Buchmann 2013; Esteve et al. 2016; van Bavel et al. 2018). Further, female labor force participation has increased, and the gender gap in pay has declined (Goldin 2006, 2014). In choosing marriage mates, men desire women with high educational attainment who can contribute significantly to the family's economic well-being (Esteve et al. 2012).
Shifting Levels of Economic Inequality in China
China's economic inequality levels in the past century can be broadly characterized in terms of three distinct periods: seemingly moderate before 1949, relatively low in 1949–1979, and increasingly high since 1979. Before 1949, agriculture dominated gross domestic production. Although, to our knowledge, systematic time series data on economic inequality for the earlier period do not exist, Brandt and Sands (1992) provided comprehensive cross-sectional evidence about 1930s China. The land concentration in the 1930s between Chinese households did not differ qualitatively from that in the eighteenth and nineteenth centuries and was comparable with or better than levels in 1920s Mexico, Victorian Great Britain (excluding London), and the United States in 1798 and 1860 (Brandt and Sands 1992:183–384). Moreover, the Gini coefficient for household income inequality was approximately .38, which was modest compared with that of several comparable low-income societies (e.g., Brazil, Colombia, India, Malaysia, Singapore, and Thailand) in the 1960s and 1970s (Brandt and Sands 1992:202–204).
Between 1949 and 1979, China established its centrally planned redistributive economy through egalitarian campaigns and policies, which substantially changed social stratification and daily lives (Bian 2002; Walder 1989; Whyte et al. 1977). In rural areas, land reform reduced inequality in landownership by redistributing land to peasants. According to evidence from several communes in north and northeast China, for example, land distribution during this period was relatively equal (Noellert 2020). In the late 1950s, collectivization essentially abolished private property rights in rural China. Meanwhile, in urban areas, the state-owned work-unit (danwei) system organized residents' daily lives, including work, housing, and access to public services such as education and health care (Walder 1986; Xie and Wu 2008). Private economic activities were mostly forbidden. Earnings per se were largely equal among urban employees. The World Income Inequality Database compiled by the United Nations University World Institute for Development Economics Research (version 2022) reported Gini coefficients for income inequality of .56 in 1953 and approximately .3 between 1964 and the early 1980s.
China's economic reform and marketization that began in 1979 increased social inequality (Wu 2019; Xie and Zhou 2014). Starting from the early 1980s, the new Household Responsibility System de-collectivized rural production, and the country marketized the economy and opened up to foreign investment (Bian 2002; Nee and Matthews 1996). Coinciding with rapid economic development, the family income Gini coefficient increased rapidly, from approximately .3 in 1985 to more than .5 in 2012, with a widening rural–urban gap and regional inequality (Xie and Zhou 2014).
Thus, the economic inequality thesis would predict that Chinese educational assortative mating trends increased from the period of planned economy and low inequality (ca. 1949–1979) to the period of marketization and rising inequality (post-1979). Prima facie, a decline in educational assortative marriages from the early to the mid-twentieth century would be expected because inequality was higher before 1949 than during 1949–1979. However, deriving plausible predictions for the pre-1949 period from the economic inequality thesis requires caution. One major concern regards the thesis's applicability to such a period of underdevelopment, particularly considering the very low average education and very high levels of gender segregation in schooling, labor participation, and returns to education. Although status homogamy by family background might be strong because of high inequality, assortative marriages on spouses' educational attainment per se might not have been a significant phenomenon until the PRC era.
Continuity and Change in Chinese Assortative Marriages
Chinese marriages are comparatively universal and occur at young ages, especially for females (Raymo et al. 2015), a pattern that has been fairly stable from the past to the present (Lee and Wang 1999). Chinese norms encourage status homogamy between families of “matching doors”—that is, sharing similar family SES (Baker 1979; Croll 1981; Ebrey 1991; Lavely 1991). Status hypergamy, in which the husband has higher social status than his wife, was also common in the past because the Chinese patriarchal tradition defined the couple's status based solely on the status of the husband and his family (e.g., Chen et al. 2018; Ebrey 1991). Moreover, most Chinese marriages were arranged (Baker 1979; Parish and Whyte 1978). However, after the Communist Revolution, the 1955 New Marriage Law stipulated that individuals, rather than their families, were to make marriage decisions (Lavely 1991; Xing et al. 2020; Xu and Whyte 1990). Yet, even in postreform China, parents remain involved in marriage decision making, mainly through social and economic support (Riley 1994; Yu and Xie 2015). With the very low divorce rate increasing after 2000 (Yu and Xie 2021), evidence suggests that relative to first marriages, Chinese remarriages tend to be more hypergamous in terms of status and age (Hu and Qian 2019).
Education as a criterion for mate selection evolved over the twentieth century in China. Until the early twentieth century, the long-standing civil examination system meant that educational attainment was important for men seeking to climb the social ladder. However, this channel of mobility was closed to women. Women began to attend schools in the Republican period, but women who attended schools were mainly from privileged families (Liang et al. 2017). A systematic study based on administrative data from Shanxi Province in the 1960s (Xing et al. 2020) found that in the mid-twentieth century, educational homogamy and hypergamy were already important types of assortative marriage. Another study documented the prevalence of educational assortative marriage in the 1960s in urban areas (Xu et al. 2000). Education was also important for women in the marriage market in the collectivization period in the 1970s, when better educated women were more likely to marry into wealthier villages in Sichuan Province (Lavely 1991). In sum, although historical quantitative evidence is limited and geographically sparse, it is clear that education has become an increasingly important mate selection criterion for both men and women.
Previous studies examining China's trends in educational assortative marriages mostly focused on different periods from 1950 to 2000 by marriage year (Han 2010; Li 2008; Raymo and Xie 2000; Shi 2019; Song 2009; Xing et al. 2020). Together, they show that Chinese educational homogamy followed a U-shaped curve: it was high in the 1950s–1960s, declined in the 1970s–1980s, and increased again in the 1990s. Song (2009) attributed the decline in educational homogamy in the 1970s–1980s to the education disruption during the Cultural Revolution. By birth year, the revealed U-curve trends roughly correspond to cohorts born from 1930 to 1975. The U-curve trends found in previous research in the second half of the twentieth century are at odds with predictions of the modernization thesis, which predicts increases, declines, or inverted-U-curve trends in educational homogamy.
Given China's unique historical context over those decades, are such observed U-curve trends simply a temporary aberration from the longer term overall trends accompanying modernization? Answering this question requires an extended overview of the trends throughout Chinese modernization, from the early twentieth century to date. Although a few recent studies included cohorts born later than 1975, they either examined regional trends (Hu and Qian 2016; Qian and Qian 2017) or did not focus on long-term national trends in educational homogamy (Hu and Qian 2015; Hu and Qian 2019; Qian and Qian 2014). Systematic evidence remains scant about recent trends in educational homogamy of cohorts born in the 1980s and 1990s, who experienced elevated inequality and the rapid expansion of higher education after 1999. Further, little is known about those born before the 1930s, most of whom married before the PRC era.
Our study aims to document the long-term trends in educational assortative marriages in China over the past century. It is, however, beyond the scope of this study to disentangle various effects of modernization, inequality, and other social and historical factors because their interrelationships are complicated over the long run (e.g., Inglehart 2016; Piketty 2019/2020). The observed trends in educational assortative marriages may reflect their nuanced, joint influences. While we demonstrate without any ambiguity the empirical pattern that educational homogamy has substantially increased in China, we acknowledge different theoretical explanations consistent with the pattern.
Structural shifts during modernization and, in particular, educational expansion make it challenging to compare educational attainment across cohorts and between men and women consistently and meaningfully. When education expands in a population, the selectivity of and returns to attaining a certain level of education vary across cohorts. The gender gap in education may also change quickly along with expanding education. Research has typically defined types of educational assortative marriage (e.g., homogamy, hypergamy, and hypogamy) by comparing nominal levels of the husband's and wife's educational attainment. However, using nominal levels to measure educational assortative marriages throughout modernization may produce a large distortion in the trends: in the presence of substantial distributional changes in education by gender, assortative marriage types by nominal educational levels do not consistently measure the extent of marital sorting on couples' educational status over time. For example, in the 1930 cohort shown in Figure 1, an illiterate couple reflects a homogamous marriage by nominal educational attainment, which means that, roughly on average, a husband who ranks in the bottom 20% among his cohort in educational attainment marries a wife who ranks in the bottom 40%. In contrast, a hypergamous marriage between a husband with a high school education and a wife with a primary school education reflects a homogamous marriage by relative ranks because they both are approximately in the top 10% among males and females in the 1930 cohort. The discrepancy between the two types of assortative marriage measurement decreases in the later cohorts, for whom education further expanded and the gender gap narrowed.
The log-linear model has been the method of choice for studying long-term trends in assortative mating because it is believed to account for such incomparability between cohorts or periods due to changing marginal distributions. Log-linear models require certain parametric assumptions that are often hard to verify empirically. For example, the log-multiplicative layer-effect model assumes the estimated trend to be a varying force of a common association pattern between husbands' and wives' educational attainment (Raymo and Xie 2000; Xie 1992). That assumption appears plausible when studying trends during a short period or in a society experiencing modest changes. However, in studying long-term trends in assortative marriages, especially in the presence of disruptive shifts in educational distribution, it is difficult to separate changes in the strength of couples' educational association patterns from changes in the association patterns themselves (e.g., see van Bavel et al. 2018). Moreover, critical historical events or periods, such as the Cultural Revolution (Deng and Treiman 1997; Song 2009), may alter the social meaning of education and educational assortative mating patterns. Although it is possible to accommodate individual-level covariates in a version of log-linear models through conditional logit models (Zhou and Xie 2019), results are still sensitive to model specification and selection.
Hence, following recent studies on intergenerational mobility and status exchange in marriage (Chetty et al. 2017; Hannum et al. 2019; Song et al. 2020; Xie and Dong 2021; Xie et al. 2022; Xie and Zhang 2019), we apply an alternative nonparametric approach based on relative percentile ranks to study patterns of educational assortative marriage over the long term. We first measure individual educational attainment, instead of using nominal levels, with percentile ranks relative to the person's same-gender peers from the same birth cohort. We then examine trends in educational assortative marriages, instead of using log-linear models, with cohort-specific rank-rank correlations between the husband's and wife's educational attainment. Building on relative percentile ranks, correlations in education between a couple are, by construction, scale-free and adjusted for changing gender-specific education distributions. They are thus comparable between men and women and across cohorts.
Data and Cohort Definition
We use 1% microdata samples from the 1982, 1990, and 2000 censuses, a one-per-thousand microdata sample from the 2010 Chinese census, and a 10% microdata sample from the 2015 1% intercensus survey. To further confirm the findings, we also exploit 12 nationally representative large-scale sampling surveys: Life Histories and Social Changes in Contemporary China (LHSCCC) 1996; China General Social Surveys (CGSS) 2003, 2005, 2006, 2008, 2010, 2011, 2013, 2015, 2017, and 2018; and the China Family Panel Studies (CFPS) 2010.
We restrict our analysis to prevailing marriages. Admittedly, prevailing marriages are subject to the selection of marital stability and survival of the couple. However, we need this restriction to include early cohorts in our study because systematic micro-level data on these early cohorts as newlyweds are not available. Unlike in Western countries, Chinese marriages have been virtually universal, and the divorce rate has remained very low until very recently (Raymo et al. 2015; Yu and Xie 2021). Consequently, any selection on observed prevailing Chinese marriages would be driven mainly by the differential survival chances of one or both spouses by educational attainment. Such a selection bias, as suggested by a supplementary analysis reported later, should not alter our main findings.
We use a birth cohort perspective for studying trends in educational assortative marriage. One practical reason is that information on marriage timing and order is missing from most of our data sets. To fully utilize the data to study long-term trends, we prioritize the robustness of our findings with multiple data sources. Because the marital age distribution is more concentrated for wives than for husbands in China, we operationalize the research by grouping marriages based on the wife's 10-year birth cohort, centering on each decadal year. For example, the 1980 cohort in our study refers to those born in 1976–1985.
Another substantive consideration for using the birth cohort perspective is that Chinese society has changed so rapidly that different Chinese birth cohorts have distinct educational opportunities, values, experiences, and life trajectories. In particular, the expansion of education has produced highly differentiated educational distributions by birth cohort. Figure A1 (available in the online appendix, along with all tables and figures designated with an “A”) shows a Lexis diagram presenting critical historical events experienced by different birth cohorts at different ages. Individuals in the 1920 cohort were born right after the end of the Qing Dynasty and faced uncertain educational prospects because of the abolition of the long-lasting imperial civil examination system. During the Sino-Japanese war (1937–1945), individuals in the 1920 cohort were of marital age, those in the 1930 cohort were enrolled in school, and those in the 1940 cohort had just been born. Most members of the 1950 cohort were married during the Cultural Revolution (1966–1976). The education of the 1960 cohort, many of whom were born during the Great Leap Forward campaign and the associated famine (1958–1961), was disrupted by the Cultural Revolution. Living standards and educational opportunities of the 1970 cohort substantially improved thanks to the resumed national college entrance examination system (the gaokao) and the economic reform since the late 1970s. The 1980 and 1990 cohorts were impacted by China's strict family planning policies but later enjoyed further improved opportunities for higher education owing to college expansion in 1998, as well as continuous marketization and rising economic inequality.
In constructing our census analytic sample of prevailing marriages, when multiple census waves are available for a birth cohort, we choose the wave collected at or closest to their 30s and 40s. At such ages, most individuals have already attained their highest educational level and have suffered the least from health and survival differences by education. We use data from the 1982 census for the 1910–1940 cohorts,2 the 1990 census for the 1950 cohort, the 2000 census for the 1960 and 1970 cohorts, and the 2010 and 2015 (mini-)censuses for the 1980 and 1990 cohorts. To construct our survey analytic sample, we pool all surveys and partition them into 10-year birth cohorts. In addition, we restrict both census and survey analytic samples to marriages in which both partners were aged 20 or older in the selected census wave or survey.
Percentile Rank Measure of Educational Attainment
We measure an individual's educational attainment as a relative positional status: the percentile rank among the individual's same-gender peers in the same 10-year birth cohort. To construct this rank measure, we use the microdata samples of the 1982, 1990, 2000, and 2010 China censuses and follow the same birth cohort definition and data selection as introduced in the previous section. We sort all individuals, regardless of their marital status, on educational attainment from the lowest to the highest attainment to obtain the gender- and cohort-specific educational distribution. Within each birth cohort and gender group, the cumulative percentage from the lowest to the individual's nominal level of educational attainment indicates their relative status. The rank theoretically ranges from 0 to 100, with higher values meaning higher status.
To accommodate the most detailed coding schemes available in both our census data and survey data, we distinguish educational attainment using most or all of the following nominal levels: illiterate, primary school, middle school, high school, professional college, university, and postgraduate. Specifically, the 1982 and 1990 censuses enumerated the first six categories, whereas the 2000 and 2010 censuses and the 2015 mini-census included all seven educational attainment categories. Because the educational attainment measure is categorical, individuals of the same educational attainment have tied ranks. We use the midpoint percentile rank of each attainment level for all its members, specific to gender and birth cohort. This method for handling ties in ranking is conventional in the statistical literature (see, e.g., Powers and Xie 2000), which implies that the midpoint rank of a group reflects the average standing of all its members. We then assign the resulting percentile ranks to husbands and wives in our analytic samples from their nominal educational attainment level, gender, and birth cohort. Variance in educational ranks for husbands and wives increased from the 1910 cohort to the 1960 cohort because of educational expansion and fluctuated for those in the 1970–1990 cohorts at levels similar to that of the 1960 cohort.
Rank-Rank Correlation of Educational Assortative Mating
We estimate the correlation between the husband's and wife's educational ranks to examine the assortativeness of marriages by birth cohort. The estimated rank-rank correlation coefficient indicates the closeness between the husband's and wife's educational statuses, which is a straightforward nonparametric statistic. Being scale-free, rank-rank correlation coefficients are comparable across cohorts despite substantial changes in the marginal distributions of the husband's and wife's educational attainment. These cohort-specific correlation coefficients reveal the long-term trends in educational assortative marriage in China.
Extending this approach from bivariate correlation to multiple regression is straightforward. When introducing covariates to adjust for confounders, what we estimate is instead the rank-rank regression slope (or partial correlation) coefficient for the level of educational assortative marriages. We report rank-rank correlations as the main findings. Our supplementary analysis includes the estimation of rank-rank slopes controlling for the husband's and wife's ages to partly account for the survival selection within each birth cohort, as well as the confounding age effects on union formation in China (Hu and Qian 2019; Mu and Xie 2014; Qian and Qian 2014; Yu and Xie 2015). Also, when pooling all census waves or surveys as alternative data strategies, we conduct supplementary analyses to estimate rank-rank slopes with the census wave or survey fixed effects as additional controls for systematic differences between data sources.
We first focus on results based on the census microdata samples. Figure 2 displays rank-rank correlation coefficients between the husband's and wife's educational percentile rank by the wife's birth cohort, with the numerical results reported in Table A1. We find that educational assortative marriages generally increased over time for Chinese cohorts born from 1906 to 1995. The estimated couple rank-rank correlation coefficient increases from .25 for the 1910 cohort to .77 for the 1980 and 1990 cohorts.
Despite the overall increasing trends, educational assortative marriages declined among individuals born in the 1950 cohort, most of whom experienced the Cultural Revolution during their schooling ages or their prime marriage ages. Moreover, the level of educational assortative marriage in the 1960 cohort recovered to that of the 1940 cohort. If we narrow our observation—say, to the 1930–1970 cohorts—the trends appear to be in line with the U-curve trends in the corresponding 1960–2000 marriage cohorts found in previous studies (e.g., Han 2010; Shi 2019; Song 2009). However, by extending the trends to the previously unexamined earlier and later cohorts, we find that such U-curve trends should be regarded as a temporary aberration to the trend of a general increase in educational assortative marriage. The short-term decline in educational assortative marriage probably reflects the influence of disruption and devaluation of education during the Cultural Revolution, as well as the general societal context of egalitarian policies and political campaigns that leveled economic inequality in the first three decades after 1949. The subsequent increase in educational assortative marriage for the 1980 and 1990 birth cohorts also supports the argument derived from the economic inequality thesis, given the dramatic expansion of higher education and rising economic inequality.
Further analysis using the pooled data of 12 nationally representative surveys collected between 1996 and 2015 generally confirms the foregoing findings. Because the surveys were conducted recently, few respondents were born in early cohorts. Given the limited data capacity, we attempt to make good use of available information by studying two groups of couples: respondents and their parents. We estimate the couple rank-rank correlation coefficients between respondents and their spouses for the 1940–1990 cohorts and between respondents' parents for the 1930–1960 cohorts. The latter, albeit statistically unrepresentative of the national population, gives us some information on the early cohorts as an alternative to census microdata. All estimations are weighted using individual survey weights, which are standardized by dividing them by each survey's mean weight (see, e.g., Xie et al. 2022). As shown in Figure 2 (and numerically in Table A1), the estimated coefficients based on pooled survey data suggest trends that are similar to those based on the census microdata. This similarity holds for both the overall increasing trends in the past century and the specific decline–recovery–rise pattern of trends from the 1940 cohort onward.
Comparison With Conventional Log-Linear Models
For comparison, we estimate 11 log-linear models, reported in Table A2. We collapse educational attainment into five nominal levels (i.e., illiterate, primary school, middle school, high school, and college or above) to avoid data sparsity in certain cells. The resulting analytic table has 225 cells (i.e., 5 husband's education levels × 5 wife's education levels × 9 wife's birth cohorts) with a total sample size of 3,891,492 marriages.3
First, following Xie (1992), we fit a class of five association models. These are quasi-association models: main diagonal cells are blocked with dummy variables accounting for the prevalence of homogamous marriages, akin to inheritance effects in intergenerational mobility studies. Model 1 is the null association (NA) model assuming no association between the husband's education (H) and the wife's education (W) given the wife's cohort (Y), controlling for the marginal distribution of H, W, and Y and the two-way interactions of HY and WY. Models 2 and 3 are two versions of the row and column effects association II model (Goodman 1979), commonly referred to as the RC model (Powers and Xie 2000). One attraction of the RC model is that while assuming the association of the row and column variables through latent scores, it requires neither the value nor the correct ordering of scores associated with row and column variables. Rather, the RC model treats the ordering and relative distances of row and column scores as unknown and to be revealed from estimation. This property is particularly relevant to our study setting, given the substantial changes in the social value of education and in (gender-specific) educational mating preferences. Model 2 (RC(o)) specifies a cross-cohort homogeneous pattern of HW association. Model 3 (RC(x)) allows the strength of HW association to vary log-multiplicatively across cohorts. Models 4 and 5 include full-interaction terms for the two-way HW association. The two models differ in specifying the three-way interaction between H, W, and Y: no three-way interaction effects in Model 4 (FI(o)) and log-multiplicative layer effects in Model 5 (FI(x)), which is also often referred to as the Unidiff model (Erikson and Goldthorpe 1992). According to the goodness-of-fit statistics reported in Table A2, none of these models fit the data well, nor are they preferred over the saturated model. Among these models, the two log-multiplicative layer-effects models have the lowest BIC statistics: 4,424.8 with 84 degrees of freedom for the RC(x) model and 4,167.2 with 80 degrees of freedom for the FI(x) model.
Then, we fit six additional log-linear models following Schwartz and Mare's (2005) study of trends in American educational assortative marriage. Model 6 accounts for the marginal distributions and two-way interactions of H, W, and Y, assuming no variation in HW association across Y. Model 7, the homogamy trends model (Schwartz and Mare 2005), adds interaction terms between one homogamy parameter (i.e., whether H equals W) and Y to capture varying homogamy effects by cohort. Model 8 is an expanded homogamy trends model, with full-interaction terms between main diagonal cells and cohorts, accounting for varying educational level-specific homogamy effects by cohort. Models 9–11 are crossings models (see, e.g., Powers and Xie 2000), which include parameters to measure the varying difficulty of marrying across certain numbers of educational levels. Whereas Model 9 includes only the interactions between crossings parameters and cohort, Models 10 and 11 further add restricted and expanded homogamy trends parameters, respectively. Again, according to the G2 and BIC statistics in Table A2, Models 6–11 fit the data poorly. Relatively speaking, Model 11 fits the data the best (G2 = 2,711.7 and BIC =1,619.2, with 72 degrees of freedom). Unlike the two log-multiplicative layer-effects models (Models 3 and 5), Model 11 does not allow us to summarize assortative marriage trends with one parameter per cohort.
The fact that none of the 11 log-linear models fit the empirical data highlights the limitations of the log-linear modeling approach for studying such long-term trends amid social changes. Still, we can compare our results of the two log-multiplicative layer-effect models (RC(x) and FI(x)) with our earlier results based on the rank-rank correlation approach. We present these log-linear results in Figure 3, with numerical results reported in the first two columns of Table A3. The trends from these two poorly fitted log-linear models appear to be broadly similar to our main findings in Figure 2, especially regarding the sharp decline for the 1950 cohort, recovery since the 1960 cohort, and further increase to high levels in later cohorts.
Not surprisingly, discrepancies between the relative rank-rank correlation and nominal log-linear modeling approaches are notable, particularly in earlier cohorts. The log-linear results suggest a much stronger assortative marriage tendency in early cohorts than the rank-rank results do. One potential reason is that, as Figure 1 suggests, many couples in early cohorts are illiterate. Their marriages are homogamous by nominal education levels in log-linear models but less homogamous by relative ranks because of the wide gender gap in educational attainment (see Table A4 for cohort-specific proportions of homogamous and heterogamous marriages). Also, the discrepancies could result from log-linear model misspecification issues, given their overall poor fit to the data.
Alternative Correlation Measures
We now compare our main results with those based on four alternative bivariate correlation measures, focusing on the problem of tied ranks in measuring education. First, we consider educational attainment as a continuous measure of (imputed) years of schooling corresponding to the individual's highest educational level attained. With that, we calculate Pearson's correlation coefficient for the linear relationship between a couple's years of schooling. Second, we consider educational attainment as an ordinal variable measured by the individual's highest educational level attained. We then compute Spearman's rank correlation rho with average ranks for ties. Third, we calculate Kendall's rank correlation tau between a couple's educational levels. Compared to Spearman's rho, Kendall's tau better accounts for the problem of tied ranks by taking advantage of the information on concordant and discordant pairs of ranks. Finally, we implement another solution for tied ranks with the polychoric correlation between a couple's nominal educational levels, assuming a latent joint normal distribution. We skip the detailed technical introduction of these four correlation coefficients because they are standard in the statistical literature. Our main results are robust: the trends reported with rank-rank correlation coefficients in Figure 2 resemble those with different correlation measures, as presented in Figure 4 (numerical results are reported in Table A3). In particular, similarities between our main results and Spearman's rho and Kendall's tau results suggest that the problem of tied ranks does not alter our main findings.
Additional Sensitivity Analyses
First, we focus on the age selection effects on observed marriages. Because of limited data availability, our decision to pool selected cross-sectional data particularly impacts the early cohorts (who are old) and the recent cohorts (who are young) under observation. Although we do not have earlier or alternative data to quantify the biases, we can gauge the direction of the potential biases. We compare rank-rank estimates of the same cohort over censuses (i.e., as they aged). Differences in their rank-rank estimates, therefore, suggest the direction of change as a function of age. We present these cross-wave comparisons in Figure A2, with numerical results reported in Table A3. We observe a remarkably consistent pattern: a monotonic increase in the rank-rank correlation from younger ages (observed in an earlier census) to older ages (observed in a later census) for every cohort. However, these biases are conservative in light of our finding that educational assortative marriage has been steadily increasing over the past century. Without the data limitation, the overall increasing trend would be stronger than the main results.
Then, we conducted seven sensitivity analyses to verify the robustness of our main results. First, to account partly for the potential confounding of age, we estimated the rank-rank slopes (i.e., partial correlation coefficients) after controlling for both spouses' linear age effects. Second, we reported parallel correlations based on the husbands' birth cohorts rather than the wives' birth cohorts. Third, we calculated educational percentile ranks based on five rather than seven nominal levels by combining the three levels of tertiary education into one level, thus enabling a comparison of the rank-rank results with those from the log-linear analysis in Figure 3 with a comparable categorization of educational attainment. Fourth, we calculated educational percentile ranks according to a cohort's overall educational distribution instead of gender-specific distributions. Fifth, using the pooled survey analytic sample, we estimated the rank-rank slopes for couples and their parents, controlling for spouses' linear age effects and survey fixed effects for potential systematic differences between data sets. Sixth, instead of using the selected single wave for each cohort, we replicated the analysis by pooling all available census waves for each cohort. Seventh, also using the pooled data of all censuses, we estimated the rank-rank slopes after controlling for not just both spouses' ages but also the census wave fixed effects for systematic differences across censuses. As reported in Table A5, the results of the seven checks yielded trends very similar to our main findings.
Finally, to examine the educational assortative marriage trends by marriage cohort, we resort to surveys with marriage timing information, such as the CFPS 2010; the CGSS 2006, 2010, 2011, 2013, 2015, 2017, and 2018; and the LHSCCC 1996. As shown in Figure A3, the trends for the 1950–2010 marriage cohorts are broadly consistent with the trends for the results for the 1930–1990 birth cohorts reported in the main analyses, with one exception: the 1950 marriage cohort shows higher than expected assortativeness in the survey data partly because they were observed at old ages (see our discussion on age and assortativeness in this subsection).
In this study, we find a marked increase in educational assortative marriages in China over the past century, with the exception of a decline or stagnation among prereform cohorts who experienced the Cultural Revolution at young ages. Using China as a research site, this study provides new longitudinal evidence about the variation in educational assortative marriage throughout the entire course of modernization within a country and across radically different levels of inequality. The findings are notable in pertaining to a large population with very different developmental trajectories and marriage norms from those of Western societies.
How well do the observed long-term trends in Chinese educational assortative marriages fit the theoretical predictions from the modernization and economic inequality theses? Our findings support the prediction of overall increasing trends along with modernization, while finding no evidence in line with the other two predictions of decreasing or inverted U-curve trends. Post-1949 cohort changes also appear consistent with the predicted increasing trends in educational assortative marriage, along with the sharp rise of inequality in the most recent four decades. Interestingly, recent research also found similar trends by birth cohort for intergenerational occupational and educational mobility (Xie et al. 2022). However, we cannot disentangle the relative explanatory power of modernization and inequality because they have trended upward together in China's recent past. Our findings about particular cohorts influenced by the Cultural Revolution also exemplify the importance of country-specific historical contexts for understanding irregular deviations from theoretically predicted regular trends.
We generate these findings via an approach that views educational attainment as a relative positional status. We examine trends in educational assortative marriages with correlations between a couple's percentile ranks in education relative to peers of the same gender and birth cohort. The rank-rank correlation approach and conventional log-linear modeling approach share the same methodological rationale of examining a couple's status association, net of marginal status distributions. Besides the parametric versus nonparametric designs, one major difference between the two approaches is whether to treat a couple's status gap as relative or absolute. Conventional absolute status gap–based assortative mating patterns—such as homogamy, hypergamy, and hypogamy—are convenient for categorizing marriage types when social status can be measured similarly between men and women. Indeed, when educational distributions are similar between genders and stable across cohorts, the two approaches produce similar substantive findings. However, when educational distributions differ greatly between genders and change substantially across cohorts, the operationalization of homogamous or heterogamous marriages by absolute education encounters both conceptual and measurement challenges for comparability. The rank-rank approach, purely based on distributional properties of husbands' and wives' education, has been verified by our supplementary analyses to be a viable methodological alternative for studying aggregate-level assortative marriage trends over an extensively long term. The rank-rank approach could also be incorporated in studying the causes and consequences of assortative marriages using regression and other statistical methods. We welcome researchers to extend our approach to other research settings.
This research was partially supported by the National Natural Science Foundation of China (grants 72003006 and 71461137001) and the Paul and Marcia Wythes Center on Contemporary China at Princeton University. The authors thank the Editor and reviewers of Demography and participants of our presentations at the 2018 annual meeting of the Population Association of America, the 2019 Summer Meeting of Research Committee 28 of the International Sociological Association, and the 2019 annual meeting of the Social Science History Association, and at seminars at the Hong Kong University of Science and Technology (HKUST), HKUST-Guangzhou, Renmin University of China, the University of Minnesota Twin Cities, and the University of Pennsylvania, for constructive comments and suggestions. The authors are also grateful to the National Bureau of Statistics of China (NBS) and the NBS-Peking University Research Data Center for the access to 2010 and 2015 (mini-)census microdata samples. The ideas expressed herein are those of the authors.
A longer preprint version of this article is available at osf.io/preprints/socarxiv/dr6ew.
Data from the National Bureau of Statistics of China are available online at http://data.stats.gov.cn/easyquery.htm?cn=C01.
Note that the 1910 cohort also includes a few individuals born before 1906, as observed in the 1982 census. The exclusion of such individuals does not substantively change our estimation of the 1910 cohort coefficient. Details are available upon request.
Frequencies calculated based on 2010 and 2015 (mini-)census microdata samples are multiplied by 10 to be comparable to those calculated from the 1% sample of earlier censuses.