Abstract
Current literature states that early-life exposure to smoking produces adverse health outcomes in later life, primarily as a result of subsequent engagements with firsthand smoking. The implications of prior research are that smoking cessation can reduce health risk in later life to levels comparable to the risk of those who have never smoked. However, recent evidence suggests that smoking exposure during childhood can have independent and permanent negative effects on health—in particular, on epigenetic aging. This investigation examines whether the effect of early-life firsthand smoking on epigenetic aging is more consistent with (1) a sensitive periods model, which is characterized by independent effects due to early firsthand exposures; or (2) a cumulative risks model, which is typified by persistent smoking. The findings support both models. Smoking during childhood can have long-lasting effects on epigenetic aging, regardless of subsequent engagements. Our evidence suggests that adult cessation can be effective but that the epigenetic age acceleration in later life is largely due to early firsthand smoking itself.
Introduction
Tobacco use is one of the most significant causes of mortality in the United States, responsible for approximately 480,000 deaths each year. This figure translates to nearly one in five deaths annually, making tobacco use the leading cause of preventable death in the country (Fenelon and Preston 2012; Mokdad et al. 2004). The negative health impacts of tobacco use are well-documented and include a range of diseases, such as lung cancer, chronic obstructive pulmonary disease (COPD), heart disease, stroke, and numerous other cancers. Tobacco use also confers substantial societal and economic costs, including increased health care costs, lost productivity, and increased absenteeism rates.
At the individual level, a large body of literature has shown that early-life exposure to smoking leads to negative health outcomes in later life primarily because of subsequent smoking (D'Agostino et al. 2008; Kawachi et al. 1994). The literature states that smoking cessation can reduce disease risks to levels comparable to those of nonsmokers (Duncan et al. 2019; Lloyd-Jones et al. 2017).
However, recent evidence suggests that effects primarily via subsequent smoking might overshadow important and permanent negative effects of tobacco smoking on health trajectories. In particular, developments in epigenetics have demonstrated that smoking might leave a permanent mark on DNA methylation (DNAm) patterns and that these patterns endure after the individual stops smoking (Beach et al. 2015; Gao et al. 2016; Klopack et al. 2022). Additionally, growing evidence indicates that epigenetic mechanisms might be crucial in mediating the link between early developmental environments and health impacts later in life, such as increased risk of developing cancer, osteoporosis, lung disease, and cardiovascular problems (Klopack et al. 2022).
Smoking damages multiple tissues, which hastens epigenetic aging. Less clear is whether the associations between early-life smoking and later-life epigenetic aging are permanent or are an artifact of cumulative firsthand smoking trajectories. Previous research supported the hypothesis that epigenetic mechanisms are involved in the physiological cascade leading to earlier illness onset and early mortality in those exposed to adversity in early life (Beach et al. 2015; Klopack et al. 2022; McCrory et al. 2022). Yet, the literature is unclear about whether early-life adversity offsets trajectories of cumulative firsthand smoking that lead to accelerated epigenetic aging, potentially creating the illusion of permanent damage. A crucial part of adversity exposure focuses on several developmental stages during the life course, making enhanced effects dependent on the timing and duration of exposures. Such effects are particularly apparent for the epigenome, given that it is more malleable early in the life course. Early life might represent a sensitive period for the embodiment of negative social exposures: rather than ticking at a constant rate, the epigenetic clock ticks rapidly in early childhood and then slows and maintains a constant rate in later life (Burns et al. 2018; McCrory et al. 2022).
The emphasis on exposure timing as a source of variability implies that behavioral risks (e.g., cigarette smoking) might influence epigenetic aging and thereby affect later-life health via several pathways. We investigate whether exposure to cigarette smoke during childhood is more compatible with the sensitive period model (i.e., indicating a permanent effect independent from subsequent engagements with smoking) or the cumulative risks model (i.e., implicating continued smoking). Using three indicators of epigenetic aging, we examine the effects of early-life exposures and engagement via independent and mediating pathways. We first characterize smoking behaviors throughout the life course using sequence analysis and hierarchical machine learning. We then investigate mediating pathways of firsthand smoking behavioral patterns.
Background
Early Life as a Sensitive Period Vulnerable to Permanent Health Damage
The developmental origin of health and disease (DOHaD) theory is predicated on the notion that the origins of lifestyle-related diseases are established during early developmental phases through the interaction of genes and nutrition, stress, or environmental chemicals (Gluckman et al. 2005; Wadhwa et al. 2009). Key to the DOHaD theory is the biological embedding process, which predisposes children to adverse health consequences (Erickson and Sbihi 2018; Hertzman 1999). Biological embedding refers to mechanisms that cause initially temporary homeostatic responses to affect physiology permanently. Because of the prevalence of sensitive periods, or windows of fast growth and heightened plasticity (responsiveness to experience), events early in life might be entrenched preferentially and operate permanently (Berens et al. 2017; Hertzman 2012). Research adopting the DOHaD framework has shown that early-life exposures and engagements might have important health consequences in later life, independent of subsequent exposures or engagements. After correcting for life course smoking duration, a Canadian study discovered that starting smoking at puberty was associated with an increased risk of breast cancer in adult women (Band et al. 2002). A recent study investigated the impact of exogenous variation in fetal and infant secondhand smoke exposure, caused by state-level cigarette taxation, on mortality (ages 55–73) among cohorts of males born in the United States in the 1920s and 1930s. The results indicated that the implementation of state cigarette taxation delayed mortality for exposed infants by approximately two months (Helgertz and Warren 2023).
Further, studies have shown that exposure to parental smoking during childhood tends to alter airway development, predisposes individuals to respiratory problems in adulthood, and increases the risk of COPD death in adulthood (Diver et al. 2018; Svanes et al. 2004). Another study conducted in Finland found that individuals who were exposed during childhood and adolescence to both parents smoking had greater carotid intima-media thickness in adulthood than individuals who were not exposed (Gall et al. 2014). The observed impact was consistent among two distinct cohorts, persisted even when alternative exposure measures were implemented, and exhibited a high degree of independence from cardiovascular risk factors. These findings provide additional evidence that early-life exposure to parental smoking negatively and irreversibly impacts health.
An emphasis on the timing of exposure as a source of heterogeneity reveals that behavioral risks (such as cigarette smoking) could alter physiological pathways (which might be reflected in epigenetic aging) and thereby affect later-life health.
Early-Life Smoking as Precursor to Cumulative Smoking Hazards
Cumulative risk processes highlight an additional key understanding of the life course approach—namely, that individual outcomes at any given time can be comprehended only in the context of the cumulative impact of earlier lived experiences. Individual health trajectories are the outcome of the accumulation of healthful inputs and harmful risks resulting from social, environmental, and behavioral exposures throughout the life course. Engaging in deleterious early-life exposures and behaviors might affect individual health and offset a trajectory that accumulates risks over time (Kuh and Ben-Shlomo 2004).
The chains of risk model, which is a specific component of the cumulative risks framework, is the basis for recent research on the health effects of smoking. This model denotes interconnected exposures that increase the risk of contracting a disease, with one adverse experience or exposure typically precipitating another. Social, biological, and psychological chains of risk can exist, each of which frequently incorporates mediating factors. Prior exposures do not influence the likelihood of developing a disease in the absence of the concluding link in the chain that initiates the disease (Kuh et al. 2003). Research on smoking effects on health that adopts the chains of risk model includes studies showing that smoking cessation can lead to significant risk reductions that reduce risks to levels comparable to those of nonsmokers. A recent study found that individuals who quit smoking did not display greater cardiovascular disease risk 10–15 years after cessation in comparison to those who never smoked (Duncan et al. 2019). Another study found that one third of the smoking-related increased risk of coronary heart disease was eliminated within two years of smoking cessation. After cessation, the excess risk reverted to levels experienced by individuals who had never smoked in the 10–14 years after cessation (Kawachi et al. 1994). Other studies have documented similar findings (Kenfield et al. 2008; Mons et al. 2015).
Consequently, the data regarding the potential long-term impact of smoking during one's youth on health in later years, even subsequent to smoking cessation, are called into question. The use of epigenetic measures and further analyses might help clarify the matter.
Epigenetic Clocks
Epigenetics is the study of the mechanisms that regulate gene expression and modify DNA without altering its linear sequence of bases; it establishes a connection between the environment and the homeostasis of the human body. Thus, epigenetics pertains to the investigation of alterations in organisms that result from modifications in gene expression rather than in the genetic code per se. Consequently, epigenetic processes are an essential collection of mechanisms that potentially reflect biological embedding. Epigenetic change refers to the enduring alteration of gene expression induced by mechanisms such as the attachment of chemical residues (e.g., methyl groups) to DNA or transcriptional regulatory components (e.g., histones). Epigenetic mechanisms, which promote or inhibit DNA transcription, refer to chemical mechanisms, such as histone modifications, noncoding RNA interaction, and DNAm. DNAm plays a crucial role in gene regulation and expression, given that a greater proportion of DNAm in gene promoters is linked to transcriptional repression (i.e., gene silencing) (Champagne 2010).
This knowledge prompted the development of the first epigenetic clocks, which were capable of predicting chronological age from the methylation of specific Cytosine–phosphate–Guanine sequences (CpGs) given a source of DNA (Hannum et al. 2013). First-generation epigenetic clocks (e.g., Hannum et al. 2013; Horvath 2013) were trained on chronological age, establishing a higher predictive power than chronological age itself. However, chronological clocks display limited capability for tracking and quantifying health conditions, also termed biological age (Bernabeu et al. 2023).
Many epigenetic clocks followed these first-generation clocks. These so-called second-generation clocks were trained on different health outcomes (e.g., all-cause mortality, lifespan, cancer, or even rates of change of biomarkers over time), alongside chronological age (Belsky et al. 2020; Levine et al. 2018; Lu et al. 2019; Yang et al. 2016; Zhang et al. 2017). Second-generation clocks have been programmed using additional age-related metrics. For example, GrimAge (Lu et al., 2019) is an indicator of time to all-cause mortality, and PhenoAge is a phenotypic biomarker of morbidity (Levine et al. 2018). An age acceleration residual is produced when an epigenetic clock–estimated biological age is subtracted from chronological age. Positive values of this residual indicate more rapid biological aging (Bernabeu et al. 2023).
Epigenetic clocks are measures that have highlighted a widespread lack of understanding about the biological mechanisms underlying aging: though all such clocks capture methylation changes that are partially correlated with chronological age, they are built differently and are composed of different CpGs, with methylation variations depending on distinct biological mechanisms and processes (Horvath and Raj 2018). Deciphering the biological meaning entails unraveling the processes in which the CpGs that comprise these clocks are involved, given that various epigenetic mechanisms might contribute to the overall signals in every clock to different degrees. Although measurement using epigenetic clocks requires further research, performance tests show that the GrimAge clock represents a step improvement in the predictive utility of the epigenetic clocks for identifying age-related declines in an array of clinical phenotypes and promises to advance the field (McCrory et al. 2021).
The Effect of Tobacco Smoke on Epigenetic Aging
Several epigenome-wide association studies have found that smoking alters DNAm at various CpG sites across the human methylome (Allione et al. 2015; Guida et al. 2015; McCrory et al. 2022). Numerous studies using epigenetic clocks have proposed DNAm as a mediator between environmental factors such as smoke exposure during development and health risks later in life (Besingi and Johansson 2014; Fujii et al. 2022; Rathod et al. 2021; Tobi et al. 2018). For instance, Klopack et al. (2022) demonstrated that tobacco exposure predisposes individuals to adverse health outcomes, such as cancers, osteoporosis, lung, and cardiovascular disorders, via DNAm. Second-generation clocks are particularly useful for identifying such patterns, as they capture strong effects of smoking where first-generation clocks do not. A recent study indicated that the associations between smoking and diabetes-related outcomes are substantially mediated by second-generation epigenetic clocks; conversely, no significant associations between smoking variables and four health outcomes were mediated by the first-generation epigenetic clocks used (Chang and Lin 2023).
Despite a clear connection between smoking exposures and epigenetic aging, whether the effects are reversible remains contested in the literature. Recent studies have found evidence that after smoking cessation, methylation alterations appear to be reversible to a certain extent (Dugué et al. 2020; McCrory et al. 2021). Most smoking-associated methylation signals show lower levels of DNAm in current smokers than in nonsmokers and variable dynamics after smoking cessation (Abbott and Tsay 2000; Tsai et al. 2018). Although some changes in methylation patterns can last for decades, some evidence suggests that smoking cessation can restore methylation levels to those seen in nonsmokers (Guida et al. 2015; Tsai et al. 2018; Zeilinger et al. 2013; Zhang et al. 2013).
Other studies found that smoking exposure leaves a long-term signature in methylation patterns throughout the entire genome and that these patterns continue long after smoking has stopped (Beach et al. 2015; Gao et al. 2016; Shenker et al. 2013). Another study found that smoking cessation protects airway cells from epigenetic aging but has no effect on lung tissue. Smoking raised the epigenetic age of airway cells and lung tissue by an average of 4.9 and 4.3 years, respectively. After smoking was discontinued, the epigenetic age acceleration in airway cells (but not in lung tissue) declined to the level observed in nonsmokers (Wu et al. 2019).
Cigarette smoke causes tissue damage, accelerating epigenetic aging. However, a key aspect of smoke exposure—often noted in the literature—is the developmental life course stage when it occurs, potentially rendering increased effects contingent on the timing of exposure. In short, it is imperative to apply Elder's groundbreaking life course principle: the developmental impact of successive life transitions or events is contingent on when they occur in an individual's life (Elder 1998).
Early life, particularly childhood, is a sensitive period of development that is subject to heightened vulnerability to insults (Kuh et al. 2003). The relationship between exposure to maternal smoking (during gestation and childhood) and changes in the DNAm of offspring is well established (de Prado-Bert et al. 2021; Joubert et al. 2016; Lee et al. 2015). Recent research found that prenatal and childhood exposure to tobacco smoke, as well as indoor particulate matter absorbance during childhood, are associated with accelerated epigenetic aging. This research showed that epigenetic modifications in pathways implicated in cellular cycle regulation, detoxification, and inflammation might exert early-life effects on human health as a result of these environmental exposures (de Prado-Bert et al. 2021). Furthermore, many of the epigenetic marks associated with these exposures were maintained during adolescence.
Hence, the literature suggests that the effects of early-life exposure to smoking on epigenetic aging might last throughout the life course. However, little is known about how early-life smoking affects epigenetic aging later in life and whether such changes result from subsequent smoking habits or have an independent and long-lasting influence across the life cycle.
Hypotheses
The life course theory presents a collection of theoretical concepts that explain how early-life insults affect health trajectories throughout the lifespan. The sensitive periods and cumulative risk models illuminate how the timing of exposure to cigarette smoke might produce varied paths to accelerated epigenetic aging (and poor health in later life). Exposures to adversity during developmentally sensitive periods can permanently affect significant outcomes, with various unfavorable repercussions across the life course (Ferraro 2011; Hertzman 1999). Considering this framework, we arrive at our main hypothesis:
Hypothesis 1: Because DNAm is most plastic during early life, early-life exposure to smoking will accelerate epigenetic aging, regardless of subsequent engagement in behavioral risks.
Results supporting Hypothesis 1 would be most consistent with a sensitive periods model, where early-life exposures are largely independent and robust to subsequent smoking. However, because prior smoking behaviors likely offset cumulative risk trajectories (i.e., chains of risk) owing to a higher likelihood of engagement later in the life course, early-life exposures and engagements might be fully mediated by subsequent smoking. Thus, we propose a competing hypothesis:
Hypothesis 2: Because prior smoking behaviors are likely to entail further smoking engagement, early-life exposures to smoking effects on DNAm will be largely mediated by subsequent engagement in behavioral risks.
Both of our hypotheses entail a direct effect of smoking behaviors on epigenetic aging acceleration. The distinction between the sensitive periods model and the chains of risk model lies in the pathways that lead to epigenetic age acceleration. In the sensitive periods model, the effects would stem from an effect of early-life smoking that is independent of subsequent engagement. In the chains of risk model, the effects of early-life smoking are mediated by downstream engagement in adulthood.
Data and Methods
We utilize the Health and Retirement Survey (HRS) data from each wave of the core data set (1992–2016) and the epigenetic clocks derived from the 2016 Venous Blood Study (VBS), a component of the HRS. First conducted in 1992, the HRS is an ongoing, nationally representative longitudinal research of community-dwelling U.S. individuals aged 51 or older and their spouses of any age (Fisher and Ryan 2018). The HRS was created to research economic well-being, labor force participation, health, and family composition among older people through biennial surveys conducted by phone or in person. A subsample of 4,104 individuals in the VBS agreed to provide biological samples for DNAm tests in 2016. The VBS yielded valid epigenetic clock measurements from 4,018 individuals (Crimmins et al. 2020). After accounting for missing data, we obtain a final sample of 3,678.
Variables
Dependent Variables
Our main dependent variables are the second-generation epigenetic clocks available in the HRS: GrimAge, Levine, and Pace of Aging (PoAm). Second-generation epigenetic clocks are an improvement over the original clocks. These new clocks not only provide chronological age predictions but are also based on measures such as blood biomarkers and can detect variations in DNAm associated with poorer health status, diseases, or stress conditions in a way that first-generation clocks cannot (Poganik et al. 2022).
GrimAge Age Acceleration
GrimAge is a combination of DNAm-based biomarkers identified to be associated with health-related plasma proteins and smoking pack-years, as well as sex and chronological age (Lu et al. 2019).1 It is associated with the key hallmarks of aging and stands out among its peers in matters of correlation with disease status, age-related clinical phenotypes, and mortality (Hillary et al. 2021; Liu et al. 2020; Maddock et al. 2020; McCrory et al. 2021). We calculate the measure of GrimAge acceleration as the difference between DNAm GrimAge and chronological age (i.e., epigenetic age acceleration), yielding a measure of years above or below chronological age. A positive value indicates that biological age is older than chronological age—that is, accelerated aging. A negative value indicates that biological age at the time of the measurement is younger than chronological age.
Levine Age Acceleration
Levine et al. (2018) developed PhenoAge, an epigenetic age estimator based on the methylation of 513 CpGs, in three steps. First, they constructed a measure of phenotypic age from clinical blood biomarker data and chronological age. Second, they used DNAm to predict phenotypic age and construct the DNAm PhenoAge measure. Finally, they validated PhenoAge's association with mortality, family longevity, socioeconomic status, diet, smoking, and different diseases (Levine et al. 2018).
PoAm
We use PoAm (also known as DunedinPoAm), a blood-based DNAm measure that represents individual variation in the pace of epigenetic aging.2 Unlike other methylation clocks, PoAm reflects a rate measure (i.e., how quickly a person ages) rather than a state metric (i.e., how much aging has occurred until that point).
Independent Variables: Early-Life Exposure to Behavioral Risks and Engagement
As indicators of early-life exposure to and engagement in smoking, we utilize several dichotomous measures (yes/no) available in the HRS childhood module.
Parental Smoking
The HRS childhood panel asked respondents to reply yes or no to the following question: “Did your parents or guardians smoke during your childhood?” We utilize this item as an indicator of exposure to secondhand smoke during childhood.
Childhood Smoking
HRS respondents were asked, “Did you regularly smoke cigarettes while you were in grade school or high school? By ‘regularly’ we mean at least one cigarette a day for most days of the week, for six months or more.”
Adulthood Smoking
As indicators of smoking behavioral patterns during adulthood (i.e., between ages 18 and 50,3 before entry into the HRS study), we utilize retrospective HRS questions asking when respondents started smoking and when they stopped. We then utilize sequence analysis and hierarchical machine learning (AGNES) to identify clusters of similar smoking patterns. The resulting variable is categorical, identifying nonsmokers, short-term smokers, and long-term smokers in adulthood. Short-term smokers are respondents who started smoking late or quit early in adulthood. Long-term smokers are respondents who smoked and continued smoking throughout adulthood. An illustration of the clustered sequences for smokers can be seen in upcoming Figure 1.
Later-Life Smoking
As indicators of smoking behavioral patterns during later life (i.e., the period of the life course since entry into the HRS),4 we use questions administered throughout the HRS waves asking respondents how many cigarettes per day they currently smoke. Following CDC research (Li et al. 2015), we classify respondents by their cigarette consumption per day as follows: nonsmokers, light smokers (5 or fewer), moderate smokers (6–16), and heavy smokers (>16). We then utilize sequence analysis and hierarchical machine learning (AGNES) to identify clusters of similar smoking patterns.5 The resulting variable is a categorical variable identifying nonsmokers, light smokers, and heavy smokers during the later-life period. Light smokers are respondents who smoked lightly during the HRS period, including individuals who ceased smoking throughout. Heavy smokers are respondents who mostly smoked large quantities of cigarettes per day during the HRS period, including short reductions in quantities and short cessation periods. An illustration of the clustered sequences can be seen in the upcoming Figure 2.
Controls
As controls, we include the following basic demographic information: gender, year of birth, and years of education. Additionally, as an indicator of overall health during childhood, we use self-reported childhood health before age 16.6 Furthermore, as an indicator of childhood socioeconomic status, we use self-reported responses to the following question: “Now think about your family when you were growing up, from birth to age 16. Would you say your family during that time was pretty well off financially, about average, or poor?”
Analytic Approach
Sequence Analysis
Previous research aiming to characterize smoking behaviors throughout the life course has tended to use measurements that characterize smoking histories on the basis of single self-reported measures (e.g., dichotomous measures of being a current smoker or not, or dichotomous measures of ever having smoked daily). This approach tends to oversimplify smoking histories, overlooking the timing and length of exposure (e.g., current smoker, past smoker) and number of cigarettes. Because these factors are crucial, we utilize sequence analysis to characterize smoking behaviors throughout the life course. Sequence analysis is a nonparametric method for analyzing trajectories and social processes (Abbott and Tsay 2000). The method has the benefit of providing a comprehensive view of trajectories, describing how processes evolve over time and when transitions occur. Sequencing (the order of a distinct state occurrence), duration (the length of a spell in a particular state), and timing (when the transition occurs) are all important sequence aspects to consider (Studer and Ritschard 2016).
We take advantage of the rich smoking histories available in the HRS to capture single-channel sequences of smoking histories (one sequence for each individual). The HRS collects smoking information in two ways. First, it collects a retrospective measure asking when individuals started and stopped smoking. Second, at each wave, the HRS asks respondents who smoke how many cigarettes they smoke per day. We use retrospective measures to capture sequences during the adulthood period (i.e., from childhood until age 50), and we use the different HRS wave information to capture sequences of smoking behavior in later life (after entry into the HRS). Adulthood sequences contain information only on whether respondents smoked, whereas later-life sequences carry more detailed information on the amount of smoking.
The multiple time points at which smoking is recorded differ for each respondent, producing up to thousands of unique smoking trajectories.7 To synthesize the thousands of unique smoking trajectories, we produce clusters of smoking trajectories using a hierarchical machine learning algorithm (AGNES). Following best practices, we first use an optimal matching algorithm that calculates the distance between each pair of sequences according to the duration and timing of each state, resulting in a dissimilarity matrix (Studer and Ritschard 2016). Second, we rely on the elbow method and the silhouette method.8 Third, we implement a hierarchical clustering approach based on Ward's agglomerate nesting (AGNES) method. We use the resulting clusters to measure adulthood and later-life smoking histories in our subsequent modeling. Smoking patterns identified by our procedure can be visualized in Figures 1 and 2.
Mediation Analysis
We estimate formal mediation models using the method developed by Breen et al. (2013) (hereafter, KHB), which permits the assessment of the proportion mediated by certain pathways utilizing discrete covariates. In addition, the KHB estimation provides a rigorous test of the difference in the coefficient of variation (i.e., the Sobel test). Our analytic strategy has two stages. First, we estimate ordinary least-squares (OLS) regressions in which the dependent variables are the epigenetic clocks (for GrimAge and Levine, we use acceleration as a difference between the age and the biological age) and the key independent variables are each of the exposures to smoking and engagement in smoking (i.e., parental smoking, childhood smoking, adulthood smoking sequence clusters, and later-life smoking sequence clusters).
where yi represents epigenetic clocks of interest for i. Sik represents the early-life condition variable of interest k for individual i, as well as adulthood and later-life smoking. Finally, Ci is a vector of control variables, including year of birth, gender, years of education, childhood socioeconomic status, and self-reported health during childhood (before age 16).
where yi represents the outcome of interest for individual i. Sik represents the early-life condition variable of interest k for individual i. Zij represents the key mediators j for individual i. Finally, Ci represents a vector of control variables identical to those specified in Eq. (1). The key mediators utilized in this stage of the analysis represent engagement in behavioral risks during adulthood and late life and are introduced sequentially in the KHB model to gauge the proportion mediated for each indicator.
Results
Our first series of results pertains to the identification of smoking typologies during adulthood and later life using sequence analysis and hierarchical machine learning (AGNES). The clustering process identifies three smoking typologies in each period (i.e., adulthood and later life). Figure 1 displays the adult smoking typologies: nonsmokers, short-term smokers, and long-term smokers. As noted earlier, short-term smokers are individuals who initiated smoking later in adulthood or ceased smoking early in adulthood, whereas long-term smokers are individuals who have smoked continuously since they reached maturity.
Figure 2 displays the later-life smoking typologies resulting from the sequence analysis and hierarchical machine learning clusters. As noted earlier, respondents who smoked only sparingly during the HRS period, including those who quit smoking, are classified as light smokers. Respondents who primarily consumed substantial amounts of cigarettes daily throughout the HRS period, including brief cessation periods and reductions in quantity, are classified as heavy smokers. Nonsmokers (N = 3,067) represent an additional category.
Table 1 displays descriptive statistics to gauge central tendencies in the variables of interest. A large proportion of individuals were subject to parental smoking. Similarly, a substantial percentage of respondents were regular smokers during childhood.
Next, we show OLS estimates for the relationship between each smoking indicator and the epigenetic clock acceleration measures. Regressions are conducted individually for each smoking indicator to determine whether it is related to the epigenetic clock acceleration measures. This step is necessary before mediation. If no relationship is revealed, the mediation analysis would not be justified. Results for all OLS regressions are available in the online appendix.
Figure 3 shows OLS regression estimates for the relationship between each of the early-life smoking exposures, as well as subsequent smoking behaviors and epigenetic age acceleration. Parental smoking is associated with a 0.8-year increase in GrimAge acceleration (with all else held constant) but is not associated with the Levine clock. Additionally, childhood smoking is strongly associated with GrimAge and Levine acceleration, with 2.8 and 0.8 years of acceleration, respectively (all else held constant). Similarly, we find strong effects for adulthood smoking and later-life smoking indicators. The results for the GrimAge clock are noteworthy: heavy smoking in later life is associated with approximately a 7-year GrimAge acceleration.
Figure 4 shows similar OLS coefficients for the PoAm measure. However, the estimates shown in Figures 3 and 4 are from separate OLS models and are intended to justify using mediation models to gauge viable pathways.
Table 2 displays total and indirect effects of KHB mediation analysis for GrimAge acceleration. As conveyed earlier, we introduce each mediator independently to gauge the percentage of the direct effect that is mediated. We then introduce all mediators at once to gauge whether the direct effect of the full model remains. Panel A presents the results of the KHB mediation analysis for parental smoking. The direct effect of parental smoking in the reduced form (i.e., not including any mediators, only controls) is 0.788, which is statistically significant at p < .000. This finding indicates that those respondents who were exposed to parental smoking display approximately 0.8 years higher GrimAge acceleration than those who were not exposed to parental smoking.
The indirect effects show that the effect is fully mediated: the coefficient for the full model (0.236) is no longer statistically significant. The largest mediators are childhood smoking (mediating 15.76% of the effect), adulthood smoking (mediating 23.34%), and later-life smoking (34.75%). These mediators are introduced independently and pass the Sobel test for mediation (indicating that the difference in coefficients of direct effects between reduced and full models is statistically significant), suggesting that parental smoking has an effect only through posterior smoking engagement.
Panel B of Table 2 shows the KHB estimates for childhood smoking. Note that these models add parental smoking as an additional control. The direct effect in the reduced model is 2.264 and is statistically significant at p < .000, indicating that those who smoked regularly in childhood display 2.264 more years of GrimAge acceleration than those who did not smoke as children. Interestingly, when all mediators are introduced in the KHB regression, the direct effect remains high (1.186) and statistically significant. Thus, regardless of whether respondents engaged in smoking in adulthood or in later life, those who smoked as children display 1.186 years of GrimAge acceleration. In other words, this effect is independent of posterior cumulative behavioral risks. The indirect effects are also interesting: the largest mediators pertain to adult smokers and later-life smoker indicators (mediating, respectively, 13.29% and 22.75% of the effect). Thus, the effect of childhood smoking largely operates via a higher likelihood of engaging in behavioral risks in adulthood.
Panel C of Table 2 shows results for adulthood smoking as mediated by later-life smoking behaviors. Here, too, we introduce prior smoking indicators as controls (i.e., parental smoking and childhood smoking). The direct effect in the reduced model is high: 4.231 years of GrimAge acceleration. However, when mediators are introduced in the model, the coefficient drops to 2.784 and remains statistically significant. These findings indicate that a large part of the effect (34.19%) is mediated through engagement in subsequent smoking in later life. Interestingly, the remainder of the effect is independent of subsequent engagements, acting as a potential indicator of a permanent DNAm signature during adult years.
Table 3 displays the total and indirect effects of KHB mediation analysis for the Levine acceleration aging measure. The specifications of the KHB models are identical to those displayed in Table 2. However, because we find no direct effects of parental smoking on the Levine measure, we do not test for mediation in this case. For childhood smoking, the results show a full mediation via subsequent engagements, with 96.64% of the effect mediated by other indicators. This finding provides strong evidence for a cumulative risks model. The direct and indirect effects in panel B (adulthood smoking) show that the mediators are similar to those obtained in Table 2. Adulthood smoking is mediated to 16.04% of the effect via subsequent engagements in later life. These mediators are introduced independently and pass the Sobel test for mediation (indicating that the difference in coefficients of direct effects between reduced and full models is statistically significant).
Table 4 displays the total and indirect effects of KHB mediation analysis for the PoAm measure. The specifications of the KHB models are identical to those displayed in Table 2, showing that very similar patterns are obtained with both the GrimAge acceleration measure and the PoAm measure. In panel A, the direct effect of parental smoking in the reduced form (i.e., not including any mediators, only controls) is 0.0113 and statistically significant at p < .000, indicating that PoAm is 0.011 higher for respondents who were exposed to parental smoking than for those who were not. Here, too, the effect is fully mediated by subsequent engagements. The indirect effects show that the largest mediators are similar to those obtained in Table 2. Childhood smoking mediates 13.20% of the effect, and adulthood and later-life smoking mediate, respectively, 33.41% and 34.87% of the effect.
Panel B of Table 4 shows KHB regression estimates for childhood smoking. These models include parental smoking as an additional control. The direct effect in the reduced model is 0.032 and is statistically significant at p < .000: PoAm is 0.032 higher for children who smoked regularly than for those who did not smoke as children. Interestingly, when all mediators are introduced in the KHB regression, the direct effect is reduced by two thirds but remains statistically significant. Thus, regardless of whether respondents engaged in smoking as adults, those who smoked as children display a higher PoAm of 0.011 compared to those who did not smoke. In other words, similar to the results obtained in Table 2, this effect is independent of posterior smoking behaviors.
Panel C of Table 4 shows results for adulthood smoking as mediated by later-life smoking. Similar to the results shown in Table 2, the direct effect in the full model largely operates via later-life smoking, but most of the effect remains statistically significant and large after the mediators are introduced.
Conclusion
A large body of literature has shown that early-life exposure to smoking leads to adverse health outcomes in later life primarily because of subsequent engagements with smoking (D'Agostino et al. 2008; Kawachi et al. 1994). These findings imply that smoking cessation can reduce disease risks to levels comparable to those of nonsmokers (Duncan et al. 2019; Lloyd-Jones et al. 2017). Yet, more recent research indicates that exposure to smoking might have a permanent effect on the epigenome by changing DNAm patterns in specific regions, which can be reflected in certain epigenetic age measurements later in life that might result in the development of chronic conditions, even if smoking behaviors cease (Beach et al. 2015; Gao et al. 2016; Klopack et al. 2022). If so, the literature might have overlooked the presence of permanent damages due to early-life exposure to tobacco.
In this study, we investigate whether the effects of childhood exposure to cigarette smoke on epigenetic aging are more compatible with the sensitive period model or the chains of risk model. Utilizing KHB regressions, we evaluate the effects of early-life exposures via direct and indirect routes using GrimAge and Levine age acceleration and PoAm as indicators of epigenetic aging.
The results show processes consistent with both the sensitive period model and the chains of risk model. Parental smoking is most consistent with a chains of risk model: exposure entails a higher likelihood of adulthood smoking and thereby greater epigenetic acceleration. However, childhood smoking effects are consistent with both models. Regarding GrimAge acceleration, half of the effect of early-life smoking is compatible with a sensitive periods model (i.e., independent effects beyond further smoking), and the other half is consistent with a chains of risk model (operating via further engagement in smoking). The results for PoAm are similar: two thirds of the effect of childhood smoking is most consistent with a chains of risks model, but one third of the effect persists despite engagement in further smoking throughout adulthood.
Previous research on life course smoking and epigenetic aging has obtained estimates pointing in the same direction as found in our research, lending credence to our findings. Klopack and colleagues (2022) used the HRS to evaluate epigenetic clocks as valid mediators connecting lifetime exposure to smoke and chronic conditions and mortality. Despite utilizing different measures and modeling strategies, our study echoes and enhances their findings.9
This study has several limitations, largely related to data constraints. First, we observe epigenetic clocks at only one time point (in 2016). A full assessment of whether early-life exposures to behavioral risks have effects consistent with a sensitive periods model or a chains of risks model requires epigenetic information at different points throughout the life course. Furthermore, our measures of early-life exposure to tobacco smoke are imperfect. Our assessments of childhood smoking and parental smoking are based on retrospective self-reports, which are susceptible to recall and social desirability bias. Furthermore, the exact timing of exposure is unclear, especially regarding parental smoking. Given the sensitivity of differential developmental stages to epigenetic modifications, having more detailed data on when such exposures and engagements occur is crucial.
An additional limitation concerns the use of specific epigenetic clocks to capture effects of early-life exposures or engagement with tobacco smoke. Epigenetic clocks represent metrics that, with regard to the biological mechanisms underlying aging, remain largely unknown. Although methylation changes are correlated with chronological age, each clock is constructed uniquely and is constituted from a unique set of CpGs; methylation variations are caused by unique biological mechanisms and processes. Further research is needed to clarify what epigenetic clocks are most appropriate for differential physiological systems and their development. However, evaluations of the GrimAge clock's performance indicate that it significantly enhances the predictive capability of epigenetic clocks in detecting age-related deterioration across a variety of clinical phenotypes (McCrory et al. 2021).
Another potential limitation to consider is related to selection effects on both mortality and volunteering to participate in the VBS module. We believe this selection understates our effects because individuals who opted to participate have higher socioeconomic status, higher self-rated health, and lower mortality hazards. For detailed analyses, see the online appendix.
Despite these limitations and caveats, the results of this study have several implications. First, we find partial evidence supporting the notion that early life is vulnerable to smoking effects that are robust to subsequent amelioration. This result highlights how crucial it is for children not to engage in regular smoking, regardless of whether they later quit or never smoke in adulthood. Second, the study provides strong evidence supporting the chains of risk model, indicating the positive effects of disengaging from tobacco smoking early on despite early-life regular smoking.
Acknowledgments
This article received funding from the European Research Council (ERC) under the European Union's Horizon 2020 Research and Innovation Program (grant agreement 788582). García and Lund contributed equally to this article as third authors.
Notes
Acceleration measured with GrimAge is associated with many age-related conditions, lifestyle factors, and clinical biomarkers (Lu et al. 2019). GrimAge is particularly interesting in measuring the association with smoking: DNAm pack-years (the DNAm-based CpG sites associated with smoking and used in GrimAge) could be capturing intrinsic variation among individuals with lasting smoking-related biological damage (Lu et al. 2019).
The PoAm clock was developed in several steps. First, using three successive waves of the Dunedin Multidisciplinary Health & Development Study, Belsky et al. (2020) quantified the rate of change in 18 blood chemistry and organ system function biomarkers among participants at ages 26, 32, and 38. Then, for each individual, they calculated how each biomarker’s rate of change varied from the cohort norm. Finally, they combined the 18 individual rates of change to compute a composite score for each study member, which they referred to as the pace of aging (PoA). Second, they validated the PoA indicator and measured its association with physical function, cognitive decline, and early-life impacts. Third, they used elastic-net regression to derive an algorithm capturing DNA methylation patterns linked with variation among individuals in their pace of aging. The algorithm was termed DunedinPoAm, and the resulting measure is estimated in years per chronological year.
We implemented multiple robustness checks specifying differential cutoff points before and after age 50. The clustering results do not differ, indicating that the AGNES algorithm identifies approximately the same groupings of respondents.
Here too we implemented multiple robustness checks for calculating clusters using different multiple imputation strategies as well as the nonimputed trajectories. Again, the clustering results do not differ, meaning that groupings are not contingent on imputation strategy.
A few cases showing smoking behavior sequences involving slight episodes were classified into the nonsmoking category using the AGNES algorithm. We conducted robustness checks to address potential misclassification issues. Results are available in the online appendix.
Individuals were asked the following question: “Consider your health while you were growing up, before you were 16 years old. Would you say that your health during that time was excellent, very good, good, fair, or poor?”
All sequence analysis was conducted using the R package TraMineR (Gabadinho et al. 2011), and sequence plots were generated using the R package ggseqplot (Raab 2022).
The elbow method provides visual aid to determine the trade-off between total variation explained and number of clusters, whereas silhouette width is the minimization of within- and between-cluster variation as an approach to guide our choice of the number of clusters (Cornwell 2015). As we show in the online appendix, the optimal number of clusters for both adulthood and late-life smoking is three.
Our results differ slightly from those from the Klopack et al. (2022) study, perhaps because of different measurement and methodologies. Our study utilizes a different HRS parental smoking variable that has fewer missing cases. Additionally, we utilize multiple smoking measures condensed in the adult and later-life smoking clusters, as well as different regression methodologies. Ultimately, we believe our results enhance the Klopack et al. (2022) study in valuable ways.