## Abstract

In this analysis, guided by an evolutionary framework, we investigate how the human genome as a whole interacts with historical period, age, and physical activity to influence body mass index (BMI). The genomic influence is estimated by (1) heritability or the proportion of variance in BMI explained by genome-wide genotype data, and (2) the random effects or the best linear unbiased predictors (BLUPs) of genome-wide association studies (GWAS) data on BMI. Data were used from the Framingham Heart Study (FHS) in the United States. The study was initiated in 1948, and the obesity data were collected repeatedly over the subsequent decades. The analyses draw analysis samples from a pool of >8,000 individuals in the FHS. The hypothesis testing based on Pitman test, permutation Pitman test, *F* test, and permutation *F* test produces three sets of significant findings. First, the genomic influence on BMI is substantially larger after the mid-1980s than in the few decades before the mid-1980s within each age group of 21–40, 41–50, 51–60, and >60. Second, the genomic influence on BMI weakens as one ages across the life course, or the genomic influence on BMI tends to be more important during reproductive ages than after reproductive ages within each of the two historical periods. Third, within the age group of 21–50 and not in the age group of >50, the genomic influence on BMI among physically active individuals is substantially smaller than the influence on those who are not physically active. In summary, this study provides evidence that the influence of human genome as a whole on obesity depends on historical period, age, and level of physical activity.

## Introduction

Obesity has consequences for morbidity and mortality. It is associated with hypertension, metabolic syndrome, dyslipidemia, type 2 diabetes, coronary heart disease, osteoarthritis, stroke, and several types of cancers (Lewis et al. 2009; National Institutes of Health 1998). Currently, more than two-thirds of adults in the United States are overweight (defined as body mass index (BMI) of 25–29.9 kg/m^{2}), and about one-half of the overweight are obese (BMI ≥ 30) (National Center for Health Statistics 2015).

Is obesity a result of genetic destiny or personal choices about eating and exercise? In the quest for genetic evidence over the past five to six decades, most research relies on biometrical methods or family, including twin studies based on genetically related individuals (Jou 2014). Family and twin studies based on genetically related individuals suggest that the heritability of BMI ranges from 40 % to 70 % (Maes et al. 1997; Stunkard et al. 1986). Recent genome-wide association studies (GWAS) and GWAS consortia studies have confirmed that dozens of genetic loci are associated with obesity (Frayling et al. 2007; Locke et al. 2015; Speliotes et al. 2010). These GWAS-identified genetic variants explain about 3 % of the variations in BMI.

Since the 1980s, the United States saw an increase of almost 200 % in obesity prevalence. Why did this obesity epidemic happen so dramatically and quickly while human gene pools could not possibly have altered to such a degree? Environmental factors such as eating and exercise must have played an important part. Thus, obesity is a complex health problem that is influenced by both genes and environment, and possibly the interaction between the two.

Gene-environment (GxE) interaction holds that an environment influences how sensitive we are to the effect of a genotype and vice versa. Ignoring GxE interactions forces us to estimate only an average genetic effect (averaged over all environments) or an average environmental effect (averaged over all genotypes), thus potentially missing genetic effects, environmental effects, or both effects entirely.

The present analysis investigates whether human genome interacts with nongenetic factors (historical period, life course, and physical activity) to influence obesity. We use the term “genome” rather than “gene” to indicate our objective of estimating the effects of the human genome rather than specific genes. One main innovation of this GxE interaction analysis is its consideration of the influences of an entire human genome, using hundreds and thousands of genetic variables simultaneously in a single regression model (Yang et al. 2010).

## Background

The joint effects of human genome and environmental factors, such as physical activity and dietary patterns, can be understood from a perspective of evolution (Bellisari 2008). The “thrifty genotype” hypothesis proposed several decades ago offers an evolutionary explanation of the current obesity epidemic (Neel 1962). This hypothesis suggests that thrifty genes were selected to give advantages to individuals by storing extra calories as body fat in times when food was abundant. Thrifty genes were advantageous because throughout almost all of human history and all over the world, food was scarce, and the level of physical activity was high. However, thrifty genes have become disadvantageous in the contemporary developed world, where food is plentiful and inexpensive, and intense physical activity is typically unnecessary.

If thrifty genes are evolutionarily selected, they must be significant for reproduction. This suggests that thrifty genes have larger effects at reproductive ages than at other ages. Empirically, we observe a number of peaks in fat storage over the life course (Zafon 2007). The first two peaks are an adaptive strategy for reproduction. Fat storage in infancy assists in the transition from placental period to lactation and then the transition from lactation to solid food. Fat storage during pregnancy for the mother is considered a safeguard for an infant’s lactation period. Humans often experience a third peak in fat deposition at older ages, but its evolutionary significance is not as clear as that in reproductive ages.

Although difficult to prove (Lazar 2005; Speakman 2008), the thrifty genotype hypothesis is plausible and timely, hypothesizing that in human populations, genes of certain forms are conducive to obesity. Also derived from the hypothesis is how some environmental factors may interact with genetic propensities for obesity. The evolutionary theory predicts that obesity genes tend not to be expressed in “normal times” with respect to food and physical activity, or the BMIs of human individuals possessing obesity genes tend not to become “overweight” unless food is abundant and inexpensively available and level of physical activity is low.

Specifically for the current analysis, we hypothesize a smaller genomic influence on BMI among individuals engaged in heavy physical activity than among individuals not engaged in such activity. Previous GxE interaction studies targeting one or a few genetic variants report that the effect of the *FTO* gene on obesity among individuals leading a physically active lifestyle is attenuated by about 30 % when compared with those who are inactive (Andreasen et al. 2008; Cauchi et al. 2009; Rampersaud et al. 2008). It remains to be seen whether physical activity exerts a similar attenuating effect on the genome-wide susceptibility for obesity.

Consistent with the evolutionary theory, we hypothesize a larger genomic influence on BMI in the current obesity epidemic since the mid-1980s than before the epidemic or before the mid-1980s. The current obesity epidemic is often characterized as an “obesogenic” environment wherein unhealthy food is more easily available, exercise is reduced, and obesity genes are expected to be more expressed. The division of the past decades into a pre-obesogenic and an obesogenic environment could only be approximate and is based on national surveys of obesity over the same periods. The prevalence of obesity in the United States remained approximately the same from the 1960s through 1980s but has increased dramatically since the mid-1980s (Flegal et al. 1998, 2002, 2012). Our data from FHS confirm this national trend (Fig. 1). The level of BMI before 1970 is similar to that between 1970 and 1985, and a much higher level of BMI is observed for the period after about the mid-1980s.

Because of a much clearer evolutionary significance for obesity genotype in reproductive ages than after reproductive ages, we hypothesize a larger genomic effect on obesity among females and males during reproductive ages than after reproductive ages. The evolutionary theory predicts such an effect only among females and does not comment on a similar effect among males (Zafon 2007). We extend the prediction to males because the same genotype is likely to be passed onto both females and males.

In this analysis, we investigate whether and how much the level of historical period, life course, and physical activity interact with genome to influence obesity. Equivalently, we seek answers to the following three questions. Does physical activity reduce the genome-wide genetic susceptibility for obesity? Has the genome-wide influence on obesity increased over the past few decades, when the prevalence of obesity rose dramatically in the United States? Does the genome-wide genetic susceptibility become smaller after reproductive ages than during reproductive ages?

This investigation uses data from FHS. Since 1948, FHS has repeatedly collected information on health and health behavior of three cohorts of more than 10,000 individuals. Recently, genome-wide genotype data were collected from about 9,000 of these individuals. By extending a recent mixed-model approach (Yang et al. 2010), our GxE interaction analysis considers the overall impact of the human genome as a whole on obesity in a single regression model.

This analytical approach represents a major methodological shift from the traditional fixed-effects (or effects of observed variables) GWAS strategy in genomic data analysis. For example, one of the questions that we ask is, Does physical activity reduce the genome-wide genetic susceptibility for obesity? In spite of the generally recognized importance of physical activity and rapid advances in genomic technologies in recent decades, this seemingly straightforward question has not been addressed. The investigation of GxE interaction involving a few genetic variants is uncomplicated, but incorporating genome-wide genotype data into GxE interaction analysis has been a formidable challenge.

A central challenge in working with genomic data is to develop a way to take advantage of the entire panel of GWAS data simultaneously in one analysis. Since the mid- to late 2000s, a GWAS study has typically measured 500k–2,500k genetic variables for each individual. The extremely large number of genetic variables posts two difficulties: multiple testing, and too many predictors for a single regression model. The current prevailing GWAS strategy estimates the effect of one single nucleotide polymorphism (SNP) at a time and sets the *P* value at an extremely low level of 5 × 10^{–8}. The effects of the large number of genetic variables in this fixed-effects (or effects of observed variables) regression models are estimated separately. In a fixed-effects framework, it is impossible to include all the SNPs as independent variables in a single regression model because the number of predictors is much larger than the number of observations. The difficulty of obtaining credible GxE interaction findings via this GWAS strategy was recently demonstrated (Boardman et al. 2014).

In a random-effects model, we calculated the heritability of BMI as well as the random effects of this large number of SNPs on BMI. Both biometrical data (nonmolecular family data) and GWAS data can be used to estimate heritability. However, we reason that GWAS data consisting of 500k–2,500k genetic variables per person must contain substantially more information than biometrical data in which genetic information depends entirely on genetic relatedness among relatives.

We demonstrate that such information can be tapped by examining random genomic effects based on molecular data. Once treated as random, the entire panel of genetic polymorphisms can be considered simultaneously. The variance of these genomic effects can be estimated with a random-effects model. The variance of the random genomic effects is an indicator of the overall size of genomic effects. The difference in the size of the genomic effects across environments can be tested formally without having to deal with the issue of multiple testing.

## Data and Methods

### Data Source

The Framingham Heart Study (FHS) (FHS 2012) is a community-based, prospective, longitudinal study following three generations of participants. The Original Cohort enrolled in 1948 (*N* = 5,209); the Offspring Cohort enrolled in 1971 (*N* = 5,124), consisting of the children of the Original Cohort and spouses of the children; and the Generation Three Cohort enrolled in 2002 (*N* = 4,095), consisting of the grandchildren of the Original Cohort. These individuals are of predominantly European origin. About 1 % of the subjects self-report as native, black, and Asian American. All study subjects undergo physical exams and complete written questionnaires at regular intervals. Weight and height were measured on FHS subjects repeatedly at dozens of medical exams over the decades. These measures at all adult ages and over several decades provide an opportunity for an age-period analysis.

At five times between 1979 and 2008, FHS asked the study subjects how many hours per day they engaged in activities such as heavy household work, heavy yard work (such as stacking or chopping wood), and exercise (such as intensive sports: e.g., jogging, swimming). About 85 % of the FHS subjects whose BMI and genotype are available responded. A binary variable is constructed that divides the FHS individuals into two groups: those who engaged in heavy activity (*N* = 2,102) and those who did not (*N* = 2,216). Splitting the entire sample into more than two groups encounters the issue of sample size to be discussed in a later section. Our measure of physical activity is merely a proxy for energy expenditure. An accurate measure needs to take into consideration numerous dimensions, such as frequency, intensity, duration, energy cost, efficiency, and locomotory effects (Wells 2006).

Of the 14,428 study subjects in FHS, a total of 9,237 consenting individuals were genotyped: 4,986 women, and 4,251 men. Genotyping for FHS participants was performed using the Affymetrix 500K GeneChip array. The Y chromosome was not genotyped. A standard quality control filter is applied to the genotype data. Individuals with 5 % or more missing genotype data were excluded from analysis. SNPs that are on X chromosomes, that have a call rate of ≤99%, or that have a minor allele frequency of ≤0.01 were also eliminated from analysis. The application of the quality control procedures left 8,738 individuals with 287,525 SNPs from the 500k genotype data. Genotype data were converted to minor allele frequencies for analysis.

### A Mixed Model for GxE Interaction Analysis for GWAS Data

Our GxE interaction approach builds on the mixed linear model of Goddard et al. (2009) and Yang et al. (2010) These mixed models are a special class of the general mixed-effects or random-effects linear models. A key innovation of Yang and colleagues is their recognition that by doing a transformation, the models could estimate the effects of a million or more genetic variables for each individual in a single regression model. This was accomplished by not estimating fixed effects of the large number of variables. Rather, the large number of genetic variables is assumed to follow a random distribution, and the distribution parameters can be estimated. After the mixed model is estimated, the random effect of each of the large number of genetic SNPs can be derived.

**Y**is an

*n*× 1 vector of the phenotype, with

*n*being the number of observations;

**β**is a vector of fixed effects;

**μ**is a vector of SNP effects with

**μ**∼

*N*(0,

**I**

_{μ}σ

_{μ}

^{2}), where

**I**

_{μ}is an

*N*×

*N*identity matrix with

*N*being the number of SNPs;

**ε**is a vector of residual effects with

**ε**~

*N*(0,

**I**

_{ε}σ

_{ε}

^{2}), where

**I**

_{ε}is an

*n*×

*n*identity matrix;

**W**is an

*n*×

*N*standardized genotype matrix with the

*ij*th element

*s*

_{ij}is the number of copies of the reference allele for the

*j*th SNP of the

*i*th individual, and

*p*

_{j}is the frequency of the reference allele.

**A**=

**WW**′/

*N*and σ

_{g}

^{2}=

*N*σ

_{μ}

^{2}. Then Eq. (2) is mathematically equivalent to Eq. (1):

**g**is an

*n*× 1 vector of the total genetic effects of the individuals with

**g**~

*N*(0,

**A**σ

_{g}

^{2}), and

**A**is the genetic relationship matrix (GRM) between individuals. Because

*N*(the dimension of

**W**or the number of SNPs) is reduced to

*n*(the dimension of

**g**or the number of individuals), the entire panel of 500k–2,500k SNPs can be incorporated into this single mixed model. Yang et al.’s (2010) mixed model based on about 4,000 individuals and approximately 300,000 SNPs shows that 0.45 of the variance in human height can be explained by common SNPs. In contrast, the prevailing GWAS strategy explains about 10 % of the variance in height (Allen et al. 2010).

**G**

_{k}is a standardized genotype matrix with

**G**

_{k}=

**W**for individuals in the

*k*th environmental category and with

**G**

_{k}= 0 for individuals in the other environmental categories;

**μ**

_{ek}is a vector of SNP effects for individuals in the

*k*th environmental category, with

**μ**

_{ek}~

*N*(0,

**I**

_{μ}σ

_{μek}

^{2}), where

**I**

_{μ}is an

*N*×

*N*identity matrix with

*N*being the number of SNPs, and σ

_{μek}

^{2}can be understood as the total variance explained by the

*N*SNPs for the individuals in the

*k*th environmental category; and

*r*is the number of categories of the environmental factor. All models are estimated with control for sex and the first seven principle components for population stratification (Price et al. 2006).

The method of genome-wide complex trait analysis (GCTA) is a special case of the general mixed model (Searle et al. 1992), and the model is built on some assumptions. It assumes that the effects of the large number of observed genetic variables follow a normal distribution. This differs from multilevel or longitudinal models (Goldstein 2011; Guo and Hipp 2004; Raudenbush and Bryk 2002), which are also special cases of the general mixed model, but which assumes that the unobserved effects at level 2 or above follow a normal distribution. The GCTA model implies that all genetic variables, which can amount to 2,500k, have an effect on the outcome except for those right on the zero value along the *x*-axis.

## Greater Statistical Power From Genomic Random Effects

Our GxE interaction analysis implemented by the software GCTA (Yang et al. 2011) yields two sets of findings of genomic influence. The first set consists of estimates of heritability or the proportion of variance in BMI that is explained by GWAS data in each subsample defined by period, age, and/or physical activity. The GCTA models allow **Xβ** and ε to vary by environmental category *k*. The second set of gnomic influence is estimated random effects on BMI or the best linear unbiased predictors (BLUP) of μ. Given that Eqs. (1) and (2) (i.e., **Y** = **Xβ** + **Wμ** + ε and **Y** = **Xβ** + **g** + ε) are mathematically equivalent, the BLUP of **μ** can be transformed from the BLUP of **g** by $\mu ^=WTA\u22121g^/N$ (Yang et al. 2010). The second set of results includes a large number of random effects on BMI for each age-period and age-activity subsample. Also estimated is the variance of these random effects of GWAS data in each subsample. A larger variance of the random effects implies that larger proportions of random effects are located away from zero, and thus indicates larger genomic effects on BMI.

Previous applications of GCTA have always reported only heritability estimates. Heritability estimates, based on ratios of variance, are intrinsically instable and low in statistical power. In this analysis, in addition to heritability estimates, we take advantage of random genomic effects. We conduct our GxE interaction analysis by comparing two sets of random effects at a time, each set corresponding to a particular social environment. We test whether the two sets of random genomic effects are statistically different.

### Hypothesis Testing

We performed six sets of hypothesis tests: two for heritability, and four for variance of random effects. In the heritability analysis for genome-age-period interaction, the first set of tests examines whether each of the eight heritability estimates is statistically larger than zero; the second set of tests examines whether these heritability estimates are different from one another. The other four sets of tests—Pitman test, permutation Pitman test, *F* test, and permutation *F* test—examine whether the random genomic effects from two environments are statistically different. As we argued earlier, random genomic effects should contain considerably more information than heritability estimates. We performed multiple tests for the random effects because such hypothesis testing has not been done before. Multiple tests would increase the robustness of the findings.

We also considered other procedures, such as bootstrapping (Efron and Tbshirani 1993). A bootstrapping procedure requires repeatedly sampling a portion of observations, and this leads to genetic relatedness in each subsample. Such relatedness would bias the estimates in GCTA analysis (see upcoming discussion).

We used a Pitman test (Howell 1997; Pitman 1939; Snedecor and Cochran 1967) to test the hypothesis that one set of random effects or the BLUPs on BMI in one “environmental” subsample is larger than another in a different “environmental” subsample. The Pitman test was developed to test the null hypothesis that two correlated samples are drawn from populations with identical variances. Because a pair of subsamples on which a Pitman test was performed could contain the same or related individuals, the two subsamples can be correlated. The random effects in our analysis are paired and correlated because the two sets of random effects are based on the same set of SNPs. Even if the individuals are not related, the BLUPs can still be paired and correlated.

*F*ratio of the larger variance to the smaller variance in the pair of subsamples. Second, the test computes

*N*is the number of SNPs, and

*r*is the Pearson correlation between the two sets of BLUPs estimated from two subsamples, respectively (e.g., one subsample consists of individuals aged 21–40 before 1985, and the second subsample consists of individuals aged 21–40 after 1985). Finally,

*t*is evaluated on

*N*– 2 degrees of freedom. To summarize, we calculated the BLUPs for each age-period and age–physical activity subsample, and employ a Pitman test to compare the distribution of the BLUPs between age groups within each historical period; between historical periods within each age group; and between the physically inactive and the physically active within each age group.

In addition to a conventional Pitman test, we performed permutation Pitman tests. Permutation tests are exact tests and do not depend on distributional assumptions. The permutation Pitman test was implemented by the following procedures. Suppose *X*_{1}, . . . , *X*_{N}; *Y*_{1}, . . . , *Y*_{N} are BLUPs from two samples, respectively (e.g., age 21–40 before or after 1985) where *N* is the number of genetic variables. We repeated the following steps (a–d) 10,000 times: (a) mix the 2*N* observations together; (b) randomly split them into two samples of size *N*; (c) pretending that the first sample is *X* and the second sample is *Y*, calculate a Pitman test and corresponding *P* values; (d) use these 10,000 test values as the null distribution, and find the location of the true Pitman value in the null distribution. If the true Pitman value lies at the extreme of the null distribution, it indicates that the Pitman test is significant. The permutation Pitman *P* value is calculated as the percentage of permutation Pitman values greater than the true values; the permutation Pitman test is a two-sided test.

*F* tests and permutation *F* tests were performed to address the potential issue of the extremely large degree of freedom in the Pitman test. Unlike the Pitman test statistic, the extremely large *N* or the number of SNPs (287,523) is not a significant factor in the calculation of the *F* test statistics. An *F* test is based on a ratio of two variances. An *F* test statistic becomes insensitive when the degree of freedom is greater than 1,000. In other words, the effect of degree of freedom becomes negligible, regardless of whether it is 1,000 or 287,523, indicating that our significant findings about group differences are not driven by the extremely large number of SNPs. The permutation *F* tests were implemented in a similar fashion as the permutation Pitman test. The permutation *F* test is a one-sided test.

### Analytical Samples

When creating samples for GCTA, we must weigh two conflicting considerations. First, each mixed-model analysis must be based on a sample of genetically unrelated observations; second, each analysis sample must be maximized in size to obtain statistical power. Including related individuals or observations in the same mixed-model results in biased estimates (Yang et al. 2010). To satisfy this requirement, we could use the information on sibling and parent–child relationships in FHS and delete one or more individuals in a known genetically related cluster. However, some individuals could still be genetically related—such as cousins—even if they are not siblings or parents and children. FHS does not provide information on these relationships. The relationship matrix estimated by GCTA can be used to address this issue by setting the cutoff point in genetic correlation very close to zero. However, too many individuals may be unduly deleted if the cutoff point of genetic correlation is set too close to zero. GCTA’s solution is to set the cutoff at .025 (Yang et al. 2010), which assumed to be caused by noise for genetically unrelated individuals. The cutoff of .025 was arrived by the observation that the maximum negative genome-wide correlation is –.025. Because related individuals are only correlated positively, the negative genotype correlation is likely to be caused by noise. Assuming that positive noise has a similar magnitude as the negative one, GCTA deletes only one or more individuals in a genetically correlated cluster in which individuals are correlated more than .025. We followed a similar logic and found the cutoff point in FHS to be .034.

The GCTA uses only genetically independent observations. Eliminating correlation among observations due to genetic relatives and repeated measures of the same individual in the FHS data would leave a portion of the total number of observations for GCTA analysis. This is, indeed, less optimal, but has to be dealt with. Standard random-effects statistical methods can routinely handle complicatedly correlated dataset that include siblings, twins, and cousins (Searle 1971); however, the two should not be confused. The GCTA method uses genome-wide genotype data to estimate heritability (Yang et al. 2010), while the standard random-effects model can use genetic relatives to estimate heritability in the absence of genotype data (Guo and Wang 2002). When the GCTA method uses genetic relatives and genotype data simultaneously, it has two overlapping sources of genetic information, resulting in biases.

Ideally, in the present study, we would prefer that each GxE interaction be carried out for each subsample defined by historical period, age range, and level of physical exercise. To compromise because of the available samples, GxE interaction analysis is performed only in age-period subsamples and age-activity subsamples, but not in age-period-activity subsamples, which would result in prohibitively small samples.

In the preparation for the age-period analysis, we grouped all the BMI measures, for which genotype information is available, into eight subsamples by two historical periods (before and after 1985), and four age groups (21–40, 41–50, 51–60, and >60). The cutoff of mid-1980 for historical periods is supported by the well-known obesity trends in the United States over past decades (National Center for Health Statistics 2015) and confirmed by the FHS data described in Fig. 1.

A few of our age-group classifications appear “irregular.” In the age-period analysis, those aged 21–40 are grouped instead of those aged 21–30 and 31–40. In the analysis of physical activity, those aged 21–50 are grouped instead of those aged 21–40 and 41–50. A wider age range for the 21–40 group is justified on two grounds. First, the evolution theory predicts a distinction with regards to effects of genotype on obesity between reproductive ages and after reproductive ages. The age range of 21–40 overlaps reproductive ages. Second, many fewer independent measures of BMI are available in the 21–40 age group than in other age groups. The age group of 21–40 for the period after 1985 has a sample of 799, which is much smaller than the other age groups in spite of an age range about twice as wide as the age ranges of 41–50 and 51–60. Splitting the age group of 21–40 into two age groups of 21–30 and 31–40 would lead to underpowered samples.

In the age-period analysis, because a separate mixed model is estimated within each age-period subsample, genetically related measures from the same individual or related individuals can be used as long as they are included in a separate regression so that the BMI measures within a mixed-model regression remain unrelated. This strategy maximizes sample sizes. Within each of the eight age-period subsamples, our analysis used the first BMI measure obtained for each individual.

In the age-activity analysis, the BMI measures were grouped by age group (21–50 and >50) and level of activity. Again, the overwhelming consideration is to work with a reasonably large sample. Other groupings such as those used in the age-period analysis (21–40, 41–50, 51–60, and >60) would lead to extremely small samples because a large proportion of respondents did not respond to the question of physical activity. Within each of the 21–50 and >50 groups, we used the BMI measure that was measured at the same time as the first response to the question of physical activity.

## Results

Figure 2 presents the estimated heritability of BMI by age group and historical period. The sample size or the number of individuals used in each subsample is also provided. The heritability findings suggest that genomic effects on BMI in the obesogenic period after 1985 are larger than those before 1985 within each of the three age groups of 21–40, 41–50, and 51–60. The estimated proportions of BMI variance explained by GWAS for the two periods before and after 1985 are 0.71 versus 0.42, 0.56 versus 0.30, and 0.27 versus 0.10, respectively, for the three age groups.

The heritability findings in Fig. 2 also suggest that genomic influences on BMI tend to decline as individuals age. Before 1985, the estimated heritabilities are 0.42, 0.30, and 0.10 for the age groups of 21–40, 41–50, and 51–60, respectively. After 1985, the estimated heritabilities are 0.71, 0.56, and 0.27, respectively, for the age groups of 21–40, 41–50, and 51–60. Those aged 60 or older are exceptions. Neither the age effect nor the historical effect observed among the younger age groups is present among individuals 60 or older.

However, the analysis of 95 % confidence intervals (Table 1) shows that only the heritability estimates in the reproductive ages (21–40 and 41–50) are significantly larger than zero, and none of the period differences within an age range or age differences within a period are significant.

We now turn to random effects yielded from GCTA analysis, first visually and then examining the results of formal hypothesis tests. Panel a of Fig. 3 shows that the random effects of SNPs or the BLUPs on BMI are substantially larger after 1985 than before 1985 within each age group. In every age group, the variance of the random effects on BMI after 1985 is much larger than that before 1985, especially for age groups of 21–40, 41–50, and 51–60. A larger variance indicates that higher proportions of random effects are located further away from zero and thus represent larger genomic effects on BMI.

Panel b of Fig. 3 shows that the random effects of SNPs or the BLUPs on BMI generally grow smaller with life course or age within each historical period in FHS. Within the historical period after 1985, the size of the variance of the random effects is correlated strictly with age. The older the age group is, the smaller the variance, and thus the smaller genomic effects are on BMI. Within the historical period before 1985, the same pattern emerges with the exception that the variance in 51–60 is larger than that in >60.

Panel c of Fig. 3 demonstrates that among those aged 21–50, the random effects on BMI are much less among those engaged in heavy physical activity than those not engaged in heavy physical activity. Such an effect is absent among those older than 51.

While Fig. 3 visually describes differences in random genomic effects on BMI between pairs of subsamples, Table 2 presents the test results of whether the differences are statistically different. In Tables 2–4, test statistics and *P* values from the Pitman test, the permutation Pitman test, the *F* test, and the permutation *F* tests are presented that test the null hypothesis that the variances of two subgroups are equal against the alternative hypothesis that variances of two subgroups are different.

Overall, the test results show that our three hypotheses are mostly supported. The random genomic effects of the SNPs (BLUPs) are significantly different between the two historical periods (before and after 1985) within each age group (Table 2); mostly significantly different between age groups within a historical period (Table 3); and substantially and significantly different between the physically inactive and the physically active within the age group of 21–50 (Table 4).

Figure 3 and Tables 2–4 together show three findings. First, the random genomic effects of the SNPs (BLUPs) in the period after 1985 are significantly larger than those before 1985 within each age group. Second, the random genomic effects of the SNPs (BLUPs) are generally larger for a younger age group than an older age group within a historical period. Third, in the age group of 21–50, the random genomic effects on BMI among those engaged in heavy physical activity are significantly and dramatically smaller than those unengaged in heavy exercise; this effect is not found in the age group of >50.

We consider a comparison significant only if it is significant by all four tests. The *F* test and the permutation *F* test suggest that the difference in genomic effects between age group 51–60 and age group >60 before 1985 is not significant. Panel b in Fig. 3 shows that the genomic effects in the group of >60 are larger than that in the group of 51–60, which is opposite of what we had predicted. The two permutation tests show that the difference between the two age groups of 21–40 and 41–50 is barely significant. The two *F* tests show that the random genomic effects do not depend on level of physical activity for those older than 50. This test finding is consistent with the visual result in panel c in Fig. 3, in which the two distributions of random genomic effects overlap almost completely.

## Discussion and Conclusion

Guided by an evolutionary theory of obesity, this study investigates how the human genome as a whole interacts with environment to influence BMI, using data from the Framingham Heart Study. This analysis provides empirical support for three hypotheses concerning genome-environment interactive effects on obesity. The empirical support is first suggested by heritability analysis and confirmed by formal hypothesis testing using random genomic effects from GCTA analysis. We consider a comparison significantly different only if it is significantly different by all four tests. Consistent with theoretical reasoning, *F* tests tend to be more stringent than Pitman tests, and permutation tests tend to be more stringent than nonpermutation tests. Permutation *F* tests are the most stringent of the four tests.

The first is a genome-period interaction on BMI. The genomic influence on BMI is substantially larger in the current obesity epidemic after the mid-1980s than in the few decades before the mid-1980s within each age group of 21–40, 40–50, 51–60, and >60. Second, this investigation shows a genomic influence on BMI that weakens as one ages across the life course or as reproduction becomes less important over the life course. This result holds within each of the two historical periods. Third, within the age group of 21–50, the genomic influence on BMI among physically active individuals is substantially smaller than the influence among those who are not physically active. Such an influence is absent among those older than 50.

Our empirical evidence beyond reproductive ages is much weaker than that during reproductive ages. The period effect in the >60 group is the smallest (Fig. 3, panel a). Before 1985, the age effect in the age group of 51–60 is larger than that in the age group of >60; this is a reverse of what is expected (Fig. 3, panel b). After 1985, the age effects in the age groups of 51–60 and >60 are very close in size, but these effects at older ages are much smaller than those at younger ages of 20–50 (Fig. 3, panel c). The interactive effect of physical activity in the age group of >60 is not observed.

The relationship between genomic influence and BMI at post-productive ages may be substantially different from that at reproductive ages. The evolutionary pressure in the development of thrifty genotype is most likely to have occurred during reproductive ages. Beyond reproductive ages, the function of thrifty genotype might not have been selected or selected as consistently.

Empirically, body mass among older individuals develops differently from younger individuals, and both males and females start losing lean body mass beginning at about age 50 (Kyle et al. 2003). The team of He and Meng (2008) reported that individuals 70 and older in the United States are prone to weight loss rather than weight gain and that males 70 and older who are engaged in physical activity actually experience less weight loss. Besides, BMI is only an approximation for excessive adiposity. A difference in BMI may not always indicate a proportional difference in body fat, especially among the elderly population. Older individuals could maintain a constant BMI while simultaneously losing lean body mass and gaining a greater portion of adiposity.

One limitation of the current study is that in order to amass reasonably sized samples, we have to group those older than 60 in the genome-age analysis and group those older than 50 in the genome-activity analysis. Future studies with sufficiently large samples should investigate the age groups of 50s, 60s, and >70 separately.

The findings of this analysis are genome-wide. The focus on the overall genomic influence in the mixed-model framework rather than individual genetic loci can be a feasible alternative to the fixed-effect GWAS studies. Investigating whether and how much, for example, physical activity reduces the effects aggregated over the entire panel of GWAS data on obesity will likely yield additional insights to those obtained from investigating whether and how much physical activity reduces the effect of a single or a few genetic variants.

The GxE interaction effect from the physical activity analysis or the period analysis can be quite large. GWAS main-effect studies show that on average, an *FTO* gene allele makes a difference of 1.2 kg in body weight (e.g., Frayling et al. 2007). Activity-*FTO* interaction studies suggest that physical activity attenuates the effect of *FTO* by 30 %, which amounts to approximately 0.40 kg. This 0.40 kg is the gene-activity interaction effect based on a single gene. Our estimated gene-activity interactive effects are based on a collection of numerous genes throughout the human genome. The finding suggests that a large proportion of genome-wide susceptibility for obesity could be attenuated by physical activity. The interaction effect of genome by physical activity has to be many times larger than 0.4 kg.

The large period effects found in our analysis may help isolate the exact culprits of the current obesity epidemic. These period effects suggest that changes over the past three decades in the United States have induced the human genome to have a larger impact on BMI. Food and exercise are two most likely candidates. In most human history until very recently, food was scarce, and the level of physical activity was high (e.g., Bellisari 2008; Swinburn et al. 2011). Health disparity is considered a source of the current obesity epidemic (Braveman 2009). Does the timing of food abundance, sedentary lifestyle, and/or health disparity correspond to the recent increase in genomic influence?

A small number of other factors have been considered. An intriguing line of research points to environmental endocrine-disrupting chemicals as a possible source for the development of obesity (Casals-Casas et al. 2008; Newbold et al. 2007; Wells 2006). A low-grade systematic inflammation has been considered a factor for obesity even though individuals with excessive adiposity do not typically have overt infection (Visser et al. 1999, 2001; Wisse 2004). Our findings suggest looking for endocrine-disrupting chemicals and/or increased low-grade inflammation that appeared in the environment about the same time the obesity epidemic began; these may have altered the genomic susceptibility for obesity.

This study is based on the premise that GWAS data contain substantially more information than biometrical data that have information only on genetic relatedness among family members without DNA measurement. So far, the GCTA analysis has used only GWAS data to estimate heritability, which is what biometrical data have been used for. In this analysis, in addition to estimating heritability, we took advantage of the estimated random effects from GCTA analysis, tapping information from the extremely large of number of genetic markers. We also developed procedures for hypothesis testing with regards to the random effects. Our analysis shows that random effects provide a lot more statistical power than heritability for a gene-environment interaction analysis. Our approach may be applied to a wide range of human outcomes beyond obesity in studies that assesses how the effects of the human genome as a whole are moderated by environmental factors.

## Acknowledgments

The Framingham Heart Study (FHS) is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University (Contract No. N01-HC-25195). This manuscript was not prepared in collaboration with investigators of the FHS and does not necessarily reflect the opinions or views of FHS, Boston University, or NHLBI. Funding for sharing genotyping was provided by NHLBI Contract N02-HL-64278. This manuscript is based on data from FHS. Our group is independent of any commercial funder, and the principal investigator had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. None of the authors have potential conflicts of interest, including relevant financial interests, activities, relationships, and affiliations.