Smoking-Associated DNA Methylation Biomarkers and Their Predictive Value for All-Cause and Cardiovascular Mortality

Background With epigenome-wide mapping of DNA methylation, a number of novel smoking-associated loci have been identified. Objectives We aimed to assess dose–response relationships of methylation at the top hits from the epigenome-wide methylation studies with smoking exposure as well as with total and cause-specific mortality. Methods In a population-based prospective cohort study in Germany, methylation was quantified in baseline blood DNA of 1,000 older adults by the Illumina 450K assay. Deaths were recorded during a median follow-up of 10.3 years. Dose–response relationships of smoking exposure with methylation at nine CpGs were modeled by restricted cubic spline regression. Associations of individual and aggregate methylation patterns with all-cause, cardiovascular, and cancer mortality were assessed by multiple Cox regression. Results Clear dose–response relationships with respect to current and lifetime smoking intensity were consistently observed for methylation at six of the nine CpGs. Seven of the nine CpGs were also associated with mortality outcomes to various extents. A methylation score based on the top two CpGs (cg05575921 and cg06126421) showed the strongest associations with all-cause, cardiovascular, and cancer mortality, with adjusted hazard ratios (95% CI) of 3.59 (2.10, 6.16), 7.41 (2.81, 19.54), and 2.48 (1.01, 6.08), respectively, for participants with methylation levels in the lowest quartile at both CpGs. Adding methylation at those two CpGs into a model that included the variables of the Systematic Coronary Risk Evaluation chart for fatal cardiovascular risk prediction improved the predictive discrimination. Conclusion The novel methylation biomarkers are highly informative for both smoking exposure and smoking-related mortality outcomes. In particular, these biomarkers may substantially improve cardiovascular risk prediction. Nevertheless, the findings of the present study need to be further validated in additional large longitudinal studies. Citation Zhang Y, Schöttker B, Florath I, Stock C, Butterbach K, Holleczek B, Mons U, Brenner H. 2016. Smoking-associated DNA methylation biomarkers and their predictive value for all-cause and cardiovascular mortality. Environ Health Perspect 124:67–74; http://dx.doi.org/10.1289/ehp.1409020


Introduction
Tobacco smoking has been recognized as a risk factor for a variety of complex diseases (CDC 2014), including cardio vascular diseases (CVDs) (Ezzati et al. 2005b), at least 15 types of cancer (Ezzati et al. 2005a), and pulmonary diseases (Decramer et al. 2012). Nevertheless, accurate prediction of smokingattributable health risk is still hampered by various factors (CDC 2010). In particular, it is well known that self-reported smoking exposure suffers from recall bias or intentional under reporting (Connor Gorber et al. 2009;Rebagliato 2002). Even though a number of biomarkers are well established, such as breath carbon monoxide (CO) and cotinine levels, they exclusively reflect shortterm smoking exposure and are of limited use for quantifying cumulative exposure and consequently for predicting smoking-related risk (CDC 2010). DNA or protein adducts are considered integrative biomarkers that reflect internal effective dose of smoking, which may, however, only be useful for carcino genic risk assessment (CDC 2010;Lodovici and Bigagli 2009). In cardio vascular risk assessment, although several biomarkers have been described and used, no biomarker has yet been identified for specifically predicting smoking-related risk (CDC 2010).
Recent advances in genome-wide methyla tion profiling have opened new avenues in the search for biomarkers reflecting both current and lifetime smoking exposure that might have the potential to enhance prediction of smoking-related risks. Recently, a number of novel smokingassociated blood DNA methylation biomarkers were identified by using the Infinium HumanMethylation Illumina 450K BeadChip (Joubert et al. 2012;Shenker et al. 2013a;Zeilinger et al. 2013), among which seven loci located in four intragenic or intergenic regions [including F2RL3 (cg03636183), AHRR (cg21161138 and cg05575921), 2q37.1 (cg21566642, cg01940273, and cg05951221), 6p21.33 (cg06126421)] were the top seven CpGs reported by both epigenome-wide studies conducted in adults (Shenker et al. 2013a;Zeilinger et al. 2013). To further explore the use of methyl ation levels of these regions for quantifying biologically effective smoking exposure and for enhancing risk prediction of smoking-related disease, we carried out comprehensive analyses on the associations of methylation at nine CpGs [the top seven CpGs listed above and two other CpGs [AHRR (cg23576855); 2q37.1 (cg06644428)] in those regions reported to be associated with smoking (Shenker et al. 2013a;Zeilinger et al. 2013)] with both current and lifetime smoking exposure as well as mortality in a population-based cohort of older adults. In addition, we aimed to evaluate whether these methylation biomarkers can improve the fatal cardio vascular risk prediction estimated by the Systematic Coronary Risk Evaluation (SCORE) chart of the European Society of Cardiology (Conroy et al. 2003).

Study design and data collection.
The study subjects were selected from the ESTHER study, a statewide population-based cohort study conducted in southwest Germany (Schöttker et al. 2013a). Briefly, 9,949 older adults (50-75 years of age) were enrolled by their general practitioners during a routine health check-up between July 2000 and December 2002, and followed up since then. The distribution of socio demographic factors and major risk factors in the cohort was similar to the distribution seen in representative surveys of the population in Germany in the corresponding age range (Löw et al. 2004). A genome-wide methylation screen was performed in baseline blood samples Background: With epigenome-wide mapping of DNA methylation, a number of novel smokingassociated loci have been identified. oBjectives: We aimed to assess dose-response relationships of methylation at the top hits from the epigenome-wide methylation studies with smoking exposure as well as with total and causespecific mortality. Methods: In a population-based prospective cohort study in Germany, methylation was quantified in baseline blood DNA of 1,000 older adults by the Illumina 450K assay. Deaths were recorded during a median follow-up of 10.3 years. Dose-response relationships of smoking exposure with methylation at nine CpGs were modeled by restricted cubic spline regression. Associations of individual and aggregate methylation patterns with all-cause, cardio vascular, and cancer mortality were assessed by multiple Cox regression. results: Clear dose-response relationships with respect to current and lifetime smoking intensity were consistently observed for methylation at six of the nine CpGs. Seven of the nine CpGs were also associated with mortality outcomes to various extents. A methylation score based on the top two CpGs (cg05575921 and cg06126421) showed the strongest associations with all-cause, cardiovascular, and cancer mortality, with adjusted hazard ratios (95% CI) of 3.59 (2.10, 6.16), 7.41 (2.81, 19.54), and 2.48 (1.01, 6.08), respectively, for participants with methylation levels in the lowest quartile at both CpGs. Adding methylation at those two CpGs into a model that included the variables of the Systematic Coronary Risk Evaluation chart for fatal cardio vascular risk prediction improved the predictive discrimination. conclusion: The novel methylation biomarkers are highly informative for both smoking exposure and smoking-related mortality outcomes. In particular, these biomarkers may substantially improve cardiovascular risk prediction. Nevertheless, the findings of the present study need to be further validated in additional large longitudinal studies.  1,000 participants who were recruited between July and October 2000 (i.e., those with the longest follow-up time) and included in the current analysis. The study was approved by the ethics committees of the University of Heidelberg and of the state medical board of Saarland, Germany. Written informed consent was obtained from all participants.
Participants' socio demographic characteristics, lifestyle factors, health status, and history of major diseases at baseline were obtained by a standardized self-administered questionnaire. Detailed information on lifetime active smoking was also ascertained from the self-administered questionnaire, including age at initiation of smoking and intensity of smoking at various ages, as well as age of smoking cessation for former smokers. Additional information on height, weight, blood pressure, and prevalent diseases (e.g., diabetes, hypertension, CVD) was extracted from a standardized form completed by the general practitioners during the health checkups. Prevalent CVD at baseline was defined by either physician-reported coronary heart disease or a self-reported history of myocardial infarction, stroke, pulmonary embolism, or revascularization of coronary arteries. Prevalent cancer [International Classification of Diseases, 10th Revision (ICD-10) codes C00-C99 except nonmelanoma skin cancer (code C44)] was determined by self-report or record linkage with data from the Saarland Cancer Registry [http://www.krebsregister. saarland.de/ziele/ziel1.html (in German)]. Blood samples (21 mL from each participant) were collected during the health check-up and aliquoted and stored at -80°C until further processing. Total cholesterol level was measured in serum by standard highperformance liquid chromatography methods (Schöttker et al. 2013b). Deaths during follow-up (between 2000 and end of 2011) were identified by record linkage with population registries in Saarland; few participants who moved out of Saarland were censored at the date last known to be alive. Information about the major cause of death was obtained from death certificates provided by the local public health offices, and were coded with ICD-10 codes. Cardiovascular and cancer deaths were defined by ICD-10 codes I00-I99 and C00-C99, respectively; non melanoma skin cancer (ICD-10 code C44) was excluded.
Methylation assessment. DNA was extracted from whole blood samples collected at baseline by a salting out procedure (Miller et al. 1988) and was allocated in the 96-well format. Three random duplicate samples were placed on each plate as quality controls. The Infinium HumanMethylation450K BeadChip Assay (Illumina Inc., San Diego, CA, USA) was used to quantify DNA methyla tion at 485,577 CpG sites. Briefly, a sample of 1.5 μg genomic DNA was bisulfite converted, and 200 ng bisulfite-treated DNA was applied to the 450K BeadChips. The samples were analyzed following the manufacturer's instruction at the Genomics and Proteomics Core Facility of German Cancer Research Center. GenomeStudio® (version 2011.1; Illumina Inc.) was used to extract DNA methylation signals from the scanned arrays (module version 1.9.0; Illumina Inc.) and to calculate methylation intensity (β-value) as a ratio of the methylated signal over the sum of the methylated and unmethylated signals at each CpG according to the manufacturer's guide, without additional background correction. Data were normalized to internal controls provided by Illumina (Illumina normalization). Methylation intensities at the nine CpGs were extracted from the 450K data.
Statistical analysis. Median methylation intensities at the nine CpGs were determined for strata of socio demographic charac teristics, lifestyle factors, and prevalent diseases; differences in methylation intensities between strata were examined by Kruskal-Wallis tests. Correlations between methylation intensity at the nine CpGs were assessed by Spearman rank correlation coefficients. The associations between smoking indicators (including smoking status, current intensity of smoking, cumulative dose of smoking, and time since cessation of smoking) and methylation intensity at the nine CpGs were assessed by linear regression models, controlling for batch effect, age (years), sex, body mass index (BMI; < 25, 25.0 to < 30.0, ≥ 30.0 kg/m 2 ), physical activity ( inactive, insufficient, sufficient), and prevalence of CVD (ICD-10 codes I20-I16, I60-I69), diabetes (ICD-10 codes E10-E14), and cancer (ICD-10 codes C00-C99 except C44) at baseline. Dose-response relationships of current and lifetime smoking intensity, and time since smoking cessation with methylation intensity were assessed using restricted cubic spline (RSC) regression (Desquilbet and Mariotti 2010), controlling for the aforementioned confounders.
The associations of methylation intensities at each of the nine CpGs with all-cause mortality were first examined by Kaplan-Meier plots and log-rank tests. Then Cox regression models were fit adjusting for age (years), sex, and batch effect (model 1). Further models were additionally adjusted for smoking status (never, former, current smoker) (model 2) and for systolic blood pressure (millimeters of mercury), total cholesterol level (milligrams per deciliter), BMI (< 25, 25.0 to < 30.0, ≥ 30.0 kg/m 2 ), physical activity ( inactive, insufficient, sufficient), and prevalence of CVD (ICD-10 codes I20-I16, I60-I69), diabetes (ICD-10 codes E10-E14), and cancer (ICD-10 codes C00-C99 except C44) at baseline (model 3). Methylation intensity was entered into the models either as a categorical variable (using the highest quartiles as reference level) or as a continuous variable [calculating hazard ratios (HR) for a decrease in methylation intensity by one standard deviation]. In parallel, the associations between smoking at baseline and all-cause mortality were also estimated by Cox regression, with and without controlling for methylation intensities to explore the role of DNA methylation in smoking-related mortality. The proportional hazards assumption was assessed by martingale-based residuals (Lin et al. 1993). These preliminary analyses showed methylation at two of the nine CpGs (cg05575921, cg06126421) to be most strongly associated with all-cause mortality, whereas much less strong or non significant associations were observed for the other seven CpGs. Additional preliminary analyses were conducted by L 1 -penalized Cox model (Benner et al. 2010;Goeman 2010) with nine CpGs and other risk factors as covariates; in that model, only cg05575921 and cg06126421 were selected among the nine CpGs. We therefore carried out analyses on all-cause and cause-specific mortality, including CVD, cancer, and other mortality, using a methyl ation-based score developed according to these two CpGs. Categories of the score were 2, 1, and 0 for participants in the lowest quartiles of both CpGs, in one of the two CpGs, and none of the two CpGs, respectively. In addition, the analyses were repeated after joint classification of participants according to both methylation score and sex.
To further assess the potential contributions of the smoking-associated CpGs for fatal cardio vascular risk prediction, methylation intensity at the nine CpGs individually and jointly added to a Cox regression model consisting of variables of the SCORE (Conroy et al. 2003), including age (years), sex, systolic blood pressure (milli meters of mercury), current smoking (yes, no), and total cholesterol (milligrams per deciliter) and using cardio vascular mortality as the dependent variable, additionally controlling for batch effect. Model fit was compared using Akaike information criterion (AIC) and likelihood ratio (LR) tests. Discrimination of the models was evaluated by Harrell's C-statistics (Harrell et al. 1996), and the over optimism was corrected using .632 bootstrap analysis with 1,000 replications [for this purpose, a SAS Macro was adapted from Miao's work (Miao et al. 2013)]. Bootstrapping is a well-established approach for validation of a predictive model through quantifying the degradation in model predictive accuracy when applied in different data sources, which is known as overoptimism. The improvement in model performance by adding methylation intensity was examined by both net reclassification improvement (NRI) and integrated discrimination improvement (IDI). The NRI assesses whether participants are classified into clinically relevant risk categories by adding a new factor (e.g., methylation marker) to the risk prediction model (e.g., SCORE model). Absolute risk predictions were first calculated by Cox regression model with and without methylation marker for each individual, followed by assigning risk categories according to the recommended 10-year risk categories: 0-5%, > 5-10%, > 10-20%, and > 20% of predicted probability for a cardio vascular event (Cook 2007;Pencina et al. 2008). Movements are considered separately for cases (deaths) and controls (survivors), and deemed as correct direction if cases move into a higher risk category and controls move into a lower risk category. NRI = [(no. of cases up -no. of cases down)/no. of cases] -[(no. of controls up -no. of controls down)/no. of controls]. IDI estimates the mean difference in predicted probability for cases and controls over all possible cut-off points between models with and without methyla tion marker (Cook 2010;Pencina et al. 2008). Calibration of all assessed models was examined by May-Hosmer's simplification of the Gronnesby-Borgan test (May and Hosmer 2004). The study population was divided into five subgroups according to the quintiles of the ranks based on their estimated risk probability, and model calibration was deemed satisfactory if p-values were > 0.05 for comparison of the observed and expected cases in each subgroup. Potential multi collinearity when simultaneously adding both CpGs in the model was assessed by variance inflation factor (VIF) and tolerance values, which did not indicate any relevant multi collinearity (e.g., VIF = 1.46 and tolerance = 0.69 when adding cg05575921 and cg06126421). Sensitivity analyses were carried out by excluding participants with prevalent CVD at baseline (n = 29).
The penalized Cox regression analyses were conducted using the R package "penalized" (version 0.9-42; Goeman et al. 2014), and all other analyses were carried out in SAS 9.3 (SAS Institute Inc., Cary, NC, USA).

Results
Of 1,000 participants included in the present analysis, mortality follow-up was available for 999 subjects. Of the nine CpG sites assays, cg21566642, cg23576855, and cg21161138 had 3, 1, and 1 missing values, respectively; all other CpGs had complete data. Characteristics of the study population at baseline are shown in Table 1. Equal numbers of men and women of German nationality were included. The mean age was 62 years, and 33.9% of participants were younger than 60 years. More than half of the participants had ever smoked, and 19% still smoked at the time of recruitment, among whom male (61.3%) and younger (< 60 years, 45.2%) participants were somewhat overrepresented. During a median follow-up time of 10.3 years, 143 participants died. Among 135 participants with death certificates (94.4%), 50 died from CVD, 49 died from cancer, and 36 died from other diseases.
Methylation intensities by demographic and behavioral factors. Methylation intensities across various strata of charac teristics of the study population are shown in Table 1 for AHRR cg05575921 and 6p21.33 cg06126421 (see Supplemental Material, Table S1, for all other CpGs). Men had lower methylation intensities than women at all nine CpG sites (all p < 0.0001). Methylation was not significantly associated with age (p > 0.05) except at 2q37.1 cg06644428 (p < 0.0001). Major differences were observed between never, former, and current smokers. Methylation levels at all nine CpGs were lower in current smokers than in never smokers and intermediate in former smokers, and all of the differences across the three group were statistically significant (p < 0.0001).
Correlations of methylation intensities at the nine CpGs. Mutual Spearman correlation coefficients for methylation intensities at all CpGs except cg06644428 were 0.46-0.93; Spearman correlation coefficients between cg06644428 and other CpGs were 0.18-0.66 (see Supplemental Material, Table S2).
Methylation intensities by smoking charac teristics. Table 2 shows the association between smoking behavior and methylation intensities at cg05575921 and cg06126421 estimated by linear regression (results for the other seven CpGs, which showed very similar patterns, are presented in Supplemental Material, Table S3). Compared with participants who never smoked, current and former smokers had the lowest and intermediate methylation levels at both CpGs, respectively. Methylation intensities were inversely associated with both current and lifetime smoking intensity, and were positively associated with time since cessation. Estimated dose-response curves for smoking behavior with methylation intensity at the two CpGs are shown in Figure 1. A steep decrease in methylation intensity was observed with increasing  Figure S1)]. Methylation intensities and mortality. Supplemental Material, Figure S2 depicts the survival experience according to quartiles of methylation intensity at the nine CpGs: a gradient of lower survival among participants with lower methylation levels was observed for 7 of the nine CpGs (all except cg23576855 and cg06644428). The associations of methyla tion intensity at the individual CpGs with all-cause mortality are further presented in Supplemental Material, Table S4. After multi variate adjustment, the strongest and statistically significant associations were estimated for two CpGs (cg05575921 and cg06126421), with HR = 2.45 [95% confidence interval (CI): 1.26, 4.79] and HR = 2.34 (95% CI: 1.27, 4.30), respectively, for the lowest quartile compared with the highest quartile. In addition, a decrease in methylation intensity by one standard deviation was associated with an increase in allcause mortality by 15%-60% for seven CpGs (all except cg23576855 and cg06644428). In addition, a 1-SD decrease in methylation intensity was associated with higher all-cause mortality for seven CpGs (HR 1.15-1.59, with p < 0.05 for 5 CpGs); HRs for cg23576855 and cg06644428 were 0.97 and 1.00, respectively. Table 3 shows the associations of scorebased methylation with all-cause and causespecific mortality. Multivariate-adjusted HRs for cardio vascular, cancer, and other mortality were 7.41 (95% CI: 2.81, 19.54), 2.48 (95% CI: 1.01, 6.08), and 2.78 (95% CI: 0.97, 7.98), respectively, for participants in the lowest quartile of methylation for both cg05575921 and cg06126421 compared with participants who were not in the lowest quartile of methylation for either CpG. By contrast, the strong associations between current smoking and all mortality outcomes were substantially attenuated or disappeared after adjustment for methylation-based score. Joint classification by sex and methylation demonstrated clear dose-response relationships of the methyla tion score with mortality in both sexes (see Supplemental Material, Table S5). Table 4 and Supplemental Material, Table S6 present the increment in the performance indicators of the SCORE in prediction of fatal CVD by adding methylation intensity. The largest improvement was observed when including cg05575921 and cg06126421: Harrell's C-statistics increased from 0.754 for the SCORE-only model to 0.822 and from 0.736 to 0.779 after correction for overoptimism (Table 4). Adding the two CpGs also resulted in 18 cases and 82 controls moving up and 11 cases and 151 controls moving down, which resulted in a NRI of 21.92% (p = 0.049) and a significant IDI of 3.73% (p = 0.005). Additionally adding methylation at other CpGs did not lead to a further improvement in the prediction of fatal CVD mortality (see Supplemental Material, Table S6). Even though NRI and IDI increased with additional CpGs included in the model, a substantial proportion of controls, who were supposed to move to lower risk categories, moved to higher risk categories along with cases moving to higher risk categories. The improvement in risk prediction became larger after excluding participants with CVD at baseline (n = 216; see Supplemental Material, Table S7). The Gronnesby-Borgan test indicated that the new model was also well-calibrated in both full and sensitivity analyses (all p > 0.05).

Discussion
In this population-based cohort study, we found clear dose-response relationships of current and lifetime smoking exposure and time since smoking cessation with site-specific methylation, which were consistent among six CpGs located in AHRR (cg05575921, cg21161138), F2RL3 (cg03636183), 2q37.1 (cg21566642, cg01940273), and 6p21.22 (cg06126421). Methylation at seven CpGs (all above + cg05951221) was also associated with mortality outcomes to various extents. A score based on methylation at the top two CpGs (cg05575921 and cg06126421) provided very strong associations with allcause, cardio vascular, and cancer mortality. Moreover, integrating methylation at these two CpGs into the conventional risk factors substantially improved the accuracy of predicting fatal cardio vascular risk and reclassified a substantial proportion of individuals to higher or lower risk categories.
A biomarker reflecting long-term past smoking exposure is desirable for accurate evaluation of smoking cessation and for assessment of smoking-related disease risk (CDC 2010). DNA methylation biomarkers might be promising candidates for this purpose. Methylation at nine loci targeted in our study was reported to be strongly associated with smoking exposure by both previous genome-wide methylation studies (Shenker et al. 2013a;Zeilinger et al. 2013). In the present study, distinct and rather consistent dose-response patterns of   methylation with respect to both lifetime cumulative smoking exposure and time since cessation were observed for six of the nine CpGs, which are, of note, similar to the dose-response patterns observed between smoking and smoking-related diseases. For example, cardio vascular risk increases sharply at low levels of cigarette consumption and then plateaus at higher levels of smoking (CDC 2010); the reduction of cardio vascular risk becomes evident within the initial years after quitting smoking and remains slightly elevated for more than a decade (CDC 2010;Kramer et al. 2006;Lightwood and Glantz 1997). The observed dose-response pattern of these six CpGs with current and lifetime smoking behavior was also consistent with dose-response patterns of methylation at the F2RL3 gene previously identified by our group in a large study specifically focusing on this site (Zhang et al. 2014 Our present study, in which we addressed associations of methyla tion patterns with both smoking and smoking-related mortality, suggested that the identified DNA methylation biomarkers might be markers of cumulative smoking exposure-associated risk. The AHRR gene, known as a tumor repressor (Zudaire et al. 2008), codes a protein involved in multiple pathophysiological pathways, such as metabolism of tobacco smoke components (Kasai et al. 2006;Moennikes et al. 2004) and regulation of cell proliferation and differentiation (Haarmann-Stemmann et al. 2007;Pot 2012). Hypomethylation of cg05575921 at AHRR has been reported to be associated with increasing lymphoblast AHRR gene expression in vivo (Monick et al. 2012). It has also been observed that AHRR expression in human lung tissues was inversly correlated with methylation levels of cg23576855 and cg21161138 at AHRR, with 5.7-fold increased expression in five current smokers compared with five non smokers (Shenker et al. 2013a). AHRR and the aryl hydrocarbon receptor (AHR) constitute a feedback loop in which the AHR hetero dimer activates the expression of the AHRR gene, and the expressed AHRR inhibits the function of AHR in oncogenesis (Mimura et al. 1999). Tobacco smoking has been shown to trigger the production of AHR that mediates dioxin toxicity and other pathological effects (Martey et al. 2005;Meek and Finch 1999). Therefore, it is plausible to assume that demethylation/overexpression of the AHRR gene may result from a smoking-induced increase in AHR activation. The gene product of F2RL3, thrombin protease-activated receptor-4 (PAR-4), plays roles in inflammatory reactions and blood coagulation (Leger et al. 2006), and other pathophysiology commonly described in smoking-induced conditions (Leone 2007;Rahman and Laher 2007). Hypomethylation at F2RL3 has been suggested to be strongly associated with mortality in a cohort of 1,206 patients with stable CVD (Breitling et al. 2012). Interestingly, methylation at four CpGs assessed in our study [AHRR (cg05575921), F2RL3 (cg03636183), 2q37.1 (cg21566642), and 6p21.22 (cg06126421)] were recently found to be associated with a metabolic indicator of complex disorders, 4-vinylphenol sulfate (Petersen et al. 2014). Of note, this metabolic marker has also been reported to be associated with smoking (Manini et al. 2003). Although the potential joint or Figure 1. Dose-response relationships between smoking behavior and methylation intensity (results from restricted cubic spline regression adjusted for potential confounding factors). CL, confidence limit. (A) Dose-response relationship between current intensity of smoking and methylation intensity at AHRR (cg05575921; left), and 6p21.33 (cg06126421; right); never and former smokers were defined as reference, with current smoking intensity = 0. (B) Dose-response relationship between cumulative dose of smoking and methylation intensity at AHRR (cg05575921; left), and 6p21.33 (cg06126421; right); never smokers were defined as reference, with pack-years = 0. (C) Dose-response relationship between time since cessation of smoking and methylation intensity at AHRR (cg05575921; left), and 6p21.33 (cg06126421; right) among former smokers; current smokers were defined as reference, with time since cessation = 0. volume 124 | number 1 | January 2016 • Environmental Health Perspectives independent epigenetic role of the various loci remains to be clarified, these findings, as well as the disappearance or attenuation of association between smoking and mortality outcomes after adjustment for methylation at these CpGs in the present study, suggest that multiple DNA methylation sites are involved in mediating smoking-related adverse effects.
The much stronger associations of the methylation markers with mortality outcomes, compared with those of commonly studied molecular and genetic biomarkers, and the attenuation or disappearance of the association between current smoking and mortality after adjustment for the methylation markers observed in our study suggest that DNA methylation biomarkers may more accurately summarize individuals' smokingrelated risks that accumulated through past and current exposure, and thus be more informative in risk assessment than selfreported smoking history. To our knowledge, this is the first study to evaluate the improvement in risk assessment of fatal CVD Abbreviations: HR, hazard ratio; IR, incidence rate; PY, person-years. a Score was based on methylation intensity at cg05575921 and cg06126421, defined as follows: 2, methylation intensity in the lowest quartiles of both 2 CpG sites; 1, methylation intensity in the lowest quartiles of one of the 2 CpG sites; 0, other. b Incidence rate per 100 person-years. c Model 1: adjusted for age, sex, and batch effect. d Model 2: model 1 plus adjusted for smoking status and methylation score. e Model 3: model 2 plus adjusted for BMI, physical activity, systolic blood pressure, total cholesterol, hypertension, and prevalent CVD, diabetes, and cancer at baseline. when adding DNA methylation biomarkers to conventional risk factors. The increment in C-statistics by adding the methylation intensity at cg05575921 and cg06126421 (approximately 0.04) was much larger than the increment seen by adding a multi marker score in the Framingham Heart Study (C-statistics for model of major cardio vascular events increased by 0.01) (Wang et al. 2006). In another large population-based cohort, the investigators evaluated six novel biomarkers for cardio vascular risk prediction along with the conventional markers and reported the NRI was 0.00% and 4.70% for cardio vascular events and coronary events, respectively (Melander et al. 2009). They obtained improved NRI by restricting the analyses to individuals with intermediate risk; the reclassification, however, was essentially confined to down-classification of participants without events. Of note, the proportion of reclassified participants was substantial in our study, and consisted of not only downclassification of individuals without events but also up-classification of individuals with events. Given that nearly 22% of participants were reclassified, inclusion of smoking-associated methylation markers into the routine screening programs, such as the SCORE risk estimation system, would benefit a substantial proportion of individuals in the population setting and could greatly promote cost effective ness of CVD prevention and therapy. On the other hand, our study was an exploratory investigation on CVD risk prediction using methylation markers based on a limited number of total cardio vascular deaths, thus our findings need to be validated in an independent population. The performance of these methylation markers for predicting risk of non fatal or subtypes of fatal CVD, such as coronary and non-coronary heart disease, needs to be evaluated in further studies with high-quality assessment of CVD risk factors as well as CVD events. In addition, to examine the generalizability of the current finding, the performance of methylation markers should also be assessed in relation to other well-established risk scores, such as the Framingham score, and in geographically different populations. Our study has specific strengths and limitations. Strengths of our study are the population-based prospective study design with comprehensive information on smoking exposure and a variety of covariates, as well as long-term complete mortality follow-up data. A limitation is that the limited numbers of cause-specific deaths prevent the analyses from going into more detail, such as sexspecific examination of CVD risk prediction or investigation of deaths from well-known smoking-associated subtypes of cancer (CDC 2014; Ezzati et al. 2005a). Future studies with large numbers of participants would be desirable to further validate our findings. Information on cause of death was based on death certificates, which are known to be less than perfect. However, potential misclassification between the broad categories of causes of deaths assessed in our study is likely to be much less relevant than potential misclassification between specific causes; given the rather consistent findings of an inverse association with methyla tion intensity for all categories of causes of deaths, such misclassification might have had only a small impact on the observed results. An additional limitation of our study is that methyla tion was measured from whole blood, without possibilities for differentiating DNA methyla tion between various cell types. It might therefore be conceivable that differences in methylation might, in part, reflect different distribution of leukocyte cell types. However, even if the difference in methylation we observed was primarily or partly due to shifts in leukocyte distribution, their use as biomarkers for charac terizing smoking exposure or risk prediction would not be invalidated. On the contrary, given that DNA from whole blood is more readily obtainable in most clinical and epidemiological settings, biomarkers based on whole blood may be more relevant for clinical practice. Finally, our results are based on a single study and might be over optimistic because only the CpG sites that performed best in the exploratory phase of the study were used to create the model and outcome classification. Further validation in independent studies should therefore be the aim for future studies.
Despite its limitations, our study strongly supports the potential utility of DNA methylation markers as indicators for both current and lifetime smoking exposure and for predicting mortality outcomes, in particular for cardiovascular mortality. Incorporation of methylation biomarkers into conventional risk factors might be a promising approach to improve cardio vascular risk assessment and disease prevention, which needs to be further validated and confirmed in additional studies with a large number of participants and detailed assessment of known determinants of CVD.