Associations between Maternal Tobacco Smoke Exposure and the Cord Blood CD4+ DNA Methylome

Background: Maternal tobacco smoke exposure has been associated with altered DNA methylation. However, previous studies largely used methylation arrays, which cover a small fraction of CpGs, and focused on whole cord blood. Objectives: The current study examined the impact of in utero exposure to maternal tobacco smoke on the cord blood CD4+ DNA methylome. Methods: The methylomes of 20 Hispanic white newborns (n=10 exposed to any maternal tobacco smoke in pregnancy; n=10 unexposed) from the Maternal and Child Health Study (MACHS) were profiled by whole-genome bisulfite sequencing (median coverage: 6.5×). Statistical analyses were conducted using the Regression Analysis of Differential Methylation (RADMeth) program because it performs well on low-coverage data (minimizes false positives and negatives). Results: We found that 10,381 CpGs were differentially methylated by tobacco smoke exposure [neighbor-adjusted p-values that are additionally corrected for multiple testing based on the Benjamini-Hochberg method for controlling the false discovery rate (FDR) (pFDR)<0.05]. From these CpGs, RADMeth identified 557 differentially methylated regions (DMRs) that were overrepresented (p<0.05) in important regulatory regions, including enhancers. Of nine DMRs that could be queried in a reduced representation bisulfite sequencing (RRBS) study of adult CD4+ cells (n=9 smokers; n=10 nonsmokers), four replicated (p<0.05). Additionally, a CpG in the promoter of SLC7A8 (percent methylation difference: −9.4% comparing exposed to unexposed) replicated (p<0.05) in an EPIC (Illumina) array study of cord blood CD4+ cells (n=14 exposed to sustained maternal tobacco smoke; n=16 unexposed) and in a study of adult CD4+ cells across two platforms (EPIC: n=9 smokers; n=11 nonsmokers; 450K: n=59 smokers; n=72 nonsmokers). Conclusions: Maternal tobacco smoke exposure in pregnancy is associated with cord blood CD4+ DNA methylation in key regulatory regions, including enhancers. While we used a method that performs well on low-coverage data, we cannot exclude the possibility that some results may be false positives. However, we identified a differentially methylated CpG in amino acid transporter SLC7A8 that is highly reproducible, which may be sensitive to cigarette smoke in both cord blood and adult CD4+ cells. https://doi.org/10.1289/EHP3398


Table of Contents
Replication Look-Up Analyses Table S1. Demographic Characteristics of the WakeMed SMKE EPIC Study Participants.     Bauer et al. 2016. Figure S1. Dot and boxplots for the 10 CpGs with the smallest p-values and absolute %methylation differences between 10-11%. The CpG position is shown on the x-axis and the %methylation level is shown on the y-axis. Horizontal lines within each boxplot indicate the median %methylation level for each CpG. Interquartile ranges are represented by the upper and lower boundaries of the boxplots. The vertical lines ("whiskers") at the top and bottom of the boxplots indicate the boundaries of 1.5 times the interquartile range. Points beyond these whiskers are outliers. Newborns who were exposed to maternal tobacco smoke in utero are indicated in blue, while unexposed newborns are indicated in pink. Figure S2. Circos plot showing the 10,381 differentially methylated CpG sites and the 557 differentially methylated regions identified in cord blood CD4 + samples from the Maternal and Child Health Study (n=10 exposed, n=10 unexposed to any maternal tobacco smoke during pregnancy) by chromosome and genomic location. The outermost ring is comprised of chromosome ideograms, which show the relative size of each chromosome, in megabases (MB), and its banding patterns (darker black and gray bands indicate heterochromatin, white bands indicates euchromatin, red bands indicate centromeres, and blue bands indicate stalks for acrocentric chromosomes). The middle ring shows the differentially methylated CpG sites and the innermost ring shows the differentially methylated regions. CpGs and regions that were hypermethylated in the maternal tobacco smoke exposed, compared with unexposed, group are shown in dark blue and dark red, respectively. CpGs and regions that were hypomethylated in the maternal tobacco smoke exposed, compared with unexposed, group are shown in light blue and pink, respectively. The height of each bar indicates the %methylation difference between groups.

Figure S3.
Histograms showing distributions for the A) %methylation differences within differentially methylated regions (DMRs), B) base pair lengths of DMRs, and C) number of differentially methylated CpG sites (raw and false discovery rate-adjusted p<0.05) within DMRs.

Figure S4. (A)
Proportion of differentially methylated regions (DMRs) that were hypermethylated (black) and hypomethylated (gray) in the maternal tobacco smoke exposed, compared with unexposed, group by genomic region, (B) corresponding enrichment tests (Fisher's exact test), comparing the number of DMRs overlapping each genomic region with a set of similar-sized regions randomly selected from the genome, and (C) median (range) %methylation differences by genomic region. Figure S5. Dot and boxplots for the 33 CpGs that were identified as differentially methylated in MACHS (A), which replicated (%methylation difference in the same direction and raw p-value<0.05) in the WakeMed SMKE EPIC array study of cord blood CD4 + cells (B). The name of the EPIC array CpG is shown on the x-axis, and the %methylation level is shown on the y-axis. Horizontal lines within each boxplot indicate the median %methylation value for each CpG. The interquartile range is represented by the upper and lower boundaries of the boxplot. The vertical lines ("whiskers") at the top and bottom of each boxplot indicate the boundaries of 1.5 times the interquartile range. Points beyond these whiskers are outliers. Newborns who were exposed to maternal tobacco smoke in utero are indicated in blue, while unexposed newborns are indicated in pink.

Replication Look-Up Analyses
Replication analyses were conducted using data from two different studies of tobacco smoke exposure, which also profiled DNA methylation patterns in CD4 + cells.
The first study, SMKE, profiled DNA methylation using Illumina's Infinium MethylationEPIC array in cord blood CD4 + cells isolated from a subset of 30 newborns exposed (n=14) versus unexposed (n=16) to maternal tobacco smoke, who were recruited from the WakeMed hospital in Raleigh, North Carolina. Non-smoking mothers from this study were lifetime non-smokers. Cord blood mononuclear cells were isolated using Ficoll-Paque PLUS (Sigma-Aldrich), and CD4 + antibody-coated magnetic beads (Invitrogen Dynabeads) were used to isolate CD4 + T cells. DNA/RNA was extracted using the Qiagen All Prep DNA/RNA/miRNA kit (Qiagen, 80224), according to the manufacturer's instructions. Participant characteristics are shown in Table S1. DNA methylation was analyzed on Illumina's Infinium MethylationEPIC array as per manufacturer's instructions. Raw methylation image files were processed using the minfi package in R (Aryee et al. 2014). Background correction and dye-bias equalization was performed via the normal-exponential out-of-band (noob) correction method. The methylation level at each CpG was reported as the beta-value [β = intensity of the methylated allele (M) / (intensity of the unmethylated allele (U) + intensity of the methylated allele (M) + 100)]. Betavalues were then transformed to obtain the log ratio, defined as log[β/(1 -β)], or M. Robust linear regression was used to evaluate the association between DNA methylation (M) at each CpG and smoking status while adjusting for potential confounders, including gestational age, infant sex, mother's ethnicity (non-Hispanic black, non-Hispanic white, and Hispanic other), and sample batch. A total of 485 CpGs that were identified as differentially methylated (false discovery rate-adjusted p<0.05) in the Maternal and Child Health Study (MACHS) are represented on the EPIC array and could therefore be queried in the SMKE study.
The second set of replication analyses were conducted using results from a study of adult smokers (n=59) and lifetime non-smokers (n=72) who were recruited at the National Institute of Environmental Health Sciences Clinical Research Unit (NIEHS CRU), which has been described previously (Wan et al. 2018). DNA methylation was measured in CD4 + cells for all participants using Illumina's Infinium HumanMethylation450 array, and was also measured in CD4 + cells from a subset of participants using Illumina's EPIC array (n=9 smokers, n=11 nonsmokers), and also in CD4 + cells from a subset of female participants, using reduced representation bisulfite sequencing (RRBS) (n=9 smokers, n=10 non-smokers). The 450k and EPIC array data were processed and analyzed using the same methods as those described above for the SMKE study, and statistical models were similarly adjusted for age, sex, and ethnicity. RRBS was carried out as reported previously in Wan et al. (Wan et al. 2018). Briefly, DNA from CD4 + cells was digested with MspI, bisulfite converted, made into RRBS libraries, and sequenced on the Illumina NextSeq platform. Bismark version 0.14.3 was used to align the reads to the hg19 assembly and methylation percentages were derived for each CpG, excluding any sites that had fewer than 10 reads or occurred at a single nucleotide polymorphism as reported previously (Su et al. 2016). RRBS DMRs were determined using the method of Wan et al. (Wan et al. 2018). Demographic characteristics for each set of NIEHS CRU participants are shown in Tables S2-S4.
CpG sites that were found to be differentially methylated (p FDR <0.05) by maternal tobacco smoke exposure status in MACHS, which are represented on the 450K (n=399) and EPIC (n=485) arrays, were queried in the SMKE and NIEHS CRU EPIC array results using Practical Extraction and Reporting Language (PERL) script to match chromosomal positions. Raw p-values<0.05 were considered statistically significant for the replication study. Directions of association were also compared. Similarly, DMRs that were represented in the NIEHS CRU RRBS results (n=9) were identified using chromosome position. Raw p-values<0.05 were considered statistically significant. Directions of effect were also compared.      0.66, 0.92 a DMRs were identified by merging neighboring CpGs with a false discovery rate-adjusted p<0.05, identified by betabinomial regression models, adjusted for maternal working status and infant sex, using the RADMeth program b The FANTOM5 consortium predicts targets of enhancers by examining correlations between enhancers and all robust FANTOM5 promoter pairs within 500 kb and then filters these pairs for correlations with p-values<1.0x10    Figure S1. Dot and boxplots for the 10 CpGs with the smallest p-values and absolute %methylation differences between 10-11%. The CpG position is shown on the x-axis and the %methylation level is shown on the y-axis. Horizontal lines within each boxplot indicate the median %methylation level for each CpG. Interquartile ranges are represented by the upper and lower boundaries of the boxplots. The vertical lines ("whiskers") at the top and bottom of the boxplots indicate the boundaries of 1.5 times the interquartile range. Points beyond these whiskers are outliers. Newborns who were exposed to maternal tobacco smoke in utero are indicated in blue, while unexposed newborns are indicated in pink. Figure S2. Circos plot showing the 10,381 differentially methylated CpG sites and the 557 differentially methylated regions identified in cord blood CD4 + samples from the Maternal and Child Health Study (n=10 exposed, n=10 unexposed to any maternal tobacco smoke during pregnancy) by chromosome and genomic location. The outermost ring is comprised of chromosome ideograms, which show the relative size of each chromosome, in megabases (MB), and its banding patterns (darker black and gray bands indicate heterochromatin, white bands indicates euchromatin, red bands indicate centromeres, and blue bands indicate stalks for acrocentric chromosomes). The middle ring shows the differentially methylated CpG sites and the innermost ring shows the differentially methylated regions. CpGs and regions that were hypermethylated in the maternal tobacco smoke exposed, compared with unexposed, group are shown in dark blue and dark red, respectively. CpGs and regions that were hypomethylated in the maternal tobacco smoke exposed, compared with unexposed, group are shown in light blue and pink, respectively. The height of each bar indicates the %methylation difference between groups.
A B C Figure S3. Histograms showing distributions for the A) %methylation differences within differentially methylated regions (DMRs), B) base pair lengths of DMRs, and C) number of differentially methylated CpG sites (raw and false discovery rate-adjusted p<0.05) within DMRs  Figure S4. (A) Proportion of differentially methylated regions (DMRs) that were hypermethylated (black) and hypomethylated (gray) in the maternal tobacco smoke exposed, compared with unexposed, group by genomic region, (B) corresponding enrichment tests (Fisher's exact test), comparing the number of DMRs overlapping each genomic region with a set of similar-sized regions randomly selected from the genome, and (C) median (range) %methylation differences by genomic region