Cadmium-Associated Differential Methylation throughout the Placental Genome: Epigenome-Wide Association Study of Two U.S. Birth Cohorts

Background: Cadmium (Cd) is a ubiquitous environmental toxicant that can accumulate in the placenta during pregnancy, where it may impair placental function and affect fetal development. Objectives: We aimed to investigate Cd-associated variations in placental DNA methylation (DNAM) and associations with gene expression; we also aimed to identify novel pathways involved in Cd-associated reproductive toxicity. Methods: Using placental DNAM and Cd concentrations in the New Hampshire Birth Cohort Study (NHBCS, n=343) and the Rhode Island Child Health Study (RICHS, n=141), we performed an epigenome-wide association study (EWAS) between Cd and DNAM, adjusting for tissue heterogeneity using a reference-free method. Cohort-specific results were aggregated via inverse variance weighted fixed effects meta-analysis, and variably methylated CpGs were associated with gene expression. We then performed functional enrichment analysis and tests for associations between gene expression and birth size metrics. Results: We identified 17 Cd-associated differentially methylated CpG sites with meta-analysis p-values<1×10−5, two of which were within a 5% false discovery rate (FDR). DNAM levels at 9 of the 17 loci were associated with increased expression of 6 genes (5% FDR): TNFAIP2, EXOC3L4, GAS7, SREBF1, ACOT7, and RORA. Higher placental expression of TNFAIP2 and ACOT7 and lower expression of RORA were associated with lower birth weight z-scores (p-values<0.05). Conclusion: Cd-associated differential DNAM and corresponding DNAM-expression associations were observed at loci involved in inflammatory signaling and cell growth. The expression levels of genes involved in inflammatory signaling (TNFAIP2, ACOT7, and RORA) were also associated with birth weight, suggesting a role for inflammatory processes in Cd-associated reproductive toxicity. https://doi.org/10.1289/EHP2192


Introduction
Cadmium (Cd) is a toxic heavy metal that is released into the environment from mining and industrial processes, the application of phosphate fertilizers, and fossil fuel combustion; it accumulates in soils and is readily taken up by plants (ATSDR 2012). Human exposure primarily occurs via consumption of contaminated foods or through using tobacco products (Järup and Åkesson 2009). Cadmium serves no biological role in humans and can cause severe health consequences such as kidney damage, bone and joint problems, and various cancers (ATSDR 2012). Trace exposures to cadmium may also have toxic effects on pregnant mothers and on the developing fetus (Thompson and Bannigan 2008).
Recent epidemiologic studies have observed associations between Cd concentrations in maternal and/or fetal tissues with restricted fetal growth and pregnancy complications (Al-Saleh et al. 2014, Kippler et al. 2012bLaine et al. 2015;Menai et al. 2012;Wang et al. 2016a). The toxic mechanisms that underlie these associations are not clear, although Cd is known to induce oxidative stress, interfere with cell-cycle regulation, alter apoptotic signaling at the cellular level (Valko et al. 2016), and to act as an endocrine disruptor (Jiménez-Ortega et al. 2012). Additionally, the placenta provides a partial barrier against maternal-fetal Cd transfer, resulting in Cd accumulation in the placenta throughout pregnancy, although its accumulation within the placenta is likely not without consequence (Gundacker and Hengstschläger 2012). Animal models and in vitro studies have suggested that maternal Cd exposure during pregnancy can perturb maternal-fetal nutrient and waste transfer (Kippler et al. 2010;Wang et al. 2016b), increase oxidative stress in placental tissue (Wang et al. 2012), interfere with placental glucocorticoid synthesis (Wang et al. 2014), and alter trophoblast proliferation and apoptosis (Erboga and Kanter 2016). Thus, the placenta appears to be a critical target tissue for the toxic effects of Cd during human pregnancies.
We hypothesize that various Cd-associated perturbations to placental functions are likely linked to the selective expression or repression of specific biological pathways. Similarly, maternal Cd exposure during pregnancy has been associated with variations in DNA methylation (DNAM), an epigenetic mechanism that regulates gene expression potential, in maternal and fetal blood samples (Vilahur et al. 2015). To date, only one study has investigated associations between placental Cd and DNAM throughout the placental genome in humans. Mohanty et al (2015) explored Cd-associated DNAM in a small study of 24 placentae, identifying multiple differentially methylated loci nearby or within genes involved in cell damage, angiogenesis, cell differentiation, and organ development; these associations differed by fetal sex (Mohanty et al. 2015). Studies with larger samples using recent methodological advances to estimate and adjust for tissue heterogeneity are needed to understand how Cd exposure influences the placental epigenome of human pregnancies and whether such variations are related to fetal growth restriction.
We conducted a study of the associations between placental Cd and placental DNAM throughout the epigenome across two well-characterized independent cohorts while examining the potential impact of tissue heterogeneity on these associations. Secondarily, we investigated whether the top hits from our analysis were enriched for biological pathways consistent with findings from other studies; whether differentially methylated loci were associated with the expression levels of nearby genes; and whether these variations in expression were associated with standardized measures of birth weight, birth length, and head circumference.

New Hampshire Birth Cohort Study
This study included mother-infant pairs from the New Hampshire Birth Cohort Study (NHBCS), an ongoing birth cohort initiated in 2009. Women who were currently pregnant, were between 18 and 45 y of age, were receiving prenatal care from one of the study clinics in New Hampshire, who reported that the primary source of drinking water at their residence was from an unregulated well, and who had resided in the same household since their previous menstrual period with no plans to move before delivery were enrolled in the cohort. All participants provided written informed consent in accordance with the requirements of the Institutional Review Board (IRB) of Dartmouth College. The sample for this study consisted of NHBCS participants who were recruited between February 2012 and September 2013, whose placenta were sampled to conduct genetic and epigenetic assays (n = 343). Placental gross measures such as placental diameter (cm) and placental weight (g) were collected immediately after delivery. Interviewer-administered questionnaires and medical record abstraction were utilized to collect sociodemographic, lifestyle, and anthropometric data.

Rhode Island Child Health Study
The Rhode Island Child Health Study (RICHS) enrolled motherinfant pairs with nonpathologic pregnancies at the Women & Infants Hospital in Providence, Rhode Island, between September 2010 and February 2013. Exclusion criteria consisted of mothers being <18 y of age or having life-threatening conditions, pregnancies with gestational time <37 wk, or infants with congenital/ chromosomal abnormalities. All protocols were approved by the institutional review boards at the Women & Infants Hospital of Rhode Island and Dartmouth College, and all participants provided written informed consent. Infants who were born small for gestational age [≤10th birth weight (BW) percentile] or large for gestational age (≥90th BW percentile) were oversampled; then, infants who were adequate for gestational age (between the 10th and 90th BW percentiles) that matched on gestational age and maternal age were coincidentally enrolled. This study included all mother-infant pairs for whom placental metal concentrations and placental DNAM arrays had been conducted (n = 141). Interviewer-administered questionnaires were utilized to collect sociodemographic and lifestyle data; anthropometric and medical history data were obtained via structured medical records review. Samples with complete covariate, exposure, and outcome data were utilized for this study.

Anthropometric Measures
Birth weight (g), head circumference (cm), and birth length (cm) were abstracted from medical records. z-Scores were calculated for each, standardized by gestational time and infant sex, via Fenton growth curves (Fenton and Kim 2013). We classified infants with standardized BW percentiles below the 10th percentile as small for gestational age (SGA), those above the 90th percentile as large for gestational age (LGA), and those between the 10th and 90th percentile as adequate for gestational age (AGA).

Potential Covariates
Highest achieved maternal education level was self-reported and was defined as those with education beyond the high school level versus those who did not go beyond high school. Maternal smoking during pregnancy was defined as any self-reported smoking during any point of pregnancy versus those who reported no smoking during pregnancy. Fetal sex was abstracted from medical records. Samples with complete covariate, placental Cd, and placental DNAM data were utilized for this study.

Sample Collection
Placenta samples from RICHS and NHBCS were biopsied from the fetal side adjacent to the cord insertion site within 2 h of delivery, and maternal decidua was removed. Samples were placed in RNAlater (Life Technologies) and then frozen at −80 C. Both RNA and DNA were extracted (Norgen Biotek) and quantified using a Qubit Flourometer (Life Technologies) and were subsequently stored at −80 C. Both cohorts followed the same sampling protocol.

DNA Methylation, Quality Control, and Normalization
Both cohorts measured DNAM via Illumina Infinium Human Methylation450K BeadArray (Illumina) at the University of Minnesota Genomics Center. Bisulfite modification was performed with an EZ Methylation kit (Zymo Research, Irvine, CA), samples were randomized across multiple batches, batch variables were recorded to allow for batch-effect corrections, and data were assembled using BeadStudio (Illumina). The raw array data are available via the NCBI Gene Expression Omnibus (GEO) for NHBCS and RICHS via accession numbers GSE71678 and GSE75248, respectively. For quality control (QC), we excluded probes with poor detection p-values (p-value >0:001), measuring DNAM at X-and Y-linked CpG sites, with a single nucleotide polymorphism (SNP) within 10 bp of the target CpG or single base extension (SBE) (and allelic frequency >1%), or crosshybridizing to multiple genomic regions (Chen et al. 2013). Background correction, dye bias, and functional normalization were performed via the minfi package (Fortin et al. 2014), and standardization across probe types was performed via beta mixture quantile (BMIQ) normalization (Teschendorff et al. 2013). Batch effects were corrected for using empirical Bayes via the combat function in R (Johnson et al. 2007). We examined whether principal components from the methylation array were associated with batch variables and found no indications of remaining batch effects after this correction. We also confirmed that batch variables were not associated with continuous [analysis of variance (ANOVA) p-values: NHBCS = 0:326, RICHS = 0:762] or dichotomous Cd (v 2 p-values: NHBCS = 0:980, RICHS = 0:688) in either cohort. DNAM data were reported as b-values, representing the proportion of methylated alleles for each individual CpG site. For the epigenome-wide association (EWAS) models described below we utilized M-values, logit-transformation of the beta values, which better approximate a normal distribution (Du et al. 2010).

RNA Sequencing, QC, and Normalization
Transcriptome-wide sequencing was performed on 200 placental samples from RICHS. We used an RNeasy Mini Kit (Qiagen) to isolate total RNA, which was then stored at −80 C until analysis. We quantified RNA using a Nanodrop Spectrophotometer (Thermo Scientific), assessed RNA integrity via Agilent Bioanalyzer (Agilent), removed ribosomal RNA with a Ribo-Zero Kit (Huang et al. 2011), converted to cDNA using random hexamers (Thermo Scientific), and performed transcriptomewide RNA sequencing via the HiSeq 2500 platform (Illumina) (Bentley et al. 2008). Raw reads have been deposited at the NCBI sequence read archive (SRP095910). QC was performed in FastQC; then, reads were mapped to the human reference genome (h19) using the Spliced Transcripts Alignment to a Reference (STAR) aligner. Low-expressed genes were excluded, and read counts were adjusted for GC content (Risso et al. 2011) and then normalized via the trimmed mean of M-values (TMM) (Robinson et al. 2010); final data are normalized log 2 counts per million (logCPM) reads.

Cadmium Quantification
Trace element concentrations were quantified in RICHS and NHBCS placental samples at the Dartmouth Trace Elements Analysis Core using inductively coupled plasma mass spectrometry (ICP-MS); details of the processing are described elsewhere (Punshon et al. 2016). None of the RICHS samples had nondetectable Cd concentrations, and only four of the NHBCS samples received nondetects, which were assigned a value equal to the 0.5th percentile of detectable values. Owing to the skewed nature of the Cd concentration distributions and to the infeasibility of exploring modeling assumptions for all EWAS models, we dichotomized cadmium concentrations by the cohort-specific median concentrations (NHBCS median = 3:13 ng=g) and (RICHS median = 4:37 ng=g) for the EWAS analysis. Cd concentrations were log 2 -transformed for follow-up analyses of significant findings, which utilized continuous data.

Estimation of Placental Tissue Heterogeneity
We addressed the potential issue of confounding due to DNAMassociated tissue heterogeneity via the RefFreeEWAS package in R (Houseman et al. 2016). This method utilized nonnegative matrix factorization to estimate the proportions of putative cellular mixtures and has been demonstrated to yield reliable estimates of the constituent cell types and the underlying methylomes that define them (Houseman et al. 2016). Using the 10,000 CpGs with the greatest variability in DNAM, we identified four and five putative constituent cell mixtures in the RICHS and NHBCS cohorts, respectively, and then used the full set of CpGs for each cohort (NHBCS p = 408,367 and RICHS p = 397,040) to estimate the relative proportions of each putative cell type per placental sample. We examined whether the 10,000 CpGs with the largest variance across putative cell-type-specific DNAM values were enriched for gene ontology (GO) terms or Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways via the gometh function in the missMethyl package in R (Phipson et al. 2015). This method adjusts for the potential bias introduced by some genes having a greater probability of being included in the gene set because they have greater numbers of CpGs on the 450K array. We examined the consistency of the CpGs with high cell-typespecific DNAM as well as the consistency in the top 100 GO terms and KEGG pathways associated with those highly variable CpGs.

Epigenome-Wide Association Study
Robust linear regressions (rlm) were used to estimate the associations between CpG-specific DNAM levels and high versus low placental concentrations of Cd. We produced two covariateadjusted models for each CpG within each cohort, first regressing M-values on dichotomized Cd concentrations (cohort-specific median as the reference) while adjusting for infant sex, maternal smoking during pregnancy, and highest achieved maternal education level. Second, we produced the same models while also adjusting for the estimated proportions of putative placental cell types. Genomic inflation was examined via the genomic inflation factor (k), which was calculated using the genABEL package in R (Aulchenko et al. 2007). We selected the 250 CpGs with the smallest Cd-associated meta-analysis p-values and assessed these for functional enrichment with GO terms and KEGG pathways as described above. Statistical significance for enrichment was determined at a 5% false discovery rate (FDR).

Meta-Analyses
We used METAL to meta-analyze the cohort-specific Cdassociations via inverse variance weighted fixed effects models, and we evaluated interstudy heterogeneity via Cochran's Q-test p-values <0:05. (Willer et al. 2010). We produced FDR-adjusted p-values using the qvalue package in R. Statistical significance was determined at an FDR of 5%. Volcano plots were produced to visualize the overall magnitude and direction of associations, and Manhattan plots were produced using the qqman package to visualize the genomic distribution of significant associations. Plots for specific regions of the genome were visualized using the coMET package (Martin et al. 2015). Top hits from the study were also evaluated for dose-response relationships via modeling associations between log 2 -transformed continuous Cd and DNAM, using linear mixed models, allowing for random intercepts by study. Plots of FDR-significant hits utilized DNAM b-values as the dependent variable for more intuitive interpretation.

Expression Quantitative Trait Methylation Analysis
We then tested whether any of the Cd-associated CpGs (p-values <1 × 10 −5 ) could be expression quantitative trait methylation (eQTM) loci within the 200 RICHS samples, 85 of which overlap with the samples used in the EWAS. We tested for cisgene-CpG associations (within 100 kb of the target CpG site) by regressing gene expression levels (log 2 ðCPMÞ) on CpG b-values using robust linear models. Statistical significance for eQTM associations was determined at a 5% FDR.

Associations with Fetal Growth
Genes that were identified as potential eQTMs at a 5% FDR threshold were then tested for associations between birth metrics and gene expression using Kendal correlations, a nonparametric correlation test that allows for ties. Because birth metric z-scores were standardized by fetal sex and by weeks of gestation, no additional adjustments were made to these association tests.

Replication of Previous Studies and Sex-Specific Effects
We then investigated whether we could reproduce the results from a previous study of sex-specific associations between placental Cd and placental DNAM (Mohanty et al. 2015). Because multiple studies of various fetal and maternal tissues have suggested that associations between Cd and DNAM may differ by fetal sex (Kippler et al. 2013;Mohanty et al. 2015;Virani et al. 2016), we also screened the CpGs from our study with metaanalysis p-values <1 × 10 −5 for interactions with fetal sex. For these analyses, we utilized linear mixed models with DNAM Mvalues as the outcome, allowing for random intercepts by study, and we included an interaction term between DNAM levels and fetal sex while adjusting for maternal education and maternal smoking. We utilized the Wald test to produce p-values and determined statistically significant interactions at a p-value threshold of 0.05. Any models producing interaction p-values <0:05 were then rerun stratified by fetal sex.

Results
The EWAS for this study included 343 mother-infant pairs drawn from the New Hampshire Birth Cohort (NHBCS) and 141 mother-infant pairs from the Rhode Island Child Health Study (RICHS). The most notable differences between the NHBCS and RICHS samples (Table 1) were with regard to racial diversity and birth size. Because RICHS was over-sampled for LGA and SGA infants by design, we observed a higher proportion of SGA and LGA infants compared with NHBCS. Placental samples from the RICHS cohort also had slightly higher median Cd concentrations (4:34 ng=g) than samples from the NHBCS cohort (3:13 ng=g), and concentrations were approximately log-normally distributed in both cohorts (see Figure S1).
We regressed DNAM M-values (at 408,367 CpGs in NHBCS and 397,040 CpGs in RICHS) on dichotomized placental Cd concentrations (median split) while adjusting for fetal sex, maternal smoking during pregnancy, and maternal education level in each cohort. We then combined parameter estimates and p-values via inverse variance weighted fixed effects meta-analysis (390,583 CpGs overlapped between the cohorts), which yielded three loci with Cd associations at a 5% FDR and 18 additional loci with p-values <1 × 10 −5 (Table 2), in which Cd was predominantly associated with hypermethylation.
Cell-specific DNAM is a well-recognized confounder in epigenetic epidemiology, leading many studies to estimate the proportions of constituent cell types from the DNAM array data (Houseman et al. 2012) and to adjust for these proportions in their regression models. However, estimation of cell-type proportions requires access to cell-type-specific methylomes for the individual cells that constitute the heterogenous tissue under study, and no reference methylomes currently exist for placental tissue. Thus, we used a reference-free method that deconvoluted the major axes of variation in DNAM, which are likely associated with tissue heterogeneity, to estimate the proportions of the putative constituent cell types (PCTs) in our placental samples (Houseman et al. 2016). With this method, we estimated five and four PCTs in NHBCS and RICHS, respectively. We then tested whether the 10,000 CpGs with the greatest variability in PCTspecific DNAM were enriched for biologically relevant GO terms or KEGG pathways, and we checked for consistency between the two cohorts. We observed 31.6% overlap among these 10,000 PCT-defining CpGs and strong concordance in the biological functions of the top 100 GO terms (60% overlap) and KEGG pathways (76% overlap) across the two cohorts. The top 10 KEGG pathways from both cohorts included metabolic pathways, neuroactive ligand-receptor interaction, axon guidance, calcium signaling, olfactory transduction, cell adhesion molecules (CAMs), and cAMP signaling (see Excel Tables S1 and S2; see also Figure S2).
We then tested whether the distribution of these PCTs varied with placental Cd concentrations. Higher Cd was associated with increased proportions of PCT-1 (3.4% difference, T-test p-value = 0:010) and PCT-2 (4.7% difference, T-test p-value = 0:022) in NHBCS, whereas higher Cd was associated with increased proportions of PCT-2 (7.3% difference, T-test p-value = 0:047) in RICHS. Because these associations may confound the relationships between Cd and locus-specific DNAM, we then reproduced the cohort-specific multivariable models and meta-analysis while also adjusting for the estimated PCT proportions. These adjustments substantially reduced inflation in NHBCS (k = 1:47 before adjustment and k = 0:96 after adjustment), and only modestly increased inflation in RICHS (k = 1:03 before adjustment and k = 1:10 after adjustment). Furthermore, the PCT adjustments reduced the proportion of small heterogeneity p-values (<0:05) in the meta-analysis from 7.9% in the non-PCT-adjusted models to 4.6% in the PCT-adjusted models. (See Excel Table S3 and Excel Table S4 for full results from all cohort-specific and meta-analyses).
We compared the meta-analysis parameter estimates and p-values before and after PCT adjustment across all CpGs via plots, LOESS curves, and Spearman correlations. We found that overall, the parameter estimates tended to become attenuated by PCT adjustment (see Figure S3) and produced fewer small p-values (see Figure S4), indicating an overall confounding by estimated cellular heterogeneity. For example, the parameter estimates for our top three hits from the original meta-analysis (cg15345741, cg16768966, and cg05888894) were reduced by 11%, 16%, and 45%, respectively, after adjustment for PCTs, indicating that cell mixture likely explained much of the Cd-associated variation for cg05888894, but only a modest proportion of the Cd-DNAM associations at cg15345741 and cg16768966. Additionally, the coefficients for cg15345741 and cg16768966 retained highly significant p-values (<1 × 10 −5 ) after PCT adjustment. Despite a tendency towards attenuation, the estimated effects before and after adjustment were highly correlated (q = 0:81, p-value = less than 2:2 × 10 −16 ), although p-values were only moderately correlated (q = 0:60, p-value <2:2 × 10 −16 ). Overall, higher Cd was more commonly associated with higher DNAM, and our suggestive and FDR-significant hits were spread throughout the genome (Figure 1). The PCT-adjusted top hit from NHBCS was cg20656525 on chromosome 14, downstream of the genes encoding TNF alpha induced protein 2 (TNFAIP2) and exocyst complex component 3 like 4 (EXOC3L4), whereas the top hit from RICHS was cg07708653 on chromosome 1, within the body of the 4-hydroxyphenylpyruvate dioxygenase like (HPDL) gene. The PCT-adjusted meta-analysis yielded 17 CpGs with Cd-association p-values <1 × 10 −5 (Table 3), two of which were significant at a 5% FDR (cg15345741 and cg11707084). Both of the FDRsignificant CpGs also had significant linear associations with log-Cd (cg15345741 estimate = 0:17, p-value = 0:00019; cg11707084 estimate = 0:16, p-value = 0:000014) in linear mixed models. For these two CpGs, we regressed DNAM b-values over dichotomized Cd, continuous Cd, and log-transformed Cd, using the pooled Table 2. Results from cohort-specific and meta-analyses of Cd-associated DNA methylation, for probes with meta-analysis p-values <1 × 10 −5 ; adjusted for maternal smoking during pregnancy, maternal education, and fetal sex.

CpG annotations
Meta-analysis NHBCS (n = 343) RICHS (n = 141) CpG ID Ch. Genomic position Gene  Figure 1. Volcano plots from each cohort-specific epigenome-wide association study (EWAS) and the meta-analysis, and Manhattan plot of meta-analysis results. All models adjusted for fetal sex, maternal smoking during pregnancy, mother's highest achieved education level, and putative placental cell types; dashed line represents the false discovery rate (FDR) 5% threshold; dotted line represents suggestive association (p-values <1 × 10 −5 ).
data adjusted for the same covariates and an indicator for study to visualize the approximate percent change in DNAM associated with Cd ( Figure 2). The meta-analyses again revealed a clear trend towards higher DNAM levels associated with higher Cd concentrations; 94% of Cd-associated loci with p-values <1 × 10 −5 had positive beta coefficients. Of note, 41% of CpGs with suggestive p-values (<1 × 10 −5 ) were on chromosome 17, five of which were within the growth arrest specific 7 (GAS7) gene. Given the high proportion of GAS7 CpGs among the top hits from the meta-analysis, we explored the comethylation structure within GAS7 and all Cd-DNAM associations within and around this gene. The strongest Cd-associated DNAM sites tended to congregate around the first exons of three of the four different GAS7 variants (GAS7-a, -b, and -d) and exhibited a strong comethylation structure as shown by strong positive Spearman correlation p-values (Figure 3). We also investigated the underlying biology that may be affected by Cd-associated variations in the placental epigenome. Because we had too few FDR-significant findings to perform an enrichment analysis on only those hits, we tested for GO term and KEGG pathway enrichment among the top 250 CpGs associated with placental Cd after PCT adjustment. The top GO terms included dimethylarginase activity, eye and heart valve development, and heparan sulfate proteoglycan and polysaccharide biosynthetic processing (see Excel Table S5), although none of these enrichments was significant at an FDR threshold of 5%. In contrast, our gene set was enriched with genes from numerous KEGG pathways (see Excel Table S6), 12 of which yielded p-values within an FDR threshold of 5% (Table 4); these included cell adhesion molecules (CAMs) and tight junctions, multiple cancer pathways, multiple G-protein and nitric oxide signaling pathways, and cytotoxic processes.
To explore whether gene expression levels might be associated with variations in DNAM levels at Cd-associated CpGs with meta-analysis p-values <1 × 10 −5 , we performed an expression quantitative trait methylation (eQTM) (Teschendorff and Relton 2017) analysis. We regressed the expression levels of any gene within 100 kb of a candidate CpG site on the DNAM levels of that CpG. Three of these CpGs (cg24000528, cg03917020, and cg23193177) were not within 100 kb of any genes that yielded detectable RNA from our placental samples. Thus, 14 CpGs and 32 genes were included in this analysis; only GAS7 was cis to multiple CpGs within our set of candidates. We produced 36 linear models ranging between 1 and 7 genes tested per CpG (Table  5). Of the 36 eQTM models, 10 yielded FDR-significant linear associations between DNAM and expression of six unique genes (see Figure S5). Of particular note were the associations between our top hit from the meta-analysis and the expression levels of EXOC3L4 (b 1 = 1:95, p-value = 0:0043) and TNFAIP2 (b 1 = 1:88, p-value = 0:0055). Additionally, all five of the Cd-associated GAS7 CpGs (within or nearby the first exons of variants a, b, and d) were associated with increased GAS7 expression (Table 5). In contrast, DNAM surrounding the first exon of GAS7 variant c tended to be inversely correlated with mRNA levels (Figure 3).
We then investigated whether the expression levels of the FDR-significant genes from the eQTM analysis were associated with fetal growth metrics standardized by sex and gestational age (Fenton and Kim 2013) (Table 6). We found that higher placental expression levels of TNFAIP2 and acyl-CoA thioesterase 7 (ACOT7) were modestly associated with decreasing birth weight z-scores, and higher expression of retinoic acid receptor-related orphan receptor alpha (RORA) was modestly associated with increased birth weight z-scores (p-values <0:05). Higher expression of ACOT7 was also associated with decreasing birth length and head circumference z-scores (p-values <0:05). None of the other gene expression levels was associated with birth metrics.
Finally, we explored whether top hits from a previous small-sample study (n = 24) of placental Cd and DNAM levels (Mohanty et al. 2015) demonstrated associations in our study. In our meta-analysis, one of the top six loci from that study, cg04528060 on chromosome 4 within the ADP-ribosylation factor-like 9 (ARL9) gene, demonstrated a nominally significant association with Cd (b 1 = − 0:49, p-value = 0:014), with the same direction of effect. However, we did not observe any evidence of sex-specific effects at this or the other 5 CpG sites. We also explored whether any CpGs from our meta-analysis with p-values <1 × 10 −5 yielded statistically significant interactions between Cd and fetal sex (see Excel Table S5). Of the 17 models, only cg02600679 within the gene body of piezo type mechanosensitive ion channel component 1 (PIEZO1) produced a significant interaction (p-value = 0:031). Stratifying by fetal sex revealed that Cd was associated with increased DNAM at Table 3. Results from cohort-specific and meta-analyses of cadmium-associated DNA methylation, for probes with meta-analysis p-values <1 × 10 −5 ; adjusted for maternal smoking during pregnancy, maternal education, fetal sex, and estimated tissue heterogeneity. CpG Sites that yielded meta-analysis p-values <1 × 10 −5 in models before and after heterogeneity adjustments.

Sensitivity Analyses
We performed sensitivity analyses to a) evaluate the linear relationships between log 2 -Cd and DNAM, and b) evaluate the potential confounding effects of maternal smoking during pregnancy. To evaluate the linear relationships, we tested the associations between DNAM and log-Cd at the 17 CpG sites from Table  3 while adjusting for sex, maternal education, maternal smoking, and PCTs within each cohort; then, we meta-analyzed the results (see Table S1). All 17 models produced nominally significant (p-values <0:05) linear associations with log-Cd with the same direction of effect, although p-values for these associations (ranging between 0.014 and 0.000019) did not produce suggestive significance (p-values <1 × 10 −5 ). To evaluate whether our observed associations between dichotomized Cd and DNAM were driven by the effects of maternal smoking during pregnancy, we reproduced the fully adjusted study-specific models and meta-analyses after excluding all mothers who reported smoking during pregnancy. The associations we observed between Cd and DNAM Figure 2. Differences in average DMA methylation (DNAM) by cadmium (Cd) as a dichotomous, continuous, and log-transformed variable (panels A, B, and C, respectively), at the two false discovery rate (FDR)-significant CpGs (left, cg11701084; right, cg15345741). Robust linear regression models were adjusted for fetal sex, maternal smoking during pregnancy, mother's highest achieved education level, putative placental cell types, and an indicator for study [New Hampshire Birth Cohort Study (NHBCS) and Rhode Island Child Health Study (RICHS)]. For models of dichotomous Cd, values were classified as low or high according to the median value for each cohort (3.13 and 4:37 ng=g for NHBCS and RICHS, respectively.). did vary somewhat after excluding smokers, but there was no strong attenuation of b-coefficients towards the null (see Figures  S6 and S7). Among our top hits, the Cd-DNAM associations were only modestly attenuated (see Table S2): b coefficients changed by >20% for only 3 CpGs and by >10% for an additional 5 CpGs; however, only 3 CpGs retained meta-analysis p-values <1 × 10 −5 . In addition, maternal smoking was not significantly associated with estimated placental heterogeneity in either cohort (see Table S3; minimum T-test p-value = 0:09) or with the expression of our top genes (see Table S4; minimum T-test p-value = 0:07). Our top two hits from the original analysis retained strong associations with Cd when utilizing log-Cd Figure 3. Manhattan plot of Cd meta-analysis results surrounding the GAS7 gene. Annotated with Ensembl gene names and RefSeq IDs for GAS7 variants, Spearman correlations between GAS7 CpG sites, and CpG-expression correlations. Cadmium (Cd) associations at CpGs outside of the GAS7 coding region ( ± 1,500 bp) have triangles, whereas CpGs within GAS7 have squares; hits with meta-analysis p-values <1 × 10 −5 are indicated with an asterisk (*). Directions of association between DNA methylation (DNAM) and GAS7 expression are indicated with + and − signs for those relationships whose Spearman correlation p-values were nominally significant (p-values <0:05).

Discussion
We conducted a large EWAS study across two cohorts to examine associations between Cd concentrations and DNAM from human placentae while considering the potential confounding effects of placental tissue heterogeneity and maternal smoking. We further assessed eQTM and associations with birth metrics. We used identical measurement techniques for both DNAM and trace metal measurement in the two cohorts, and the placental concentrations of Cd were similar to the 3:53 ng=g previously reported in the full NHBCS cohort (Punshon et al. 2016), and towards the lower end but consistent with other cohorts that have reported concentrations ranging from as low as 1:2 ng=g to as high as 53:3 ng=g (Esteban-Vasallo et al. 2012). We identified Cd-associated variations at two CpGs within a 5% FDR threshold and an additional 15 CpGs with meta-analysis p-values <1 × 10 −5 . At all but one of these loci, higher Cd was associated with increased DNAM levels, and each of these sites produced linear associations with log-Cd and in the same direction of effect at a nominal significance level (p-values <0:05) We also found that the DNAM levels at many of these CpGs were associated with the expression patterns of six genes, and the expression of three genes (TNFAIP2, ACOT7, and RORA) was associated with birth metrics. The observed preference for Cd-associated hypermethylation, as opposed to hypomethylation, is consistent with the findings of previous studies that observed higher Cd to be associated with a higher proportion of hypermethylation at gene promoters from fetal and maternal blood samples (Sanders et al. 2014) and with increased DNAM at regulatory regions for imprinted genes from cord blood samples (Vidal et al. 2015). In contrast, others have observed Cd-associated hypomethylation in placental DNA; however, this finding was from a relatively small study that focused on sex-specific associations (Mohanty et al. 2015). Cd has been shown to enhance DNA methyltransferase activity and thus to increase the preponderance of hypermethylated loci, although these effects on DNAM may vary by timing and duration of exposure (Takiguchi et al. 2003).
The top hit from our study, cg15345741 is on chromosome 14, approximately 70 kb and 97 kb downstream of the TNFAIP2 and EXOC3L4 genes, respectively, which encode tumor necrosis factor alpha-induced protein 2 and the exocyst complex component 3 like 4 protein. Higher DNAM at cg15345741 was associated with higher expression of both TNFAIP2 and EXOC3L4 and thus may be involved in the regulation of these genes or as a possible marker of their expression, although our findings are not consistent with the promoter methylation paradigm. Other investigators have observed that arsenic-associated DNAM variations in the first exon, particularly within CpG islands, have been associated with increased gene expression in cord blood leukocytes (Rojas et al. 2015). Furthermore, environmental toxicant-associated DNAM   variations in nonpromoter regions have been suggested to represent transcription factor binding activity (Martin and Fry 2016). Interestingly, the location of our top CpG does overlap with a DNase hypersensitivity region and with multiple putative transcription factor binding sites, which may explain the apparent relationship between DNAM and the expression of TNFAIP2 and EXOC3L4. We also identified the Cd-associated locus cg11707084 on chromosome 19, which is cis to six genes, although the expression levels of most of these genes were not detectable in placental tissue, and those that were did not yield associations with DNAM at our candidate loci. Thus, the potential functional role of DNAM at cg11707084 is unclear. Both TNFAIP2 and EXOC3L4 also serve other biological roles that may be important to Cd-associated toxicity. For example, the expression of TNFAIP2 can be up-regulated by TNFa, retinoic acid, interleukin-1b, and other pro-inflammatory and cytotoxic signals (Rusiniak et al. 2000;Sarma et al. 1992). SNPs within EXOC3L4 have been implicated in affecting c-glutamyl transferase (GGT) activity (Middelberg et al. 2012), and elevated levels of GGT are recognized markers of oxidative stress (Fentiman 2012), which plays a central role in Cd toxicity. Disrupted epigenetic regulation of these genes and their involvement in placental growth, development, inflammation, and/or oxidative stress responses present compelling potential mechanisms through which Cd may elicit some of its reproductive toxicity.
Our top 250 sites included genes that were significantly enriched for nitric oxide (NO) and G-protein signaling pathways (RAS, apelin, oxytocin, and cGMP-PKG), cancerous processes, cellular adhesion, cellular metabolism, and cytotoxicity. These pathways are consistent with currently recognized mechanisms of Cd-associated toxicity that have primarily been studied in animal or in vitro models: altered signal transduction, impaired cellular adhesion, disruptions to the cell cycle, increased oxidative stress, and cytotoxic/apoptotic signaling (Thompson and Bannigan 2008). We provide evidence that these biologic processes in the human placenta may be perturbed by Cd-associated differential DNAM, despite overall Cd exposures being low and our study samples primarily consisting of healthy pregnancies.
Expression of the GAS7 gene, which encodes the growth arrest specific 7 protein, was strongly associated with DNAM at all five candidate CpGs that were associated with Cd (p-values <1 × 10 −5 ). Interestingly, the strongest Cd associations were localized near the first exons of transcript variants a, b, and d for GAS7, and thus these variations may play roles in Cd-associated alternative splicing of this gene. However, the read-depth of our RNA-seq data was not sufficient to accurately test for variant-specific expression levels; this should be investigated further in future studies given the potential for variantspecific functions that is described below. The GAS7 gene, which is highly expressed in embryonic cells (Moorthy et al. 2005), mature brain cells, and Purkinje neurons, plays an essential role in neurite outgrowth (You and Lin-Chao 2010) and in inducing neuronal cell death (Hung and Chao 2010); specific functions may be isoform dependent. In the mouse embryo, levels of GAS7 decrease as the embryo grows, except in neuronalspecific epidermal cells, in which expression remains high (Moorthy et al. 2005). GAS7 expression also plays an essential role in other nonneuronal growth functions, such as osteoblast differentiation and bone development (Chao et al. 2013). Furthermore, neural outgrowth and arborization are shared functions between the exocyst and GAS7 (Martin-Urdiroz et al. 2016;You and Lin-Chao 2010), whereas TNFAIP2 expression has been associated with some neurodegenerative conditions (Brohawn et al. 2016;Colangelo et al. 2002). Relatively few studies have investigated relationships between perinatal Cd exposure and neurodevelopment, although some studies suggest an association with impaired cognition (Kippler et al. 2012a;Sanders et al. 2015), which raises some questions about the possible roles of the above-mentioned genes in Cd-associated cognitive and neurobehavioral deficits. However, the functional roles of these genes in human placental tissue are not well understood. Thus, it is unclear how, or whether, Cd-associated differential placental DNAM and expression may affect early-life cognitive function; this should be the focus of additional research.
Genes known to contribute to neurite outgrowth are sometimes also involved in regulating placental angiogenesis, as has been shown for vascular endothelial growth factor (VEGF) and neurotrophins (Sahay et al. 2015). Interestingly, sterol regulatory element binding proteins (SREBPs) such as SREBF1, whose expression was strongly correlated with one of our top hits, play key roles in VEGF-induced angiogenesis (Zhou et al. 2004). Additionally, inappropriate placental vascularization can result in fetal growth restriction or other pregnancy complications (Sahay et al. 2015). In our study, the expression of GAS7, EXOC3L4, and SREBF1 was not associated with standardized percentiles for birth weight, head circumference, or birth length and thus may not be directly related to fetal growth. In contrast, expression of TNFAIP2, ACOT7, and RORA was associated with fetal growth. ACOT7, which encodes acyl-CoA thioesterase 7, is involved in lipid metabolism in neuronal cells and may prevent neurotoxicity (Ellis et al. 2013); it also promotes pro-inflammatory activities by up-regulating arachidonic acid production (Brocker et al. 2010). RORA, which encodes the retinoic acid receptor-related orphan receptor alpha, is involved in numerous developmental processes including innate immune system development, regulation of inflammatory responses, and neuronal survival and growth, as well as in lipid and glucose metabolism (Cook et al. 2015). One common biological thread that differentiates TNFAIP2, ACOT7, and RORA from GAS7, EXOC3L4, and SREBF1 is their role in mediating inflammatory and immune responses. Thus, although we observed numerous Cd-associated variations in DNAM among genes involved in neurodevelopment and cellular growth processes, variations involved in inflammatory and immune signaling disruption may affect fetal growth. Additionally, these genes are involved in glucose and insulin signaling, which is consistent with the endocrine-disruptive effects of cadmium, and may have implications for metabolic health. GAS7 influences adiposity, fat metabolism, and insulin signaling (Yang et al. 2009). SREBF1 mediates insulin action, and polymorphisms and DNA methylation variations within SREBF1 increase predisposition to type II diabetes (Grarup et al. 2008;Muftah et al. 2016). ACOT7 is a key regulator of glucose response and is more highly expressed in pancreatic b cells of those with type II diabetes (Martinez-Sanchez et al. 2016). RORA is involved in lipid metabolism and insulin sensitivity, and it is associated with obesity and type II diabetes (Jetten et al. 2013).
Other epidemiologic studies of maternal Cd and birth/pregnancy outcomes have found that higher Cd may affect essential metals transport (Kippler et al. 2010); reduce expression of a protocadherin involved in neuronal development (Everson et al. 2016); and affect the DNAM of genes involved in cell death, lipid metabolism, and cancer pathways (Sanders et al. 2014). Similarly, electronic waste (e-waste) exposure, which is associated with higher placental concentrations of toxic metals including Cd, may influence differential expression of proteins involved in immune and metabolic processes (Xu et al. 2016). The top hits from a small-sample EWAS of placental Cd and DNAM indicated differential methylation of genes involved in cellular metabolism, growth, and cell damage response (Mohanty et al. 2015). We were able to reproduce Cd-associated DNAM variations for one top hit from that study, cg04528060 in the 5 0 untranslated region (UTR) of the ARL9 gene (meta-analysis p-value <0:05), with the same direction of effect. Notably, the ARL9 gene encodes a GTP-binding protein that is similar to members of the RAS superfamily (Louro et al. 2004), and RAS signaling was the most highly enriched KEGG pathway from our study.
Several studies have investigated relationships between maternal Cd biomarkers and DNAM in cord blood. One study found that maternal blood Cd was associated with higher cord blood DNAM at the paternally expressed gene 3 (PEG3) differentially methylated region (DMR) and observed indiciations of sexspecific effects and possible antagonism of effects by Zn (Vidal et al. 2015). In our study, the four CpGs within the PEG3 DMR (Chr. 19: 57,351,352,096; UCSC Genome Browser, GRCH37/hg19) (Nye et al. 2013) produced p-values >0:05. A genome-wide study found that genes involved in regulating expression, cell cycle, cell death, and nervous system development were differentially methylated by Cd exposure (Sanders et al. 2014); many of the same processes were identified in our enrichment analysis. Another genome-wide study observed higher DNAM at nine of their top loci associated with maternal Cd, consistent with our observation of hypermethylation among top hits (Kippler et al. 2013). The CpG with the second-strongest effect in that study, cg16667631, showed a strong effect in the same direction of effect in our meta-analysis.
The preponderance of similar biological processes associated with Cd across multiple studies with different technologies, populations, and tissues, is striking and suggests that common mechanisms may be affected by Cd exposure during pregnancy. The findings from our study should be interpreted within the context of this study's limitations. This was an observational study in which placental Cd concentrations and DNAM levels were both measured in placenta at term. Thus, temporality between the Cd concentrations and variations in DNAM could not be established, and the observed associations at term may not be representative of Cd and DNAM associations throughout development. Although we did estimate and adjust for cellular heterogeneity, we used a reference-free approach that was data-driven rather than biology-driven. Future studies of placental DNAM would benefit greatly from the production of an appropriate cell-type-specific methylome for placental tissues. Although we adjusted for likely confounders in our study, we also cannot rule out the possibility that unmeasured or residual confounding may have affected our observations; one of the most likely confounders of these associations was maternal smoking, which is a common source of high Cd exposure. However, we adjusted for maternal smoking in our primary analysis and further evaluated potential bias with a sensitivity analysis in which we excluded mothers who reported smoking during pregnancy. These analyses indicated that the observed associations between Cd and DNAM were largely independent of maternal smoking during pregnancy. We utilized a 5% FDR cutoff to identify our top statistically significant loci; however, given the small number of significant hits, we performed follow-up DNAM expression analyses on all loci with raw p-values <1 × 10 −5 . Thus, there are likely some false positive findings among these hits. Power is a common problem in epigenetic epidemiology given the high dimensionality of the data and the small effect sizes (Bakulski and Fallin 2014); therefore, we recommend further replication in other independent study samples. Further, both samples utilized in this study were from the New England region of the United States and consisted predominantly of healthy white mothers. Thus, the generalizability of our findings to populations in different regions of the world and to those with different racial and ethnic backgrounds may be limited. However, to our knowledge, this was the largest study yet to examine the relationships between placental Cd and DNAM and the first to account for and evaluate associations with estimated cellular heterogeneity.
The loci that we did identify are involved in biological processes known to be affected by Cd toxicity and thus are strong candidates for future studies. We observed these associations in two healthy populations with relatively low exposure levels. In the United States, exposure to Cd has declined since the late 1980s, which is primarily attributed to declining smoking rates (Tellez-Plaza et al. 2012). However, Cd is still detectable in numerous commonly consumed foods (Awata et al. 2016), and exposure prevalence remains quite high (Satarug et al. 2017). Additionally, e-waste recycling and disposal are emerging sources of environmental Cd (and other toxic metals) contamination, particularly in developing countries (Sthiannopkao and Wong 2013). These sources of exposure raise some concerns about the potential threshold for Cd-associated reproductive toxicity and whether current dietary intake guidelines and efforts to reduce tobacco smoke exposure during pregnancy adequately protect against adverse birth and pregnancy outcomes. We encourage further molecular epidemiologic studies of the associations between Cd and DNAM in placental tissue, particularly in populations with higher exposure levels and different ethnic backgrounds; we also encourage studies of the associations between dietary intake-and smoking-related Cd exposure with the placental accumulation of Cd. Understanding these relationships can help us determine whether changes to current guidelines and recommendations could reduce Cd-associated adverse reproductive outcomes. Finally, the genes whose expression levels were associated with DNAM at our top hits are multifunctional and are involved in neurodevelopmental and/or neurodegenerative processes, cellular growth processes, and inflammatory/immune signaling. We also found that the expression levels of three of these genes were associated with infant birth weight. Thus, additional future studies should investigate whether differential placental epigenetic regulation and expression of GAS7, EXOC3L4, SREBF1, RORA, ACOT7, and TNFAIP2 are associated with cognitive/neurobehavioral, growth, and immune outcomes in children.

Conclusion
Cadmium is a toxic environmental pollutant that can impair fetal development, although mechanisms underlying this developmental toxicity are unclear. Because cadmium can accumulate in the placenta during pregnancy, we hypothesized that some of this toxicity may be due to disrupted placental functions. We showed that placental cadmium was associated with differential placental DNAM at multiple loci that are involved in cellular growth processes, metabolism, and inflammatory signaling. We also observed corresponding DNAM-expression associations at many of the genes in close genomic proximity to these loci. The expression of the genes involved in inflammatory and immune signaling (RORA, ACOT7 and TNFAIP2) was also associated with birth size. The present study provides supporting evidence that disrupted placental epigenetic regulation of cellular growth and immune/inflammatory signaling could play a role in Cd-associated reproductive toxicity in human pregnancies. the grantee and do not necessarily represent the official views of the U.S. EPA. Further, the U.S. EPA does not endorse the purchase of any commercial products or services mentioned in the presentation.