Prenatal Exposure to Nonpersistent Chemical Mixtures and Fetal Growth: A Population-Based Study

Background: Prenatal exposure to mixtures of nonpersistent chemicals is universal. Most studies examining these chemicals in association with fetal growth have been restricted to single exposure models, ignoring their potentially cumulative impact. Objective: We aimed to assess the association between prenatal exposure to a mixture of phthalates, bisphenols, and organophosphate (OP) pesticides and fetal measures of head circumference, femur length, and weight. Methods: Within the Generation R Study, a population-based cohort in Netherlands (n=776), urinary concentrations of 11 phthalate metabolites, 3 bisphenols, and 5 dialkylphosphate (DAP) metabolites were measured at <18, 18–25, and >25 weeks of gestation and averaged. Ultrasound measures of head circumference, femur length, and estimated fetal weight (EFW) were taken at 18–25 and >25 weeks of gestation, and measurements of head circumference, length, and weight were performed at delivery. We estimated the difference in each fetal measurement per quartile increase in all exposures within the mixture with quantile g-computation. Results: The average EFW at 18–25 wk and >25wk was 369 and 1,626g, respectively, and the average birth weight was 3,451g. Higher exposure was associated with smaller fetal and newborn growth parameters in a nonlinear fashion. At 18–25 wk, fetuses in the second, third, and fourth quartiles of exposure (Q2–Q4) had 26g [95% confidence intervals (CI):−38, −13], 35g (95% CI: −55, −15), and 27g (95% CI: −54, 1) lower EFW compared with those in the first quartile (Q1). A similar dose–response pattern was observed at >25wk, but all effect sizes were smaller, and no association was observed comparing Q4 to Q1. At birth, we observed no differences in weight between Q1–Q2 or Q1–Q3. However, fetuses in Q4 had 91g (95% CI: −258, 76) lower birth weight in comparison with those in Q1. Results observed at 18–25 and >25wk were similar for femur length; however, no differences were observed at birth. No associations were observed for head circumference. Discussion: Higher exposure to a mixture of phthalates, bisphenols, and OP pesticides was associated with lower EFW in the midpregnancy period. In late pregnancy, these differences were similar but less pronounced. At birth, the only associations observed appeared when comparing individuals from Q1 and Q4. This finding suggests that even low levels of exposure may be sufficient to influence growth in early pregnancy, whereas higher levels may be necessary to affect birth weight. Joint exposure to nonpersistent chemicals may adversely impact fetal growth, and because these exposures are widespread, this impact could be substantial. https://doi.org/10.1289/EHP9178


Introduction
Pregnant mothers are ubiquitously exposed to a plethora of chemicals found in commonly used consumer products and via their diet (Traoré et al. 2016;Wong and Durrani 2017). Phthalates and bisphenols are synthetic compounds incorporated in many personal care products and food-packaging materials (Gunderson 1995;Muncke 2009;Wong and Durrani 2017). Organophosphate (OP) pesticides are insecticides that are commonly used for pest control, and exposure occurs mostly through the diet (Gunderson 1995;Lu et al. 2008;Muncke 2009). All these chemicals are able to cross the placental and blood-brain barriers (Bradman et al. 2003;Chou et al. 2011;Jensen et al. 2012;Philippat et al. 2013;Schönfelder et al. 2002;Silva et al. 2004), have the potential to cause permanent developmental changes to the fetus as shown by experimental animal studies (Zoeller et al. 2012), and are thus a growing public health concern. Furthermore, because these chemicals are nonpersistent, their contribution to adverse health effects may be preventable via interventions in a relatively short period of time.
Although not conclusive, animal studies have linked prenatal exposure to phthalates, bisphenols, and OP pesticides with fetal growth alterations (Breslin et al. 1996;Chanda et al. 1995;Kim et al. 2001; Rubin et al. 2001; Srivastava and Raizada 1996;Tanaka 2005). Human studies also suggest associations between these exposures and anthropometric parameters at delivery (Chou et al. 2011;Harley et al. 2016;Huang et al. 2014;Miao et al. 2011;Philippat et al. 2012;Suzuki et al. 2010;Wolff et al. 2008;Zhang et al. 2009) as well as fetal growth ultrasound measures during pregnancy (Casas et al. 2016;Ferguson et al. 2016Ferguson et al. , 2018Ferguson et al. , 2019Lee et al. 2008;Philippat et al. 2014;Snijder et al. 2013). Perturbations in normal fetal development could have significant public health impact, because they are related to numerous adverse health effects in children (Miller et al. 2016) and later life (Barker 2006).
Exposures to these compounds do not occur in isolation; instead, each woman's exposure is a complex combination of multiple chemicals. Most studies examining maternal exposure to these compounds in relation to fetal growth have assessed associations between single exposures and growth outcomes rather than the overall effect of the chemical mixture. Restricting analyses to single pollutants may ignore health effects that would be detected if chemical mixtures as a whole were assessed (Bopp et al. 2018). For example, coexposures may combine in various ways to elicit health effects, even when effects of individual exposures are below concentrations considered harmful (Carpenter et al. 2002;Kortenkamp 2014;Zoeller et al. 2012). Other key limitations of single-chemical models are the potential biased effect estimates in the presence of copollutant confounding, and inflated false discoveries when correlated exposures are modeled separately . Focusing on the joint mixture effect provides results that more closely correspond to real-world exposures and may thus directly inform potential public health interventions Robins et al. 2004). For instance, reducing prenatal exposures by way of behavioral interventions (e.g., avoiding packaged foods) is unlikely to affect exposure to a single phthalate. Hence, the observed health benefits of intervention may actually reflect the benefits of reducing exposure to multiple chemicals that exist in such products.
To address this research gap, we used quantile-based g-computation to estimate the overall effect (i.e., joint impact) of prenatal exposure to phthalates, bisphenols, and OP pesticides on fetal growth measured by ultrasound at two time points during pregnancy and at delivery.

Study Population
The Generation R Study is a prospective population-based birth cohort designed to identify early environmental and genetic determinants of development, with participants recruited between 2002 and 2006 (Kooijman et al. 2016). In total 8,879 women were enrolled during pregnancy. Of those, 2,083 women provided three spot urine samples at the time of routine ultrasound examinations: <18, 18-25, and >25 weeks of gestational age. Of the 2,083 mother-child pairs, 1,405 provided data at the follow-up visit at child age 6 y. The present analysis was restricted to individuals with age 6 y follow-up because the primary aim of the studies of prenatal chemical exposures were to examine associations with child health outcomes. Of the 1,405, 776 mother-child pairs had complete data on all three nonpersistent chemical exposure groups across three time points during pregnancy and complete birth weight data. These 776 mother-child pairs therefore comprised the study sample. Mothers provided written informed consent at the time of enrollment. The study protocol underwent human subjects review at Erasmus Medical Center, Rotterdam, Netherlands (IRB Registration no.: IRB00001482, MEC 198.782.2001.31, MEC 217.595/2002/202, MEC-2007-413, MEC-2012.

Chemical Biomarker Measurements
Details of urine specimen collection and the analytical procedure for the measurement of phthalate, bisphenol, and OP pesticide exposure biomarkers are given elsewhere (Kruithof et al. 2014;Philips et al. 2018;van den Dries et al. 2018). Briefly, 18 phthalate metabolites were measured using a solid-phase extraction method followed by enzymatic deconjugation of the conjugated (i.e., glucuronidated) phthalate monoesters coupled with high performance liquid chromatography electrospray ionization-tandem mass spectrometry (HPLC-ESI-MS/MS) (Asimakopoulos et al. 2016). Eight bisphenols were quantified using a liquid-liquid extraction method followed by enzymatic deconjugation coupled with HPLC-ESI-MS/MS (Asimakopoulos et al. 2016). Limits of detection (LOD) for phthalate metabolites and bisphenols were in the range of 0:008-1:11 lg=L. The concentrations below the LOD were not estimated by the lab and were therefore imputed by LOD divided by the square root of 2 (Hornung and Reed 1990).
Regarding OP pesticides, six nonspecific dialkylphosphate (DAP) metabolites of OP pesticides were measured using gas chromatography coupled with tandem mass spectrometry, which resulted in a rapid detection of three dimethyl (DM) metabolites and three diethyl (DE) metabolites with LODs in the range of 0:06-0:50 lg=L (Health Canada 2010). For these compounds, the machine-measured concentrations below the LOD were provided by the lab and therefore used (Butler 1975

Ultrasound and Delivery Measures of Size
Ultrasound scans were carried out to assess gestational age and to quantify fetal growth on the full study population (described in detail elsewhere) . Femur length and head circumference were measured at 18-25 and >25 weeks of gestation. With the use of the Hadlock formula, estimated fetal weight for each measurement was calculated (Hadlock et al. 1985). At birth, weight, length, and head circumference were measured. For each measure, the standard deviation scores (SDS) were computed using longitudinal growth curves that accounted for gestational age at measurement . Birth measurement SDS accounted for gestational age at delivery as well as sex (Niklasson et al. 1991).

Potential Confounders
Potential confounders were selected a priori using a directed acyclic graph based on previous studies of prenatal phthalate, bisphenol, and OP pesticide exposure and fetal growth and on biologically plausible covariate-exposure and covariate-outcome associations observed in our data (see Figure S1). Data on maternal age, parity (0, 1, or 2+), marital status (married/partner vs. single), folic acid intake (none, started in first 10 wk of pregnancy, started preconception), household total net income (<e1,200 per month [i.e., below the Dutch social security level], e1,200-2,000 per month, >e2,000 per month), and highest completed educational level (low: <3 y of high school; intermediate: 3+y of secondary education; and, high: university degree or higher vocational training) was collected by questionnaires during pregnancy. Also, information on ethnicity was collected by questionnaires during pregnancy using the ethnicity categorization of the central bureau of statistics, Netherlands (Alders 2001) (Dutch, Indonesian, Cape Verdean, Moroccan, Dutch Antilles, Surinamese, Turkish, African, American Western, American non-Western, Asian Western, Asian non-Western, European, Oceania). This information was combined and reclassified into Dutch, other-Western (Indonesian, American Western, Asian Western, and European), and non-Western (Cape Verdean, Moroccan, Dutch Antilles, Surinamese, Turkish, African, American non-Western, Asian non-Western, and Oceania). Next, with the use of a questionnaire in each trimester, mothers were asked whether they smoked during pregnancy [never, until pregnancy was known (smoking in the first trimester) or continued smoking during pregnancy]. Similarly, data on maternal alcohol consumption was obtained by questionnaires in each trimester. Mothers were asked whether they consumed alcohol in the past 3 months (with answer categories: "no," "until pregnancy was known," and "after pregnancy was known"). When participants reported that they consumed alcohol, they were asked to classify their average intake in drinks per week. This information was combined and reclassified into the following categories: no alcohol consumption during pregnancy, alcohol consumption until pregnancy recognized, continued occasionally (<1 glass=wk), and continued frequently (1+glass=wk). Maternal prepregnancy weight was self-reported in the first trimester of pregnancy, and height was measured at enrollment.

Statistical Analyses
Few DAP metabolite concentrations were missing due to insufficient sample volume or machine error. Also, very few fetal growth measures during pregnancy (<1%) and several birth measures of head circumference (39%) and length (28%) were missing. These missing values and missing covariate data were imputed 10 times with the multivariate imputation by chained equations (MICE) method in R (version 3.5.3, R Development Core Team); van Buuren and Groothuis-Oudshoorn 2011). DAP metabolites were log10-transformed prior to the imputation procedure. Missing values were assumed to be missing at random. The predictors used to impute missing data include fetal sex, maternal age, prepregnancy weight, height, education level, ethnicity, income, marital status, parity, smoking, alcohol use, folic acid use, phthalic acid, BPA, DMP, fetal and birth weight, head circumference, and length. Missing data of continuous variables were imputed using predictive mean matching and missing data of categorical variables were imputed using logistic regression factor (2 levels) and multinomial or ordered logit models (>2 levels). Convergence plots of the MICE imputation were inspected and showed a healthy convergence (data not shown). The final imputed data set was used for the main analyses.
Prior to the main analyses, urinary exposure biomarker concentrations were expressed on a creatinine basis (micrograms per gram creatinine), log-transformed (base 10), and averaged across pregnancy. Biomarker concentrations in this study may vary from day to day within each subject resulting in high (withinsubject) temporal variability (Perrier et al. 2016;Spaan et al. 2015). Thus, the average is a better indicator of exposure across gestation, and our focus is on identifying fetal growth patterns related to longer-term, rather than recent, exposures.
To estimate the joint effect of phthalate metabolite, bisphenol, and DAP concentrations (i.e., the overall chemical mixture) on fetal growth, we used the quantile-based g-computation (hereafter, quantile g-computation) approach from the qgcomp package in R (Keil et al. 2020). Quantile g-computation estimates the joint effect of increasing all exposures within the mixture by a single quantile (Keil et al. 2020). This method also allows estimation of the joint effect of a specific subset of compounds from the measured mixture (e.g., phthalate metabolites) while still controlling for possible confounding from other chemicals in the mixture (e.g., bisphenols and DAP metabolites).
We first estimated the joint effect of the overall mixture (averaged phthalate metabolites, bisphenols, and DAP metabolites across pregnancy) on fetal size measured at 18-25, >25 wk, and birth. Associations with size end points were examined in separate models because results from previous studies of DAP and phthalate metabolites and fetal growth showed that results differed based on timing of outcome measurement (Ferguson et al. 2019;Santos et al. 2021).
To explore nonlinearity, we used a stepwise procedure, examining whether model fit was improved with inclusion of square terms for each exposure biomarker concentration. Quadratic terms for individual biomarkers were included if they improved model fit based on significantly lower Akaike information criteria (AIC) (likelihood ratio test p < 0:05). If one or more square terms are included in the model, the overall mixture effect is determined by a quadratic term coefficient as well as the coefficient for the lower-order mixture effect, as for a traditional linear regression model. In addition, we estimated differences between each quartile and the lowest quartile based on the predicted SDS for each quartile Q, where q is the integer score assigned to each quartile (q = 0,1,2,3), B 0 is the model intercept, B 1 is the mixture coefficient, and B 2 is the quadratic term coefficient (when included). Effect estimates for each quartile Q q relative to the lowest (reference) quartile Q 0 were then derived as the difference between the predicted SDS for each quartile, SDS Qq -SDS Q0 . To facilitate interpretation of associations, we converted quartile-specific differences in SDS scores to grams by multiplying the estimated difference by the standard deviation (SD) of the mean fetal weight/birth weight at each time point, where means (SD) were 369 g (74 g), 1,626 g (238 g), and 3,451 g (506 g) at 18-25 wk, >25 wk, and birth, respectively. Although a spline term may be more appropriate for capturing nonlinear associations, this approach would be problematic in quantile g-computation since exposure is "quantized," leaving only a few distinct values for exposure and little information to inform knot placement; thus, the overall mixture effect is defined using simpler polynomial models. We did not include interaction terms within our model and thus did not formally evaluate nonadditive effects. Confounders for this and subsequent models, unless otherwise stated, included: fetal sex (categorical), maternal age (continuous), prepregnancy weight (continuous), height (continuous), education level (categorical), ethnicity (categorical), income (categorical), marital status (categorical), parity (categorical), smoking (categorical), alcohol use (categorical), and folic acid use (categorical).
Second, for each chemical class, we estimated the joint (mixture) effects of compounds within each class while adjusting for log10-transformed pregnancy-average concentrations of the individual compounds in each of the other two chemical classes as covariates. Square terms identified in the overall mixture model were retained for these models, both in the mixture term and in the confounder terms.

Sensitivity Analyses
We performed several sensitivity analyses. First, some of the concentrations included in subject-specific exposure averages (e.g., DAP metabolites at >25 weeks of gestation) were measured after the assessment of fetal growth parameters (e.g., fetal weight at 18-25 weeks of gestation). We therefore conducted a sensitivity analysis in which we separately explored the mixture effect of the biomarker concentrations measured at <18, 18-25, and >25 weeks of gestation. In this analysis we only examined outcomes at the same or subsequent visits to exposure measurements. In other words, for birth weight and weight at >25 wk of gestation we included all biomarker concentrations (measured at <18, 18-25, and >25 weeks of gestation) as separate components in the mixture, whereas for weight at 18-25 weeks of gestation, we only included biomarker concentrations measured at <18 weeks of gestation and 18-25 weeks of gestation. As in our primary analyses, the same quadratic terms for exposure biomarkers were included if the model fit was improved as determined by the AIC. These models retained the same confounders as the primary analyses.
Second, we created single-pollutant regression models to estimate the individual chemical biomarker associations with fetal growth parameters. These estimates have been previously published (Ferguson et al. 2019;Santos et al. 2021;Sol et al. 2021) but used different model structures, exposure and outcome parameterizations, and covariate sets based on the research questions of each study. Thus, we replicated those analyses here with the same primary model structure (i.e., averaged biomarkers over pregnancy in association with growth measurements at each time point) and with the same covariates included in our primary model for more direct comparison with our results.
Third, because of the high proportion of head circumference (39%) and length (28%) measurements at birth that were imputed for our primary analyses, we refit models among complete outcome cases in which missing confounder data remained imputed to test the robustness of these results to differing assumptions about missing data. Fourth, because the concentrations below the LOD for BPF and BPS were high, we investigated the association between the mixture and fetal growth in which we excluded BPF and BPS from the analyses. Fifth, Philippat and Calafat (2021) observed that, for BPA, combining repeated urine measures into an average based on creatinine-standardized concentrations may not be suitable (Philippat and Calafat 2021). We therefore carried out a sensitivity analysis in which we reran the main models for weight with the inclusion of the averaged nonstandardized concentrations for BPA. Finally, for models where we observed significant associations, we wanted to investigate which chemical biomarker concentrations within the mixture were contributing the most to observed effect estimates. Because the qgcomp package does not currently allow for estimation of weights when there are nonlinear terms in the model, we reduced models to their linear forms and examined weights as an exploratory sensitivity analysis.

Results
The median age of the mothers at enrollment was 31 y ( Table 1). The majority of mothers participating in this study were Dutch (57%), highly educated (55%), had an income of >e2,000 (71%), were nulliparous (63%), and did not smoke during pregnancy (77%). Women included in the present analysis were older, more often Dutch, highly educated, and less likely to smoke during pregnancy in comparison with those the overall study population (Table S1). See Table S2 for the distribution of participants among the 14 individual ethnic groups in the present study sample and in the Generation R cohort as a whole.
Phthalate metabolite, bisphenol, and DAP metabolite concentration distributions of the biomarkers included in the analyses are presented in Table S3 (descriptive statistics of biomarkers excluded from the analyses can be found in Table S4). Phthalate metabolites were well detected with 0%-1% of observations below the LOD for most metabolites. Regarding bisphenols, BPA had the highest concentrations and was relatively well detected; however, the percentage <LOD for BPS (<18 wk = 32%, 18-25 wk = 56%, and >25 wk = 78%) and BPF (<18 wk = 65%, 18-25 wk = 85%, and >25 wk = 69%) were relatively high. For DAP metabolites, we observed higher concentrations among the DM metabolites as compared to the DE metabolites, and 0%-20% of observed values were below the LOD. The intraclass correlation coefficients calculated using a mean of three measurements, absolute-agreement, and 2-way mixed-effects model, ranged from 0.2 to 0.7 for phthalate metabolites, 0.0 to 0.2 for bisphenols, and 0.3-0.5 for DAP metabolites. Table 2 presents the distributions of pregnancy-averaged chemical biomarker concentrations. Similar to the distributions of the separate time points (Table S5), MEP had the highest averaged concentrations across pregnancy among the phthalate metabolites. Regarding the other chemical groups, BPA had the highest median concentration among bisphenols and DMP was the DAP metabolite with the highest average concentration across pregnancy.
Correlations were positive and generally higher within classes of phthalate metabolites, bisphenols, and DAP metabolites (e.g., Pearson correlation between MEHHP and MEOHP metabolites = 0:9, between BPS and PBF = 0:4, and DMDTP and DMTP metabolites = 0:7) (Figure 1 and Table S6). Correlations The few DAP metabolite concentrations that were missing at each timepoint were imputed prior to the correction for creatinine and the averaging across pregnancy.  Table S6. See Table 2 for biomarker abbreviations.
across chemical classes were lower in magnitude. Furthermore, most phthalate metabolites and bisphenols had low to moderate inverse correlations with DAP metabolites. Regarding fetal growth outcomes, Pearson correlations varied between 0.1 and 0.7 (Table S7). Models of associations between mixtures of pregnancyaveraged biomarkers and fetal weight (estimated during pregnancy by ultrasound) and birth weight (measured at delivery) included quadratic terms for pregnancy-averaged MBzP in models of weight at 18-15 wk and >25 wk, and a quadratic term for pregnancy-averaged MnBP in the model of weight at birth (Table  S8). To facilitate interpretation of the nonlinear model estimates we also compared the predicted SDS for fetal weight or birth weight for each quartile of the overall exposure mixture relative to the predicted SDS for the first quartile (Q1) (Figure 2). In addition, we converted the SDS differences by quartile into estimated differences in grams using the standard deviation of mean weight at each time point (Table S8). At 18-25 wk, fetuses in Q2, Q3, and Q4 for the total mixture had lower predicted SDS for fetal weight compared with those in Q1. When converted to grams (based on SD = 74 g for mean fetal weight at 18-25 wk), estimated differences were -26 g (95% CI : −38, −13), -35 g (95% CI : −55, −15), and -27 g (95% CI : −54, 1), for Q2, Q3, and Q4 vs. Q1, respectively ( Figure 2, Table S8). At >25 weeks of gestation, fetuses in Q2 and Q3 for the total mixture had slightly lower estimated fetal weight relative to the Q1 group (-43 g; 95% CI: -83, 0g and -43 g; 95% CI: -112, 24 g based on SD = 238 g), but there was no association with the highest quartile of exposure (-5 g; 95% CI: -100, 88 g). At birth, compared with the lowest quartile of total mixture exposure, estimated fetal weight was slightly higher for those in Q2 (30 g; 95% CI: -35, 101 g based on SD = 506 g), the same for Q3 (0:0 g; 95% CI: -111, 111 g), and lower for those in the highest quartile (-91 g; 95% CI: −258, 76).
Next, we estimated the mixture effect for each individual chemical group (e.g., phthalate metabolites) on fetal weight while adjusting for the other chemical groups (e.g., bisphenols and DAP metabolites). The individual biomarkers from other chemical classes, as well as their square terms, where appropriate, were treated as separate confounders in the models. In comparison with associations for the overall chemical mixture, we observed a somewhat similar nonlinear pattern for the phthalate metabolite mixture (Figure 3; Table S8). For example, those in the Q4 exposure group had 142 g (95% CI: −258, −20) lower birth weight in comparison with those in Q1. However, the differences in weight at 18-25 wk and >25 wk between the Q1 and Q4 were essentially null. Nonlinear (i.e., quadratic) terms did not improve model fit for models of the DAP metabolites; therefore, associations were linear, with inverse associations with fetal weight at 18-25 wk and >25 weeks of gestation, and weaker positive associations with birth weight were observed. Models of bisphenols also included linear terms only, and associations were close to the null for weight at 18-25 wk and at birth, and positive for weight at >25 wk.
Similar associations for the overall mixture and for individual chemical classes were observed for femur length at 18-25 wk and >25 wk (Table S9). This finding included the shape of the dose-response relationship, with a nonlinear dose-response association for the overall mixture and the phthalate metabolite mixture and a linear pattern for the bisphenol and DAP metabolite mixtures. However, the following differences compared with models of weight were noted. The inverse associations for the overall mixture, the phthalate metabolite mixture, and the DAP metabolite mixture were all greater in magnitude (i.e., more negative) for femur length models at 18-25 and >25 week gestation. Associations with the bisphenol mixture at these time points remained statistically null, although effect estimates trended positive. Further, for models of length at birth, the shape of the doseresponse relationships differed slightly (e.g., the association for the phthalate mixture were linear, but that of the bisphenol mixture was nonlinear), and all effect estimates were close to the  Table S8. Models are adjusted for fetal sex (categorical), maternal age (continuous), prepregnancy weight (continuous), height (continuous), education level (categorical), ethnicity (categorical), income (categorical), marital status (categorical), parity (categorical), smoking (categorical), alcohol use (categorical), folic acid use (categorical). Quadratic terms for individual biomarkers were included if they improved model fit based on significantly lower Akaike information criteria (AIC) (likelihood ratio test p < 0:05): Weight at 18-25 wk = monobenzyl phthalate metabolite, weight at >25 wk = monobenzyl phthalate metabolite, and weight at birth = mono-n-butyl phthalate metabolite. SDS, standard deviation scores.
null. For models of head circumference, quadratic terms were included for all mixtures and phthalate metabolites at 18-25 wk and all mixtures and DAP metabolites at >25 wk, and no statistically significant associations were observed (Table S10). However, albeit not statistically significant, the inverse association between the DAP metabolite mixture and head circumference at 18-25 wk was slightly below the null, and at birth it was slightly above the null.  Table S8. See Table 2  and bisphenols (BPA, BPS, BPF). Quadratic terms for the phthalate mixture model were included because they improved model fit based on significantly lower Akaike information criteria (AIC) (likelihood ratio test p < 0:05): Weight at 18-25 wk = MBzP, weight at >25 wk = MBzP, and weight at birth = MBP. Nonlinear (i.e., quadratic) terms did not improve model fit for models of the DAP metabolites and bisphenols.

Sensitivity Analyses
First, we compared our results to a model where all visit-specific chemical concentrations, as opposed to averages, were modeled in association with fetal weight or birth weight. In this analysis we only examined outcomes at the same or subsequent visits to exposure measurements, as described above. In comparison with the main analyses, the estimates for the overall mixture effect on weight at each time point were similar (Table S11). Second, the results of the analyses in which we excluded BPF and BPS were similar as compared with the main results (Table S11). Third, the results of the analyses in which we included the averaged nonstandardized concentrations for BPA were similar in comparison with the main results (Table S11). Fourth, although not directly comparable because separate exposure models do not account for the joint effects of individual exposure biomarkers, consider nonlinear dose-response relations, or correct for copollutant confounding, single-pollutant model results showed patterns similar to those we observed in our mixtures analysis (Table S12). However, there were some discrepancies compared our primary results. For example, individual phthalate metabolite associations with fetal weight at 18-25 wk were around the null, which is different from the mixture analyses in which we observed an inverse association between the phthalate mixture and weight at 18-25 wk. By modeling cooccurring chemical exposures, associations with a certain outcome can be identified which are missed in single regression models that do not account for joint effects or adjust for coexposures. However, the sensitivity analyses only modeled linear associations, which could also have contributed to differences in results. Fifth, when we reanalyzed models in which we only used complete cases for head circumference and length, results were similar to those from the main analyses where these measures were imputed (Table  S13).
Finally, although we could not estimate weights from models with nonlinear terms, we examined weights in reduced models with linear terms only as a sensitivity analysis (Table S14). These estimates provide information on the contribution of each chemical biomarker to the overall effect in both the negative and positive directions. We used these weights to examine the most important contributors to the negative effects for interpretation of our primary results, where negative effects outweighed the positive (i.e., because associations were inverse). Phthalate metabolites had the highest negative contribution (18-25 wk = 55%, >25 wk = 68%, birth = 68%), followed by DAP metabolites (18-25 wk = 41%, >25 wk = 28%, birth = 23%). Among these, MEOHP (18-25 wk = 25%, >25 wk = 32%, birth = 23%) was the metabolite that contributed the most to the negative association. The DAP metabolite DEP also had a substantial contribution to the negative association with weight at 18-25 wk (26%). On the other hand, MEHHP (18-25 wk = 49%, >25 wk = 30%, birth = 19%) contributed most to the positive association. Independent effects from these models should be interpreted cautiously, because our statistical evidence is for the joint effects noted in our primary results.

Discussion
In this large population-based study, we observed that prenatal exposure to a mixture of phthalates, bisphenols, and OP pesticides was associated with lower fetal weight estimated by ultrasound during pregnancy, and lower weight at birth. These associations appeared to be driven by phthalates and OP pesticides. An important finding was that associations were nonlinear, and the nature of the nonlinearity differed by the period when weight was measured. For estimated fetal weight measured by ultrasound, the largest difference at 18-25 wk was approximately equivalent between the exposure in the lowest quartile and the other exposure quartiles (i.e., first vs. the second, third, and fourth quartile). For weight at >25 wk, the largest difference was observed between the exposure in the lowest quartile and the exposure in the second and third quartile. For birth weight, however, the largest difference was observed comparing the lowest exposure quartile with the highest exposure quartile (i.e., first vs. the fourth quartile). These results suggest differences in susceptibility of nonpersistent chemical exposure mixtures for fetal weight gain occurring in different periods of pregnancy.
Most epidemiological studies of nonpersistent chemicals have focused on the health effects of individual chemical exposures (Lazarevic et al. 2019). Generally, results from these previous studies have been inconclusive (Casas et al. 2016;Dalsager et al. 2018;Ferguson et al. 2016;Harley et al. 2016;Philippat et al. 2012Philippat et al. , 2014Shoaff et al. 2016;Wolff et al. 2008;Zhu et al. 2018). However, the assessment of individual exposure effects on health outcomes may not be ideal to study these compounds. Single-chemical models can bias effect estimates in the presence of copollutant confounders and increase false positives when correlated exposures are modeled separately Kortenkamp 2007). Further, chemicals can act additively, synergistically, antagonistically, or they may be inert with respect to the health outcomes of interest (Gaudriault et al. 2017;Kortenkamp 2007). Thus, single-chemical models may over-or underestimate the risks of exposure when exposures are modeled separately Kortenkamp 2007). By using the quantile-g computation to model nonpersistent chemical exposure, we were able to account for the joint effect of the exposure, reduce the number of tests significantly (i.e., false discoveries), and account for copollutant confounding. Further, quantile g-computation can account for nonlinear dose-response relationships and provides simplicity of inference by presenting one or two estimates for the joint mixture effect. Weighted quantile sum (WQS) regression shares this simplicity of inference and also allows for the estimation of joint effects. However, WQS regression is limited by the assumption of directional homogeneity (i.e., effects of all exposures are zero or in the same direction) (Keil et al. 2020;Lazarevic et al. 2019). Another innovative method that can estimate the joint effects of exposure to mixtures is Bayesian kernel machine regression (BKMR). BKMR has many benefits, such as the ability to concurrently estimate, among highly correlated chemicals, joint effects, nonlinear relationships, and exposure interactions (Bobb et al. 2015). However, the quantile g-computation method provides one or two parameters for a doseresponse estimation, whereas the dose-response parameters of the BKMR are not as easily interpretable.
Comparing our results to prior studies is challenging because of the differences in chemical exposure biomarkers included in the mixture and variation in the methods used to assess the chemical mixture-fetal growth association. Moreover, there are vast differences in exposure assessment approaches (number and timing of urine sample collections) and approaches for outcome assessment (birth measures vs. ultrasound). All but one (Ouidir et al. 2020) of the previous studies investigating chemical mixtures and fetal growth have focused only on birth weight as an outcome, and none have focused on jointly estimating the effects of the nonpersistent exposures examined in our study (Chiu et al. 2018;Kalloo et al. 2020;Lenters et al. 2016;Philippat et al. 2019;Woods et al. 2017). Furthermore, previous analyses have addressed different research questions pertaining to chemical mixtures, such as identifying the most potent compound within the mixture, which requires different statistical methods (Lenters et al. 2016;Philippat et al. 2019).
The results from these studies can best be summarized by research question. First, several studies have used methods such as elastic net penalized regression (ENET) to identify the most toxic compounds in the mixture. Philippat et al. (2019) estimated the association of prenatal exposure to 9 phenols and 11 phthalate metabolites with birth weight. Benzophenone-3 was the only biomarker selected by the multipollutant ENET model that was associated with birth weight, and no associations were observed for phthalate metabolites (Philippat et al. 2019). Similarly, Lenters et al. (2016) assessed associations using ENET between multiple biomarkers of persistent and nonpersistent chemicals and birth weight, identifying MEHHP and several persistent organic pollutants to be associated with lower birth weight, and mono(oxoisononyl) phthalate to be associated with higher birth weight. Second, a commonly used approach for reducing the dimensionality of exposures is the use of principal components analysis (PCA). Chiu et al. (2018) used this approach and identified two principal components [bis(2-ethylhexyl) phthalate (DEHP) metabolites and other phthalate metabolites] that were both associated with lower birth weight (Chiu et al. 2018). Kalloo et al. (2020) also used PCA and observed that a component with loadings from DAP metabolites and per-and polyfluoroalkyl substances (PFAS), but not phthalate metabolites, was associated with lower birth weight. Lee et al. (2020) identified a component comprised of bisphenols and phthalate metabolites but did not observe an association between that component and birth weight. Finally, several studies have used approaches to estimate joint effects of the mixture, as was our goal in the present analysis. The study by Chiu et al. mentioned above used BKMR and noted a decrease in birth weight in association with the overall mixture (including phthalate metabolites). Woods et al. (2017) also observed an overall mixture effect, when the mixture was comprised PFAS, lead, and DAP metabolites.
In summary, with regard to the chemicals examined in the present study, OP pesticide exposure appears to be the most consistently associated with birth weight in studies using a mixture approach. Regarding phthalate metabolites and bisphenols, studies using a mixture approach have produced conflicting results. Although we did not observe associations for DAP metabolites and birth weight, we did observe associations with growth measures in pregnancy. On the other hand, we did observe associations between the phthalate mixture and birth weight for individuals in the fourth quartile of exposure in comparison with those of the first quartile. Given the major differences in design and methods across studies, these incongruences are not necessarily surprising. Additional work to harmonize across methodologies for better comparability is needed in mixtures research. Although, effects of mixtures also depend on the specific mixture to which each population is exposed; inconsistencies across studies may reflect different exposure characteristics of each population, not just study design and methodology differences.
Most previous studies of these chemicals and fetal growth use growth parameters at birth only, whereas our study also included ultrasound measures in pregnancy to examine fetal growth. Fetal growth changes during specific periods of pregnancy may differentially affect childhood health outcomes (Gishti et al. 2014;Henrichs et al. 2010) and may also exhibit differences in susceptibility to environmental chemical exposures (Braun and Gray 2017). We observed that differences in the prenatal exposure mixture (i.e., first vs. the second, third, and fourth quartile) were associated with lower estimated fetal weight at 18-25 wk and that differences in the prenatal exposure mixture between the exposure in the lowest quartile and the exposure in the second and third quartile were associated with lower estimated fetal weight at >25 wk. Decreased growth in this period may be critical for health outcomes later in life. For example, first-trimester fetal growth restriction is linked to faster weight gain and adverse cardiovascular profiles in school-age children Mook-Kanamori et al. 2010). On the other hand, it is notable that differences in birth weight were only apparent at high levels of exposure (e.g., being in the fourth quartile of exposure) in our analysis. Lower birth weight is associated with numerous health outcomes such as increased rates of obesity, insulin resistance, and type 2 diabetes (Barker 2004;Jornayvaz et al. 2016). Based on this pattern of associations, we hypothesize that certain fetal compartments (e.g., organs, adipose, skeleton) are differentially vulnerable to environmental exposures. For example, organs and skeleton, which constitute most of the fetal weight in the first half of pregnancy, may be more sensitive to low levels of these exposures than adipose, which accumulates in the second half of pregnancy (Orsso et al. 2020;Toro-Ramos et al. 2015).
Findings of this study should be interpreted considering the following limitations. First, phthalates, bisphenols, and OP pesticides have short half-lives. We used urinary measurements, which are preferred for assessment of exposure to these compounds (Nieuwenhuijsen 2015). However, concentrations measured in spot urine samples might not accurately reflect pregnancy exposure because concentrations vary from day to day, depending on diet and lifestyle. We therefore created subject-specific averages based on three measurements during pregnancy that may be a more stable reflection of exposure over time. Despite this improvement, measurement error may have resulted in imprecise effect estimates (Perrier et al. 2016). This study was also limited by the high percentages below the LOD for BPF and BPS, which resulted in less variability to measure the exposure on a continuous scale and may have affected the interpretability of the effect estimate (i.e., when exposure levels are the same for individuals in the first and second quartiles of BPF or BPS exposure). However, results were similar when BPF and BPS were excluded from the main models in the sensitivity analyses. Finally, we did not formally evaluate nonadditive effects using the quantile g-computation and thereby may have missed synergistic effects. However, under certain circumstances, e.g., when exposures are highly correlated, nonlinear terms could also capture nonadditivity within the joint effect (Belzak and Bauer 2019). In our more realistic setting, where correlations ranged from moderate to high, it is unlikely that we were captured all nonadditivity in the joint effect estimates; however, the finding from Belzak and Bauer (2019) is an important reminder that nonlinear joint effect estimates of mixtures may capture nonadditive associations even if interactions are not explicitly modeled.
However, this study has several major strengths, such as the large sample size, the availability of three urinary measurements of chemical concentrations to assess exposure, and the repeated ultrasound scans that captured fetal size at multiple time points in pregnancy and in different parameters (e.g., femur length in addition to weight). This study also used quantile-g computation, a novel method to assess the joint effect of a chemical mixture, which provides easily interpretable and parsimonious effect estimates for the mixture as a whole. Such joint effects may more closely resemble real-world effects of exposures than adjusted independent effects when the exposures co-occur (Keil et al. 2020).
In conclusion, we observed that higher exposure to a mixture of phthalates, bisphenols, and OP pesticides was associated with lower fetal weight at 18-25 wk. At >25 weeks of gestation, differences in weight were observed between the first and the second and first and third quartile of the mixture exposure, and differences at birth were observed between the first and fourth quartile of the mixture exposure. These associations appeared to be driven by phthalates and OP pesticides. Growth earlier in pregnancy appeared to be susceptible to lower levels of exposure, whereas higher levels of exposure were needed for an association with birth weight. The chemical exposures in this mixture are widespread in pregnant women; thus, the impact of these chemical mixtures on fetal and neonatal health could be substantial.