Urinary Concentrations of Organophosphate Flame Retardant Metabolites and Pregnancy Outcomes among Women Undergoing in Vitro Fertilization

Background: Evidence from animal studies suggests that exposure to organophosphate flame retardants (PFRs) can disrupt endocrine function and impair embryo development. However, no epidemiologic studies have been conducted to evaluate effects on fertility and pregnancy outcomes. Objectives: We evaluated associations between urinary concentrations of PFR metabolites and outcomes of in vitro fertilization (IVF) treatment among couples recruited from an academic fertility clinic. Methods: This analysis included 211 women enrolled in the Environment And Reproductive Health (EARTH) prospective cohort study (2005–2015) who provided one or two urine samples per IVF cycle. We measured five urinary PFR metabolites [bis(1,3-dichloro-2-propyl) phosphate (BDCIPP), diphenyl phosphate (DPHP), isopropylphenyl phenyl phosphate (ip-PPP), tert-butylphenyl phenyl phosphate (tb-PPP), and bis(1-chloro-2-propyl) phosphate (BCIPP)] using negative electrospray ionization liquid chromatography tandem mass spectrometry (LC-MS/MS). Molar concentrations of the urinary PFR metabolites were summed. We used multivariable generalized linear mixed models to evaluate the association of the PFR metabolites with IVF outcomes, accounting for multiple IVF cycles per woman. Results: Detection frequencies were high for BDCIPP (87%), DPHP (94%), and ip-PPP (80%), but low for tb-PPP (14%) and BCIPP (0%). We observed decreased success for several IVF outcomes across increasing quartiles of both summed and individual PFR metabolites (DPHP and ip-PPP) in our adjusted multivariable models. Significant declines in adjusted means from the lowest to highest quartile of ΣPFR were observed for the proportion of cycles resulting in successful fertilization (10% decrease), implantation (31%), clinical pregnancy (41%), and live birth (38%). Conclusions: Using IVF to investigate human reproduction and pregnancy outcomes, we found that concentrations of some urinary PFR metabolites were negatively associated with proportions of successful fertilization, implantation, clinical pregnancy, and live birth. https://doi.org/10.1289/EHP1021


Introduction
One in six couples worldwide is affected by infertility, which is defined as the inability to get pregnant after 1 y or more of unprotected intercourse (Chandra et al. 2005), and a recent U.S. study found that pregnancy loss (miscarriage) affected approximately 28% of couples planning a pregnancy (Buck Louis et al. 2016). These figures will likely rise as the postponement of childbearing increases in developed regions of the world. Infertility has an associated health-care cost in the billions of dollars per year, not including the physical and psychological burden placed on the couple (Connolly et al. 2010). Both the high rates of fertility along with the associated high costs highlight the need to improve our understanding of risk factors that impair the ability to have a child.
One potential risk factor is environmental exposure. Several classes of endocrine-disrupting chemicals (EDCs) with widespread general population exposure, including pesticides and phthalates, have been linked to infertility and adverse pregnancy outcomes (Di Renzo et al. 2015). While there are hundreds or more EDCs, only a fraction have been evaluated for effects on infertility and pregnancy (Gore et al. 2015). Organophosphate flame retardants (PFRs) are a class of EDCs with ubiquitous exposure that have been detected in 90-100% of adult urine samples (Butt et al. 2014(Butt et al. , 2016Carignan et al. 2013a;Hammel et al. 2016;Hoffman et al. 2014;Meeker et al. 2013a). Over the past decade, PFRs have been used widely in the polyurethane foam of upholstered furniture as replacements for pentabromodiphenyl ether, a flame retardant mixture that was phased out of use in 2005 (Stapleton et al. 2009). These chemicals are not chemically bonded to foam and have been shown to migrate into the air and dust of indoor environments (van der Veen and de Boer 2012). The PFR triphenyl phosphate (TPHP) is used as part of flame retardant mixtures in polyurethane foam as well as in a variety of other applications, including as a plasticizer (WHO 1991).
Animal studies indicate that exposure to PFRs can disrupt endocrine function through altered thyroid action, steroidogenesis, or estrogen metabolism, and can also impair embryo development (Farhat et al. 2013;Liu et al. 2013;Wang et al. 2015). Reductions in sperm motility and increased serum total T 3 levels were associated with increasing PFR exposures in a small study of men (Meeker et al. 2013b); however, no studies have investigated the effect of PFRs on pregnancy outcomes. Therefore, we explored the association between urinary concentrations of PFRs and pregnancy outcomes among women in a prospective cohort study, the Environment And Reproductive Health (EARTH) study, using assisted reproductive technologies (ART) as a model to study early developmental endpoints and pregnancy outcomes.

Participants
Study participants were women recruited into the EARTH study, which was established in June 2004 to evaluate environmental and dietary determinants of fertility from patients undergoing ART at the Massachusetts General Hospital (MGH) Fertility Center. Women between the ages of 18 and 46 were eligible to participate, and approximately 60% of those contacted by the research nurses enrolled in the study. The EARTH study was approved by the Human Studies Institutional Review Boards of the MGH and Harvard T.H. Chan School of Public Health. Participants signed an informed consent after the study procedures were explained by trained study staff and any questions were answered. Demographic information including race/ethnicity, smoking history, and education, as well as whether they had a previous pregnancy, were recorded at study entry by the participant using a questionnaire. To be included in the present analysis, women must have contributed their own oocytes and at least one urine sample for the measurement of flame retardant metabolites during an in vitro fertilization (IVF) cycle. From the 212 women (298 IVF cycles) who met these criteria, we excluded 1 woman (1 IVF cycle) with incomplete outcome data. Our final data set included 211 women with 297 IVF cycles who had complete information on the exposure and outcome variables.

Clinical Data and in Vitro Fertilization Outcomes
At study entry, the participant's date of birth was collected, and her weight and height were measured by trained study staff. Body mass index (BMI) was calculated as weight (in kilograms) per height (in meters) squared. Clinical information was collected or abstracted from the patient's electronic medical record by trained study staff at enrollment into the study and after each IVF cycle. Follicle stimulating hormone (FSH) was measured in a blood sample drawn on the third day of the menstrual cycle and analyzed for with an automated electrochemiluminescence immunoassay at the MGH Core Laboratory as previously described (Mok-Lin et al. 2010). Cause of infertility was diagnosed by a physician at the MGH Fertility Center according to the Society for Assisted Reproductive Technology (SART) using standard infertility definitions (SART 2014;Mok-Lin et al. 2010). IVF treatment protocols were assigned by a physician at the MGH Fertility Center based on clinical indications and factors such as age and infertility diagnosis. Treatment protocols include: a) luteal phase gonadotropin-releasing hormone (GnRH) agonist (low-, regular-, or high-dose leuprolide acetate; Lupron); b) follicular phase GnRH agonist/flare stimulation; or c) GnRH antagonist. Lupron dose was reduced at, or shortly after, the start of ovarian stimulation with FSH/human menopausal gonadotropin (hMG) in the luteal phase GnRH agonist protocol. Serum peak estradiol (pmol=L) was measured on the day of ovulation trigger with human chorionic gonadotropin (hCG) [estradiol (E 2 ) trigger levels] using an automated electrochemiluminescence immunoassay at the MGH Core Laboratory. Oocytes were counted and classified by embryologists after egg retrieval as germinal vesicle, metaphase I, metaphase II, or degenerated. Fertilization was obtained by conventional IVF or intracytoplasmatic sperm injection (ICSI). Fertilization was confirmed 17-20 h after insemination by the presence of a fertilized oocyte with two pronuclei. Embryos were monitored for cell number and morphological quality [1 (best) to 5 (worst)] on days 2 and 3, and considered of best quality if they had a score of 1 or 2. Transfer occurred rarely on day 2, with the majority on day 3 or 5 and comprised of cleavage stage and blastocyst stage embryos, respectively. Most women had 1-3 embryos transferred (range = 0-5), with the number dependent on the woman's age, cycle number. and the day of transfer. Implantation was defined as a serum b-hCG level >6 mIU=mL approximately 17 d (range = 15-20 d) after egg retrieval, clinical pregnancy as the presence of an intrauterine pregnancy confirmed by ultrasound at approximately 6-wk gestation, and live birth as the birth of a neonate on or after 24-wk gestation.

Organophosphate Flame Retardant Assessment in Urine Samples
Urine samples were collected from participants enrolled in the EARTH study between May 2005 and January 2015. Each sample was collected into a sterile polypropylene cup, and up to two urine samples were collected during each IVF cycle. Urine samples were collected at a geometric mean (GM) [95% confidence interval (CI)] of 7.6 (7.3, 8.0) d apart with 178 (141, 215) d between the first and second IVF cycle. Following collection of each sample, specific gravity (SG) was measured using a Protometer (hand held) Model 100B refractometer (National Instrument Company, Inc.); the urine sample was divided into aliquots and frozen at −80 C. Samples were shipped on dry ice overnight to H.M. Stapleton's lab at Duke University (Durham, NC) for the quantification of the PFR metabolites ( Figure 1).
Samples were analyzed by LC-MS/MS in 10 separate batches, and unique method detection limits (MDLs) were calculated for each analysis batch. In the urine samples, the mean recovery of the mass-labeled standards was 119% (standard error = 0:75%) for d 10 -DPHP and 152% (2.2%) for d 10 -BDCIPP. One laboratory blank (5 mL MilliPore water only) sample was extracted with every batch (n = 95). An in-house standard reference material (SRM) was prepared from pooled urine that was collected during previous studies. SRM samples were periodically analyzed during the extraction batches (n = 18) and were generally within 10% for DPHP, 15% for BDCIPP, and 20% for ip-PPP. Two of the individual subsamples were analyzed in duplicate to assess method precision and were generally within 15% for DPHP, 25% for ip-PPP, and 35% for BDCIPP. Very low levels of DPHP (mean = 0:58 ng) and ip-PPP (mean = 0:21 ng) were commonly detected in the laboratory blanks, and analyte values were blank corrected using the mean laboratory blank values. MDLs were calculated as three times the standard deviation of laboratory blanks normalized to the volume of water extracted (5 mL). MDLs ranged (n = 10) from 68-180 pg=mL for BCIPP, 31-300 pg=mL for BDCIPP, 25-130 pg=mL for DPHP, and 23-120 pg=mL for ip-PPP, and 10-150 pg=mL for tb-PPP, respectively.

Data Analysis
Demographic and clinical characteristics were reported using mean ± interquartile range or percentages. Unquantified concentrations <MDL were substituted with a value equal to the MDL= p 2 (Hornung and Reed 1990). To account for urinary dilution, we adjusted to SG as described by Pearson et al. (2009): C SG = CÃ ½ðSG m − 1Þ=ðSG i − 1Þ where C SG = SG-adjusted urinary metabolite concentration, C = urinary metabolite concentration, SG m = mean SG for the population, and SG i = SG for an individual sample. Cycle-specific urinary metabolite concentrations were calculated using the GM of the two urinary metabolite concentrations from each IVF cycle. Cycle-specific concentrations were divided into quartiles for use in regression models. All analyses used the SGadjusted urinary metabolite concentrations.
To evaluate the associations between the urinary metabolites and IVF outcomes, we fit multivariable generalized linear mixed models with random intercepts to account for multiple IVF cycles in the same woman. These models allow for the use of multiple outcome observations per individual while accounting for withinperson correlations in outcomes. These models are also appropriate and can provide unbiased estimates in the presence of an unbalanced design (e.g., different number of cycles contributed per woman) when imbalance in the number of IVF cycles is not completely random (e.g., women with more cycles are having more difficulty getting pregnant), and the lack of balance can be accurately predicted by all measured covariates in the adjusted model. A normal distribution and identity link function was specified for peak estradiol and endometrial lining thickness. A Poisson distribution and log link function were specified for the number of mature and total oocytes as well as best quality embryo. A binomial distribution and logit link function were spcified for fertilization and the proportion of mature to total oocytes. Finally, a binary distribution with a logit link function were specified for the clinical outcomes (implantation, clinical pregnancy, and live birth). Tests for trend were conducted across quartiles using the median log-transformed urinary metabolite concentration in each quartile as a linear variable in the regression models. To allow for better interpretation of the results, all results are presented as population marginal means adjusted for covariates. Percent decrease was calculated as the difference in marginal means from Q1 to Q4 divided by the marginal mean from Q1.
We evaluated confounding using prior knowledge and descriptive statistics from our cohort. The following covariates were considered for inclusion in the final model: maternal age (continuous), race/ethnic group (black/Asian/other, white/ Caucasian), BMI (continuous), smoking history (ever, never), education (high school, college graduate, graduate degree), year of IVF treatment cycle (continuous), and primary infertility diagnosis (female factor, male factor, and unexplained). Variables were included in the final model if they were associated with any of the individual PFR metabolites (BDCIPP, DPHP, ip-PPP) in our population, were suspected to be associated with exposure based on previous research, or were strong predictors of the outcome. In our primary analysis of the early developmental outcomes, we excluded seven women (16 IVF cycles) with unsuccessful oocyte retrieval, and applied this restriction in a sensitivity analysis for the clinical outcomes. We tested whether the associations of urinary metabolite concentrations with fertilization were modified by ICSI by entering a product of the metabolite quartiles and a binary variable representing the presence or absence of ICSI into the models. Given our limited sample size for detecting interactions, a suggestion of interaction was considered if the p-value for this interaction term was <0:10. Finally, we tested for a potential cohort effect by controlling for maternal year of birth, rather than maternal age, in our adjusted model.
We evaluated the relationship between urinary PFR metabolites using Spearman's correlation and estimated the variability of urinary metabolite concentrations within a cycle and for all cycles by calculating the intraclass correlation coefficient (ICC). We conducted all statistical analyses using SAS (version 9.2; SAS Institute Inc.), and all tests other than interaction with ICSI considered two-sided significance levels less than 0.05 as statistically significant.

Study Population
Our analysis included 211 women who were on average 35.0 y of age; 87% were Caucasian, 75% had never smoked, and 34% had a prior pregnancy (Table 1). The primary SART diagnosis was approximately equally distributed between female factor (36%), male factor (29%), and unexplained (35%).
Demographic variables associated with urinary PFR concentrations included age (BDCIPP and DPHP), BMI (BDCIPP), and year (ip-PPP). For every unit increase in age, there was a 4% (p = 0:05) and 3% (p = 0:04) decline in mean urinary BDCIPP and DPHP concentrations. For every unit increase in BMI, there was a 4% (p = 0:03) increase in urinary BDCIPP, and every year, there was a 7% (p = 0:002) decline in urinary ip-PPP concentrations. There were no other significant associations with the demographic variables, and no associations of urinary PFR quartiles with reproductive/cycle characteristics were observed.

In Vitro Fertilization Outcomes
We observed significant declines in adjusted means from the lowest to highest quartile of RPFR for the proportion of cycles resulting in successful fertilization (10% decrease, p-trend = 0:04), implantation (31%, p-trend = 0:02), clinical pregnancy (41%, p-trend = 0:004) and live birth (38%, p-trend = 0:05), adjusted for age, BMI, race/ethnicity, year of cycle, and infertility diagnosis Environmental Health Perspectives 087018-4 ( Figure 2, Tables 3 and 4). There were also significant declines in adjusted means from the highest to lowest quartile of DPHP for the proportion of cycles resulting in successful implantation (28% decrease, p-trend = 0:02) and clinical pregnancy (36%, p-trend = 0:01) as well as for ip-PPP for the proportion of cycles resulting in successful fertilization (16%, p-trend = 0:0006), implantation (27%, p-trend = 0:05), and live birth (34%, p-trend = 0:05). Unadjusted results were similar (Table S1). Results for the clinical outcomes were similar when restricted to IVF cycles with successful oocyte retrieval, restricted to the first cycle in the  EARTH study, restricted to nulliparous women, or controlling for maternal year of birth (Table S2).
The proportion of fertilized oocytes was significantly lower in the highest quartile (Q4) of urinary ip-PPP compared to the lowest (Q1) with an adjusted difference in proportions (95% CI) of 0.10 (0.02, 0.19) using conventional fertilization that was smaller and nonsignificant when restricted to fertilization using ICSI [adjusted difference in proportions = 0:04 (95% CI = − 0:02, 0.10)]. However, the interaction term by ICSI on the association of urinary ip-PPP with fertilization was not statistically significant (p of interaction = 0:53).
There was a significant positive association between urinary BDCIPP and the number of total oocytes (18% increase, p-trend = 0:04) and a nonsignificant increase in the number of mature oocytes (15% increase, p-trend = 0:10) retrieved in an IVF cycle. For endometrial lining thickness, we observed a nonsignificant decline in adjusted means from the highest to lowest quartile of urinary DPHP (4% decrease, p-trend = 0:15). Similar trends with oocyte count were not observed for DPHP, ip-PPP, or RPFR (p-trends >0:3). No other associations were observed between urinary metabolites and the other early developmental outcomes, and unadjusted results were similar (Table S3).

Discussion
As far as we are aware, this is the first study to explore associations in an epidemiologic study on the effects of PFRs on female reproduction. We used the model of IVF to investigate human reproduction and early pregnancy outcomes, ranging chronologically from oocyte retrieval, oocyte fertilization, embryo quality, and implantation to clinical pregnancy and live birth. The sum of the urinary PFR metabolites was associated with reduced probability of successful fertilization, implantation, clinical pregnancy, and live birth. These findings are clinically relevant, as the adjusted proportion of live births in the highest quartile of RPFR was 0.30 as compared to 0.48 in the lowest quartile (adjusted difference in proportions = 0:18). GM concentrations of the PFR metabolites in our study population were similar to or lower than other adult populations in the United States, Norway, and Australia; thus, these exposure levels are not abnormal or high (Butt et al. 2014;Carignan et al. 2013b;Cequier et al. 2015;Cooper et al. 2011;Dodson et al. 2014;Hoffman et al. 2015;Meeker et al. 2013b;Van den Eede et al. 2015) (Table S4).
Animal studies indicate that PFRs may adversely affect female reproduction through disruption of regulatory pathways mediated by the hypothalamus-pituitary-gonadal axis. For example, studies in zebrafish have observed decreased fecundity (hatching and survival) with independent exposures to TDCIPP and TPHP Wang et al. 2015). Evidence of hormone disruption in zebrafish includes increased plasma E 2 , testosterone, and vitellogenin as well as increased triiodothyronine (T 3 ) and decreased thyroxine (T 4 ) Wang et al. 2015Wang et al. , 2013. Studies of chicken embryos have observed delayed hatching as well as declines in plasma thyroxine and cholesterol (Farhat et al. 2014(Farhat et al. , 2013. Human studies have also observed an association between low pregnancy levels of estradiol and fetal loss (Schindler 2004), and it is well known that subclinical hypothyroidism can adversely affect fertility (Abdel Rahman et al. 2010;Bussen et al. 2000;Scoccia et al. 2012;Velkeniers et al. 2013).
Strengths of our study include the prospective study design, preconception measurement of exposure using repeated urine samples (which is necessary for short-lived biomarkers like the PFR metabolites), state-of-the-art measurement of PFR exposure biomarkers, assessment of early developmental outcomes (i.e., fertilization, implantation) that are not observable in non-IVF populations, clinical outcome data obtained from electronic medical records, and control for potential confounders. One limitation of this analysis is that we did not consider the male partner's exposure, which may be correlated with his female partner and could contribute to the observed association. Our findings are generalizable to the infertility clinic population, which is sizable (Thoma et al. 2013), and may be more broadly generalizable, assuming that women undergoing IVF have similar biological responses to PFR exposure as women not undergoing IVF. Our findings may also be more relevant and generalizable to older women, as the mean age of women in our study population was 35 y. As each outcome is dependent on the previous, a larger sample would be required to differentiate between independent effects on implantation, clinical pregnancy, and live birth.