Effects of BPA in Snails: Oehlmann et Al. Respond

We welcome critical appraisals that help to provide balance; however, Dietrich et al. gave an unjustified reproach. We feel that Dietrich’s position is severely compromised because he serves as an expert for the bisphenol A (BPA) Industry Group (Brussels, Belgium). We would like to respond to the issues raised by Dietrich et al., as well as to their oversights and inappropriate interpretations of our findings. 
 
The source of test animals was clearly provided in our “Materials and Methods” (Oehlmann et al. 2006). All animals were dissected and sexed; thus, sex distribution was known for each time-point of the experiment. We supposed a 1:1 sex ratio for dead snails, although historical data (n > 14,000) indicate a slight prevalence of females (1.13:1); therefore, our assumption was conservative. Egg production was corrected for the number of females in the tanks, and snail densities were equal for all groups at each time-point. 
 
Semistatic designs are widely applied in scientific and regulatory ecotoxicology [Organization for Economic Development and Co-operation (OECD) 1998]. The actual exposure concentrations of BPA were measured and clearly communicated in our Tables 1 and 2 (Oehlmann et al. 2006). Because 17α-ethinylestradiol (EE2) is more stable than BPA (Larsson et al. 1999), exposure to the positive control is also guaranteed in our 24-hr renewal test. Interestingly, Dietrich himself coauthored a semistatic study on snails (Czech et al. 2001) with several shortcomings: they used no analytical verification of exposure concentrations, no replicates, and inconsistent group size. 
 
Analysis of covariance (ANCOVA) analyses of fecundity, development, and other cumulative data are widely used (Bochdansky and Bollens 2004; Dziminski and Alford 2005; Scharer and Wedekind 1999). In our experiment 2 with replicates (Oehlmann et al. 2006), ANOVA confirmed the ANCOVA results (Figure 2A,2C). A BPA Industry Group–sponsored statistical reevaluation of our raw data (Ecostat 2005) concluded that “at 20°C the mean egg production increased compared to the control in the BPA-exposed females at all applied concentrations (0.25, 0.5, 1 and 5 μg/L), and decreased in the BPA+faslodex- or tamoxifen-exposed females.” 
 
We achieved an association for a steady state of specific binding in three independent time-course studies (Oehlmann et al. 2006). We determined nonspecific binding using a 1,000-fold excess of unlabeled ligands resulting in clear specific binding for testosterone and estradiol. At higher concentrations, nonspecific binding was 70%, comparable with findings of Chou and Dietrich (1999), who also performed their experiments in duplicate. This percentage might be due to homogenization of large amounts of tissue with high protein content but a limited degree of specific cytosolic binding sites. In our study (Oehlmann et al. 2006), we did not intend to deliver a complete binding study in which saturation experiments with Scatchard analysis are needed, but to provide indications for the presence of estrogen receptors by a specific binding of ligands to cytosolic extracts (a widely used practice). Tamoxifen was not disqualified as an antiestrogen because it elicited a binding higher than that of BPA. Furthermore, in vitro ligand affinities have a limited predictive value for biologic potencies in vivo (Kloas et al. 1999). In summary, the binding study was performed appropriately for the desired purpose and provides initial evidence for specific estrogen binding sites with high affinity for BPA. 
 
Data presented in our Figure 1B (Oehlmann et al. 2006) were published in Schulte-Oehlmann et al. (2001) without EE2 because the focus of that work was comparing responses to BPA in four prosobranch species, including Marisa. Because the article was published in German, the distribution was not large enough to bring the issue to a wider audience. In the current article (Oehlmann et al. 2006), EE2 data were included to demonstrate the masking of BPA effects during the spawning season. Because future BPA industry-sponsored studies intend to investigate BPA effects under conditions maximizing reproduction, the problem of masked effects and an associated loss of sensitivity is of vital importance. 
 
Responses in Marisa (ruptured oviducts, increased spawning) are estrogen specific and opposite of androgenic effects (imposex, reduced spawning). This and evidence communicated in our article (Oehlmann et al. 2006) justify the use of EE2 to demonstrate the responsiveness of organisms. Non-monotonic concentration responses have also been reported for estrogen-regulated end points in EE2-exposed fish (Pawlowski et al. 2004), supporting our view that estrogen-specific binding sites in Marisa may represent functional receptors. 
 
Dietrich et al.’s charges that our “Introduction” and “Discussion” were “imbalanced and indeed alarmist” and that we selectively used literature are unjustified. 
 
We hope that the evidence presented here serves to refute the unjustified claims made by Dietrich et al. We leave it to the readers to make final judgment, but we feel that with the ever-increasing body of evidence showing effects of BPA on reproduction in various animal species, common sense will eventually prevail on this issue.

In contrast to what Wålinder et al. (2005) concluded in their article "Acute Effects of a Fungal Volatile Compound," I interpret the article to report essentially no effects beyond chance. In all, the authors carried out some 76 comparisons (each one representing a time point and an exposure vs. control measurement) if you take blink frequency as a single comparison. The authors reported finding 5 "significant differences" out of 76 comparisons. Of the reported significant differences, one (blink frequency) is misleading, as discussed below. Of the remaining 75 comparisons, 4 differences at a p-value of < 0.05 might be expected by random chance. This is without applying Bonferroni's adjustment for multiple comparisons; using this adjustment, a p-value of approximately < 0.0007 would be required for a single comparison to be statistically significant. None of the differences reported reached this level. Wålinder et al. (2005) reported that the subjects showed increased "blink frequency" during 3-methylfuran (3-MF) exposure (Table 1), but the frequency was higher in the exposure phase at time 0, about 9 for 3-MF exposure versus 6.5 for the control air phase. From the data in Table 1, it appears that both groups had fewer overall blinks per minute compared with baseline during the trial ( Figure 2). Reporting that blinking was higher during exposure and not noting that it was higher at baseline is disingenuous. Wålinder et al. (2005) may have mislabeled tear break-up, but as it reads in the legend for Table 1, a negative value indicates a decrease; therefore, the 6 sec given for measured break-up time after 3-MF exposure (Table 1) indicates that it was increased (i.e., longer to tear break-up), which is better. Is this correct? Also, was the observer who measured the tear break-up blinded to the exposure?
Finally, of the four lung measurements taken, the only comparison with a p-value of < 0.05 was the small 100-mL change for forced vital capacity (FVC) right after exposure (Table 4). How do the authors interpret this change in FVC in view of the fact that there was no significant change in forced expiratory volume in 1 sec (FEV 1 )?
The author has provided expert testimony in mold litigation.
There are different opinions on the use of Bonferroni's corrections. Everitt (1995) stated that it gives too conservative estimates if there are more than five tests performed. In environmental medicine, one exposure can have different health effects, so it is reasonable to test for different types of effects on different organs. We prefer to perform conventional statistical tests, without Bonferroni correction, and look at the pattern of significant effects and their biologic plausibility.
We did not perform 76 comparisons (Wålinder et al. 2005); we actually performed 25 tests on 13 physiologic variables based on differences before and after exposure. A fourteenth variable (vital staining) was tested only after exposure. Repeated measurement analysis was performed on blink frequency (60 consecutive measurements of 2 min each) and one questionnaire with 10 questions was administered at six different times. This is a total of 37 tests performed on 25 variables, having five significant values, of which one was highly significant (p < 0.001).
Moreover, all tests point in the same direction-mucosal effects of the exposure. We did not find the same effects over time for control exposure. There is, of course, the possibility that some of the significant effects were due to chance, and we were quite modest in our conclusions, using the words "may" or "might be." Because our study is the first exposure-chamber study on 3-methylfuran (3-MF), more studies are needed to determine final conclusions.
Blink frequency was measured only during the 2 hr of exposure in the chamber. Therefore, there is no preexposure baseline data available at time 0. Eye effects occur quickly; a rapid effect of the exposure on blinking frequency can occur during the first 2 min of exposure to 3-MF inside the chamber, possibly followed by later adaptation (8.8 blinks/min during the first 2 min, compared with a mean of 7.6 blinks/min during the whole period of exposure).
It is true that there was a numerical increase in break-up time at exposure, which could be in agreement with increased blink frequency. The fatty layer on the tear film is produced by the glands of the eyelids. Therefore, an increased blinking frequency could produce more secretion from the meibomian glands and therefore a longer break-up time.
Regarding lung function, transient effects of environmental exposure (as well as diurnal variation, which we controlled for by performing the experiment at the same time) may affect either forced vital capacity (FVC) or forced expiratory volume in 1 sec (FEV 1 ), or both. Physiologically and numerically, the decrease was of the same order, but statistically the outcome was different. The decreases were 0.1 L for FVC and 0.08 L for FEV 1 after exposure to 3-MF. The magnitude of the effect was clinically small, but it was significant at group level for FVC. Small pulmonary effects may have large health effects in a population (Künzli et al. 2000.) children's health. However, when considering the current scientific weight of evidence, their conclusions cannot be supported for two reasons: first, they did not demonstrate a particular concern for children based on their results, and second, they cited incomplete and inappropriate literature to support the notion that recent toxicologic and epidemiologic studies indicate a health concern.
Harnly et al. (2005) did not conduct a risk assessment to demonstrate a concern for children. Rather, they detected pesticides in air concentrations in agricultural areas and then suggested there may be a concern for children because of recent toxicologic and epidemiologic studies. Their observed median exposures of all three active ingredients (chlorpyrifos, diazinon, and malathion) were all low and well within established regulatory limits. A risk assessment approach would have been quite useful. For chlorpyrifos,  detected a 20-day median concentration in air of 0.000033 mg/m 3 . A tier-1 risk assessment assuming an air concentration of chlorpyrifos at 0.000033 mg/m 3 , the mean body weight of a 1-to 2-year-old child of 12.3 kg, a child inhalation rate of 6.8 m 3 /day, and 24-hr outdoor respiration results in a chlorpyrifos inhalation exposure of 0.0000182 mg/kg/day. Margins of exposure (MOE) would be 5,495 [acute inhalation no observed adverse effect level (NOAEL) = 0.1 mg/kg/day], 27,473 (acute NOAEL = 0.5 mg/kg/day), and 1,648 (chronic NOAEL = 0.03 mg/kg/day). All MOEs are greater than the U.S. Environmental Protection Agency (EPA) target MOE of 1,000 for infants, children, and females 13-50 years of age (U.S. EPA 2002).
More problematic is that  stated that Recent cellular, animal, and human evidence of toxicity, particularly in newborns, supports the public health concern indicated by initial risk estimates.
The authors did not provide a sufficiently thorough review of the literature relevant to risk assessment to support or refute their statement. In the case of chlorpyrifos, this statement cannot be supported by the available evidence. The principal problem with Harnly et al.'s approach is not unique to their article. Appropriate risk assessment requires appropriate data, and, as simple as this relationship sounds, it is often ignored.  cited toxicity and epidemiologic studies, but these particular studies are not appropriate for use in risk assessment. This problem has become so pervasive that Conolly et al. (1999) clarified the basic features of toxicology studies that are and are not appropriate for use in risk assessment.  should not have included the findings of  because the high doses, subcutaneous route of administration, and carrier were inappropriate for toxicologic risk assessment (Conolly et al. 1999;Zhao et al. 2005). Indeed, Slotkin (2004), a coauthor of Qiao, has written that there is little academic interest in relevant routes of exposure or pharmacokinetics. He stated that Practical issues that are critical to standardized testing are de-emphasized, such as pharmacokinetics/ toxicokinetics, the matching of routes of exposure to those of humans in industrial, agricultural or domestic settings, or the development of biologically-based dose response models of established hazards. In that sense, the academic approach is entirely deficient in those attributes that are necessary components of the application of research findings to regulatory science.  cited  as a source of concern for adverse consequences of organophosphate exposure. To be complete,  provide more information. They stated, "We failed to demonstrate an adverse relationship between fetal growth and any measure of in utero organophosphate pesticide exposure." An association was found for a couple of variables and decreased gestational duration, but the conclusion was that these potential pesticide effects appear to have "little clinical impact at the population level." Finally, air concentrations have been shown to translate poorly into systemic exposure.  showed that children in houses treated with chlorpyrifos had no detectable increase in urinary 3,5,6-trichloropyridinol (TCP), whereas median peak ambient air chlorpyrifos increased > 10-fold (median of 14 ng/m 3 pretreatment, 196 ng/m 3 on day of treatment). If a 10-fold increase in air chlorpyrifos does not cause a detectable increase in urinary TCP, then the 1-fold background air cannot be contributing measurably to the children's background levels of urinary TCP.

Organophosphates and Outdoor Air: Harnly et al. Respond
Peterson is incorrect in stating that we did not conduct a risk assessment. We summarized in our article ) and detailed in a previous article (Lee et al. 2002) a human health risk assessment demonstrating elevated acute and subchronic risks for children's exposures to ambient chlorpyrifos air concentrations in agricultural communities. Compared with the assessment by Peterson, we used a more refined probability distribution analysis, included air levels of the degradation product (chlorpyrifos oxon), and presented exposures relative to reference values. The chlorpyrifos reference values, however, were based on the same no-observed-adverse-effect levels (NOAELs) and the same 10-fold intraspecies and interspecies, and child uncertainty factors used in Peterson's calculations.
In our article , we suggested that our risk assessment may underestimate risks for several reasons, such as a) risk assessments that use NOAELs, and not the entire dose-response curve, tend to underestimate risks (Castorina and Woodruff, 2003); and b) the true ranges of intraspecies and interspecies variability are unknown and may be larger than the factors used Faustmann et al. 2000). In a very recent study, some Environmental Health Perspectives • VOLUME 114 | NUMBER 6 | June 2006

A 339
Correspondence newborns were 65-130 times less able to metabolize diazoxon and chlorpyrifos oxon than their mothers (Furlong et al. 2006). To further support the concern for children indicated by our quantitative risk assessment, we cited toxicologic studies establishing that in addition to chloinesterase inhibition, on which the NOAEL for chlorpyrifos is established, chlorpyrifos and chlorpyrifos oxon have other neurodevelopmental toxicity mechanisms (Huff et al. 1994;. We also noted that cell death has been induced at the reference dose for drinking water (Greenlee et al. 2005).
Peterson argues that the toxicologic studies we cited (Castorina and Woodruff, 2003;Faustmann et al. 2000;Greenlee et al. 2005;Huff et al. 1994; are an insufficient review of the "literature relevant to risk assessment" and that these studies are not appropriate for use in risk assessment. However, in missing the fact that we conducted a quantitative risk assessment, Peterson is misinterpreting our citations as the only basis for our public health concern. We consider it our public health responsibility to at least qualitatively consider recent toxicologic data in addition to a quantitative risk assessment based on established reference values. Others have argued for a complete restructuring of risk assessment for children, including toxicokinetic modeling and assessment of cellular and molecular outcomes over the entire lifespan of experimental subjects (Landrigan et al. 2004).
For many reasons we disagree with the suggestion that the epidemiologic fetal growth and gestational duration findings of  may be used to disregard concern for in utero and child organophosphate exposure highlighted by . The associations of reduced gestational duration with dimethyl organophosphate urinary metabolites and chloinesterase inhibition were not clinically significant in the California population studied (recent Mexican immigrants who tend to have very healthy birth outcomes). However, a shortened gestational age of a half-week would represent, for some women, a risk of preterm delivery ). Clearly, this finding and the absence of any adverse association between fetal growth and measures of in utero pesticide exposure need to be confirmed or refuted. To be complete, however, we also cited the association found in a New York City population between low birth weight and length and cord plasma levels of chlorpyrifos and diazinon (n = 314) (Whyatt et al. 2004). Further, effects of organophosphate pesticide exposure on early child neurodevelopment have been found (Young et al. 2005) and are continuing to be evaluated in the California and the New York City cohorts. Finally, public health policy is typically developed to protect against a 1 in 1,000, or lower, risk, and the epidemiologic studies cited here are below the sample size necessary to detect such risks.
Peterson notes that a study of children in 10 homes did not demonstrate an association with child urine metabolite levels of chlorpyrifos and ambient air levels following crack and crevice treatment ). Yet, the authors of that study were careful to note a number of study limitations, including the variability and accuracy of the child urinary metabolite readings. We also note that chloryprifos oxon, which also breaks down into the measured urinary metabolite, was not measured in air; air concentrations in four of the study homes were not elevated compared to pretreatment levels; and personal air samples were not collected . Among mothers in New York City (n = 314) in another study, 48-hr personal air samples collected during pregnancy were associated with cord and maternal blood levels of chlorpyrifos (Whyatt et al. 2004). This is the same study population within which an association with adverse birth outcomes and pesticide cord blood levels has been demonstrated, and the chlorpyrifos air levels are in the same (average, 15 ng/m 3 ) range, if not lower, as those evaluated in our health risk assessment (Whyatt et al. 2004).

Effects of BPA in Snails
It is an ethical requirement that new findings be presented in light of and in conjunction with a balanced evaluation of the current knowledge and published literature. lated this general principle in several ways. For example, the authors inferred that prosobranch snails have a functional estrogen receptor and therefore a much higher sensitivity to estrogens and endocrine-disrupting compounds (EDCs) than other species previously reported in the literature. We found several other problems in their article: reveal the source of the animals used in their study, thus prohibiting independent repetition of the experiments by others.
Second, the authors stated that male and female Marisa cornuarietis cannot be distinguished morphologically without killing the animals. Therefore, the lack of data on the sex distribution of the animals sampled at each time-point leads us to question the stability of the experimental conditions with regard to sex ratios and thus reproductive conditions. Furthermore, the A 340 VOLUME 114 | NUMBER 6 | June 2006 • Environmental Health Perspectives

Correspondence
We believe that ) vio-First, Oehlmann et al. (2006 did not rapidly changing snail density, and hence the sex distribution at each sampling time point, certainly influenced the remaining animals with respect to mortality and fecundity. Third, the experimental design and the lack of replication (Experiment 1) did not allow for sound statistical analysis; the statistical methods used were inappropriate, making correct interpretation impossible. Of most concern to us was the analysis of data by analysis of covariance (ANCOVA), mainly because the ANCOVA-inherent assumption of independency of the dependent variable (i.e., total number of eggs) is violated. Thus, small differences among aquaria (treatment groups) might have been propagated over time, resulting in the impression of large differences.
Fourth, we believe that carrying out receptor binding experiments only in duplicate and without Scatchard analysis is questionable per se. The number of concentrations tested was extremely limited and consequently cannot allow accurate description of binding curves. Oehlmann et al.
the assessment of unspecific binding and the reported IC 50 values (concentration causing 50% inhibition) are approximately three orders of magnitude higher than what would be expected if this were a real sex-steroid receptor interaction. Because tamoxifen did not elicit a typical and highly specific recep- Figure 3), we question the use of tamoxifen as an "antiestrogen" in this in vivo study.
Finally, the data in Figure 1B  , yet the originally published data did not incorporate 17α-ethinylestradiol (EE 2 ) as positive control. Moreover, the EE 2 curve in Figure 1B appears identical to the one on slide 14 from a slide presentation available on Oehlmanns' website (Schulte- . The use of a positive control is commendable when the mode of action is known [National Toxicology Program (NTP) 2001]; however, as in the study of knowledge precludes the inclusion of a positive control as proof-of-principle. Slide 14 (Schulte- ) demonstrates that EE 2 does not have a monotonic mode of activity in M. cornuarietis, but rather appears to stimulate egg laying at 10-25 ng EE 2 /L, inhibit egg laying at 50 ng EE 2 /L and has no effect at 1 and 100 ng EE 2 /L. On the basis of in vitro and in vivo we question the presence of any estrogen receptor-like interaction. In view of the NTP (2001) definitions and use of con-trols, the use of EE 2 as a "positive" control, with its nonmonotonic and nonhormetic dose-response curve in comparison with BPA (which has a presumably monotonic response curve), as well as the use of an antiestrogen (tamoxifen), is inappropriate.
In conclusion, the data presented by Flaws in the experimental design, data presentation, and interpretation as well as statistical analyses render their findings untenable. Furthermore, the "Introduction" and "Discussion" of their article was written in a way that could be considered highly imbalanced and indeed alarmist. The highly selective inclusion/omission and discussion of previously published research that contradicts the authors' opinion (e.g., Pickford et al. 2003) is particularly disturbing. It is our opinion that our evaluation of the Oehlmann et al. work serves as a useful reminder to scientists that we must constantly strive to formulate clear hypotheses, use sound experimental designs, employ appropriate statistics, and draw conclusions that are supported by the available data and that reflect a balanced assessment of the scientific literature to avoid jumping to erroneous conclusions.

Effects of BPA in Snails: Oehlmann et al. Respond
We welcome critical appraisals that help to provide balance; however, Dietrich et al. gave an unjustified reproach. We feel that Dietrich's position is severely compromised because he serves as an expert for the bisphenol A (BPA) Industry Group (Brussels, Belgium). We would like to respond to the issues raised by Dietrich et al., as well as to their oversights and inappropriate interpretations of our findings. The source of test animals was clearly provided in our "Materials and Methods" was known for each time-point of the experiment. We supposed a 1:1 sex ratio for dead snails, although historical data (n > 14,000) indicate a slight prevalence of females (1.13:1); therefore, our assumption was conservative. Egg production was corrected for the number of females in the tanks, and snail densities were equal for all groups at each time-point.
Semistatic designs are widely applied in scientific and regulatory ecotoxicology [Organization for Economic Development and Co-operation (OECD) 1998]. The actual exposure concentrations of BPA were measured and clearly communicated in our Because 17α-ethinylestradiol (EE 2 ) is more stable than BPA (Larsson et al. 1999), exposure to the positive control is also guaranteed in our 24-hr renewal test. Interestingly, Dietrich himself coauthored a semistatic study on snails (Czech et al. 2001) with several shortcomings: they used no analytical verification of exposure concentrations, no replicates, and inconsistent group size.
Analysis of covariance (ANCOVA) analyses of fecundity, development, and other cumulative data are widely used (Bochdansky and Bollens 2004;Dziminski and Alford 2005;Schärer and Wedekind 1999  ). All animals were Tables 1 and 2 , ANOVA confirmed the tor binding curve , ANCOVA results (Figure 2A,2C). A BPA Industry Group-sponsored statistical reevaluation of our raw data (Ecostat 2005) concluded that "at 20°C the mean egg production increased compared to the control in the BPA-exposed females at all applied concentrations (0.25, 0.5, 1 and 5 µg/L), and decreased in the BPA+faslodex-or tamoxifen-exposed females." We achieved an association for a steady state of specific binding in three independent time-course studies (Oehlmann et al. ands resulting in clear specific binding for testosterone and estradiol. At higher concentrations, nonspecific binding was 70%, comparable with findings of Chou and Dietrich (1999), who also performed their experiments in duplicate. This percentage might be due to homogenization of large amounts of tissue with high protein content but a limited degree of specific cytosolic binding sites. In our study (Oehlmann et al. plete binding study in which saturation experiments with Scatchard analysis are needed, but to provide indications for the presence of estrogen receptors by a specific binding of ligands to cytosolic extracts (a widely used practice). Tamoxifen was not disqualified as an antiestrogen because it elicited a binding higher than that of BPA. Furthermore, in vitro ligand affinities have a limited predictive value for biologic potencies in vivo (Kloas et al. 1999). In summary, the binding study was performed appropriately for the desired purpose and provides initial evidence for specific estrogen binding sites with high affinity for BPA.
Data presented in our Figure 1B Schulte-  without EE 2 because the focus of that work was comparing responses to BPA in four prosobranch species, including Marisa. Because the article was published in German, the distribution was not large enough to bring the issue to a wider audience. In the current article 2 data were included to demonstrate the masking of BPA effects during the spawning season. Because future BPA industry-sponsored studies intend to investigate BPA effects under conditions maximizing reproduction, the problem of masked effects and an associated loss of sensitivity is of vital importance.
Responses in Marisa (ruptured oviducts, increased spawning) are estrogen specific and opposite of androgenic effects (imposex, reduced spawning). This and evidence communicated in our article (Oehlmann et al. 2 to demonstrate the responsiveness of organisms. Non-monotonic concentration responses have also been reported for estrogen-regulated end points in EE 2 -exposed fish (Pawlowski et al. 2004), supporting our view that estrogenspecific binding sites in Marisa may represent functional receptors.
Dietrich et al.'s charges that our "Introduction" and "Discussion" were "imbalanced and indeed alarmist" and that we selectively used literature are unjustified.
We hope that the evidence presented here serves to refute the unjustified claims made by Dietrich et al. We leave it to the readers to make final judgment, but we feel that with the ever-increasing body of evidence showing effects of BPA on reproduction in various animal species, common sense will eventually prevail on this issue.