Arsenic contamination in West Bengal and Bangladesh: statistical errors.

In their paper “Groundwater Contamination in Bangladesh and West Bengal, India,” Chowdhury et al. (1) address arsenic contamination in groundwater from two countries in Asia. Although arsenic contamination is a serious concern for the entire world, Chowdhury et al. bring out the proportion of people at risk in these areas by measuring arsenic levels using various biochemical parameters, but there are errors and missing information in their statistical presentation. As much as possible, I would like to clarify the statistical presentation of their data. Chowdhury et al. (1) did not classify all the cases for the Bangladesh data in Figure 2 of their paper. The total of the percentages shown for Bangladesh is 98.9; thus 1.1% is missing. This 1.1% represents 121 cases that were not classified. Fortunately, the percentage of tube wells affected by arsenic (100–1,000 μg/L) was provided in the text. These 121 cases may have been omitted from the first two class intervals. Therefore, these cases were distributed in the first two class intervals for further analysis, namely 61 cases to the first interval (10–50 μg/L) and 60 cases to the second class interval (51–99 μg/L). The mean arsenic level (± SD) was 186.16 ± 225.23 μg/L for tube well water in Bangladesh and 67.00 ± 107.84 μg/L in West Bengal. The difference between these mean levels of arsenic are statistically significant (p < 0.001). This significant difference reveals that, on average, the groundwater arsenic contamination in Bangladesh is 2.8 times higher than that in West Bengal. Similarly, the proportion of tube wells containing water contaminated with arsenic at concentrations > 50 μg/L is also statistically significant (p < 0.001) between these countries. The second error is in Table 2 of Chowdhury et al.’s paper (1), under the skin scale of West Bengal. The given SD of 4,750 is not possible, and it is not consistent with other results shown in the table. The skin scale arsenic level ranged from 1,280 to 1,550 (μg/L), and a range of 270. Thus, the given SD of 4,750 is not possible. From Chowdhury et al.’s (1) Table 2, I calculated the mode value and obtained approximate values of the first and third quartiles (2). Chowdhury et al. used seven values in the analysis of their data, but because the mode was not well-defined for urine data from West Bengal and hair and urine data from Bangladesh, only six values could be used to calculate the mode. I used these values for further analysis. I calculated the correlation matrix (3) for West Bengal (Table 1) and Bangladesh (Table 2) to determine the linear relationship of arsenic concentrations among the biochemical parameters. For the Bangladesh data (Table 2), the nail arsenic level and the skin scale arsenic level have perfect correlation. Moreover, the nail arsenic level includes the normal range shown by Chowdhury et al. in their Table 2. Although Chowdhury et al. (1) declared that there is no normal arsenic level for skin scale, it is possible to use these data to determine the corresponding skin scale arsenic level (micrograms per kilogram) by simple regression analysis (4); that is, for a given nail arsenic level, it is possible to determine the skin scale arsenic level using the following linear regression equation:

In their paper "Groundwater Contamination in Bangladesh and West Bengal, India," Chowdhury et al. (1) address arsenic contamination in groundwater from two countries in Asia. Although arsenic contamination is a serious concern for the entire world, Chowdhury et al. bring out the proportion of people at risk in these areas by measuring arsenic levels using various biochemical parameters, but there are errors and missing information in their statistical presentation. As much as possible, I would like to clarify the statistical presentation of their data.
Chowdhury et al. (1) did not classify all the cases for the Bangladesh data in Figure 2 of their paper. The total of the percentages shown for Bangladesh is 98.9; thus 1.1% is missing. This 1.1% represents 121 cases that were not classified. Fortunately, the percentage of tube wells affected by arsenic (100-1,000 µg/L) was provided in the text. These 121 cases may have been omitted from the first two class intervals. Therefore, these cases were distributed in the first two class intervals for further analysis, namely 61 cases to the first interval (10-50 µg/L) and 60 cases to the second class interval (51-99 µg/L).
The mean arsenic level (± SD) was 186.16 ± 225.23 µg/L for tube well water in Bangladesh and 67.00 ± 107.84 µg/L in West Bengal. The difference between these mean levels of arsenic are statistically significant (p < 0.001). This significant difference reveals that, on average, the groundwater arsenic contamination in Bangladesh is 2.8 times higher than that in West Bengal. Similarly, the proportion of tube wells containing water contaminated with arsenic at concentrations > 50 µg/L is also statistically significant (p < 0.001) between these countries.
The second error is in Table 2 (1) declared that there is no normal arsenic level for skin scale, it is possible to use these data to determine the corresponding skin scale arsenic level (micrograms per kilogram) by simple regression analysis (4); that is, for a given nail arsenic level, it is possible to determine the skin scale arsenic level using the following linear regression equation: Skin scale arsenic (µg/kg) = 180.75 + 0.663 nail arsenic (µg/kg).
The regression coefficient is statististically significant (p < 0.001). Because the correlation is 1, the R 2 = 1; that is, the explained variance of the dependent variable (skin scale arsenic) is 100% through the independent (skin scale arsenic) variables. The analysis of variance for the fitted model is also significant (p < 0.001).
If nail arsenic is 430 µg/kg, skin scale arsenic will be 466 µg/kg; if nail arsenic is 1,080 µg/kg, skin scale arsenic will be 897 µg/kg. Therefore, when the nail arsenic level is in the normal range, the skin scale arsenic will be 466-896 µg/kg on average.

Statistical Errors: Chakraborti's Response
Marimuthu reported two errors in our paper (1). First, I would like to respond to his comment that we had omitted 1.1% of the values (121 cases). In our original manuscript, we provided actual values up to one decimal place, but because of overlapping of the numbers in our Figure 2, we rounded off the values. The actual values are 27.7, 14.2, 10.2, which equals the missing 1.1% that Marimuthu reported as our first error. However, one can easily see in the Y-axis of our Figure 2 that the bar for < 10 µg/L arsenic in Bangladesh is nearer 28% than 27% (the actual value is 27.7%) even though the number above the bar is 27.
The second error reported by Marimuthu came about when we converted values from milligrams per kilogram to micrograms per kilogram. The actual value is 15,500 (not 1,550). However, it is easy to see that there is an error in a number because the maximum value can not be 1,550 when the mean and median values are 6,820 and 4,460, respectively. Correlation coefficients between hair, nail, urine, and water arsenic were discussed by Biswas et al. (2), which we referenced in our paper.
Finally, we appreciate Marimuthu's mode approach for statistical evaluation of data, and we will keep his suggestion in mind.
A 62 All of the correlation coefficients are statistically significant (p < 0.001). Skin scale data was not used due to the inconsistency of the data (n = 6 because the mode was not well-defined).

Scientific Theory versus Legal Theory
I would like to respond to John Cairns' editorial on the developing role of ecotoxicology (1). Cairns' belief that a paradigm shift to assess the value of natural capital is crucial for this planet's sustained use is understandable and warranted; however, in achieving this goal, Cairns fails to realize that the shift, especially in a democracy, must be directed from below, from the body politic, and not from scientists, lawyers, or politicians. Cairns (1) correctly assesses lawyers and politicians-their knowledge of science is limited and self-serving-but after years of hearing complaints about lawyers and politicians, and years of working with scientists and engineers, I have concluded that the narrow-minded, egotistical, and selfcentered disposition claimed of lawyers is more of result of human nature rather than inherent to the profession, meaning scientists and engineers suffer from the same qualities. The belief that science offers an absolute, irrefutable answer to the problems of the planet shows the overwhelming belief in science, and forgoing the impact of other dynamics such as human nature, the natural world, and the physical world itself.
Legal theories undergo the same torturous scrutiny as scientific theories: they are subjected to peer review, challenges, and political judgments, and they ultimately survive by the test of time. Occasionally, theories are placed before the Supreme Court, the final arbitrator in the United States; the court makes rulings that may later be overruled, modified, or accepted. The Supreme Court, though loathe to admit it, is made up of individuals-individuals who study law and apply it equally based on the case or theory before the court, and who more often than not disagree with one another. These are individuals who have been chosen by a political process; they have been appointed by a government made up of individuals who have their own particular views of legal theory.
How can a science court be different? Is science so immune from controversy that any scientific theory would be incontrovertible by science judges? Any judge, ultimately, will be appointed by some political process, and to claim that any set of individuals, even scientists and engineers from the National Academy of Science, could absolutely determine scientific theories without disagreement is a farce. Debates will occur; this is ultimately the strength, not the weakness, of scientific theory as well as legal theory.
Although scientific principle is irrefutable, scientific theory is not, and just as with political and legal theory, scientific theory becomes justified through debate, analysis, and time. Scientific theory is no more absolute than law, and although a judiciary more enlightened on science may be warranted, the idea of a science court is fraught with the same inconsistencies as a court of law.
For a scientific theory, no matter how legitimate, to move forward legally and politically, the theory must be subjected to continuous scrutiny and must be able to stand before the courts of law and politics before it can actually change human nature. Doubtless, science courts would be as flawed as other human institutions. Anyone who believes sustainable use of the planet is possible also believes that human institutions of all varieties can be improved. It was reassuring to note Farquher's statement that "… a judiciary more enlightened on science may be warranted." My purpose in writing the editorial (1) was to encourage debate by suggesting that the ways in which environmental judgments are being made can be improved. Knowledge of science is as important in making scientific judgments as legal knowledge is in making legal judgments. Throughout my career, I have been careful to note that risk or hazard is a probabilistic determination that requires scientific evidence, whereas risk management involves both probabilities of harm and knowledge of societal tolerance for risks. The quality of the evidence of risk is best judged by persons with appropriate academic qualifications. Value judgments involved in managing risks, on the other hand, are best made in a democratic fashion by an informed citizenry and/or their elected representatives. Neither scientists nor lawyers should make value judgments for other people. I also affirm that lawyers are no more narrowminded, egotistical, and self-centered than any other group of professionals.

Doug Farquhar
It is my understanding that precedent is very important in matters of law. In this regard, it is worth noting that natural law preceded human societal law by a substantial temporal span. Those who study these laws are categorized as natural scientists. Because natural science is a dynamic field, disputes and paradigm shifts are the norm. In the field of ecotoxicology, the subject of my editorial (1), the rate of change in the last 50 years has been astonishing. Still, no theory is incontrovertible, and the best judges of the confidence that may be placed in it are those who are well acquainted with validating or confirming evidence as well as the uncertainties; for scientific theories, this would be scientists with peer-reviewed publications in the area of concern.
Elsewhere (2), I have noted that sustainable use of the planet will require a) compassion for other living humans who may be less fortunate than we are, b) compassion for future generations, and c) compassion for other living creatures with whom we share the planet. To this I now add compassion and empathy with other professions and disciplines (e.g., science, law, engineering, philosophy, sociology, economics), so that each is permitted to judge the adequacy and robustness of the evidence it is best equipped to judge in a way that will best represent the current knowledge in the field. Once a probabilistic determination has been made, the value judgments should be left to the citizens and/or their representatives. In my editorial (1), I merely suggested that scientists should be permitted to evaluate scientific evidence on their own terms, not on those of another profession. If my health were concerned, I would hope that medical professionals would be permitted to judge the evidence, unhampered by the constraints of another profession. I hope the health of the earth's ecologic life support system will get comparable consideration.

Inaccurate Models for Mixtures
The data presented by Payne et al. (1) do not appear to support the authors' conclusion that their models, based on simple additivity of concentrations, accurately predicted the effects of mixtures of xenoestrogens.
The concentration-response data reported for the mixtures do not correspond with those predicted by the models (1). This is especially noticeable at the higher concentrations. For all of the mixtures, the responses decline at the higher concentrations [ Figures 2-4, Payne et al. (1)]. Indeed, for the mixtures reported in their Figures 2  and 3, the highest response at the highest concentration used is lower than the lowest response at the second highest concentration. Competition for a common receptor would be expected in this system when the chemicals are present at sufficiently high concentrations. Such competition should result in antagonism and a decline in response. This expected result, confirmed by their data, is not included in either of their additive models, which would explain the deviation from the predicted concentration-response curves in the cited figures. Many models have been developed for competition at common receptors and/or enzymes [e.g., (2)(3)(4)(5)(6)], and these would be expected to predict more accurately the results of the mixtures of xenoestrogens.
Furthermore, in Figure 4, Payne et al. (1) report an unacceptably high range of responses for some concentrations of the mixture. Several specific concentrations of this mixture have a reported range of responses that exceeds one-fourth of the full range of responses for all of the concentrations tested (which span several orders of magnitude). At the highest concentration, moreover, the range of responses was more than one-half the full range of responses; indeed, the lowest response at the highest concentration is approximately the same as the lowest response at a concentration almost 100-fold lower.
Payne et al. (1) based the models for the mixtures on data derived from the concentration-response curves for the individual chemicals. For two of these chemicals, however, o,p´-DDT and 4-nonylphenol [ Figures  1A and 1D, respectively (1)], a substantial number of the data points are outside the 95% confidence limits of the predicted response for each chemical alone. For one concentration of 4-nonylphenol (approximately 3 µM), the range of responses (approximately 0.3-1.2) reported is about one-half of the range of responses (approximately 0-1.6) reported for the full range of concentrations tested, which again spanned several orders of magnitude.
Taken together, these observations of the reported results require more explanation than the simple additive models considered by Payne et al. (1). All of their mixtures demonstrated a decline in response at high concentrations, and both of their models require an increasing or constant response as concentrations increase. Therefore, at a minimum, the models cannot be accurate at high concentrations.

Inaccurate Models for Mixtures: Kortenkamp's Response
The aim of our paper (1) was to explore whether the combined estrogenic effects of mixtures of xenoestrogens could be predicted successfully on the basis of the potency of individual mixture components. Putzrath's criticism of our study focuses primarily on the responses we observed with high concentrations of xenoestrogen mixtures. At these concentrations marked reductions of estrogenic effects became noticable. This phenomenon is frequently observed in the YES (yeast estrogen screen) assay (2) and is a manifestation of toxic effects on the yeast cells, as was clearly pointed out on page 986 of our paper (1). It is definitely not due to competition for the estrogen receptor in yeast cells, as suggested by Putzrath.
During the assessment of estrogenic effects of chemicals in yeast cells, toxic effects introduce anomalies to the concentration-response curves for estrogen receptor activation, and these confound the assessment of estrogenic effects. Therefore, toxic effects must be carefully distinguished from estrogenic responses, and the assay should not be run with concentrations of test agents the yeast cells cannot tolerate. Furthermore, no dosimetric model is able to deal with estrogen receptor activation and toxic effects at the same time. For these reasons, the data points at high mixture concentrations could not be included in the regression analysis for estrogenic effects, which was also clearly stated in our paper. We have nevertheless chosen to present these observations because we (like Putzrath) were intrigued by the toxicity that occurred at high concentrations of all mixtures. However, toxicity was not-and could not be-the object of our analysis. We maintain that our data show decisively that the combined estrogenic effect of all four xenoestrogens is additive. There was good agreement between the various predictions made on the basis of the individual effects of each mixture component and the observed combination effects. Therefore, in emphasising the discrepancies between observed and predicted effects at high mixture concentrations Putzrath misses the point of our work entirely. Our models are not accurate at these concentrations, nor are they intended to be.
Putzrath also criticizes the spread of data points observed with the single agents and with some of the mixtures. Again, we disagree with his notion that this represents an unacceptably high variation. Our data were from different experiments, performed by different operators over a period of 3 months. Given the biological variation inherent in living organisms and the other possible sources of experimental error, we feel that the variation in our data is nothing out of the ordinary. The confidence limits shown in our figures are 95% confidence bands of the best estimate of the regression models, not the population means, and data points are bound to lie outside these limits.
Taken together, our studies have encouraged us to attempt the prediction and assessment of the effects of more than four xenoestrogens. It remains to be seen which of the two prediction models used in our paper can be applied productively to such mixtures.

Andreas Kortenkamp
The School of Pharmacy Centre for Toxicology London, United Kingdom E-mail: A.Kortenkamp@cua.ulsop.ac.uk