orrespnndence Re: "Have Sperm Densities Declined? A Reanalysis of Global Trend Data"

Last year Swan et al. (1) published a reanalysis of data from 61 studies originally compiled and analyzed by Carlsen et al. (2). Just prior to the appearance of the Swan et al. artide, we published a reanalysis in another journal (3). Regional differences were considered in both reanalyses, but we examined only the effect ofyear in the final models (fertility status was also considered in the initial model), whereas they included several additional indicators culled from each study. However, while the results in the two papers for the U.S. studies were very similar (coefficients for the effect of year of-1.3 and -1.5 in their paper and ours, respectively), Swan et al. (1) reported a significant decline in sperm counts over time for Europe, whereas we found a nonsignificant decline. We doubted that this difference was due to the confounding with the additional covariates that they induded, so we decided to explore. We found the reason for the difference to be that Swan et al. only did a reanalysis of a subset of studies from the Carlsen et al. compilation (2). While dropping "two studies that included men who conceived only after an infertility workup" (1) seems justified on scientific grounds, dropping three nonEnglish language studies was arbitrary, inappropriate, and led to the different results. Two of the three non-English papers were from Europe and were written in Danish and German in decades before English dominated the scientific literature as it does today. These two studies, contrary to an assertion ofSwan et al. (1) in their discussion, have sperm count values that are low relative to later studies done in Europe, so the slope is nonsignificant when they are included (in our analyses). Swan et al. also included an Australian study with the European ones; this would make sense ifone

Last year Swan et al. (1) published a reanalysis of data from 61 studies originally compiled and analyzed by Carlsen et al. (2). Just prior to the appearance of the Swan et al. artide, we published a reanalysis in another journal (3).
Regional differences were considered in both reanalyses, but we examined only the effect ofyear in the final models (fertility status was also considered in the initial model), whereas they included several additional indicators culled from each study. However, while the results in the two papers for the U.S. studies were very similar (coefficients for the effect ofyear of-1.3 and -1.5 in their paper and ours, respectively), Swan et al. (1) reported a significant decline in sperm counts over time for Europe, whereas we found a nonsignificant decline. We doubted that this difference was due to the confounding with the additional covariates that they induded, so we decided to explore.
We found the reason for the difference to be that Swan et al. only did a reanalysis of a subset of studies from the Carlsen et al. compilation (2). While dropping "two studies that included men who conceived only after an infertility workup" (1) seems justified on scientific grounds, dropping three non-English language studies was arbitrary, inappropriate, and led to the different results.
Two of the three non-English papers were from Europe and were written in Danish and German in decades before English dominated the scientific literature as it does today. These two studies, contrary to an assertion of Swan et al. (1) in their discussion, have sperm count values that are low relative to later studies done in Europe, so the slope is nonsignificant when they are included (in our analyses). Swan 1940 1950 1960 1970 1980 1960 Year had a hypothesis that there was a genetic or cultural cause of differences in sperm counts, but would be inappropriate if counts were hypothesized to vary with climate or environmental factors. Actually, the inclusion or exdusion of the Australian study influences the fits only trivially. Figure 1 shows the linear regression fits for the data used in the Becker and Berhane  Table 1. The value for 1944 was excluded because it is dearly not part of the quadratic pattern.
In conclusion, the significant and very marked decline that Swan et al. (1) found for Europe was an artifact of their inappropriate sampling from the original studies. If the two non-English studies from 1944 and 1971 are included, there is no significant decline over the entire period. However, a significant nonlinear pattern is found, with an increase until about 1980 followed by a decrease. Such a significant quadratic pattern was not found in either the United States or in the other regions combined (not shown). We lack an explanation for the observed pattern in Europe, but since the Carlsen paper appeared, a number of other papers with more recent data from Europe have been published [see references in Becker and Berhane (3)].
There are several methodological morals to this story. First, single data points can have considerable influence in linear regression, particularly when the total number of sample points is small. Only very careful inspection of residuals from the linear regression over the entire period would allow one to spot the nonlinearity in this case. Second, it is inappropriate and parochial to only accept English-language studies in scientific meta-analyses.

Response: Sperm Density Declines
Becker and Berhane take issue with the exclusion of three non-English language studies (1-3) from our reanalysis of the 61 studies on sperm density (4) that were induded by Carlsen et al (5). This objection raises two issues.
First, could we have used these studies in our analysis? We would argue that we could not. Unlike Becker and Berhane, whose own reanalysis (6) did not require any data other than what was published in Carlsen et al. (5), our multivariate analysis (4) required that we read the underlying studies. Otherwise, we would not have been able to abstract the detailed information on variables, such as age, abstinence time, and method of sample collection, that we included in our multivariate analysis. Moreover, not being fluent in German, Spanish, and Danish, we were not able to ascertain the eligibility of these studies.
Second, should we have used these three studies in our analysis even if we were able to A 420 Volume 106, Number 9, September 1998 * Environmental Health Perspectives Correspondence read them? We would argue that they should not have been included because these few studies are unlikely to represent all eligible non-English language studies published between 1938 and 1990. To determine the volume of non-English language articles in this field, we reviewed the Medline listing for 1989 publications obtained by Carlsen et al (5). We selected 1989 for this review because this was the last complete year included by Carlsen et al. and was therefore likely to have the least non-English publications during the study period if, as stated by Becker and Berhane, "English dominated the scientific literature" in recent decades. Of the 244 studies included, 58 (24%) were in languages other than English, with sixteen languages represented. Our Medline review suggested that Becker and Berhane's perceived dominance of the scientific literature by the English language may be the "parochial" view, rather than ours. This review also suggested that it is unlikely that the three non-English language studies included by Carlsen et al. (all published before 1972) represented all eligible non-English studies; thus, there was no reason that these three alone should have been included. This application of our exclusionary criteria appears better justified than Becker and Berhane's post hoc exclusion of the study by Varnek (1) simply because "It is clearly not part of the quadratic pattern." Finally, as noted in our paper, data from additional European studies suggested that sperm densities in Europe tended to be high early in the study period. Davidson (4), not included by Carlsen et al. (5) although eligible, reported a mean density of 143 x 106/ml in 1949. Further, the mean sperm density from five studies published in 1944-1962, which included 2,456 infertile European men (8)(9)(10)(11)(12) was 98.5 x 106/m1. It is reasonable to assume that sperm densities from fertile European men would have been at least as high and therefore would not support the quadratic model with low sperm counts in Europe prior to 1975, as proposed by Becker and Berhane in their letter [although not in their own analysis (6)].
Since its publication in 1992, the analysis by Carlsen et al. (5) has been widely discussed; our recent Medline search found it cited 231 times. It is unlikely that further discussion will resolve all remaining disagreements. Nevertheless, the conclusion of a mean decline in sperm density of about 1% per year is quite robust and is the same whether the analysis is based on the original 61 studies or only the 56 studies we induded (4). Therefore, we suggest that at this point, efforts might best be spent elsewhere. Studies to rigorously estimate cross-sectional differences in semen quality are currently ongoing in several countries; these should provide reliable information about geographic variation in semen quality. Comparable data on temporal variation must await the results of prospective longitudinal studies.