Risk of internal cancers from arsenic in drinking water.

The U.S. Environmental Protection Agency is under a congressional mandate to revise its current standard for arsenic in drinking water. We present a risk assessment for cancers of the bladder, liver, and lung from exposure to arsenic in water, based on data from 42 villages in an arseniasis-endemic region of Taiwan. We calculate excess lifetime risk estimates for several variations of the generalized linear model and for the multistage-Weibull model. Risk estimates are sensitive to the model choice, to whether or not a comparison population is used to define the unexposed disease mortality rates, and to whether the comparison population is all of Taiwan or just the southwestern region. Some factors that may affect risk could not be evaluated quantitatively: the ecologic nature of the data, the nutritional status of the study population, and the dietary intake of arsenic. Despite all of these sources of uncertainty, however, our analysis suggests that the current standard of 50 microg/L is associated with a substantial increased risk of cancer and is not sufficiently protective of public health. ImagesFigure 1Figure 2Figure 3

A metal found in rocks and mineral formations in the earth's crust, arsenic has long been associated with the development of cancer in humans. Exposure can occur via inhalation, primarily in industrial settings, or through ingestion. Because drinking water is one of the primary routes of exposure, standards set in 1942 established a maximum contaminant level (MCL) of 50 pg/L in drinking water. In 1975, 50 pg/L was adopted as the interim standard in response to the 1974 Safe Drinking Water Act (1). In a 1984 health assessment, the U.S. Environmental Protection Agency (EPA) dassified arsenic as a dass A human carcinogen, based primarily on epidemiologic evidence, and produced quantitative risk estimates for both ingestion and inhalation routes of exposure (2). Although the EPA assessment for the inhalation route is well accepted, the risk assessment for ingestion remains controversial. The 1984 risk assessment for arsenic in drinking water was based on an epidemiologic study in Taiwan that examined an association between arsenic exposure via drinking water and skin cancer (nonmelanoma) (3). EPA investigators estimated that the lifetime risk of skin cancer for individuals who consumed 2 L water per day at 50 psg/L could be as high as 2 in 1,000. This high value prompted questions about the 1984 risk assessment, including applicability of the risk assessment to the U.S. population, the role of arsenic as an essential nutrient, the relevance of skin lesions as the basis for the risk assessment, and the role of arsenic intake via food. In 1988, the EPA Risk Assessment Forum published a revised skin cancer risk assessment and focused attention on these questions (4). The EPA is currently under a congressional mandate to finalize a new rule for arsenic in drinking water by 1 January 2001 (5).
There has been substantial focus on the association between arsenic and skin cancer, and there is also substantial evidence that exposure to arsenic in drinking water increases the mortality risk for several internal cancers. Increases in bladder and lung cancer mortality were found in a region of northern Chile (6). An association was also found between bladder cancer mortality and arsenic in drinking water in Argentina (7). Significant increased mortality was observed for males and females in Taiwan due to lung, liver, skin, kidney, and bladder cancer (8). The National Research Council presents a more detailed summary of the evidence linking arsenic exposure to internal cancer (1).
The purpose of this article is to present a risk assessment for mortality due to several internal cancers based on a reanalysis of the data reported by Chen et al. (8). Brown (9) discussed the limitations of the data available for analysis when the current EPA risk assessment (4) was prepared. For several reasons, it can be argued that the risk assessment of internal cancers presented in this paper yields more convincing results than the previous EPA assessment based on skin cancer. First, the current study focuses on mortality from bladder, lung, and liver cancers identified through national death records. In addition, unlike the Tseng et al. (3) study that was used in the EPA analysis, which grouped data into three broad exposure intervals [low (< 300 pg/L), medium (300-600 pg/L), and high (> 600 pg/L)], data now available provide exposure at the individual village level.
This paper is a follow-up to a preliminary study that focused only on bladder cancer and examined model sensitivity (10). The current analysis is expanded to include lung and liver cancers and examines issues of dose-response modeling by Poisson regression, in addition to application of the multistage-Weibull (MSW) model, in more detail.

Materials and Methods
Internal cancer data. Data used in this analysis were derived from a study in an arseniasis-endemic area of Taiwan (11)(12)(13). Cancer mortality data were collected from death certificates of residents of 42 villages during [1973][1974][1975][1976][1977][1978][1979][1980][1981][1982][1983][1984][1985][1986]. These data were originally collected in 1987, so only records up to 1986 were available. Causes of death were classified according to the Eighth Revision of International Classification ofDiseases, 1965 Revision (ICD) (14). The data consisted of person-years at risk and the number of deaths due to bladder (ICD code 188), lung (ICD code 162), and liver (ICD code 155) cancer in 5-year age increments for both males and females. Table 1 summarizes the internal cancer data and provides person-years at risk and observed number of cancer deaths by age, sex, and arsenic level. Although individual village arsenic levels are available and will be used in subsequent analyses, exposure levels are grouped in Table 1 for convenience of presentation. The numbers of bladder, liver, and lung cancers are given, along with the number of person-years at risk. For example, males between the ages of 50 and 69 contributed 21,040 person-years at risk and 6, 17, and 12 deaths were observed from bladder, liver, and lung cancer, respectively. Exposure data. Drinking water samples were collected from wells of 42 villages in 1964-1966 (12). The artesian wells were gradually closed; the last one dosed in mid-1970. Although mortality data were collected for a later time period (1973)(1974)(1975)(1976)(1977)(1978)(1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986), it is likely that arsenic levels in well water remained relatively unchanged over this time period. It could also be argued that because of the long latency of the cancers of interest, it is appropriate for exposure to be based on a time period 10 to 20 years before death. A strength of the currently available exposure data is that individual well concentration levels are available for each village. Physical and chemical characteristics of drinking water such as pH value and levels of arsenic, sodium, calcium, magnesium, manganese, iron, mercury, chromium, lead, nitrite and nitrate nitrogen, fluoride, and bicarbonate have been intensively studied in both Blackfoot disease-endemic and -nonendemic areas (15,16). Arsenic level was the only level that was significantly higher than the maximal allowable limit and strikingly different in water from shallow wells and artesian wells. The data also have some limitations. The drinking water was not tested for levels of dissolved radon and other a-emitters. Fluorescent compounds, especially humic acids, have been found in the well water. These fluorescent substances result from the decomposition of organic matter, particularly dead plants. However, it is unlikely that their presence causes confounding in this analysis because widespread contamination is not confined to the arseniais-endemic area.
Standardized mortality ratio. We used standardized mortality ratios (SMRs) to summarize the observed patterns of mortality in data. SMRs provide a popular approach to comparing mortality in a specific population with mortality from a suitable comparison population (17). SMRs correspond to ratios of observed and expected number of events and are calculated by 10/XEi, where 0, is the observed number of deaths in the ih age group and Ei is the corresponding expected number of deaths, calculated by multiplying the study population size (P,) by the age-specific cancer death rate (M) in a comparison population (i.e., Ei = P. x M). Usually SMRs are expressed as a percentage so that the value 100 x 12Oi l/Ei is the number reported.
There are concerns with using all of Taiwan as a comparison population because of the potential for bias associated with differences in the populations (e.g., rural vs. urban). For this reason, we considered two comparison populations in this analysis: all of Taiwan and the southwestern region of Taiwan (18). The latter is expected to provide a more suitable comparison basis for the study population, which is largely rural and fairly poor. Table 2 contains the data from the two comparison populations. The number of deaths due to bladder, lung, and liver cancers and person years at risk (PYR) were extracted by age group and sex for 1973-1986. Generalized linear model, Poisson modeling is often used in epidemiologic analysis, particularly for rare events such as cancer deaths. In fact, SMRs can correspond to maximum likelihood estimates of risk ratios from a Poisson model (17). In our analysis, we assumed that the number of deaths due to cancer follows a Poisson distribution with parameter equal to the person-years at risk multiplied by the hazard of dying of cancer. The hazard is often modeled as a function of age (t) and exposure (x). As described by Breslow and Day (17), a broad class of models can be characterized using the following general form, h(x,t) = ho(t) X g(x), [1] where h0(4 denotes the baseline hazard function that only depends on age, t, and describes the instantaneous hazard of dying of cancer for the unexposed population. The risk ratio attributed to exposure level x is denoted by g(x). Of course, it is likely that a variety of factors, including cigarette smoking, use of bottled water, and dietary intake of inorganic arsenic, could influence or even confound the model. The model described in Equation 1 will allow consideration of other covariates. Unfortunately, measurements for these and other potentially important factors were not available for our study. Rather, this is an ecologic study wherein only relatively simple exposure and population characteristics could be measured. It will be important to consider this and other sources of uncertainty when interpreting the results. Although not discussed extensively here, it is possible for the risk ratio g(x) to also depend on age, t. For example, older people may be more susceptible to exposure. We did in fact explore such age-dependent risk models and found that in general, it was adequate to model the relative risk as a function of exposure only. A wide range of models was obtained by varying a) the use of comparison populations; b) the way age is modeled in ho(4, e.g., linear, quadratic, or the use of regression splines; c) transformations of exposure concentrations; and a) the way exposure is modeled. Table 3 summarizes the various modeling options considered in this analysis. Each model corresponds to choosing one option from each column. For example, the model excluding the comparison population, with a linear age effect, an exponential linear dose effect, and no transformation on dose, is characterized by ho(t) = exp(ao + clt) and g(x) = exp(,lx). Note that the linear and quadratic dose-effect models (generally referred to as additive models) do not fit into the usual dass of generalized linear models (GLMs) and require special programming. Exponential linear and exponential quadratic models fall under the general dass of multiplicative models. The spline age effect was modeled using natural splines because of the ease of obtaining predicted values (19). There are three options for the baseline hazard: model the hazard without including a comparison population, treat the comparison population as an unexposed group, or replace the baseline hazard function with empirical "Values in parentheses are number of deaths from bladder, liver, and lung cancer, respectively. VOLUME 108 1 NUMBER 7 1 July 2000 * Environmental Health Perspectives estimates based on the comparison population (not included in Table 1). The third option can be accomplished by fitting a Poisson model containing indicators corresponding to the age categories observed in the comparison population. This approach essentially corresponds to the traditional SMR approach. Because there were no villages with zero concentration levels, the method used to model the baseline hazard had a fairly strong influence on the results. In particular, the choice of whether to include a comparison population had a strong influence. The use of an unexposed comparison population has the potential to provide more information about the shape of the model at low concentrations.
Although not a member of the usual GLM class, the MSW model was also considered because it was used in the previous risk assessment (4). The MSW corresponds to letting g(x) = PO + IX + 2x2 and ho(t) = C(t-To)+ (10), where tdenotes age and x denotes exposure concentration. The plus sign (+) indicates a truncation on the (t -To) term (i.e., if To > t then the term is set to zero). Results based on the MSW model are only presented for comparison. The GLM approach has several advantages over the MSW model. First, the MSW model appears to be more sensitive to outliers than the GLM model (1J). Also, the hazard function for the MSW model involves a truncation in t that complicates estimation. Finally, the inclusion of the power parameter k (for our purposes, k = 2) tends to give the fitted model a relatively sublinear shape that leads,  in general, to higher benchmark doses than the GLM models. Quantitative risk assessment. Because the risk of dying from cancer is age dependent, it is common to base risk assessment on the excess risk of dying from cancer over the course of a typical lifetime. The adjusted lifetime death risk can be calculated by integrating the death hazard over the typical lifetime in the population of interest, where S(t) is the probability of surviving until age t and h(x,t) is the hazard for dying of the cancer of interest at age t for someone exposed at level x. Applying integration by parts, Idr(x) can also be written as  (Table 4) (20).
Traditionally, standards for carcinogenic compounds have been set by finding the exposure level that yields a rate of 10-over background. As suggested by Brown (9) and discussed by Smith and Sharp (21), this estimate is probably unreliable for epidemiologic data, where exposure is not typically measured accurately enough to extrapolate to such low risk levels. The new EPA guidelines for cancer risk assessments (22) suggest the use of a point-of-departure analysis for settings where the mode of action is supportive of linearity or there is insufficient support for a nonlinear mode of action. The idea is to often in epidemiologic studies, an excess risk of 10% is fairly large and occurs only at relatively high doses. We will use both 1% and 5% excess risks for the point of departure (EDO0 and ED05, respectively). We computed confidence intervals for excess lifetime risk using the Delta method (23). Bootstrap methods were also used for models with nonparametric age effects, yielding similar results (24). The new guidelines also suggest a margin-of-exposure analysis (MOE), defined as the point of departure divided by the environmental exposure of interest. This approach is the proposed default mode of action when linearity is not the most reasonable assumption (22). For subsequent discussion we will use MOE01(50) to represent the margin of exposure using the EDO1 point of departure and 50 1ig/L as the environmental exposure of interest. Table 5 contains a descriptive summary of the internal cancer data, showing person-years at risk, observed number of cancers, and the SMRs for age, sex, and exposure grouped into the same intervals used by the EPA in the skin cancer risk assessment. As in Wu et al. (12), the analysis is limited to persons . 20 years of age because there were essentially no cancer deaths observed in those younger than 20 years of age. Note that the entire Taiwanese population was used to calculate the expected deaths used in the computation of SMRs in Table 5. Although the computed SMRs display a large amount of noise, there appear to be higher SMRs at high exposure levels compared to exposures in the lower range, especially for bladder and lung cancer. There is no observed tendency in SMRs with respect to age, which suggests no age dependency on the risk ratio, g(x), defined in Equation 1. Overall, females have higher SMRs than males. Liver cancer mortality is generally higher than expected, although there is no particularly strong exposure-response relationship. The GLM analysis began by fitting all possible models and comparing the Akaike information criterion (AIC) to narrow the model choice (25). The AIC is a commonly used criterion for comparing nonnested models. It penalizes the model deviance by adding twice the number of parameters to it. Thus a model with a low AIC will be one that describes the observed data well (a low deviance) yet with relatively few parameters (small penalty). The models with the lowest AIC provide the best fit. Because it was not appropriate to compare AIC for the models based on different data sets, we provide separate analyses with and without comparison populations. Table 6 identifies models 1-9 and the MSW model that appear in Tables 7-10 and Figures 1-3. For simplicity we will refer to the model numbers. Table 7 compares the four top performing models based on AMC for the models with and without comparison populations for male bladder cancer. Several other models fit reasonably well, but we chose to present only four (see "Discussion"). It is important to note, however, that models induding exposure concentration were highly significant compared to models excluding -concentration. We also present the MSW model. Although detailed results are shown here only for male bladder cancer, the same general patterns apply to females and to all cancer outcomes except for the combined analysis (see "Discussion"). In general, models with no transformation on dose and an exponential linear dose effect fit well when we used no comparison population. When we used population data from the southwestern region of Taiwan or the entire Taiwanese population, models with the square root and log transformation fit well. This is most likely  For a description of models, see Table 6.  Table 6. of models, see Table 6. VOLUME 108 1 NUMBER 71 July 2000 * Environmental Health Perspectives due to the relatively low cancer death rates in the comparison population. The log-transformation allowed the fitted curve to rise more quickly from zero to accommodate this difference. Using the log-transformation without the comparison population gave a good model fit, according to AIC, but risk estimates were not easy to interpret because of instability of the fitted model at low dose. For this reason, we chose not to pursue the log-transformation without the comparison population any further. A few additive models gave a good fit, but in most cases, the multiplicative models did a better job, so we chose not to continue with the additive models. Also note that the MSW model fit reasonably well (Figures 1-3 show graphical representations). Each dot in Figures 1-3 corresponds to the estimated lifetime risk of dying of bladder cancer for villages, grouped by 50-pg/L exposure levels (0-50, 50-100, etc.). The grouping is for presentation purposes only because village-specific estimates were highly variable. The idea of grouping for the purpose of graphical presentation of a fitted model has been widely used in the logistic regression context as well (26). Fitted curves for the models without the comparison population are very similar in shape, whereas there is a considerable amount of variability in the models with a comparison population. Tables 8-10 contain risk statistics for the best-fitting GLM models and the MSW model with and without comparison population data. Concentrations are reported in U.S. equivalent concentrations of arsenic in drinking water, based on conversions that account for the average weight and average water intake for a male living in the United States compared to a male living in Taiwan. For models 1 and 2, which have no transformation on dose, EDOI estimates equal 395 and 351 pg/L, respectively, for male bladder cancer. For models 3, 4, and 5, which have a log-transformed dose effect, EDO0 estimates for male bladder cancer range from 21 to 54 pg/L. Models 7 and 8, which have a square root dose effect, give higher estimates (156 and 108 pg/L, respectively). Results for model 9 are similar to models 7 and 8. When Table 7. AIC for best-fitting models.

Results
All of Southwestern 333.8307 a comparison population is used (Tables 9 Estimates of ED and ED05 based on using and 10), there is more variability in the pre-the southwestern region ofTaiwan tended to dicted lifetime risk from model to model. It be much lower than those based on using appears that the inclusion of a large unex-the Taiwanese-wide population. The MSW posed comparison population had a relative-model implies a lower risk when no comparly strong influence on estimation of risk.
ison population was used (ED01 = 633 pg/L  for male bladder cancer) compared to estimates when a comparison population is used (164 and 185 jig/L).

Discussion
In contrast to the 1988 EPA risk assessment that focused on skin cancer incidence (4), this study examines cancer mortality in a setting where exposure is measured at village level. Although there is an advantage to having individual village measurements, there also appears to be variability in the exposure assessment, causing high variability in the risk estimates. Depending on the model and whether or not a comparison population is used in the analysis, EDO1 estimates range in value from 21 to 633 jig/L for male bladder cancer. For males, the lung cancer risk tends to be slightly higher than the risk for bladder cancer, with EDOI values ranging from 10 to 364 pg/L. Although this result seems in contrast to the high SMRs for bladder cancer in Our results show that exposure-response assessments depend highly on the choice of model, as well as whether or not a comparison population is used in the analysis. As discussed by Morales et al. (10), one possible explanation is the uncertainty associated with an ecologic study design. We assumed the same arsenic concentration for all persons in the same village and individual exposures can vary widely in a village. Mortality records are available for individuals, but their individual exposures are not. The National Academy of Sciences (1) provides a good discussion on this subject.
Although one might argue that the appropriate strategy would be to select the best model based on accepted statistical criteria, several models gave essentially the same quality fit (as measured by AIC), yet yielded substantial differences in risk estimates. For example, for the models without a comparison population, the MSW model gave a fit comparable to some of the GLM models, but produced EDOI estimates almost twice as high. Despite the comparably good fit, we preferred the GLM models to the MSW model. For example, sensitivity analysis revealed that the MSW model was influenced strongly by the removal of various subsets of villages, whereas the GLM was not (10). The poor nutritional status of the Taiwanese in the Blackfoot disease region could be another contributing factor of uncertainty. We could not account for dietary intake of inorganic arsenic in food for either population, or for other confounders in this analysis.
Differences in EDOI estimates were particularly affected by whether or not a comparison population was used. There is reason to believe that the urban Taiwanese population is not a comparable population for the poor rural population used in this study. Thus, risk estimates using the Taiwanese population may be biased. As an alternative, we used the southwestern region of Taiwan; we found very different risk estimates based on the two different comparison populations (Tables 9 and 10). We could have done other analyses. For example, we could have calculated lifetime death rates for the unexposed group [Idr(0)] using U.S. population data. It would be of interest to see how the unexposed death rates in the United States compare to the death rates in Taiwan.
Despite the considerable variation in estimated EDO0, the results are sobering and indicate that current standards are not adequately protective against cancer. For the combined analysis with no comparison population and identity transformation on dose, the MOE01(50) values range from 0.4 to 16.9 for both males and females. When we include a comparison population, the MOE01(50) values range from 0.2 to 3.4.
The current arsenic standard of 50 j.g/L (4) is actually below the estimated EDO0, which suggests that the risk at the current standard is higher than 1 in 100. Note, however, this estimate is likely to be overly conservative because the data suggest that the log-transformations lead to somewhat unstable results. Even considering the identity transformation, which tended to give less extreme results, the risk associated with a concentration of 50 pg/L is approximately 1 in 300, based on linear extrapolation from the point of departure. Risks of a similar magnitude were reported by Smith et al. (27). This is an extremely high value. We could argue that if indeed the risk were this high, we would expect to find epidemiologic evidence even within the U.S. population. The SEER Cancer Statistics Review (28) estimates that the age-adjusted U.S. mortality rates for bladder, lung, and liver cancer are 3.2, 49.5, and 2.8 per 100,000, respectively. It is also estimated that approximately 5% of large and small regulated water supply systems in the United States have arsenic concentrations > 20 )ig/L (291). Thus, if the excess cancer risk associated with 50 pg/L arsenic is on the order of magnitude 1 in 1,000, we would expect an increase of approximately 0.05 per 1,000 or 5 per 100,000 in the population. It is not surprising that epidemiologic studies in the United States have not so far been able to identify clear associations.
Thus, we conclude that arsenic in drinking water may indeed be contributing to excess cancer mortality in the United States. Each month we delve into the issues you care about such as: water, air, and soil pollution urban sprawl population and world health toxic pesticides environmental products and services Call today to subscribe and visit us online at S -S Environmental Health Perspectives * VOLUME 1081 NUMBER 7 1 July 2000