Use of mechanism-based structure-activity relationships analysis in carcinogenic potential ranking for drinking water disinfection by-products.

Disinfection by-products (DBPs) are formed when disinfectants such as chlorine, chloramine, and ozone react with organic and inorganic matter in water. The observations that some DBPs such as trihalomethanes (THMs), di-/trichloroacetic acids, and 3-chloro-4-(dichloromethyl)-5-hydroxy-2(5H)-furanone (MX) are carcinogenic in animal studies have raised public concern over the possible adverse health effects of DBPs. To date, several hundred DBPs have been identified. To prioritize research efforts, an in-depth, mechanism-based structure-activity relationship analysis, supplemented by extensive literature search for genotoxicity and other data, was conducted for ranking the carcinogenic potential of DBPs that met the following criteria: a) detected in actual drinking water samples, b) have insufficient cancer bioassay data for risk assessment, and c) have structural features/alerts or short-term predictive assays indicative of carcinogenic potential. A semiquantitative concern rating scale of low, marginal, low-moderate, moderate, high-moderate, and high was used along with delineation of scientific rationale. Of the 209 DBPs analyzed, 20 were of priority concern with a moderate or high-moderate rating. Of these, four were structural analogs of MX and five were haloalkanes that presumably will be controlled by existing and future THM regulations. The other eleven DBPs, which included halonitriles (6), haloketones (2), haloaldehyde (1), halonitroalkane (1), and dialdehyde (1), are suitable priority candidates for future carcinogenicity testing and/or mechanistic studies.

A majority of U.S. households are supplied with disinfected water. Disinfection is necessary to destroy pathogenic organisms and prevent the outbreak of waterborne infectious diseases. Such diseases are largely under control in the United States, but waterborne outbreaks resulting in disease and mortality continue to occur (1). Although the benefits of water disinfection are well recognized, there is an undesirable side effect of producing various disinfection by-products (DBPs) when disinfectants such as chlorine and ozone react with natural inorganic and organic matter in the water.
The public health risks associated with DBPs are not fully understood. In 1974 it was discovered that some DBPs are carcinogenic in laboratory animals (2,3). This raised public concern about the possible adverse health effects from exposure to DBPs, and in 1979 led the U.S. Environmental Protection Agency (U.S. EPA) to regulate the level of certain DBPs, trihalomethanes (THMs), in the water supply.
In November 1979, the U.S. EPA set an interim maximum contaminant level (MCL) for total THMs (the combination of chloroform, bromodichloromethane, chlorodibromomethane, and bromoform) of 0.10 mg/L as an annual average (4). This standard applied to any public water system that serves at least 10,000 people and uses a disinfectant. In December 1998, the U.S. EPA finalized the Stage 1 Disinfectants/ Disinfection Byproduct Rule (DBPR) (5) in conjunction with the Interim Enhanced Surface Water Treatment Rule (6). Together, these rules attempt to balance the control of health risks from DBPs against the risks from pathogenic microbial organisms. As a part of the Stage 1 DBPR, the U.S. EPA lowered the total THM standard (the MCL) to 0.080 mg/L and set MCLs of 0.060 mg/L for five haloacetic acids (monochloro-, dichloro-, trichloro-, monobromo-, and dibromoacetic acids [HAA5]), 0.010 mg/L for bromate, and 1.0 mg/L for chlorite. THMs and HAA5 are by-products of chlorination. Bromate is a byproduct of both disinfection with ozone and chlorine dioxide, whereas chlorite is a chlorine dioxide byproduct. The 1996 Safe Drinking Water Act (7) amendments require the U.S. EPA to publish a Stage 2 DBPR. The content of the Stage 2 DBPR, including which DBPs will be regulated, has not yet been finalized.
The U.S. EPA has an extensive research program to better characterize the potential health effects and occurrence levels of several DBPs, including those regulated under the Stage 1 DBPR. It is preferable to have both toxicity and occurrence data when setting maximum contaminant level goals (MCLGs) for drinking water standards to better define public health risk. The U.S. EPA, in a collaborative testing program with the U.S. National Toxicology Program (NTP), is conducting 2-year cancer rodent bioassays, transgenic mouse cancer assays, and medaka fish cancer assays, as well as tests for reproductive, developmental, immunologic, and neurologic toxicities on several DBPs. In addition, an information collection rule was promulgated (8) to collect national occurrence information on 32 DBPs ( Table 1).
The chemicals in the Stage 1 DBPR are among the DBPs with the highest occurrence in drinking water. However, hundreds of other DBPs formed from treatment with various disinfectants, including chlorine, have been identified. There is a limited amount of information on most of these DBPs beyond their identification in water. In addition, there are many unidentified DBPs, as evidenced by measurements of total organic halides compared with known halogenated DBPs (5). The U.S. EPA believes that the standards in the Stage 1 DBPR will, to some extent, control these other known and unknown DBPs. The U.S. EPA must better define the risk from the DBPs identified in drinking water before it can determine whether they are adequately controlled by current standards. The two most important factors needed to characterize risk are occurrence and toxicity. For most DBPs that have been identified, few or no data are available in either area. Further research on these DBPs is therefore necessary. Because there are hundreds of DBPs for which there are few or no health or occurrence data, there is a need for prioritization before expensive toxicologic tests and occurrence monitoring studies are initiated.
To achieve this goal, a tiered approach to prioritization was designed for evaluating which DBPs, if any, present a health concern sufficient to warrant additional research to better characterize the risk. Those DBPs considered to present a health concern would first be tested in a battery of appropriate in vitro or in vivo screening assays. The results could be used to decide which DBPs deserve further studies such as acute/subchronic specialized animal tests for neurotoxicity, immunotoxicity, developmental, reproductive, or system toxicity, and medium-term tests for cancer (e.g., transgenic mouse, medaka fish). Results from these studies could help prioritize more expensive longterm tests and mechanistic studies. In addition, occurrence studies and research on the development of analytical methods for these high-priority DBPs would also need to be addressed. In this article we discuss the process the U.S. EPA used to prioritize DBPs for future testing, in which DBPs were examined by expert structure-activity relationship (SAR) judgment with emphasis on genotoxic cancer potential.
The SAR analysis in this study addresses only the carcinogenic potential of DBPs. There are other ongoing efforts for predicting noncancer effects, including reproductive and developmental toxicity (9,10). Cancer has been raised as an end point of concern in epidemiologic studies (11)(12)(13)(14)(15)(16)(17)(18)(19). In addition, cancer is often the most sensitive health end point that is used to set drinking water standards. Most important, cancer concerns were addressed first because the predictive tools for evaluating the potential cancer risk are more developed than those for other end points such as reproductive and developmental effects.

Prioritization Approach
The U.S. EPA designed a simple prioritization scheme for determining which DBPs may require additional research ( Figure 1). First, the U.S. EPA compiled a list of DBPs to consider for prioritization. More than 600 DBPs from various disinfectant combinations that have been identified and cataloged by the U.S. EPA (20) served as an important reference. Additional DBPs were subsequently added as new information became available (21,22). Of these, the U.S. EPA considered only those DBPs found or detected in actual drinking water samples. DBPs found only through laboratory experiments were excluded because these experiments are often performed under conditions that are not representative of actual water treatment practices. Thus, there is uncertainty as to whether DBPs identified in laboratory experiments can actually be found in drinking water samples. Several additional criteria included eliminating DBPs with incomplete chemical structure characterizations. In addition, chemicals believed to be impurities from processes other than disinfection, such as leachates from treatment plant materials and laboratory equipment (e.g., naphthalene, 3-ethyl styrene), were eliminated. The list of 252 remaining DBPs was peer reviewed by chemists with expertise in DBP formation and identification to ensure, to the extent possible, that the chemicals in the list were all actual or probable DBPs. After these criteria were applied, 239 DBPs remained for research prioritization.
In the next step, the U.S. EPA identified those DBPs that have or will have 2-year cancer bioassay data and occurrence data sufficient for making a hazard assessment, and those DBPs for which sufficient bioassay data are/will be available but insufficient occurrence data currently exist. The criteria for judging if sufficient toxicity data exist to conduct a cancer assessment were as follows: a) there is an MCLG from the Stage 1 DBP rule or past drinking water rules; b) the NTP, the U.S. EPA, or others have conducted or will conduct a 2-year cancer bioassay; or c) there is an oral slope factor on the agency's Integrated Risk Information System (IRIS) (23). The criteria for judging if sufficient occurrence data exist to derive a national estimate of exposure were as follows: a) there is an MCLG from the Stage 1 DBP rule or past drinking water rules, or b) the DBP is included in the information collection rule for DBPs that is collecting national occurrence data. Thirty DBPs ( Table 2) were identified in this step and eliminated from SAR consideration.
As discussed in detail below, the remaining 209 DBPs were analyzed by expert judgment SAR analysis. After DBPs of low concern were identified, a literature search for mutagenicity and other toxicity data was performed for the remaining DBPs to provide additional input to the SAR analysis. The DBPs were categorized by a semiquantitative ranking scale of high (H), high-moderate (HM), moderate (M), low-moderate (LM), marginal (Mar), and low (L) concern. Because these concern levels are based on expert judgment relative to known carcinogens, there is no exact definition. As a guideline, the following narrative descriptions have been used (24): a) H = highly likely to be a potent multispecies, multitarget carcinogen even at low doses; b) HM = highly likely to be an active multispecies/target carcinogen at moderate doses; c) M = likely to be a moderately active multispecies/target carcinogen at relatively high doses or active single species/target carcinogen at low doses; d ) LM = likely to be weakly carcinogenic, or carcinogenic toward a single species/target at relatively high doses; e) Mar = likely to have marginal carcinogenic activity or may be weakly carcinogenic at doses at or exceeding maximum tolerated doses; f ) and L = unlikely to be carcinogenic. Table 3 lists all 209 DBPs together with their assignment to their most appropriate structural chemical classes and categorization of concern level. The basic principles of mechanism-based SAR analysis used for categorizing concern levels of DBPs are discussed below.

Overview of Basic Principles of the Structure-Activity Relationship Approach
SAR analysis has been used to predict toxic potential of chemicals for which test data are limited or not available. SAR is an indispensable tool to help prioritize research and development on a compound by providing valuable initial information on its hazard potential. For organic chemicals the  predictive capability of SAR analysis combined with other toxicity information has been demonstrated (25)(26)(27)(28). Currently, SAR analysis is most well developed for chemicals and metabolites believed to initiate carcinogenesis through covalent interaction with DNA (i.e., DNA-reactive, -mutagenic, -electrophilic, or -proelectrophilic chemicals). At a more limited scale there is some SAR experience for predicting carcinogenicity that does not involve DNA reactive mechanisms but rather involves cellular toxicity, pathophysiologic parameters, or receptor-mediated mechanisms such as Ah receptor, peroxisome proliferation, and endocrine disruption (24,(29)(30)(31).
Mechanism-based SAR analysis has been effectively used by the U.S. EPA for many years to assess the potential carcinogenic hazard of new chemicals, for which there are no or scanty data, under the Premanufacture Notification program of the Toxic Substances Control Act (32). The same approach has been used in design of safer chemicals (33) and pollution prevention (34). An expert system (OncoLogic) has been developed to systematize and codify the agency's SAR expertise in predicting carcinogenic potential of chemicals (26). The principal authors of this present article have been involved in these program activities for more than a decade. The SAR predictions of the cancer potential of DBPs in this article are based mainly on human expert judgment, with some input from the OncoLogic expert system. A similar approach has been applied to prospective prediction of the outcome of NTP cancer bioassays (27). The predictive performance of our approach relative to other predictive methods has been affirmed by an independent evaluation (28).

Mechanism-Based Structure-Activity Analysis
Essentially, mechanism-based SAR analysis involves comparison of an untested chemical with structurally related compounds for which carcinogenic activity is known. Considering the most probable mechanism(s) of action, the structural features and functional properties of the untested compound are evaluated and compared with reference compounds. All available knowledge and data relevant to evaluation of carcinogenic potential of the untested chemical are considered. These include a) SAR knowledge base of the related chemicals; b) toxicokinetics and toxicodynamics parameters (including physicochemical properties, route of potential exposure, and mode of activation or detoxification) that affect the delivery of biologically active intermediates to target tissue(s) for interaction with cellular macromolecules or receptors; and c) supportive noncancer screening or predictive data known to correlate to carcinogenic activity. A prediction of carcinogenic potential involves integration of all this available information with human expert intuition and judgment.
In evaluating the DBPs both structural and functional criteria were applied. The structural criteria and methodology for assessing carcinogenic potential of chemicals have been discussed in detail in previous reviews (26,27). Basically, the structural moieties or fragments that may contribute to carcinogenic activity through a perceived or postulated mechanism are identified, and the modifying role of the rest of the molecule to which the structural moiety/fragment is attached is evaluated. Whenever possible, comparison is made to a structurally related reference compound with known carcinogenic activity (tested preferably by the same route of administration as the chemical in question) to evaluate whether the difference in chemical structures may lead to an increase or decrease in carcinogenic activity.
Electrophiles can interact with DNA and potentially lead to mutagenesis. The identification of electrophiles and their precursors is thus fundamental to the prediction of mutagenic carcinogens. Some of the commonly encountered electrophiles or electrophilic intermediates in carcinogenesis include carbonium ions (alkyl-, aryl-, benzylic), nitrenium ions, epoxides and oxonium ions, aldehydes, polarized double bonds (α,β-unsaturated carbonyls or carboxylates), peroxides, free radicals, and acylating intermediates (27).
For compounds that are metabolically activated, resonance stabilization provides reactive intermediates a longer reactive lifetime. Structural features that may furnish resonance stabilization include conjugated double bonds, an aryl moiety (especially those capable of providing long resonance pathways), ring positions that allow several resonance forms, and structures that allow reversible cyclization of reactive intermediates.
The molecule to which a reactive moiety is attached may significantly affect its carcinogenic potential. Many potent carcinogens (e.g., aflatoxin B 1 , benzo[a]pyrene) have a relatively planar molecular size and shape favorable for DNA intercalation in addition to having a reactive functional group. Attachment of a reactive electrophilic group to normal cellular molecular constituents may also enhance carcinogenic activity (e.g., attaching the moderately carcinogenic nitrogen mustard to uracil yields a more potent carcinogen, uracil mustard), probably by serving as a carrier to reach the target macromolecule. Conversely, the presence of highly hydrophilic groups or bulky substituents that may affect metabolic activation or molecular planarity tends to decrease or eliminate carcinogenic activity.
The structural basis for identifying receptor-mediated carcinogens is considerably less understood and is dependent on the type of receptor believed to be involved. Some of the structural features useful in identifying these carcinogens include a) planar tricyclic molecule with lateral ring substitution for Ah receptor-mediated 2,3,7,8-tetrachlorodibenzo-p-dioxin-related chemicals (35,36); b) nonmetabolizable acids (such as branching at the carbon next to the acid-bearing carbon) for peroxisome proliferator-type carcinogens (29); and c) a molecular descriptor containing a phenolic group 6 angstroms away from a lipophilic moiety for at least some types of hormonal carcinogens (30). In addition to the information on structural    basis of receptors, functional criteria using short-term test data can also be used. Functional criteria involve consideration of all the available short-term noncancer predictive data and pharmacologic and toxicologic capabilities correlated or associated with carcinogenic activity. Functional criteria complement structural criteria because structural considerations alone cannot forecast entirely new types of carcinogens. Furthermore, functional criteria may serve as a means to confirm or cast doubt on the mechanistic assumptions made in applying structural criteria. Information that is highly useful for predicting carcinogenic potential includes data on oncogenes, tumor suppressor genes, genotoxicity and/or ability to bind covalently to DNA, apoptosis, cellular proliferation, immunosuppression, and subchronic toxicity end points that are indicative or suggestive of carcinogenic potential. Ideally, all of the available data should be evaluated with respect to predictive capability, strength of evidence, and relevance to the carcinogenic process and then integrated. Positive predictive tests and data covering all aspects of the carcinogenic process (initiation, promotion, and progression) should be given more weight than multiple tests detecting the same mechanistic end point (24).

Conditions of Hazard Expression (Routes of Exposure)
An individual may be exposed to DBPs by different routes of exposure (e.g., inhalation from showering, dermal from bathing, oral from tap water consumption). In evaluating the carcinogenic potential of a compound, it is important to consider the route of exposure because the hazard and risk posed by a compound may vary by exposure route (37). Delivery of the reactive intermediate to target macromolecules such as DNA is crucial for carcinogenic activity, and exposure routes such as inhalation and injection are often required for maximal activity for direct-acting reactive chemicals. For example, electrophiles such as aldehydes are DNA reactive, but this reactivity also means they are readily detoxified by cellular-protective nucleophiles such as glutathione (GSH). Their toxicity, therefore, tends to be localized to the port of entry. Thus aldehydes, which are of cancer concern via inhalation, pose a lower cancer concern via the oral route because they are readily oxidized to acids before they can react with DNA. However, subpopulations with genetically diminished capability to detoxify aldehydes may be at higher risk. The SAR predictions presented in this document focus mainly on the hazard potential via ingestion of drinking water, a major route of exposure to DBPs. Inhalation exposure to some volatile DBPs may occur through bathing or showering. In general, for the purpose of ranking hazard potential, DBPs that require metabolic activation (e.g., THMs) should have similar hazard potential whether via oral or inhalation, whereas DBPs that are highly reactive direct-acting chemicals (e.g., α-haloethers if they could actually remain reactive through the water delivery system) are expected to have higher concern via inhalation than via oral route.

Literature Search Approach
In support of the SAR analysis of DBPs of greater than low concern, a literature search was performed using chemical abstract numbers. There were several DBPs for which a literature search was not performed because Chemical Abstracts Service (CAS) registry numbers could not be found in the CAS Scientific and Technical Network online database (38). For some DBPs, information on closely related compounds was searched.
Because the present SAR study emphasized predicting genotoxic carcinogens, selected databases were used. Both the Environmental Mutagen Information Center-Front and Back Files (EMIC/ EMICBACK) (39) were searched. EMICBACK, developed and maintained by the Oak Ridge National Laboratory, is a bibliographic database on compounds tested for genotoxic activity. The database contains literature published from 1950 to 1990 and includes some references published before 1950. EMIC covers publications from 1989 to the present. The Chemical Carcinogenesis Research Information System (CCRIS) (40), developed and sponsored by the National Cancer Institute, was also searched. CCRIS contains information from carcinogenicity, mutagenicity, tumor promotion, and tumor inhibition studies that have been evaluated for acceptability by experts in carcinogenesis. CCRIS contains 7,000 chemical records. Additionally, the NTP (41), IRIS (23), and the Agency for Toxic Substances and Disease Registry (ATSDR) toxicological profiles (42) were searched for availability of cancer bioassay data. Although the EMIC/EMICBACK and CCRIS databases were searched, some information on mutagenicity and carcinogenicity may have been missed, in particular, information on metabolism and mode of action (e.g., cell proliferation, apoptosis).
Mutagenicity and carcinogenicity data were gathered from either abstracts or actual publications and compiled into a summary table listing the chemical name, CAS number, test, strain, method, result, dose, and publication reference. This information was then used to assist in the SAR predictions for DBPs of greater than low concern.

Structure-Activity Relationships Cancer Prediction for Disinfection By-Products
Prior to the SAR analysis the U.S. EPA determined that those DBPs ranked as moderate, high-moderate, or high would be the priority candidates for future testing. This was decided because of the large number of DBPs involved. If a chemical with few occurrence data was determined to be of a higher concern, then further toxicity research on the chemical might be justified. If, however, a chemical was determined to be of a lower concern, some occurrence data beyond mere identification would have to be obtained before testing would be warranted.
SAR predictions were made for 209 DBPs ( Table 3). The DBPs were first reviewed to identify chemicals with low concern. Judgments of low cancer concern were based on structural similarity to chemicals with negative cancer data, a lack of structural alert for genotoxicity, or presence of structural features suggestive of low cancer risk via the oral route (26,27,33). Once the DBPs of low concern were removed from the list, a literature search was done for the remaining DBPs. It should be noted that literature was not found in EMIC/ CCRIS/NTP/IRIS/ATSDR databases for many of the DBPs. Thus, the mechanismbased SAR predictions relied heavily on expert judgment and experience. SAR assumptions and conclusions for concern levels and specific classes of DBPs are discussed below. Table 4 summarizes the structural class and concern level distribution of the 209 DBPs Of the 209 DBPs examined, none are considered to be of high concern. Only 20 (<10%) are predicted to have a concern level of moderate or high-moderate. With one exception, all these compounds are halogenated, with most of them belonging to the structural classes of halofuranones, haloalkanes/alkenes, halonitriles, and haloketones. A detailed analysis of these four classes will be presented. Haloacids would have constituted a major class of concern. However, because several haloacids have already been tested or selected for testing (Tables 1 and 2), they are not considered in detail in the present study. Outside of the four major classes of concern, one haloaldehyde (dichloroacetaldehyde), one halonitroalkane (dibromonitromethane), and Reviews, 2002 • Woo et al.

Distribution of Disinfection By-Products within Structure-Activity Relationship Concern Levels and Structural Classes
one nonhalogenated aldehyde (butanedial) are considered of moderate concern. Dichloroacetaldehyde has been given a moderate concern because it is a potential cross-linking agent. It can also be readily oxidized to dichloroacetic acid, which has been shown to be a rodent carcinogen with multiple mechanisms of action (43)(44)(45). Dibromonitromethane has been given a moderate concern because the corresponding dichloronitromethane is believed to be the proximate mutagen of chloropicrin (46). The replacement of chlorine by bromine should make it a more potent mutagen because bromine is a better leaving group. The structurally related nitromethanes, particularly tetranitromethane, are carcinogenic, whereas chloropicrin (trichloronitromethane) is noncarcinogenic in mice and inconclusive in rats (47). Butanedial is the only nonhalogenated DBP given a moderate concern in the present study. This compound has two terminal reactive aldehydes separated by two methylene groups, which should make it a highly favorable cross-linking agent.
The majority (131/209) of the DBPs in this study are considered to have low (98/209) or marginal (33/209) cancer concern. Most of these compounds are nonhalogenated carboxylic acids, ketones, aldehydes, and miscellaneous organic compounds. Nonhalogenated hydrophilic carboxylic acids are not of concern because they are unlikely to be absorbed and, even if absorbed, are rapidly excreted. High-molecular-weight nonhalogenatic carboxylic acids are also a low concern because they have no structural alerts, and many are natural products and nutrients, U.S. Food and Drug Administration food additives, and synthetic flavorings. Several medium-size (6-10 carbons) carboxylic acids with branching at the carbon next to the carboxylic group (omega-1 carbon) were considered potential rodent carcinogens because of potential peroxisome-proliferating activity but were given a marginal concern rating because of uncertain human significance. A number of nonhalogenated aldehydes, particularly those with high molecular weight, are given low or marginal concern because they are unlikely to have significant dose via drinking water; this subject will be further discussed below. With the exception of α,β-unsaturation or closely spaced dicarbonyl groups, nonhalogenated ketones are mostly of low concern because they lack electrophilic activity and are generally not associated with carcinogenicity. Halogenated aliphatic amines are a low concern because of structural analogy to chloramine, which has negative cancer bioassay data (47).
The remainder (58/209) of the DBPs fall into the low-moderate concern category and represent a wide variety of classes, both halogenated and nonhalogenated. In general, these DBPs are considered to have a concern level lower than moderate because they have a less active chlorine/bromine group or contain structural features that are not as favorable for carcinogenic activity. These DBPs include certain haloacids, haloaldehydes, haloethers, haloamides, nonhalogenated aromatics, and reactive ketones. Additionally, a large number (35/209) of haloketones, halofuranones, haloalkanes, halonitriles, and nonhalogenated aldehydes are considered of lowmoderate concern. The rationale for their assignments as well as the SAR information available on these classes are discussed in more detail below.

Halofuranones, MX, and Related Compounds
Within the halofuranones class, 3-chloro-4-(dichloromethyl)-5-hydroxy-2(5H)-furanone (MX) is the most well-known chemical. MX is the most potent, direct-acting mutagenic DBP ever tested in the Ames test (48). On a molar basis MX alone can account for up to 30-50% of the mutagenicity of chlorinated water (49). It is also a potent multitarget carcinogen in the rat (50). The upper-bound cancer risk per unit dose (oral slope factor) for lifetime exposure to MX (based on thyroid follicular adenomas in the rat) was estimated (51) to be 3.7 (mg/kg-day) -1 . This number is not as high as would be expected from its bacterial mutagenic potency, indicating that MX may be readily detoxified in the body. The structure-mutagenicity relationships of MX and related compounds have been extensively studied using Ames Salmonella assay (49,52,53). MX is an extremely potent, direct-acting bacterial mutagen; its mutagenic activity can be substantially decreased by inclusion of S-9 mix. MX can undergo reversible cyclization between its closed-ring and open-ring forms, depending on the pH of the aqueous medium. In general, MX and related compounds, which are capable of undergoing cyclization reactions, are considerably more mutagenic than their corresponding compounds, which remain predominantly in the open-ring forms. For example, MX is at least 10 times more potent than (E)-2-chloro-3-(dichloromethyl)-4-oxobutenoic acid (EMX), the geometric isomer of the openring form of MX with limited capacity to cyclize (49). The hydroxy group at the 5 position, which facilitates the cyclization reaction, also has a profound effect on determining the mutagenicity. Elimination of the 5-OH group from MX (yielding 3-chloro-4-(dichloromethyl)-2-(5H)-furanone [red-MX]) reduces the mutagenicity by 100-fold (49). Apparently, the closed-ring form, which is less hydrophilic than the open-ring form, may be required for optimal membrane penetration. It appears that the ultimate mutagen of MX-related compounds inside the cells may be their open-ring form, but they need to cyclize to closed-ring form outside the cells to facilitate membrane penetration. Substitution of chlorine by bromine has no appreciable effects on mutagenicity, as indicated by comparable mutagenicity among MX, 3-chloro-4-(bromochloromethyl)-5hydroxy-2(5H)-furanone (BMX-1), 3-chloro-4-(dibromomethyl)-5-hydroxy-2(5H)furanone (BMX-2), and 3-bromo-4-(dibromomethyl)-5-hydroxy-2(5H)-furanone (BMX-3) (53), whereas replacement of the 4-dichloromethyl group of MX by 4-chloromethyl generates a less potent mucochloric acid (52,54,55). On the basis of this SAR information, the cancer concern levels of the 10 MX-related DBPs in this study are summarized in Table  5, along with rationale and available genotoxicity data. The three chlorobromo analogs of MX (BMX-1, BMX-2, and BMX-3) are all given a high-moderate rating whereas mucochloric acid is given a moderate rating. On the basis of weaker mutagenicity and less favorable cyclizing capacity, EMX and red-MX are considered to be of low-moderate concern. Despite the lack of toxicity data, 2chloro-3-(dichloromethyl)-buteindioic acid (ox-MX) and (E)-2-chloro-(dichloromethyl)buteindioic acid (ox-EMX) are given a marginal concern because the oxidation of the aldehyde group is expected to eliminate cyclizing capacity and may render the compounds too hydrophilic.

Haloalkanes and Haloalkenes
Numerous haloalkanes and haloalkenes have been tested for carcinogenic and mutagenic activities; the SARs have been extensively studied (56). In general, the genotoxic potential is dependent on the nature, number, and position of halogen(s) and the molecular size of the compound. Short-chain monohalogenated (excluding fluorine) alkanes and alkenes are potential direct-acting alkylating agents, particularly if the halogen is at the terminal end of the carbon chain or at an allylic position. Dihalogenated alkanes are also potential alkylating or cross-linking agents (either directly or after GSH conjugation), particularly if they are vicinally substituted (e.g., 1,2-dihaloalkane) or substituted at the two terminal ends of a short to medium-size (e.g., 2-7) alkyl moiety (i.e., α,ω´dihaloalkane). Fully halogenated haloalkanes tend to act by free radical or nongenotoxic mechanisms (such as generating peroxisomeproliferative intermediates) or undergo reductive dehalogenation to yield haloalkenes that in turn could be activated to epoxides. Haloalkenes are of concern because of potential to generate genotoxic intermediates after epoxidation. The concern for haloalkenes may be diminished if the double bond is internal or sterically hindered.
On the basis of the above SAR information, the cancer concern levels of the 14 haloalkanes and haloalkenes in this study are summarized in Table 6

2,3-Dichloro-4-oxobutenoic acid
M Structural analogy to MX with Cl 5-hydroxy-2(5H)furanone) expected to (mucochloric acid; 3,4-dichloro-5-be less reactive. Positive genotoxicity data (Ames, E. coli, sister chromatid hydroxy-2(5H)-furanone) exchange in Chinese hamster ovary cells) but less active than MX (54,55,74). (pulmonary adenoma assay) and genotoxicity data. Five brominated and iodinated methane and ethane derivatives are given a moderate rating. Beyond the fact that bromine and iodine are better leaving groups than chlorine, there is also evidence that brominated THMs may be preferentially activated by a theta-class glutathione S-transferase (GSTT1-1) to mutagens in Salmonella even at low substrate concentrations (57,58). Furthermore, there are human carcinogenicity implications because of polymorphism in GSTT1-1. Human subpopulations with expressed GSTT1-1 may be at a greater risk to brominate THMs than humans who lack the gene (57). Six, two, and one haloalkanes/ haloalkene(s) are given low-moderate, marginal, and low concern, respectively, with detailed rationale summarized in Table 6.

Halonitriles
There are basically three types of halonitriles detected as DBPs: a) halogenated acrylonitrile and higher congeners, b) halogenated acetonitriles and higher congeners, and c) cyanogen halides. The predicted concern levels of these compounds are summarized in Table 7 along with rationale and available screening data. Three DBPs in this class are chlorinated acrylonitriles (cis-and trans-2,3,4-trichloro-2-butenenitrile and trichloropropenenitrile); they have all been given a moderate concern rating. Acrylonitrile is a well-known genotoxic rodent carcinogen (59). The introduction of halogens to acrylonitrile may reduce the potential to undergo Michael addition or epoxidation, but the terminal chlorine in cisand trans-2,3,4-trichloro-2-butenenitrile may introduce an additional reactive terminal chlorine. Trichloropropenenitrile is of concern because of its structural analogy to tetrachloroethene.
Acetonitrile is not carcinogenic in rodents and is only weakly or marginally mutagenic (60). The introduction of halogen to αand terminal carbons is expected to increase genotoxic potential by making it an alkylating/cross-linking agent. Halogenated acetonitriles have been tested in various cancer and genotoxicity screening assays. Table 8 summarizes and compares the available data. On the basis of alkylating activity, the brominated compounds are expected to be more reactive than chlorinated compounds. On the basis of data for chlorinated acetonitriles, and consistent with chemistry of halogenated compounds, increasing halogenation tends to decrease alkylating activity. Essentially mixed results have been observed in the screening assays. Despite their higher alkylating activity, monohalogenated acetonitriles tend to be inactive in a number of in vitro genotoxicity assays, probably because of complication by their higher cytotoxicity. There is some evidence that, in Comet, Chinese hamster ovary, and newt micronucleus assays, increasing chlorination increases the genotoxic potency (Table 8). However, this pattern is not seen in lung adenoma assay and skin tumor initiation studies in SEN-CAR mice. Probably the only consistent pattern seen across various assays is the higher activity of dibromoacetonitrile and bromochloroacetonitrile. Dibromoacetonitrile  Structural analogy to dichloromethane, which is a rat carcinogen (47). The brominated compound is expected to be more hazardous than the chlorinated compound because of more favorable leaving tendency and GSH-mediated activation. Positive genotoxicity (Ames, ara forward mutation, E. coli) data (75-78).
2. Bromochloromethane CH 2 BrCl M Structural analogy to dichloromethane, which is a rat carcinogen (47). The brominated compound is expected to be more hazardous than the chlorinated compound because of more favorable leaving tendency and GSH-mediated activation.

Bromochloroiodomethane
CHBrClI M Structural analogy to bromodichloromethane, which is a rodent carcinogen (47). The iodo group is expected to be a better leaving group than chloro group.

Dichloroiodomethane
CHCl 2 I M Structural analogy to bromodichloromethane and chloroform, which are both carcinogenic (47). The iodo group is expected to be an even better leaving group than the chloro/bromo group.

2,3-Dichlorobutane CH 3 CH(Cl)CH(Cl)CH 3 LM
Vicinal dichloro substitution may lead to GSH-mediated activation, but internal location of chlorine may limit its genotoxic potential.
11. Tetrachlorocyclopropene LM Limited structural analogy to hexachloropentadiene, which has negative bioassay data (47). However, this compound may have some genotoxic potential. One of the chlorines at the bridged carbon may leave and generate a carbonium ion that can be stabilized by the ring by resonance stabilization.
12. 1,1,5,5-Tetrachloropentane Cl 2 CH(CH 2 ) 3 CHCl 2 Mar Potential alkylating agent, but its genotoxic potential may be reduced because the potentially reactive terminal carbons are both dichlorinated, making them not as favorable as mono chlorine as leaving groups.
13. 1-Chlorooctane ClCH 2 (CH 2 ) 6 CH 3 Mar Despite the presence of a terminal chlorine, this compound is expected to be a weak alkylating agent because of its high molecular weight and its saturated chain.
14. 2-Chlorododecane CH 3 CH(Cl)(CH 2 ) 9 CH 3 L Expected to be a very weak alkylating agent because of its high molecular weight and its saturated chain.

Cl Cl
Cl Cl has already been selected for testing ( Table  2). In this study, bromochloroacetonitrile has been given a moderate concern, whereas all other halogenated acetonitriles have been given a low-moderate concern. Two higher homologs of bromochloroacetonitriles (2,3dichloro-3-bromopropanenitrile and 3,4dichlorobutanenitrile) have also been considered of moderate concern because of SAR consideration, although they should be at the low end of the moderate category.
Cyanogen chloride and cyanogen bromide have been given a low concern. They are known or expected to be metabolized to cyanide in the body. The expected high acute toxicity should limit significant exposure. There are also no structural alerts suggestive of carcinogenic potential.

Haloketones
Haloketones with monosubstitution with chlorine or bromine at the α-carbon or terminal carbon are expected to be potential alkylating agents. Haloketones with active halogen at both ends of the aliphatic chain are expected to be cross-linking agents. The leaving tendency of halogen tends to decrease with an increase in the degree of halogenation as the electron-withdrawing effect of the second and/or third halogen diminishes the leaving potential of the first halogen (22,48,56). On the other hand, haloketones with multiple halogenation at both α-carbons may lead to unstable compounds. The stability of several chlorinated ketones in aqueous solutions follows this order: 1,3dichloro > pentachloro >> hexachloro (61).
A variety of haloketones have been tested in various screening assays. Consistent with   (79). Although the substitutions may reduce potential to undergo Michael addition or epoxidation, the terminal active chlorine may provide additional genotoxic potential.

trans-2,3,4-Trichloro-2-butene nitrile M
This compound is a substituted acrylinitrile, a known rodent carcinogen (79). Although the substitutions may reduce potential to undergo Michael addition or epoxidation, the terminal active chlorine may provide additional genotoxic potential.

Bromoacetonitrile BrCH 2 CN LM
This compound has an active bromine but negative or mixed genotoxicity data (69,83,84), due possibly to its cytotoxicity.

Chloroacetonitrile
ClCH 2 CN LM This compound has an active chlorine. Positive skin tumor initiator (80) and positive in lung adenoma assay (81) but negative or mixed genotoxicity data (69,83,84) due possibly to its cytotoxicity. 9. Dichloroacetonitrile Cl 2 CHCN LM This compound has a somewhat active chlorine. Negative in skin tumor initiation and lung adenoma assays (80,81); some positive and some equivocal genotoxicity data (69,83,84 14. Cyanogen chloride ClCN L This compound is known to be readily metabolized to cyanide in the body. The expected high acute toxicity should limit significant exposure. There is also no structural alert suggestive of cancer concern. 15. Cyanogen bromide BrCN L This compound is expected to behave in the same way as cyanogen chloride.  their potential chemical reactivity as alkylating agents, three chloropropanones have been shown to react directly with GSH. Their relative potency follows this order: 1,3-dichloro > monochloro > 1,1-dichloro (62). Among five chloropropanones (mono-, 1,1-, 1,3-, 1,1,1-, and 1,1,3-) tested for skin tumor-initiating activity in SENCAR mice, only 1,3dichloropropanone showed clearly positive results (63). With the exception of 1,1,1,3tetrachloropropanone, all congeners of chloropropanones have been tested for mutagenicity in the Ames test. Among the mutagenic chloropropanones (mostly directacting), the relative mutagenic potency follows this order: 1,3-> 1,1,3,3-> penta-> 1,1,3-> 1,1,1-> 1,1-, with the potency of 1,3-being about 100 to 1,000 times higher than that of 1,1-(48,62). Inconsistent results have been observed in the Ames test on monochloropropanone because of its high cytotoxicity (which to some extent can be attenuated by inclusion of S9 mix) and on hexachloropropanone, which is relatively unstable in water (61). 1,3-Dichloro and, to a lesser extent, 1,1,3-trichloro congeners have also been consistently found to be more mutagenic than mono-, 1,1-, and 1,1,1congeners in E. coli SOS chromotest for DNA damage (SOS), Ames fluctuation, and newt micronucleus tests (64). Based on the above SAR and screening data, the cancer concern levels of 19 haloketones are summarized in Table 9 along with rationale and available data. Only 1,3-dichloropropanone and, to a lesser extent, 1,1,3-trichloropropanone have been given a moderate concern. Most of the other haloketones have been given a low-moderate concern, although there may be slight differences within the low-moderate category as detailed in the rationale of individual compounds.

Nonhalogenated Aldehydes
As a class, aldehydes have been given special attention tailored to drinking water consideration. Essentially, aldehydes are electrophilic, reactive chemicals that may form DNA-protein cross-links and induce carcinogenesis/mutagenesis. A variety of aldehydes have been tested for carcinogenic activity (65). By the inhalation route formaldehyde and, to a much lesser extent, acetaldehyde are carcinogenic, whereas isobutyraldehyde is not carcinogenic even at doses that cause irritation to the respiratory tract. There is   (48,89). The extensive chlorine substitution makes the compound unstable even at near neutral pH (89). Concern level at low end of LM.  (63). Weak or mixed genotoxicity (+SOS, w+Ames, -newt micronucleus) data (48,86,87). Concern level at low end of LM.
The pentenedione chlorines at 2-position may be somewhat active. Concern level at low end of LM.

2-Chlorocyclohexanone LM
The unsubstituted cyclohexanone is a weak to marginally active carcinogen (90). The introduction of active chlorine at the a-carbon expected to increase genotoxic potential, but the rigid ring may limit its potential. Concern level at low end of LM. some suggestive evidence that acetaldehyde may be a potential ultimate carcinogen in alcoholics with genetically deficient detoxifying capabilities; however, the subject remains to be resolved. By the oral route, the α,βunsaturated aldehyde, crotonaldehyde, is carcinogenic, whereas acrolein is equivocal, probably because it is too reactive.
Numerous aldehydes have been tested for mutagenic activity. In general, only short-chain aldehydes (e.g., formaldehyde, acetaldehyde) have been clearly shown to be mutagenic. The genotoxic potential of aldehydes decreases substantially with an increase in molecular size. The introduction of hydrophilic groups generally decreases activity, whereas α,β-unsaturation tends to increase the genotoxic potential provided that the β-position is not sterically hindered (66).
Although short-chain aldehydes such as formaldehyde and acetaldehyde are carcinogenic in animals by inhalation, their carcinogenic potential by the oral route may be limited unless exposure occurs in high doses that overwhelm the detoxification mechanisms or to susceptible individuals. There is some evidence that hexamethylenetetramine, which is known to be hydrolyzed to formaldehyde and shown to induce local sarcomas by injection, has no carcinogenic activity when tested by the oral route (65). With the exception of α,β-unsaturated aldehydes, our assessment of the cancer hazard potential of aldehydes is based on the assumption that the principal route of exposure is oral and that the general population has adequate capacity to detoxify environmental levels of aldehydes. Humans are known to have genetic polymorphism in aldehyde dehydrogenase-2 (ALDH-2), and there is some suggestive (67) but inconsistent (68) evidence that subpopulations with deficient ALDH-2 may be at a higher cancer risk to acetaldehyde generated from consuming alcohol.
Among the nonhalogenated aldehydes considered in this study, butanedial is the only compound that has been given a moderate concern. Despite the lack of toxicity data, butanedial has been given a higher concern than the rest of the compounds because it has two terminal reactive aldehydes separated by two methylene groups, which should make it a highly favorable cross-linking agent. Four aldehydes (methyl glyoxal, cyanoformaldehyde, 2-hexenal, and propanal) are considered to be of low-moderate concern if they could be found in water in significant amounts. Higher molecular-weight aldehydes are not of significant concern by SAR consideration and comparison to isobutyraldehyde, which is not carcinogenic even by the inhalation route.

Summary and Conclusions
Determining appropriate drinking water DBP regulations is a complex problem. Disinfectants are necessary to protect against waterborne pathogens, and thus DBPs are unavoidable. Source water quality and constituents vary widely throughout the United States. Combined with the assortment of disinfectants available, this means that DBPs differ from site to site in both occurrence and concentration. Along with a number of DBPs that have some occurrence data, there are hundreds of chemicals that have been identified as DBPs but that have no quantitative occurrence data beyond this single identification. The conundrum presented by these hundreds of identified DBPs is how to determine research priorities. Two important factors to consider in setting regulations are the toxicity of the chemical and the concentration at which the chemical is found. For the majority of the chemicals in this article, no data were available on either factor. Gathering occurrence data and toxicity testing are both expensive and time-consuming activities. SAR analysis is essential in narrowing down health research priorities because it is time and cost effective. The U.S. EPA efforts are ongoing to gather occurrence data for a number of DBPs of higher concern.
It is encouraging from a public health standpoint that although more than 200 DBPs were analyzed, only 20 were of moderate or higher concern for carcinogenic potential. Of these, four are structurally related to MX, which is believed to occur at very low levels (nanograms per liter), and are thus likely not of great concern. Five others are halogenated alkanes, which presumably will be controlled by existing and future THM regulations. As a result of this analysis, the most suitable candidates for testing are the halonitriles and haloketones that are in the moderate concern category, dibromonitromethane and butanedial.