Use of QSARs in international decision-making frameworks to predict ecologic effects and environmental fate of chemical substances.

This article is a review of the use, by regulatory agencies and authorities, of quantitative structure-activity relationships (QSARs) to predict ecologic effects and environmental fate of chemicals. For many years, the U.S. Environmental Protection Agency has been the most prominent regulatory agency using QSARs to predict the ecologic effects and environmental fate of chemicals. However, as increasing numbers of standard QSAR methods are developed and validated to predict ecologic effects and environmental fate of chemicals, it is anticipated that more regulatory agencies and authorities will find them to be acceptable alternatives to chemical testing.


Aims, Goals, and Scope of Review
The manufacture, storage, distribution, and release to the environment of xenobiotic substances worldwide are controlled and regulated at local, national, and international levels. Control and regulation of chemical substances are mandated by legislation and implemented by regulatory agencies and authorities. To regulate the use of chemicals successfully, authorities require reliable data to be produced on the effects and fate of chemicals in the environment. Traditionally, these data have been provided by testing chemical substances by a number of well-defined protocols. However, until the passage of the Toxic Substances Control Act (TSCA 1976) in the United States and similar legislation elsewhere, such data were publicly available for only a fraction of industrial chemicals. Since the enactment of TSCA and creation of the TSCA Interagency Testing Committee (ITC 2002), the ITC has recommended testing for hundreds of chemicals, the U.S. Environmental Protection Agency (U.S. EPA) has implemented these recommendations, and the producers of these chemicals have conducted more than 900 tests for these chemicals (Walker 1993a(Walker , 1993b. The development of these data, in response to the ITC's recommendations, proved advantageous for subsequent nonstatutory programs designed to encourage the voluntary development of data. For example, the Organisation for Economic Cooperation and Development's (OECD) Screening Information Data Set (SIDS) and the U.S. EPA High Production Volume (HPV) Challenge programs benefited because most of the chemicals in the early phases of those programs had already been recommended for testing by the ITC (Walker 1993a). However, even with the creation of these nonstatutory programs, there are potentially thousands of non-HPV industrial chemicals that will continue to go untested. Prioritization of these industrial chemicals for screening and testing needs the development and validation of standard methods to predict the ecologic effects and environmental fate of chemicals using quantitative structure-activity relationships (QSARs) (Russom et al. In press). Guidelines have been published for developing and using QSARs [Walker. In press (a); Walker et al. In press (a)] and for predicting ecologic effects (Bradbury et al. In press). Nonetheless, further development and validation of standard QSAR methods is needed to gain widespread acceptance by the regulatory and regulated communities.
The aim of this article is to review the worldwide regulatory use of QSARs for predicting the ecologic effects and environmental fate of chemical substances (the regulatory use of QSARs for predicting human health effects forms the basis of a second review; Cronin et al. 2003). The use of QSARs by a number of regulatory agencies to prioritize chemicals for testing and to fill data needs is described. Although QSARs are applied by agencies worldwide, this review focuses on their use in North America and by the European Union (EU) and their member states. It should be emphasized that the purpose of this article is not to provide an extensive review of the use of QSARs per se but to review their regulatory application; further details on this complex and evolving topic may be obtained from a recent review by  and a web-based database developed by the OECD (2002). QSARs themselves have been the subject of a number of excellent recent reviews [Boethling and Mackay 2000;ECETOC 1998;Karcher 1995;Lyman et al. 1990;Nendza 1997;Walker 2003, In press (a-e); Walker and Schultz 2002].

Regulatory Use-Europe
European Union. Risk assessment of the ecologic effects of chemicals is well described by van Leeuwen and Hermens (1995). The potential application of QSARs in this process is described in Nendza and Hermens (1995), Comber (In press), and Comber et al. (In press). In the EU, risk assessment of chemical substances is driven by the requirements of Commission Directive 93/67/EEC on Risk Assessment for New Notified Substances and Commission Regulation (EC) No. 1488/94 on Risk Assessment for Existing Substances (EEC 1993a(EEC , 1994Joint Research Centre 1998). To ensure consistency of application of the environmental risk assessment (ERA) process, in 1996 the EU produced a comprehensive technical guidance document (TGD) to support the Directive on New Substances and the Regulation on Existing Substances (EEC 1996). This document includes a substantial chapter providing guidance on the use of QSARs in the ERA process in terms of where they should be used, how they should be used, and which ones should be used. Four specific uses for QSARs in ERA are identified in the EU TGD, and these are presented in more detail below.

Data Evaluation
Acceptable QSARs may be used as a supporting tool to evaluate the adequacy of the available experimental data, for example, when the validity of the test data is not obvious. This may occur when incomplete data are available on the test and/or the test differs in some way from current methods, for example, OECD test guidelines.

Decision for Further Testing/Testing Strategies
In those cases where a predicted environmental concentration (PEC)/predicted no effect concentration (PNEC) ratio, established using test data, is greater than 1, there will be a requirement to determine whether additional testing is needed to allow a refinement of the PEC/PNEC ratio. To facilitate this decision, all available test data should be reconsidered along with estimates established using acceptable QSARs. If PEC/PNEC ratios derived using the QSARs suggest that further testing is required, then generally a chronic test should be conducted on the species that showed the lowest estimated no-observable-effect concentration (NOEC).

Establishing Specific Parameters
Acceptable QSARs can be used for the estimation of specific (input) parameters used in the risk assessment, particularly in the exposure assessment when no measured data are available to enable derivation of the PEC.

Identifying Data Needs on Effects of Potential Concern
Acceptable QSARs can be used for preliminary assessment of endpoints that are not part of the base set of data and for which information is not available.
Thus QSARs are used to estimate missing physicochemical parameters in environmental exposure assessment. Rules for the use of QSARs in environmental effects assessment are also provided in the TGD. These simply state that a QSAR is considered to be acceptable for a particular use with the risk assessment process if a) the QSAR applied has been validated by an appropriate process, and b) the estimate possesses the necessary accuracy for the intended use.
At the time of our writing this article, the TGD is nearing the end of a major revision that, among other things, will incorporate a risk assessment of the marine environment into the process. This will include manual comparison of persistent, bioconcentration, and toxicity properties against specific criteria; the use of QSARs is allowed where experimental data are not available. This is an implementation of the European Commission's interim strategy for management of persistent bioaccumulative and toxic substances (PBTs) and very persistent, very bioaccumulative substances (vPvB) (CEC 2001b), and the criteria used are shown in Table 1.
The Convention for the Protection of the Marine Environment of the Northeast Atlantic DYNAMEC prioritization of hazardous substances. The Commission for the Protection of the Marine Environment of the Northeast Atlantic (OSPAR Commission) has developed a strategy regarding hazardous substances that involved the development and application of a dynamic selection and prioritization mechanism (DYNAMEC) to produce an updated list of chemicals for priority action. The prioritization mechanism was based on persistence, toxicity, and bioaccumulation properties of chemicals and was applied to a data set of more than 180,000 substances (OSPAR Commission 2000). For many of the chemicals, no experimental test data were available, and the data used for the prioritization exercise were generated using QSARs, including the Syracuse Research Corporation (SRC, Syracuse, New York, USA) BIOWIN model for the prediction of persistence, KOWWIN and BCFWIN for bioaccumulation, and ECOSAR for toxicity. REACH initiative. The European Commission (CEC 2001b) has recently published a white paper setting out a "strategy for a future chemicals policy," which discusses the registration, evaluation, and assessment of chemicals-the so-called REACH initiativefor new and existing chemical substances marketed in quantities of more than 1 metric ton/ enterprise/year. The 30,000 existing substances affected will be processed on a phased basis over a period of 11 years (ending 2012), starting with those marketed in the highest volumes, as well as those with very high hazard (once data have been registered). There is a desire to decrease the time taken to assess the risk of thousands of existing chemicals that are in current use. An important part of this chemicals policy is the fostering of research on development and validation of testing methods as alternatives to animal testing, including QSAR models. This theme has been taken up by the European Parliament (2001aParliament ( , 2001b, which has requested the use of screening procedures based on simplified risk assessment using data modeling, for example, QSARs and use patterns, to prioritize substances of possible concern "in order to speed up risk assessments" (European Parliament 2001a). In addition, the European Commission (CEC 2001c) has recently published an "Interim Strategy for Management of PBT and vPvB Substances," which states that the identification and verification of PBTs and vPvBs among new and existing substances will use QSAR models "where experimental data do not exist." The Danish Environmental Protection Agency (Danish EPA) have proposed that their QSAR database be used for this purpose.
EU Scientific Committee on Toxicity, Ecotoxicity and the Environment. The EU Scientific Committee on Toxicity, Ecotoxicity and the Environment (CSTEE) recommended, in their general data requirements for regulatory submissions, that QSAR data may also be used. However, they cautioned that for predictions, chemicals should be assessed not only to ensure they are of similar structure, but also to ensure that they are operating by a similar mode of action as the chemicals used to elicit the QSAR. Moreover, in order to take into account the higher level of uncertainty of predicted data compared with experimental data, different rules (i.e., stricter application of safety factors and triggers) should be applied to QSAR information.
Danish self-classification of dangerous substances. The Danish EPA (2001) developed an advisory list for self-classification of dangerous substances using QSAR models. Of approximately 47,000 substances that were examined, 20,624 substances were identified as requiring classification for one or more of the following dangerous properties: acute oral toxicity, sensitization by skin contact, mutagenicity, carcinogenicity, and danger to the aquatic environment. The Danish EPA stated that "the [QSAR] models used here are now so reliable that they are able to predict whether a given substance has one or more of the properties selected with an accuracy of approximately 70-85%" (Danish EPA 2001). With regard to

Criterion
PBT criteria vPvB criteria P Half-life > 60 days in marine water or > 40 days Half-life > 60 days in marine water or fresh water in fresh water a or half-life > 180 days in marine or > 180 days in marine or freshwater sediment sediment or > 120 days in freshwater sediment a B BCF > 2,000 BCF > 5,000 T Chronic NOEC < 0.01 mg/L or CMR or Not applicable endocrine-disrupting effects environmental effects, specific endpoints predicted included bioconcentration, biodegradation, and fish acute toxicity.
Danish QSAR database. The Danish EPA has made extensive use of QSARs and has developed a QSAR database that contains predicted data on more than 166,000 substances (OSPAR Commission 2000). A recent publication from the Danish EPA (Tyle et al. 2001) reports the use of QSARs for identification of potential PBT and vPvB substances from among the HPV and medium-productionvolume chemicals used in the EU.
German use of QSARs. Lange and Vormann (1995) of the German Umweltbundesamt (UBA) have reported on the use of well-designed QSARs both to fill missing data needs and to give some assurance of the quality of the available experimental test data. They described how the German Regulatory Authority developed a QSAR-based software system to assess the validity of predictions made for data supplied with newly notified chemicals. The system predicted a wide range of fate and effect endpoints, including fish and Daphnia acute toxicity. However the authors explained that for 64% of the substances assessed, some of the QSARs were not applicable because the substances were ionic, contained heavy elements, hydrolyzed rapidly, or were reaction mixtures. Generally, the German UBA uses these QSAR approaches and software within the framework of the assessment of chemicals.
Use of QSARs in the Netherlands. Many of the applications of QSARs in The Netherlands are reported by van Leeuwen and Hermens (1995). An approach was developed to classify environmental pollutants according to mode (or mechanism of action) and subsequently to make predictions of toxicity (Verhaar et al. , 1996(Verhaar et al. , 2000. The approach allows for the calculation of a "baseline" toxicity and then a classification scheme for four classes of toxicity: class 1 (inert chemicals, baseline toxicity, typically nonpolar narcosis), class 2 (relatively inert chemicals, typically polar narcosis), class 3 (reactive chemicals, typically compounds capable of covalent electrophilic and nucleophilic interactions with biological macromolecules), class 4 (specifically acting chemicals such as pesticides). The classification approach was applied to predict the toxicity of HPV chemicals (Bol et al. 1993;Verhaar et al. 1994). For inert toxicants (i.e., nonpolar narcotics), QSARs were published for the no-observable-effect concentration values for 19 different species of bacteria, algae, fungi, protozoa, fish, amphibians, and other invertebrates . The information from the QSARs was used to calculate hazard concentrations for 5% of the species, with a low probability of having an effect at the ecosystem level (van Leeuwen 1995).

Regulatory Use-United States
ITC. The ITC is not a regulatory organization; however, many of the 16 U.S. government organizations represented on ITC have regulatory responsibilities. The ITC was created under section 4(e) of TSCA as an independent advisory committee to the U.S. EPA administrator (U.S. EPA 2002e). The ITC was created to identify industrial chemicals in need of testing and to add them to the priority testing list in May and November reports to the U.S. EPA administrator (Walker 1993a). The ITC has a statutory responsibility to use SARs to identify these chemicals (Walker 2003). U.S. governmental organizations represented on the ITC that apply QSARs include the U.S. EPA, Agency for Toxic Substances and Disease Registry (ATSDR), and the U.S. Food and Drug Administration (U.S. FDA) (Walker 2003).
U.S. EPA. The U.S. EPA has received about 38,000 premanufacture notifications (PMNs) for new chemicals and currently receives about 2,000 per annum. Because TSCA does not require testing before submission of a PMN, SARs and QSARs are used to predict the environmental fate and ecologic effects (Walker 2003) To assess the risk of a new chemical, the U.S. EPA makes predictions concerning chemical identity, physical/chemical properties, environmental transport and partitioning, environmental fate, environmental toxicity, engineering releases to the environment, and environmental concentrations. The agency uses a variety of methods to make predictions that include SARs, nearest analogue analysis, chemical class analogy, mechanisms of toxicity, chemical industry survey data, and professional judgment. The agency uses these methods to fill data gaps in an assessment and to validate submitted data in notifications (Nabholz 2001). Predictions are made by the U.S. EPA Office of Pollution Prevention and Toxics (OPPT) under TSCA (Zeeman et al. 1995). The OPPT has routinely used QSARs to predict ecologic hazards, fate, and risks of new industrial chemicals, as well as to identify new chemical testing needs, for more than two decades.
OPPT (Q)SARs for physical/chemical properties used for new chemical assessments are available (U.S. EPA 2002b). Many of the QSARs applied by the U.S. EPA to make predictions are based on estimates of the logarithm of the octanol-water partition coefficient (log K ow ). Log K ow is predicted by the KOWWIN software that is available within the EPIsuite software (which can be down-

Regulatory Use-Canada
The Canadian Environmental Protection Act (1999) requires that the 23,000 substances on the Canadian Domestic Substance List (DSL) (MacDonald et al. 2002) should be categorized and screened for persistence or bioconcentration and inherent toxicity. To assist with the categorization process, Environment Canada formed a technical advisory group (TAG) in December 1998. The TAG consists of scientific and technical experts from government, industry, environmental organizations, and consultant groups and was formed to act as a resource to Environment Canada for identifying and resolving scientific and technical issues that emerge from the implementation of the categorization program (Environment Canada 2002). As recommended by the TAG, an international workshop on QSARs hosted by Environment Canada was held in Philadelphia in November 1999 to establish "rules of thumb" for predicting properties and effects of the structurally diverse DSL chemicals (MacDonald et al. 2002).

Regulatory Use-Australia
Use of SARs and QSARs by Australian government organizations has been described previously . Australian regulatory authorities do not currently use QSARs in relation to the listing of chemicals on the Australian Inventory of Chemical Substances (http://www.nicnas.gov.au/obligations/aics.htm). With regard to new chemical assessments, limited use may be permitted in the National Industrial Chemicals Notification and Assessment Scheme (http://www.nicnas. gov.au) . For example, data from chemical analogues may be used from structurally similar chemicals if toxicologic and ecotoxicologic data are unavailable on the notified chemical. From the analogue data, conclusions may be drawn on the health and environmental effects of the notified chemical. However, Australian authorities do not routinely accept SAR predictions without support from actual test data, for example, physicochemical properties and ecotoxicologic effects.

Regulatory Use-Japan
Japanese regulatory authorities do not currently make predictions of ecologic effects and environmental fate using QSARs. However, this situation is under review by the Japanese Ministry of the Environment .

Organizations Involved with the Validation of QSARs
Despite not being formal regulatory agencies, two bodies, the European Centre for Validation of Alternative Methods (ECVAM) in the EU and the Interagency Coordinating Committee on the Validation of Alternative Methods in the United States, have responsibility for the validation of alternative methods to the use of animals in the safety evaluation of chemical substances. Alternative methods include in vitro tests as well as QSARs and other computer modeling techniques. ECVAM has evaluated the development and the potential for the validation of expert systems, including those using QSARs for predicting toxicity .
The OECD has also been involved in the assessment of QSARs to predict toxicity and physicochemical properties. Workshop reports made recommendations for the use of QSARs in aquatic toxicity prediction (OECD 1992) and for physicochemical properties (OECD 1993a) and biodegradation (OECD 1993b). Many of these recommendations were incorporated into the EU TGD. The recommendations of the 1993 report for aquatic toxicity prediction have been updated recently (OECD 2000).
The OECD was also responsible for collating the results from a tripartite (United States, EU, Japan) assessment of QSARs to predict toxicity (Comber and Feijtel 1997;Feijtel 1995;Karcher et al. 1995;OECD 1994aOECD , 1994bU.S. EPA 1994a). This study compared ecologic effects, physical properties, and environmental fate predictions made by the U.S. EPA with the EC's minimum premarket data (MPD) for 144 chemicals. The results of the study were quite useful in judging many of the strengths and weaknesses of the U.S. EPA approach, as well as determining the utility of MPD-type data in improving the U.S. EPA predictive methods. There were some limitations to the exercise; namely, only small data sets were available, the endpoints used for comparison were limited to the tests included in the MPD data set, different approaches were used to ascertain certain parameters, and indirect measurement in some MPD data sets was used for one or more physical/chemical properties (i.e., extrapolation), which may or may not give a "true" result [for more details on this comparative study, refer to U.S. EPA (2002f)].

QSARs Based on Chemical Classes
Toxicologic QSARs were based originally on congeneric series of chemicals (Schultz et al. 2003). Such QSARs are based on the assumption that compounds with the same functional group (e.g., an aliphatic alcohol) have the same mode of action. It can be limited, however by the identification of the correct class for chemicals with more than one functional group and more than one mode of action. The ecotoxicity structure activity (ECOSAR) database developed by the U.S. EPA (2002b) includes more than 150 QSARs for more than 50 chemical classes. These chemical classes are listed in Table 2.

QSARs Based on Modes of Action
QSARs based on mode of action represent an alternative to QSARs for classes of chemicals. Several modes of action have been predicted based on fish toxicity studies (Russom et al. 1997). These predicted modes of action have been incorporated into the U.S. EPA Assessment Tools for the Evaluation of Risk (ASTER) system, which links two empirical databases: the Aquatic Toxicity Information Retrieval (AQUIRE) database (Pilli et al. 1989) component of the U.S. EPA ECOTOX database (U.S. EPA 2002a), and the physical/ chemical properties database (Russom et al. 1991). Further tools exist for classifying compounds according to mechanisms of action, including nonpolar and polar narcosis reactive and specific categories (Boxall et al. 1997;Verhaar et al. 1992Verhaar et al. , 1996Verhaar et al. , 2000.

Expert Systems
A number commercially available QSAR-based "expert systems" have been developed to predict different toxicity endpoints. For general reviews of expert systems, the reader is referred to Dearden et al. (1997) and Hulzebos et al. (1999).

TOxicity prediction by Komputer-Assisted Technology (TOPKAT).
TOPKAT contains a suite of predictive models for both environmental and human health effects. It is marketed currently by Accelrys Inc. (Cambridge, UK). It has two models suitable for the prediction of acute toxicity. These ecotoxicity models are based on the fathead minnow and Daphnia magna toxicity data. The TOPKAT methodology is based on the collation of the toxicity data for large, chemically heterogeneous groups of chemicals. The chemicals may then be subdivided into any number of groups on the basis of chemical structure (e.g., aromatic, aliphatic, heterocyclic compounds), and then QSARs can be developed for each group. The QSARs are typically developed using topological, or atom-based, descriptors. An integral part of the prediction is an assessment of whether the chemical for which the prediction is made falls within the optimum prediction space (OPS) of the model (Gombar 1999). This is the "coverage" of the model as determined by the physicochemical descriptors from which it was derived. Predictions for compounds that fall outside of this range should be discarded.
The Fathead Minnow LC 50 module of the TOPKAT package is composed of eight statistically significant and cross-validated quantitative structure-toxicity relationship (QSTR) models, and the data from which the models are derived. Each QSTR model assesses acute median lethal concentration (LC 50 ) to fathead minnows in weight/volume units for a specific class of chemicals. The Daphnia magna module of the TOPKAT package is composed of a) four statistically significant and cross-validated, QSTR models, and b) a database of 252 uniform experimental acute median effective concentration (EC 50 ) values selected after critical review of the open literature and the AQUIRE database.
Computer-automated structure evaluation (CASE) methodology. CASE methodology, and all its variants, has been developed by Klopman colleagues (Klopman 1992;Klopman and Rosenkranz 1991). There are a multitude of models available for a variety of endpoints and hardware platforms [more details are available from the manufacturer: MultiCASE Inc. (2002)]. The CASE technique is based on the development of regression-type models after the analysis of the toxicity of a large and chemically heterogeneous data set. The technique splits the molecules into all possible fragments and seeks to find those two-dimensional fragments that are associated with (either by promoting or reducing) toxicity. A regression model is then developed from those fragments normally without the use of electronic or three-dimensional structural descriptors. There are many forms of the CASE models; the software is variously called CASE, MultiCASE (MCASE), CASE-TOX, and TOXALERT [MultiCASE Inc. (2002)], depending on the endpoint being modeled and the hardware platform. For the successful prediction of acute toxicity, the baseline activity identification algorithm  (Klopman 1998) simulates the "baseline" effect of narcosis. In this manner, CASETOX models have been developed for the fathead minnow on the basis of 479 test values (Klopman et al. 2000) and for the guppy on the basis of 219 test values (Klopman et al. 1999).
Optimized approach based on structural indices set (OASIS). OASIS is a shell system for screening chemical inventories for physicochemical and toxic endpoints accounting for conformational flexibility of chemicals. It was developed and defined by Mekenyan et al. (1990Mekenyan et al. ( , 1994. The OASIS Forecast software (designed for personal computers with Microsoft Windows; Laboratory of Mathematical Chemistry, University "Prof. As. Zlatarov," Bourgas, Bulgaria) is an interfacing program providing screening of chemicals by making use of QSAR models. Presently, models are available for bioavailability, log K ow , acute toxicity, phototoxicity, estrogen and androgen binding affinity, and mutagenicity. The system is flexible in terms of its ability to be employed as an expert-system shell for implementing subsequently derived QSARs or QSARs derived by other laboratories. QSAR models are described as logistical decision trees that consist of multiple hierarchically ordered rules based on parameter ranges that comprise reactivity patterns associated with toxic or physicochemical endpoints. These patterns could be expressed by steric, electronic, and two-dimensional (fragment) structural requirements. Boolean logic operators are used to establish "rules" in the decision tree. If the value of a parameter calculated for a conformer of a chemical is found in a range of the molecular descriptors defined by certain confidence limits, it is assumed that a chemical meets the specific requirement for eliciting the endpoint with a certain probability. OASIS Forecast is unique in terms of accounting for conformational flexibility of chemicals during the screening. It provides two approaches for that purpose. When applied to large databases of three-dimensional structures where a chemical is represented by a single conformer, one can use the "tweak" algorithm, which is based on the directed conformational search aiming to render the distance between the fragments into the specified limits. In fact the rotatable bonds of the structures are adjusted to produce a conformation that matches as closely as possible a given threedimensional requirement. The second alternative, to account for conformational flexibility of the screened molecules, employs the recently developed method for conformational coverage based on a genetic algorithm. The generation of sets of energetically reasonable conformers, providing coverage of conformational space, however, requires subsequent quantum chemical assessment of their electronic structures. Hence, the second alternative is evaluated to be much more time-consuming then the directed (tweak) conformational analyses. The conformational analysis and subsequent quantum chemical calculations are performed on the fly during the screening process. The OASIS expert system has been developed for the prediction of acute aquatic toxicity of noncongeneric chemicals using a two-step SAR approach (Karabunarliev et al. In press). It employs a) a computerized rule-based discrimination of chemicals and b) correlative QSARs using log K ow and quantum-chemical descriptors. The Duluth database (Russom et al. 1997) of acute toxicities to fathead minnow for about 660 organic chemicals served as the data source to develop an explicit rule-based discrimination scheme and QSARs for the separate chemical categories. The OASIS system has been assessed for its capabilities to predict fish toxicity (Moore et al. In press).
CATABOL. CATABOL is a mechanistic modeling approach for quantitative assessment of biodegradability. The system generates the most plausible biodegradation products and provides quantitative assessment for their solubility and toxic endpoints. The core of CATABOL is the biodegradability simulator including a library of hierarchically ordered individual transformations (catabolic steps). The catabolic steps are derived from a set of the most plausible metabolic pathways predicted by experts for each chemical from the Japanese Ministry of International Trade and Industry (MITI) database training set. The data in the training set agreed well with the calculated biological oxygen demands (r 2 = 0.90) in the entire range; that is, a good fit was observed for readily degraded, intermediate degraded, and difficultto-degrade chemicals. After introducing 60% theoretical oxygen demand as a cutoff value, the model predicted correctly 98% readily biodegradable structures and 96% not readily biodegradable structures. Cross-validation by leaving out 25% of the data and recalculating the model 4 times resulted in Q 2 = 0.88 between observed and predicted values.
CATABOL is a hybrid of a knowledgebased expert system for predicting biotransformation pathway, working in tandem with a probabilistic model that calculates probabilities of the individual transformations and overall biological oxygen demand and/or extent of CO 2 production. The novelty of the model is that the biodegradation extent is assessed on the basis of the entire pathway and not, as with all other models, on the parent structure alone. The second novelty is that CATABOL explicitly considers effects of adjacent fragments before executing each transformation step. CATABOL was described in detail by Jaworska et al. (2002).
Substructure-based computerized chemical selection expert system (SuCCSES). After years of recommending chemical classes based on SARs, the ITC determined that a computerized system was needed to use historical information and expert opinions to facilitate the ITC's review of large groups of chemicals with similar substructures (and modes of action, if available). Historical information and expert opinions used to identify ecologic and health effects associated with chemicals were captured in SuCCSES. Historical information was obtained from the ITC scoring exercises 1, 2, 3, 4 and 5 that were convened from 1978 to 1983 (Walker 1993a(Walker , 1993c(Walker , 1995. For each workshop the ITC invited expert toxicologists, pharmacologists, oncologists, geneticists, and ecologists to score (within their discipline) several hundred chemicals using criteria or thresholds for potential to cause acute, subchronic, mutagenic, carcinogenic, developmental, or reproductive effects to humans or acute effects to aquatic organisms, and for bioconcentration potential (Walker 1995).
Expert opinions were based on SARs, empirical data, and professional judgment. Opinions were developed by first convening separate panels of health effects experts and ecologic effects experts. These experts were asked to identify chemical substructures that were associated with chemicals likely to cause health or ecologic effects, respectively. Each panel was instructed to a) identify appropriate toxicologic endpoint categories for health effects or ecologic effects that could be used to classify chemicals with identical substructures and similar modes of toxic action, b) describe chemical substructures that could be associated with toxicologic endpoint categories for health effects or ecologic effects, c) develop a questionnaire that could be sent to health effects or ecologic effects experts to solicit their opinions on the reliability of the toxicologic endpoint categories and associated chemical substructures, and d) provide names of health effects or ecologic effects experts to whom the questionnaire then could be sent. Additional information on health effects that are coded in SuCCSES is provided in the companion article .
For ecologic effects, numerous international experts were sent a questionnaire listing more than 100 different chemical substructures and asked to predict (based on their field of expertise related to ecologic effects and knowledge of modes of action) the potential for chemicals containing any of the substructures to cause effects to algae, aquatic invertebrates, birds, fish, mammals, microorganisms, plants, or terrestrial invertebrates. Opinions from these ecologic effects experts were converted to codes that identified chemical substructures and indicated potential of chemicals containing one or more substructures to cause effects to the previously listed organisms.
SuCCSES is a relational database with fields indexed on Chemical Abstract Service (CAS) Registry numbers. It was created using Molecular Design Limited Information Systems' Integrated Scientific Information System. For chemicals in SuCCSES, chemical formulas, molecular weights, two-dimensional chemical structures, and simplified molecular input line entry system (SMILES) notations (Weininger 1988;Weininger et al. 1989) are provided. Features of SuCCSES promote substructure searching to identify structurally related chemical classes of chemicals and their potential ecologic or human health effects. For each record in SuCCSES, a computer screen displays fields for CAS Registry number, molecular weight, molecular formula, SMILES notation, two-dimensional chemical structure, chemical name, predicted mode of action, and potential health or ecologic effects. SuCCSES is not available to the public because it contains confidential business information.
SuCCSES is used to facilitate the ITC's review of large groups of chemicals (Walker 1991(Walker , 1995Walker and Brink 1989). The aldehyde substructures in SuCCSES that were associated with potential to cause acute effects to aquatic organisms were included in recent publications (Walker and Printup In press; Walker et al. In press (b)]. A forthcoming book chapter summarizes the development of SuCCSES and its applicability to the ITC's statutory mandate to use SARs before recommending chemicals for testing in May and November reports to the U.S. EPA administrator (Walker and Gray. In press).
Other approaches to toxicity prediction. Other approaches to predict toxicity include those for predicting water quality objectives (Vighi et al. 2001). This was a multivariate QSAR approach that involved the assessment of water quality objectives for a set of 125 chemicals [derived from the European priority list in compliance with Directive 76/464/EEC (EEC 1976)]. Predictions from classification models (based on algae, Daphnia, and fish toxicity values) were shown to perform satisfactorily compared with the classifications using literature toxicity data.

EU Regulatory Use
Technical guidance documents. In Europe, data obtained from QSARs can be used according to the guidance on the use of QSARs for specific groups of substances found in Part IV of the TGD (EEC 1996). The TGD provides recommendations for the use of QSARs to predict acute toxicity to fish (96 hr LC 50 ), Daphnia (48 hr EC 50 ) and algae (72-to 96-hr EC 50 ). In particular, QSARs are provided for chemicals acting by nonpolar narcosis and polar narcosis mechanisms of action. No QSARs have been recommended for substances that act by more specific modes of action.
Danish EPA. Tyle et al. (2001) reported that toxicity is considered only for substances with a bioconcentration factor (BCF) between 2,000 and 5,000. For such compounds, the QSAR predictions of the Danish EPA are used. The Danish EPA also used the MCASE fathead minnow model to predict acute aquatic toxicity of substances. The toxicity predictions were then used to assign EU risk classifications (e.g., R50, R51, and R53) (CEC 2001a) U.K. Department of Environment, Food, and Rural Affairs (DEFRA). In the United Kingdom, DEFRA has established a Chemical Stakeholders Forum (CSF) (DEFRA 2001), which gives a voice to those in society with an interest in chemicals and their effects on the environment and, through the environment, on human health. The CSF has established criteria for identifying chemicals of concern based on specific persistence, bioaccumulation, and toxicity values. The approach recommended by the CSF implies that appropriate QSARs can be used for estimation of persistence, bioaccumulation, and toxicity properties where experimental data are lacking. The proposed approach has been endorsed by DEFRA's Advisory Committee on Hazardous Substances (ACHS 2001), which also has agreed that appropriate QSAR models can be used to fill data gaps. The Environment Agency has carried out an initial screen of the International Uniform Chemical Information Database (Heidorn et al. 2003) for substances that fulfill the CSF PBT criteria and reported the initial findings from this study to the ACHS in September 2001 (ACHS 2001).
German Umweltbundesamt (UBA). A QSAR-based software system has been developed by the UBA (Lange and Vormann 1995). Its use is intended both to fill missing data needs and to give some assurance of the quality of the available experimental test data. The system is capable of predicting a wide range of fate and effect endpoints, including fish and Daphnia acute toxicity. The ECOSAR software is used to make predictions for these effects. Environment Canada. Environment Canada recently funded a study to assess and evaluate six modeling packages to predict acute toxicity, with particular application to prioritizing chemicals within the Canadian DSL (Moore et al. In press). The six packages assessed were ECOSAR, TOPKAT, a probabilistic neural network (PNN) model, a computational neural network model, the QSAR components of the ASTER system, and the OASIS system. Of these, the PNN model provided the best predictions on the basis of an external test set of compounds. The TOPKAT model also provided excellent predictions, but only for compounds classified as falling within the OPS of the model.

QSARs
In general few QSARs for chronic toxicity have been published. Those available are generally for unreactive, narcotic chemicals.

EU Regulatory Use
Technical guidance documents. The TGD provides recommendations for the use of QSARs to predict long-term toxicity to fish (NOEC, 28 days) and to Daphnia (NOEC, 21 days). In particular QSARs are provided for chemicals acting by nonpolar narcosis and polar narcosis mechanisms of action. No QSARs have been recommended for substances that act by more specific modes of action.

North American Regulatory Use
U.S. EPA. The U.S. EPA uses the chronic toxicity QSARs from the ECOSAR system.

QSARs to Predict Estrogen Receptor Binding
The issue of identifying chemicals that may elicit endocrine disruption has grown immeasurably in importance in the last decade. Interest in the prediction of endocrine disruption by QSAR methods has also been increased by the lack of historical data for this endpoint. This has made the need to screen the databases of existing chemicals a priority. The fact that endocrine disruption is an effect that could be initiated by a receptor-binding event, as well as being an immensely complex response that may be brought about by a number of interactions (e.g., interactions with estrogen, androgen hormones, etc.), has changed the challenges faced by modelers. There are thus two predictions that should be made: the first is whether a compound has the capability to elicit a response; the second is the magnitude of that response. With these challenges in Mini-Monograph | Regulatory use of QSARs for ecologic effects and environmental fate Environmental Health Perspectives • VOLUME 111 | NUMBER 10 | August 2003 mind, modelers in the environmental field have turned to many of the techniques traditionally used in drug design and optimization. It should be noted that "endocrine disruption" covers a broad range of physiologic effects in vivo and may result in a number of events such as estrogenicity, androgenicity, and so forth (Zacharewski 1997). QSARs for predicting estrogen receptor binding affinity of structurally diverse chemicals have recently been reviewed (Schmieder et al. In press).

QSARs for Classes
Given that a compound has the capability for binding at a receptor site, altering the structure of the compound should alter its affinity for the binding site. This phenomenon has been well demonstrated by a number of researchers; for instance, ltz et al. (2000) demonstrated that the binding of a series of phenols to the human estrogen receptor was related to the size of the phenol. Binding to this receptor is known to require a relatively bulky, hydrophobic group. These types of models provide an accurate prediction of potency but are not applicable outside the training set.

Common reactivity pattern (COREPA).
Pattern recognition methods are statistical techniques that enable the classification and prediction of activity. Schmieder et al. (2000) reported the successful application of COREPA [implemented in Mekenyan's OASIS software; Mekenyan et al. (1990Mekenyan et al. ( , 1994]. COREPA has been used to predict a chemical's potential to bind to the estrogen receptor  or androgen receptor .

Comparative Molecular Field Analysis
Much of the early progress in the prediction of estrogenicity was spawned as an offshoot of pharmaceutical studies. The study of Waller et al. (1996) indicated that three-dimensional techniques such as comparative molecular field analysis (CoMFA) were able to predict potency of estrogens. These techniques are limited to series of similar compounds (most commonly congeneric series), such as is often observed in classic drug optimization studies. As such, they are limited in their applicability outside of the training set. CoMFA techniques also still remain controversial regarding the success, or otherwise, of the fitting (molecular overlaying) of the compounds. Nevertheless, they have been used to predict estrogen receptor binding affinity Shi et al. 2002).

Tiered Assessment Approach
The Integrated 4-Phase model was developed at the U.S. FDA's National Center for Toxicological Research Shi et al. 2002). The Integrated 4-Phase model is composed of four sequential phases. Phase I employs two rejection filters; that is, if the molecular weight is less than 94 or greater than 1,000, and if there is no ring structure, it significantly, and with high confidence, eliminates those chemicals extremely unlikely to bind to the estrogen receptor . Phase II chemicals pass through phase I and are assigned as yes/no for estrogen receptor binding using 11 models using three different methods: three structural alerts, seven pharmacophore searching methods, and one classification model . These 11 models are complementary and were designed to distinguish active from inactive chemicals. Only chemicals identified as inactive by all 11 models were eliminated from further evaluation in phase III. In phase III, three-dimensional chemical structure assessments and molecular alignment were used to develop a CoMFA model (Shi et al. 2001). Chemicals with higher predicted binding affinity are given higher priority for further evaluation in phase IV.
Phase IV is a knowledge-based system that can be used to foster definitive decision making and facilitate priority assessments of chemicals with high predicted estrogen relative binding affinities (Perkins et al. In press).

EU Regulatory Use
TGD. The TGD makes no recommendations for the use of QSARs for endocrine disruption.  [Walker et al. In press (c)].

North American Regulatory Use
After the development and implementation of EDPSD1, EDPSD2 was created with substantially more resources [Walker. In press (e)]. EDPSD2 is a decision support tool that will be used to select chemicals for endocrine disruption screening assays.

QSARs for Predicting Degradation (or Persistence)
The persistence of a chemical is a vital area in establishing the ecologic effects and environmental fate of chemicals. A compound that is not persistent is generally considered to provide less risk than a persistent chemical with a similar toxic profile. However, persistence in the environment is a complex phenomenon to model because it depends on chemical structure, environmental conditions and, for biodegradation, the ability of available microorganisms to degrade a chemical. A number of excellent reviews have been published recently in the general area of biodegradation (Raymond et al. 2001;Jaworska et al. 2002, In press). General models of persistence are also provided by Gramatica et al. (1999Gramatica et al. ( , 2001 Biodegradation QSARs for classes. The vast majority of published quantitative structure-biodegradation relationships (QSBRs) rely on octanol-water partition coefficients (K ow ), van der Waal's radii, alkaline (abiotic) hydrolysis rate constants, and various molecular connectivity indices. Classes of chemicals covered by such models include chlorophenols and chloroanisoles, n-alkyl phthalates, alcohols, (2,4-dichlorophenoxyacetic acid) esters, parasubstituted phenols, meta-substituted anilines, esters, carbamates, ethers, and ketones (Howard 2000;Howard and Banerjee 1984;Howard et al. 1987). Generally, the correlation coefficients between physical/chemical properties or molecular descriptors and biodegradation rates have been good, but overall these models have not seen much use. Their applicability was limited to a very specific class, and it was inappropriate to predict biodegradation rates for chemicals outside of that class. These models were reviewed by Howard (2000) and Raymond et al. (2001).
Generally applicable QSARs. A number of multivariate QSAR approaches, including linear and nonlinear regression, partial least squares, and neural networks, have been applied to predict biodegradability. These were reviewed recently by Raymond et al. (2001) and Jaworska et al. (2002, In press).
Expert systems. The most recently developed class of QSBRs consists of expert systems that represent artificial intelligence approaches. These so-called knowledge-based expert systems tend to act as a repository of expert knowledge about phenomena or a process that, like biodegradation, can be described by a set of rules. The library of rules or transformations is organized in a hierarchy that orders the rules by their likelihood of being executed. Because they predict (or attempt to predict) the biodegradation pathway, such models are clearly mechanistic. Expert systems are qualitative in nature, but they can be linked to other models to provide a quantitative assessment. Most recently, an expert system, CATABOL (described previously) was developed (Jaworska et al. 2002). The available expert systems to predict biodegradation have been reviewed by Jaworska et al. (2002).

Other Persistence Endpoints
Methods to predict the persistence of chemicals are well reviewed, for example, for the oxidation (Canonica and Tratnyek. In press) and reduction of chemicals in water (Tratnyek et al. In press)

Expert Systems That May Be Applied to Predictions of Persistence
SRC Software. SRC's BIODEG (Biodegradation Probability Program for Windows, BIOWIN) program. BIOWIN calculates the probability that a chemical under aerobic conditions with mixed cultures of microorganisms will biodegrade rapidly or slowly. It uses fragment constants developed using multiple linear and nonlinear regressions and data from SRC's database of evaluated biodegradation data (Howard et al. 1992). The previous version (version 3) of the program added new expert survey data. The model uses a slight revision of the previous fragments and molecular weight to a) calculate the probability of rapid biodegradation from the experimental data and b) estimate the primary and ultimate biodegradation times for complete degradation (days, weeks, months, longer) using the evaluations of 200 chemicals by 17 biodegradation experts. A description is available in Boethling et al. (1994).
SRC's BIOWIN (version 4.00). BIOWIN version 4 adds two new predictive models for assessing a chemical's biodegradability in the MITI biodegradation test. The new models use an approach similar to the linear/nonlinear regression models already included in BIOWIN. A description of the MITI biodegradation models is available (Tunkel et al. 2000). Under its Chemical Substances Control Law (CSCL 1973), the Japanese have tested approximately 900 discrete substances using the MITI test. This protocol, which determines ready biodegradability, is among six guidelines officially approved by the OECD. A total data set of 884 chemicals was compiled to derive the fragment probability values that are applied in this MITI biodegradability method. The data set consists of 385 chemicals critically evaluated as "readily degradable" and 499 chemicals critically evaluated as "not readily biodegradable." BIOWIN produces two separate MITI probability estimates for each chemical. The first estimate is based upon the fragments derived through linear regression. The second estimate is based upon the fragments derived through nonlinear regression.
Other SRC models. SRC's atmospheric oxidation program. The Atmospheric Oxidation Program for Microsoft Windows (AOPWIN) estimates the rate constant for the atmospheric, gas-phase reaction between photochemically produced hydroxyl radicals and organic chemicals. It also estimates the rate constant for the gas-phase reaction between ozone and olefinic/acetylenic compounds. The rate constants estimated by the program are then used to calculate atmospheric half-lives for organic compounds on the basis of average atmospheric concentrations of hydroxyl radicals and ozone. The estimation methods used by the Atmospheric Oxidation Program are based upon the SAR methods developed by Atkinson and colleagues (Atkinson 1987;Atkinson and Carter 1984;Kwok and Atkinson 1995;Kwok et al. 1996). In addition, SRC has derived some new fragment and reaction values from new experimental data. The AOPWIN program is described by Meylan and Howard (1993). AOPWIN comes with a database of experimental values for 780 chemicals for reaction rates with hydroxyl radicals, ozone, and nitrate radicals.
SRC's HydroWIN Program. The HydroWIN program calculates the hydrolysis rate constant for esters, carbamates, halomethanes, alkyl halides, and epoxides using a method developed by the U.S. EPA OPPT. It calculates a second-order acid-or base-catalyzed hydrolysis rate constant at 25°C. Acid-and base-catalyzed half-lives are calculated for pH 7 and/or pH 8. The prediction methodology was developed for the U.S. EPA and is outlined by Mill et al. (1987).
TOPKAT. The Aerobic Biodegradability Module of the TOPKAT package is a single, self-contained module, consisting of four structurally based submodels. A single study that reported the biodegradability of 894 compounds, as assessed by the Japanese MITI test protocol, was used to develop these models. The discriminant models compute the probability of a submitted structure being capable of aerobic biodegradation (probability > 0.7) or incapable of being degraded aerobically (probability < 0.3). Probability values between 0.3 and 0.7 refer to an indeterminate region in which decisions should not be made except in special circumstances or under further analytical assessments.
MultiCASE. There are two (aerobic and anaerobic) MCASE models to assist in the prediction of biodegradability. The anaerobic biodegradation model was developed based on only 79 chemicals, and its applicability is limited. The aerobic biodegradation model is combined with an expert system that predicts a biodegradation pathway. It was developed on the training set of 200 chemicals.

EU Regulatory Use
TGD. For persistence, the TGD recommends two of the SRC BIOWIN models, namely, the BIOWIN2 nonlinear model and the BIOWIN3 survey model for ultimate biodegradation. The exact cutoff points for these models have been "calibrated" on the basis of the model score for 1,2,4-trichlorobenzene, a substance that is known to be relatively persistent under environmental conditions. In model terms the cutoff values for identifying potentially persistent substances are [BIOWIN2] < 0.5 and [BIOWIN3] < 2.2.
The European Commission (2000) reported the use of QSARs to predict persistence for compounds that were to be assessed for endocrine disruption capability.
Selection of "high-production-volume" chemicals was based on the HPV list from Regulation (EEC) No. 793/93 (EEC 1993b), of chemicals with a production volume of more than 1,000 metric tons per year. For selection of persistent chemicals the SRC Biodegradation program is used as a first indication of the persistence of a substance. Two SRC models were used: the linear regression method and the ultimate degradation method. The linear regression method leads to the definition of classes of biodegradation probability. The ultimate degradation model predicts the time for ultimate degradation (complete mineralization) of a substance. This model is based on the results of a survey of 17 biodegradation experts who were asked to evaluate 200 chemicals in terms of the time required to achieve ultimate biodegradation. The substances were rated to time units: 5 = hours; 4 = days; 3 = weeks; 2 = months; 1 = more than months. The results were averaged per substance and formulated to 36 fragments and a molecular weight parameter as with the probability estimation on linear regression. Substances that take more than months (level 1) to biodegrade, combined with a biodegradation probability of < 0.1, are considered highly persistent. Substances not fulfilling both criteria are not considered to be highly persistent.
Danish EPA. A recent publication from the Danish EPA (Tyle et al. 2001) reports the use of QSARs for identification of potential PBTs and vPvBs from among the HPV and medium-production-volume chemicals used in the EU. Proposals are made on which QSAR models should be used for persistence (the nonlinear BIOWIN model). The SRC BIOWIN program was used to assist in the advisory classification of substances according the EU risk phrase R53 (may cause long-term effects to the aquatic environment).
OSPAR. The briefing document on the OSPAR DYNAMEC exercise (OSPAR Commission 2000) describes how two of the models incorporated in the SRC BIOWIN software were used to provide information on persistence for those substances for which experimental data were not available. The models used were those for estimating that a substance is not rapidly biodegradable in the environment and for estimating the likely time scale of ultimate biodegradation in the environment. The cutoff point was "calibrated" on the basis of experimental data for 1,2,4trichlorobenzene because it is a substance that Mini-Monograph | Regulatory use of QSARs for ecologic effects and environmental fate Environmental Health Perspectives • VOLUME 111 | NUMBER 10 | August 2003 is known to be rather persistent under environmental conditions.

North American Regulatory Use
U.S. EPA. The U.S. EPA has routinely applied the BIOWIN software as well as other SRC software in the assessment of persistence of substances submitted for marketing approval.

Bioaccumulation
The ability of a chemical to bioaccumulate is also an important part of an ERA. Bioaccumulation is usually assessed by measurement of the BCF. On the whole, this has been assumed to be a simple passive accumulation process driven by the thermodynamic capability to leave an aqueous environment and enter the lipid tissue of an aquatic species. Because of this, many, if not most or all, QSARs for estimating bioaccumulation have been based upon estimates of hydrophobicity.

General QSARs
A number of general models for bioaccumulation have been proposed, including linear and nonlinear approaches (Meylan et al. 1999;Dimitrov et al. 2002aDimitrov et al. , 2002bDimitrov et al. , 2003. Generally, because bioaccumulation has been considered to be a partitioning process into an organism, there has been considerable emphasis on the use of log K ow .

Expert Systems: SRC's BCFWIN Program
BCFWIN estimates the BCF of an organic compound using the compound's log K ow . The estimation methodology (developed at SRC in conjunction with the U.S. EPA) is described by Meylan et al. (1999). Measured BCF data and other key experimental details were collected for 694 chemicals. Log BCF was then regressed against log K ow and chemicals with significant deviations from the line of best fit were analyzed by chemical structure. The resulting algorithm classifies a substance as either nonionic or ionic, the latter group including carboxylic acids, sulfonic acids, and their salts, and quaternary nitrogen compounds. Log BCF for nonionics is estimated from log K ow and a series of correction factors if applicable; different equations apply for log K ow 1.0-7.0 and > 7.0. For ionics, chemicals are categorized by log K ow and a log BCF in the range of 0.5-1.75 is assigned. Organometallics (tin and mercury), nonionics with long alkyl chains, and aromatic azo compounds receive special treatment. The correlation coefficient (r 2 = 0.73) and mean error (0.48) for log BCF (n = 694) indicate that the new method is a significantly better fit to existing data compared with other methods. Hansen et al. (1999) reported the use of a log K ow -based QSAR to predict bioaccumulation factors of chemicals for use within the EU with regard to priority setting for environmental exposure of existing methods.

EU Regulatory Use
TGD. For bioaccumulation, BCF values may be estimated from K ow using QSAR models where experimental data are not available. For highly hydrophobic substances, for example, with log K ow > 6, substances the available BCF models can lead to very different results. Hence, an assessment must be done on a caseby-case basis taking into account what is known about the BCF QSAR models and the specific properties of the substance, in particular, what is known to affect uptake and the potential for metabolism in aquatic organisms.
Danish EPA. The Danish EPA reported the use of SRC BCFWIN software to assist in the classification of chemical substances. A recent publication from the Danish EPA (Tyle et al. 2001) reports the use of QSARs for identification of potential PBTs and vPvBs from among the HPV and medium-productionvolume chemicals used in the EU. Proposals are made on which QSAR models should be used for bioaccumulation (the SYRACUSE BCFWIN model). Because the report shows that toxicity is only relevant for substances with BCFs between 2,000 and 5,000, it is suggested that available experimental and QSAR data be examined and expert evaluation made on a case-by-case basis. Hasse diagrams were used to prioritize 50 PBTs identified by the Danish EPA based on their PBT characteristics (Carlsen and Walker 2003).

North American Regulatory Use
U.S. EPA. The U.S. EPA routinely applies the SRC BCFWIN software for risk assessment purposes of chemicals.

Soil and Sediment Sorption
Soil and sediment sorption is a further fundamental property to assess with regard to risk assessment of xenobiotic chemicals in the environment. The main approaches used predict the organic carbon partition coefficient (K oc ; the ratio of the chemical adsorbed per unit weight of organic carbon in the soil or sediment to the concentration of the chemical in solution at equilibrium), which is widely used, in conjunction with the fractional organic carbon content, to model the partitioning of organics to sediments and soils.

General QSARs
As with bioaccumulation, soil sorption is a physical process that involves the partitioning of chemicals between soil and sediment, water, and the other organic fractions. It is assessed with measures such as the soil or sediment adsorption coefficient (K oc ). Models for K oc based on log K ow have normally been developed for separate chemical classes (Sabljic et al. 1995). Other general models for the prediction of K oc are also available (Gramatica et al. 2000).

Expert Systems: SRC's PCKOC Program
The PCKOC program calculates the soil or sediment adsorption coefficient from a correlation to the first-order molecular connectivity indices and a series of statistically derived fragment contribution factors for polar compounds.

EU Regulatory Use: TGD
The TGD discusses a number of QSARs to predict soil and sediment sorption coefficients. In total 19 QSARs based solely on log K ow are reported. These QSARs are for distinct chemical classes and were taken from the article by Sabljic et al. (1995).

North American Regulatory Use
U.S. EPA. The U.S. EPA routinely uses the PCKOC program to estimate soil sorption coefficients.

Physicochemical Properties
Physicochemical properties are important in the ERA of chemicals, both in their own right (for registration purposes for specific purposes, e.g., explosivity, etc.) and as descriptors in QSARs. There are numerous software programs to predict physicochemical properties (Lyman et al. 1982). Those described below are provided by the SRC and have been developed in collaboration with the U.S. EPA. Although these programs may not be the most accurate available, they are provided free of charge to interested parties (U.S. EPA 2002b) and are probably the most widely applied in a regulatory framework. Detailed compilations of descriptor software are discussed in individual sections above.

n-Octanol-Water Partition Coefficient (K ow )
K ow is a physical property used extensively to describe a chemical's lipophilic or hydrophobic properties. It is the ratio of a chemical's concentration in the octanol phase to its concentration in the aqueous phase of a two-phase system at equilibrium. Because measured values range from < 10 -4 to > 10 8 (> 12 orders of magnitude), the logarithm (log K ow ) is commonly used to characterize its value. Log K ow is a valuable parameter in numerous QSARs that have been developed for the pharmaceutical, environmental, biochemical, and toxicologic sciences.
As a consequence of the widespread use of log K ow values, there are probably more models available for predicting log K ow than for any other endpoint. This is of fundamental importance because log K ow is essential for many of the QSARs described in this review. A number of recent reports have assessed the capabilities of the various methods to predict log K ow (cf. Buchwald and Bodor 1998;Mannhold and Dross 1996). The comparisons of methods tend to suggest that fragment-based methods [e.g., KOWWIN and ClogP (BioByte Corp., Claremont, CA, USA] perform better than atom-based and conformation-dependent approaches. SRC's KOWWIN program. The KOWWIN program estimates log K ow of organic chemicals using an atom/fragment contribution method (Meylan and Howard 1995). The KOWWIN program has a unique feature that allows the user to adjust the estimation using an experimental K ow from a structurally related chemical compound.

Water Solubility
The prediction of water solubility is closely related to prediction of log K ow . However, there are fewer practical approaches to water solubility prediction than for the prediction of log K ow . Solubility remains an important property to estimate, especially for determining cutoffs for acute toxicity and for risk assessment to aquatic species in general.

SRC's WsKow program.
WsKow estimates the water solubility of an organic compound using its log K ow . The estimation methodology is described by Meylan et al. (1996). A total of 1,450 compounds (941 solids/509 liquids) that had measured values for K ow , melting point, and water solubility were used in the development of the linear regressions used by WsKow. The correlation coefficient for the method was 0.93, with a mean error of 0.47. The method was evaluated using a separate validation set of 817 chemicals (482 solids, 335 liquids) with measured water solubilities and estimated log K ow values (from SRC's LogKow program), with a resulting correlation coefficient of 0.90 and a mean error of 0.48.

Henry's Law Constant
Henry's law constant (HLC) is essentially an air-water partition coefficient. As such, it is an essential parameter for understanding and modeling the distribution of chemicals throughout the environment. Methods for the estimation of HLC have been reviewed recently by Dearden and Schüürmann (In press).
SRC's Henry program. This program calculates HLC using both the group contribution and the bond contribution methods of Hine and Mookerjee (1975). SRC has updated and expanded the bond and group contribution methods by developing new fragment constants from experimental data. The methodology is described by Meylan and Howard (1991). Henry version 3 extends the methodology to allow estimation of HLCs over a temperature range (0-50 o C). In addition, version 3 includes an experimental database of HLC values for 1,350 compounds. The Henry program also allows the user to adjust the estimation using an experimental HLC from a structurally related chemical compound.

Other Physicochemical Properties
Methods to predict melting point, boiling point, and vapor pressure are well reviewed by Dearden (In press).
SRC's MPBPVP program. This program estimates the melting point, boiling point, and vapor pressure of organic compounds. The estimation methodology for boiling point is an adaptation of the Stein and Brown (1994) method. Melting point is estimated by two different methods; the first is an adaptation of the Joback group contribution method described in detail by Dearden (1999), and the second is a method that correlates melting point to boiling point. The two melting points are then weighted according to chemical structure and magnitude of the difference between the two estimates to yield a preferred (suggested) melting point. For a description of methods to predict melting point, the reader is referred to the excellent review by Dearden (1999). Vapor pressure is estimated by three methods, all of which use the normal boiling point: the Antoine method, the modified Grain method, and the Mackay method (Neely and Blau 1985). The MPBPVP program will then select or calculate a preferred (suggested) vapor pressure (e.g., the Grain method is recommended for solids).

Regulatory Use of Estimation Methods for Physicochemical Properties
European Union. Many regulatory agencies will accept predicted values for physicochemical properties, especially log K ow . The use of reliable and well-validated methods such as KOWWIN are preferred over less well-known methods.
United States. The U.S. EPA and other agencies will accept predicted values for log K ow . In addition, the SRC software described above is used routinely by the U.S. EPA to calculate the physicochemical properties of new chemical substances for PMNs.

Prediction for Mixtures
The use of QSARs to make toxicity predictions for mixtures has been reviewed recently by Altenburger et al. (In press). Generally speaking, the making of predictions for mixtures or for different formulations is still in its infancy. QSARs are little applied to mixtures by regulatory agencies and the possible additive and/ or synergistic effects occurring from mixtures are not taken into account in risk assessment in a formal manner.

Use of (Q)SARs to Predict the Ecologic Effects and Environmental Fate of HPV Chemicals
Under the U.S. EPA HPV Chemical Challenge Program (Challenge Program) the chemical industry is being challenged to voluntarily compile a SIDS for chemicals on the U.S. HPV list (U.S. EPA 2002c). The SIDS, which has been internationally agreed to by member countries of the OECD, provides basic screening data needed for an initial assessment of the physicochemical properties, environmental fate, and human and environmental effects of chemicals. The information used to complete the SIDS can come from either existing data or from new tests conducted as part of the Challenge Program. The Challenge Program chemicals list, available online (U.S. EPA 2002d), consists of about 2,800 HPV chemicals reported under the TSCA's 1990 and 1994 (http://www.epa.gov/opptintr/iur/) inventory update rules. Because of the large number of chemicals on the list, it is important to reduce the number of tests to be conducted, where this is scientifically justifiable. SARs may be used to reduce testing in at least three different ways: First, by identifying a number of structurally similar chemicals as a group, or category, and allowing selected members of the group to be tested with the results applying to all other category members. Second, by applying SAR principles to a single chemical that is closely related to one or more better characterized chemicals (analogues). The analogue data are used to characterize the specific endpoint value for the HPV candidate chemical. Third, a combination of the analogue and category approaches may be used for individual chemicals, for example, searching for a "nearest chemical class" as opposed to a nearest single chemical analogue to estimate a SIDS endpoint. Such an approach is used in ECOSAR, a SAR-based computer program that generates ecotoxicity values.
The U.S. EPA has developed a guidance document to assist sponsors and others in constructing and supporting SAR arguments for potential application in the Challenge Program. The guidance will draw on experience from the OECD SIDS program, the U.S. EPA PMN program, and other sources available in the literature. OECD guidance on the use of QSARs is also available (OECD 2001).

Scope and Application of (Q)SARs in the U.S. HPV Challenge Program
The environmental fate and aquatic toxicity SARs rely heavily on physicochemical properties as inputs and are similarly structured in terms of models, chemical classes, and regression equations. However, "accepted QSARs" (cases in which ample data are available for a Mini-Monograph | Regulatory use of QSARs for ecologic effects and environmental fate Environmental Health Perspectives • VOLUME 111 | NUMBER 10 | August 2003 given chemical class) are not available for certain chemical classes for either ecotoxicity endpoints estimated using ECOSAR or biodegradation endpoints estimated using BIOWIN.
The use of SAR/QSAR in the HPV Challenge Program is expected to decrease the number of new tests that will be required to develop a SIDS for each HPV chemical. Their use, by either the category or individual chemical approach, will necessarily be limited by the nature of the SIDS endpoint, the amount and adequacy of the existing data, and the type of SAR/QSAR analysis performed. Measured data developed using acceptable methods are preferred over estimated values. The development and use of SAR/QSAR in the Challenge Program will be different for each of the major categories of SIDS (Table  3). It has been suggested that data from HPV chemicals be used to develop and validate QSARs to make predictions of base-set data endpoints for non-HPV chemicals (Walker et al. 2003c).
Physicochemical properties. It is anticipated that melting point, boiling point, vapor pressure, K ow , and water solubility data will be available for most HPVs. In some cases, this will be in the form of values taken from standard reference books [e.g., the Merck Index (O'Neil et al. 2001) and the CRC Handbook of Chemistry and Physics (Lide 2002)]. In the event that neither measured data nor reference book values are available, estimations using an appropriate model will be accepted for all physicochemical endpoints.
Environmental fate. Acceptable estimation techniques are available for photodegradation and hydrolysis, whereas biodegradation models are less available and less well accepted. The fourth SIDS endpoint in this category is a model (fugacity models to estimate transport/ distribution), and so there is no measured data requirement to fulfill. Thus, estimations will be acceptable in lieu of photodegradation and hydrolysis tests but not for biodegradation.
Ecotoxicity. ECOSAR is an established QSAR program that estimates toxicity to fish, invertebrates, and algae. Even though this approach represents a screening-level characterization, it is of a higher order than either physicochemical or environmental fate tests. This is not to diminish the importance of physicochemical/environmental fate tests, but there are layers of complexity not present in these endpoints when toxicity is the entity being measured/estimated. Therefore, some measured data must be available to strengthen the use of ECOSAR to characterize aquatic toxicity for an HPV chemical in the Challenge Program. For example, if an ECOSAR (or other aquatic toxicity SAR) estimation procedure is to be presented for any one endpoint, it must be accompanied by experimental data on that endpoint with a close analogue.

SARs for Individual Chemicals
For individual chemicals, SAR is applied in two ways: a) by the use of (usually quantitative) predictive models based on well-validated data sets (QSAR); or b) by comparing the chemical with one or more closely related chemicals, or analogues, and using the analogue data in place of testing the chemical. In the case of models, the comparison has essentially been incorporated into the model.
In developing a SAR, proposers need to consider the following steps for each HPV chemical they are interested in sponsoring:

Physicochemical estimation techniques.
Methods exist for estimating most of the physicochemical properties required to develop a basic understanding of the behavior of a chemical released to the environment and its potential environmental exposure pathways. Some of the methods require input as simple as chemical structure, whereas others require much less readily available information such as water solubility values, K ow , and so forth. Estimation methods for key physicochemical properties have been reviewed by Howard and Meylan (1997) and are discussed briefly below. Boiling point, melting point, and vapor pressure. Most comprehensive estimation methods for boiling point, melting point, and vapor pressure are "group contribution" methods, where values assigned to atoms, bonds, and their placement in a molecule are used to estimate their contribution to the inherent physicochemical properties of that molecule.
The Stein and Brown (1994) method for estimating boiling points was developed and validated on a large database (> 10,000 chemicals) and has been integrated into a computer program (MPBPVP) used by OPPT. In contrast, melting points are not very well estimated by this method, so the group contribution method is combined with an algorithm that relates melting point with boiling points to estimate melting point. This method is used in MPBPVP. Recently, attempts have been made to use molecular symmetry (Simamora and Yalkowsky 1994;Krzyzaniak et al. 1995), but the methods have not been well documented or validated.
A limited number of methods are available for estimating vapor pressure. Most rely on estimating the vapor pressure from the boiling point and use melting points when the chemical is a solid at room temperature, which is the method used by OPPT in MPBPVP.
Octanol-water partition coefficient. The literature contains many methods for estimating log K ow . The most common are classified as "fragment constant" methods in which a structure is divided into fragments (atom or larger functional groups), and values of each group are summed together (sometimes with structural correction factors) to yield the log K ow estimate (Meylan and Howard 1995;Leo 1979, 1995;. The OPPT KOWWIN model is based on the fragment constant method. General estimation methods based upon molecular connectivity indices (Niemi et al. 1992), UNIFAC-derived activity coefficients (Banerjee and Howard 1988), and properties of the entire solute molecule (charge densities, molecular surface area, volume, weight, shape, and electrostatic potential) (Bodor et al. 1989;Bodor and Huang 1992;Sasaki et al. 1991) have also been developed.
Water solubility. Water solubility is a determining factor in the fate and transport of a chemical in the environment as well as the potential toxicity of a chemical. Yalkowsky and Banerjee (1992) have reviewed most of the recent literature on aqueous solubility estimation and concluded that, at present, the most practical means of estimating water solubility involves regression-derived correlations using log K ow . OPPT uses the log-K ow -based WsKow model to estimate water solubility. Recently, direct fragment constant approaches to estimating water solubility have been developed (Kuhne et al. 1995;Myrdal et al. 1995).
Environmental fate estimation techniques. Biodegradation. Biodegradation (i.e., complete mineralization, or conversion to carbon dioxide and water) is an important environmental degradation process for organic chemicals. Prediction of biodegradability is severely limited because of the lack of reproducibility of biodegradation data (Howard et al. 1987) as well as the numerous protocols that have been  (Howard and Banerjee 1984). As a result, quantitative prediction of biodegradation rates has only been attempted on very limited numbers of structurally related chemicals (Howard et al. 1992). A number of comprehensive approaches using fragment constants have been attempted to qualitatively predict biodegradability. Many of the models have used a weight-of-evidence biodegradation database (BIODEG) that was specifically developed for structure-biodegradability correlations (Howard et al. 1986). Boethling et al. (1994) used the experimental BIODEG database as well as results of an expert survey to develop foµr models (these models are in the OPPT program BIOWIN). Hydrolysis rates. Hydrolysis is the reaction of a substance with water in which the water molecule or the hydroxide ion displaces an atom or group of atoms in the substance. Chemical hydrolysis at a pH normally found in the environment (i.e., pH 5-9) can be important for a variety of chemicals that have functional groups that are potentially hydrolyzable, such as alkyl halides, amides, carbamates, carboxylic acid esters and lactones, epoxides, phosphate esters, and sulfonic acid esters (Neely 1985). Only a method to predict hydrolysis rate constants for esters, carbamates, epoxides, and halogenated alkanes has been developed using linear free energy relationship (Taft and Hammett constant) methodology. A computer program (HYDROWIN) that uses this methodology is available and is used by OPPT. Also, Ellenrieder and Reinhard (1988) have developed a spreadsheet program that allows hydrolysis rates to be calculated at different pH values and temperatures if adequate data are available in the companion database.
Atmospheric oxidation rates. For most chemicals in the vapor phase in the atmosphere, reaction with photochemically generated hydroxyl radicals is the most important degradation process (Atkinson 1989). Methods for estimating reactivity with hydroxyl radicals have generally relied on fragment constant approaches or molecular orbital calculations. The method validated on the largest number of chemicals (641) is the Atkinson fragment and functional approach method (the method used in AOPWIN, the model used by OPPT), although molecular orbital methodology gives promising results on a much more limited number of chemicals (Meylan and Howard. In press).
Ecologic endpoint estimation techniques. (Q)SARs for aquatic toxicity to fish, aquatic invertebrates, and algae have been developed and used by OPPT since 1979(U.S. EPA 1994b, 1994c. These (Q)SARs have been incorporated into a software program (ECOSAR) available free from the U.S. EPA. ECOSAR uses molecular weight and structure and log K ow to predict aquatic toxicity.
The predictions are based on actual data of at least one member of a chemical class. The data (measured toxicity values) are correlated with molecular weight and log K ow to derive a regression equation that may be used to predict aquatic toxicity of another chemical that belongs to the same chemical class. ECOSAR contains equations for many chemical classes (> 50; the full list is shown in Table 2) that can be categorized into the following areas of the chemical universe (Nabholz JV. Personal communication): • Neutral organics that are nonreactive and nonionizable • Organics that are reactive and/or ionizable and that exhibit "excess toxicity" (toxicity beyond narcosis associated with neutral organic toxicity) • Surfactants, which are further divided by charge: nonionic, cationic, anionic, and amphoteric • Polymers, which are further divided by charge: nonionic, cationic, anionic, and amphoteric • Dyes, which are further divided by charge: nonionic, cationic, anionic, amphoteric • Inorganic compounds, which are separated from organometallics; inorganics are divided into classes via the periodic table; organometallics are also divided into classes via the periodic table ECOSAR is being constantly updated and so is only partially programmed for these chemical classes. For instance, SARs for classes with excess toxicity are added as they become available; surfactants are only partially programmed; polymer SARs are not yet programmed but will be as they become available; dyes are not yet programmed, although neutral dyes are assessed with SARs for neutral organics or excess toxicity as their structure dictates; inorganics are not yet programmed, but many SARs have been developed and await to be programmed; organometallics are not yet programmed, but many SARs have been developed and await to be programmed: the first step in assessing organometallics will be reactivity, either pyrophoric and/or hydrolysis. Stable organometallics will be assessed on the basis of predicted log K ow . To make predictions, ECOSAR now has a programmed decision tree based on SMILES, which selects the SAR that U.S. EPA experts would select. This decision tree is continually refined to reflect how OPPT/U.S. EPA performs new chemical assessments (Nabholz JV. Personal communication).
Therefore, to use ECOSAR for a particular chemical, an appropriate SAR is selected based on the following: chemical structure, chemical class, predicted log K ow , molecular weight, physical state, water solubility, number of carbons, ethoxylates or both, and percentage of amine nitrogen or number of cationic charges or both, per 1,000 molecular weight. Because the regression equations are chemical specific, and because they may vary by species (fish vs. daphnids vs. algae), the most important factor is the identification of the chemical class (U.S. EPA 1994b).
The following presents some guidance on the approach for evaluating the aquatic toxicity (to fish, plants, and invertebrates) of a candidate HPV chemical using ECOSAR: • Identify the chemical structure and convert it to SMILES notation • Identify appropriate physicochemical properties: physical state, melting point, water Mini-Monograph | Regulatory use of QSARs for ecologic effects and environmental fate Environmental Health Perspectives • VOLUME 111 | NUMBER 10 | August 2003  solubility, vapor pressure, and K ow are required to predict effect concentrations (i.e., EC 50 ). If a chemical is highly water reactive (e.g., a hydrolysis half-life < 1 hr), consider estimating toxicity for the hydrolysis products • Decide what ECOSAR chemical class best fits your chemical • Run the ECOSAR program to develop an aquatic toxicity profile for the candidate chemical Table 4 provides a summary of the SAR models discussed above.

Main Findings
The use of QSAR for regulatory purposes has been growing steadily and a framework of QSARs has been established by regulatory agencies worldwide (Table 5). By far the greatest use and application of QSARs has resulted from the TSCA and the efforts of the U.S. EPA. The regulatory use of QSARs in Europe and elsewhere in the world is less widespread and formalized. Within the EU, the use of QSARs is underpinned by the TGD and is being developed and encouraged by some of the EU environmental protection agencies. Generally, QSARs have been applied by regulatory agencies in two main ways: priority setting for existing chemicals and classification for new chemicals.

Future Outlook
The future will almost certainly bring an increased use of QSARs by regulators for estimating the ecologic effects and environmental fate of chemical substances. Such activities may include the prioritization of existing chemical databases (the examples of the Danish and Canadian authorities are pertinent in this regard). In such cases, well-designed strategic use of QSARs may be capable of flagging compounds of concern. It should also be noted that there will be an increasing requirement for the use of alternative methods, including QSARs, if the EU REACH initiative is to be successful.
The increase in the use of QSARs for the registration of new chemicals will also continue, particularly spurred on by legislation and public pressure against animal testing, and the cost of environmental assessment. However, in the foreseeable future, for the assessment of new chemicals a great reliance will continue to be placed on the use of experimental data.
Wider application and acceptance of the use of QSARs for regulatory purposes will require further development and more thorough validation and assessment of their use. There must be greater appreciation of QSAR "quality" and the appropriateness of their use, both in terms of the chemical domain described, and in terms of the precision of the estimates that are produced.