Evaluating quantitative formulas for dose-response assessment of chemical mixtures.

Risk assessment formulas are often distinguished from dose-response models by being rough but necessary. The evaluation of these rough formulas is described here, using the example of mixture risk assessment. Two conditions make the dose-response part of mixture risk assessment difficult, lack of data on mixture dose-response relationships, and the need to address risk from combinations of chemicals because of public demands and statutory requirements. Consequently, the U.S. Environmental Protection Agency has developed methods for carrying out quantitative dose-response assessment for chemical mixtures that require information only on the toxicity of single chemicals and of chemical pair interactions. These formulas are based on plausible ideas and default parameters but minimal supporting data on whole mixtures. Because of this lack of mixture data, the usual evaluation of accuracy (predicted vs. observed) cannot be performed. Two approaches to the evaluation of such formulas are to consider fundamental biological concepts that support the quantitative formulas (e.g., toxicologic similarity) and to determine how well the proposed method performs under simplifying constraints (e.g., as the toxicologic interactions disappear). These ideas are illustrated using dose addition and two weight-of-evidence formulas for incorporating toxicologic interactions.

One troublesome part of the risk assessment of chemical mixtures concerns the need for quantitative methods to support regulatory decisions, particularly when most of the desired information is missing yet resources and time for obtaining that information are insufficient. For the dose-response assessment step, the desired information includes the dependence of toxicity on the total mixture dose as well as its component chemical proportions, the magnitude and nature of toxicological interactions, and (nearly always) the methods for extrapolating from animal studies to human dose response in terms of interactions or of the whole-mixture toxicity. A few regulatory agencies have developed formulas that help identify the nature or degree of possible public health concern for specific mixture exposures. Because key information is usually absent for those mixtures, the formulas require several default steps or parameters. Thus we have the dilemma: a) formulas are developed that usually rely on simplifications and defaults; b) data are usually lacking for judging the accuracy of these formulas; c) regulatory agencies such as the U.S. Environmental Protection Agency (U.S. EPA) are required to be clear about the accuracy and uncertainties of their assessment methods.
Because the usual comparison of predicted to observed is rarely possible, some other approaches are needed for evaluating the quality of these formulas. It must be noted that these evaluations of plausibility, often called "groundtruthing," should also be performed for any regulatory approach. Even when the desired data are available for predictedobserved comparisons, they will represent only a snapshot of a generally complicated setting, often involving a virtually infinite number of combinations of factors. Evaluations of the risk formula based on fundamental properties and less dependent on a particular data set are then helpful and may in fact be more relevant to the adoption of a risk formula for general use.
Dose-response assessment methods for mixtures can be evaluated for plausibility, if not accuracy, by judging consistency with several desirable characteristics. This approach is somewhat similar to the judgments that are made of confidence in the risk values of the U.S. EPA's Integrated Risk Information System (IRIS), the U.S. EPA online database of risk-based exposure limits and measures of toxic potency (1). The U.S. EPA reference dose (RfD) is an estimate (with uncertainty spanning perhaps an order of magnitude) of a daily oral exposure to the human population (including sensitive subgroups) likely to be without an appreciable risk of deleterious effects during a lifetime. It can be derived from a no observed adverse effect level (NOAEL), lowest observed adverse effect level, or benchmark dose, with uncertainty factors generally applied to reflect limitations of the data. The RfD is generally used in the U.S. EPA's noncancer health assessments. For example, the RfD for a single chemical is judged to be a bounding value of high confidence if the supporting data sets are consistent, require minimal extrapolation to the case of human chronic exposure, and represent the main toxic effects of concern. We examine in this article some ideas used or proposed by the U.S. EPA to evaluate simple risk assessment formulas. In particular, we demonstrate how default-based risk assessment formulas for chemical mixtures can be judged for plausibility and usefulness in a health-protective regulatory context. The purpose of this article is not to derive conclusions on the accuracy of any of these formulas but to use them only to clarify the difficulty in evaluating such formulas. First, general concepts are presented related to toxicologic interaction and mixture toxicity. Then the noninteractive hazard index (HI), the sum of the hazard quotients (HQs) of component chemicals in a mixture, is discussed, followed by two versions of an interaction-based HI. Finally, we propose and demonstrate steps for evaluating risk formulas when direct comparison with actual risk measures is not feasible.

U.S. EPA Component Methods for Mixture Risk Assessment
The focus of this article is on componentbased dose-response methods for use in mixture risk assessment. For this article, we define a mixture to be the set of environmental chemicals that jointly contribute to the same toxicity in the same exposed population. The chemicals need not cause the same toxicity from individual exposure but must have joint influence from their combined exposure. For example, they could cause different effects during single-chemical exposure but influence each other's metabolism, and hence each other's internal dose, during simultaneous exposure. The chemicals need not be spatially or temporally coincident as long as they jointly play a role in toxic effects in the exposed individual (2). This definition makes an assumption, however, that there is indeed some time-based overlap of exposure or toxicity. Examples of overlapping exposure include the internal human doses reaching the same sites of metabolism or target tissues, or simply the co-occurrence of the chemicals in the same physiological location (e.g., two ingested chemicals combining chemically under acidic conditions in the stomach to form a new chemical). Overlapping toxic effects may be caused by persistence of effects beyond the exposure time, so there is joint toxicity in spite of no overlap of the external exposures. An example of toxicity overlap is the classic description of carcinogenic initiation and promotion. Chemicals that have no overlap and no commonality of metabolic pathways or toxic effects would usually be treated separately (3).
Component-based methods for joint dose response require consideration of toxicologic interactions among the pairs and higher combinations of the mixture component chemicals. Nearly all published toxicologic interaction studies involve only chemical pairs (4). Consequently, the two U.S. EPA databases that address toxicologic interactions include only studies on two-chemical combinations (5,6), and the interaction-based risk approaches in this article require information only on two-chemical interactions.
Toxicologic interaction has no intrinsic characteristic or measure that points toward a unique definition. Instead, interactions are determined by departure from what would be expected under normal circumstances, i.e., if no interaction occurred. Unfortunately, there is no consensus on what defines "no interaction" (7)(8)(9). The advantage to a regulatory agency is they can then choose a definition to facilitate the assessment of mixture risk. Dose addition has an easy interpretation, is the original regulatory approach first proposed in 1963 by the Association of Governmental and Industrial Hygienists (10), and has been used by the U.S. EPA more than any other component-based mixture risk approach, primarily in the assessment of health risk at hazardous waste sites (11). Consequently, we propose dose addition as the preferred nointeraction model for chemicals that contribute to a common toxic effect, and define toxicologic interactions as deviations from dose addition. Synergism is then indicated by data showing a greater response than that predicted by dose addition. When adequate interaction information is not available, the U.S. EPA usually applies dose addition as a no-interaction default approach.
Biologically based mathematical models of the exposure-joint toxicity relationship are the preferred basis of quantitative risk assessment, but they are extremely rare when compared with the large number of potential chemical combinations. Physiologically based pharmacokinetic models are by far the most frequent model type, but even those have been developed for only a few combinations, mostly chemical pairs (4,12,13). Most of the literature on toxicologic interactions present qualitative discussions of observed effects and a classification of the joint toxicity as either consistent with additivity (usually dose or risk additivity), or suggestive of greater or less than additivity, that we will call synergism or antagonism, respectively. These judgmental interaction classifications, however vague or unevenly described, can be useful. Two benefits from making judgments of interaction (and perhaps more) are to indicate the consequences of using a default no-interaction regulatory model and to facilitate investigations of a possible mode of action and mode of interaction that would explain the joint toxic effects.
This article focuses on the first benefit. The second motivation seems to require more understanding of basic principles of toxicologic interaction and the degree to which certain modes of action are unique in causing observed toxicity. We are not convinced that those principles have been adequately identified for general application to mixture risk and so leave that discussion to future work.
Noninteractive hazard index. The U.S. EPA has concerns for thousands of chemicals but has regulations only on hundreds of them. In addition to the time it takes to draft risk-based standards, there is the constraint of lack of key information on which to base those risk estimates. The operating procedure is then to develop default methods that would be used when the desired data are missing. For mixtures, the U.S. EPA has two default approaches. If the mixture's component chemicals cause different effects with no suggestion of toxicologic interaction, separate risk assessments are performed. If the chemicals cause the same effect, or at least damage the same target organ, then the default component-based approach is dose addition, most often implemented using the dimensionless HI (3,11), which is defined for oral exposures by , [1] where E j = exposure level of chemical j, RfD j = RfD of chemical j, and HQ j = HQ for chemical j (dimensionless). Note that the exposure must represent the same quantity as the RfD: if the RfD represents a lifetime daily ingested dose in units of milligrams per kilogram per day, then E must also represent the lifetime daily ingested dose and be in the same units. The HI is consistent with dose addition as long as 1/RfD is viewed as a rough estimate of toxic potency. Under dose addition, each component chemical behaves as a dilution or concentration of the other components, so except for dose scaling, the dose-response curves are identical. The mixture dose is then the sum of the component doses once each is scaled for its potency. The HI formula is also consistent with Berenbaum's zero interaction equation (14), where his equitoxic or isoeffective dose, e.g., a single chemical's ED 10 , in the denominator is replaced by the RfD. (ED x = effective dose associated with x% response rate in the exposed group.) It must be noted that the HI is a very rough application of dose addition. In both of the above analogies, the RfDs are viewed as equitoxic doses. In the best of circumstances, they are estimates of toxicity thresholds-maximum doses with no response. The actual situation is more complex. For example, an RfD may be the ratio of an experimental dose with NOAEL divided by the product of several uncertainty factors that depend on the underlying toxicity database. Whereas the uncertainty factors are usually conservative, i.e., overestimates of equitoxic scaling factors that make the RfD smaller than it should be, the NOAEL is anticonservative, i.e., it overestimates the true threshold dose, making the RfD higher than it should be. Consequently, the ratio of these two values, with unknown counterbalancing errors, is difficult to evaluate. Because the HI involves several RfDs with different NOAELs and uncertainty factors (UFs), the bias in the HI is even more difficult to characterize. For the remainder of this article, the HI will be assumed to be based on RfDs equally uncertain and equally biased, so the evaluation of modifications of this formula can be judged on their conceptual properties.
HQ is a component dose scaled by the inverse of its RfD. For the risk characterization of a single chemical, the decision point is HQ = 1, i.e., when a chemical exposure is at its RfD. Any smaller exposures are considered to pose no significant health risk. For the mixture, the corresponding decision point is HI = 1. One interpretation, HI = 1, represents the situation where the mixture is at its RfD. The complication is that the mixture RfD is actually an infinite number of component combinations, not a single point. For example, with a mixture of only two chemicals, one can draw the dose addition isobole for a response of 1%. If the RfD were defined in terms of a very small response rate (1%, for example) of a nonadverse effect (perhaps a precursor to toxicity), then the 1% isobole would be the set of all mixture RfDs, two-chemical dose combinations producing the 1% mixture response.
The HI in Equation 1 is constrained to combinations of chemicals that are toxicologically similar. That similarity is not precisely defined, and the evidence can range from identical cellular mechanisms to a judgment of rough similarity in the impact on the same target organ. Usually it is viewed as a neutral approach for addressing potential joint toxicity because it does not reflect synergism or antagonism.
Interaction-based hazard index. In the original U.S. EPA mixture guidelines (15) and the recent supplement (3), the recommendation is to use interaction data when available. The practical approach adopted by the U.S. EPA is to modify the HI according to the available evidence on pairwise interactions. In the first approach (16), a judgmental weight of evidence (WOE) evaluation of the interaction studies was converted into a numerical score, then inserted into a formula multiplied by the HI. The formula for that interaction-based HI is , [2] where UF I is an uncertainty factor for interactions with the default value of 10. The exponent, WOE N , is a normalized value, further defined by , [3] where the denominator is the maximum value of the numerator function, i.e., the value if the WOE data were perfect. These two right-hand functions are [4] , [5] where n = the number of chemicals in the mixture, j, k = indices for the pair of chemicals whose interaction is being considered, and B jk = the interaction WOE score for influence of chemical j on the toxicity of chemical k.
The WOE score is negative for less-thanadditive interactions, and positive for greater-than-additive interactions. B is the weight-of-evidence score reflecting a judgment of the potential for toxicologic interaction in humans based on observed interactions in toxicologic studies. The range of B is [-1,1]. The value of B = 0 is used for those chemicals where pairwise exposures are shown to be dose additive or are presumed so because of inadequate interaction data. It should be noted that the UF in this formula serves a different purpose than the UF in the RfD formula. For the RfD, the UF errs on the side of conservatism when data are weak, i.e., the UF is large, causing a reduction in the estimated safe dose. With interactions, however, the UF reflects the quality of the evidence for an interaction. With weak evidence, the UF reduces the influence of the reported interactions, so the formula approaches the noninteractive (dose-additive) HI in Equation 1. Weak data do not make the formula more conservative or protective.
In 1999 (2), the U.S. EPA published a refined formula for an interaction-based HI. Equation 2 was devised as a simple way to alter the standard HI based on evidence that toxicologic interactions were plausible. Certainly, the simplest modification involves one additional factor. Even though the modifying factor is derived from several pairwise evaluations, the final formula is easy to follow. This simplification, however, was one of the key issues motivating the refined formula: use of a single multiplicative factor with the additive formula to account for the composite influence of all pairwise interactions. The refined formula differs by having an adjustment factor for each HQ. This updated version then represents toxicologic interaction by a change in each component's toxic potency.
The revised interaction HI is (2) , [6] where M jk = magnitude of the interaction, B jk = the WOE score for the interaction of chemicals j affecting toxicity of chemical k (see below for more explanation), f and g, the two exposure-dependent functions, are defined as [7] [8] The function f is a normalizing function that ensures the modifying summation is numerically constrained. For example, if all chemical pairs are dose additive, f makes Equation 6 equal to the dose-additive HI in Equation 1. The function g is based on the concept that the interactive influence should be maximal when both chemicals are equitoxic. This means as one chemical dominates the mixture, the interactive influence diminishes, so the mixture toxicity becomes that of the dominant chemical (2).
Most of the toxicologic interaction studies describe the interaction in terms of altered pharmacokinetics of one or more of the mixture chemicals, where the change in toxicity is caused by changes in the active chemical's concentration in the target tissue (3,17). Equation 6 was largely motivated by this interaction concept, and so presents each pairwise interaction as an incremental alteration in the toxicity of each chemical (i.e., effectively changing its HQ).
A second motivation for Equation 6 was the desire to include the interaction magnitude (M; the ratio of the observed ED x to the ED x predicted from dose addition), a quantity missing from Equation 2. There is no commonly used definition of M. Interactions are often described qualitatively in terms of altered response, such as an increase in severity of the histopathology, or quantitatively in terms of a change in the numbers of animals affected. In the U.S. EPA mixture guidance (3), the M for Equation 6 is preferably given as the proportional change in ED. For example, the isobologram analysis of a mixture response uses this concept by displaying the measured isoeffective dose combinations relative to the predicted line of dose additivity (18). As a second example, Mehendale (19) used x-fold changes in the lethal dose with 50% response rate to show a range of potentiation from 1.6-to 67-fold. A 67-fold dose reduction can be applied to any selected response rate, whether an ED 01 or an ED 90 . The corresponding increase in response, however, is not as useful a measure of potentiation magnitude. For example, the response at an ED 01 (1%) can be potentiated to increase up to 100-fold, but the response at an ED 20 can only increase 5-fold. The M is assumed to be roughly constant over the dose range of interest, varying mostly because of changes in component proportions not total dose. Because most measures of toxic response (e.g., enzyme activity, relative increase in organ weight, fraction of animals responding) are bounded, an M defined by a change in measured response is not likely to be constant. In the application of Equation 6, the M is recommended to represent the change in effective dose.
The binary WOE classification and its score, B, are almost identical to their counterparts in the 1992 formula of Equation 2. In both versions, the value of B is negative for antagonism and positive for synergism, with -1 and 1 indicating the strongest evidence for each interaction, respectively.  (Table 3) has more steps but is similar to that for Equation 6 in that the evidence is judged according to the extent of extrapolation or inference required. The Agency for Toxic Substances and Disease Registry discusses this WOE scheme in detail in its mixture risk guidance (20). In both Equations 2 and 6, the WOE judgment gives a score closer to zero as the quality or relevance of the interaction information diminishes (or evidence for dose addition strengthens). Note the U.S. EPA scores are not symmetric. To err on the side of increased protection, the U.S. EPA approach requires stronger evidence for antagonism before allowing a relaxation of expected toxicity. During the discussion of the two formulas (Equations 2, 6), we evaluated their numerical values for a few plausible simplified conditions and discovered a dramatic difference. For the conditions involving perfect evidence for synergy, i.e., where B = 1, the value of Equation 2 became constant, regardless of changes in mixture composition (Figure 1). This is easily seen in the formula for the exponent, This unintended property of the 1992 formula is obvious in hindsight. Yet this formula has been published at least twice (16,21) following internal U.S. EPA review and journal peer review. Why was this problem not discovered? We believe the explanation lies in the formula being a decision index. Although the response addition and relative potency factor methods produce quantitative estimates of a measurable mixture response, the HI and interaction-based HI provide only numerical indicators of the degree of concern for potential mixture toxicity. Consequently, instead of performing the usual observed versus expected comparison, such formulas are judged on their plausibility, an evaluation that can be quite subjective. In fact, although the U.S. EPA has guidance for the evaluation of physically based mathematical models, it presently has no quality assurance process for these kinds of risk-based decision formulas. We now propose some general guidance and illustrate the steps with these two interaction-based HI formulas.

Methods
These two formulas (Equations 2, 6) were designed from general concepts of interaction, not from extensive data or mechanistic principles. As discussed before, evaluations of the quality of such formulas cannot use common statistical tests and procedures such as goodness-of-fit calculations. Instead, for such formulas we recommend simple rules based on a formula's structure and its numerical behavior. First, the formula must reflect most of the basic concepts or principles believed to apply to the environmental situation being assessed. Second, the formula must accurately track the numerical behavior of simplified or trivial conditions, i.e., situations where the correct result is known. Because these formulas are simple approximations, it is likely not all underlying concepts will be reflected, and not all trivial conditions will be accurately tracked. For component-based mixture assessment, we suggest the following properties for use in evaluating the mixture formulas.
Basic concepts: 1) If a chemical is not involved in any toxicologic interactions, then as that chemical's exposure level increases, the mixture index formula also increases in magnitude. This property assumes the chemical has a monotonically increasing dose-response curve.
2) The formula must be symmetric for a chemical pair in that it makes no difference which chemical is denoted chemical 1.

3) The impact of an interaction on HI modified to reflect pairwise toxicologic interactions (HI INT ) increases if the WOE for the interaction increases, all other factors held constant. 4) The formula must reflect the relative
proportions of the components so the contribution of any pairwise interaction, for HQ 1 + HQ 2 fixed, is strongest when the two chemicals are at equitoxic exposure levels (i.e., when HQ 1 = HQ 2 ). This concept applies regardless of the strength of evidence for the interaction. 5) The interaction magnitude is defined in terms of a change in isoeffective dose (e.g., change in the ED 10 ).

Results
Basic concepts 1-3 are satisfied by both formulas, Equations 2 and 6, as is the consistency with trivial case 8:

I
The interaction has been shown to be relevant to human health effects and the direction of the interaction is unequivocal. II The direction of the interaction has been demonstrated in vivo in an appropriate animal model, and relevance to potential human health effects is likely. III An interaction in a particular direction is plausible, but the evidence supporting the interaction and its relevance to human health effects is weak.

IV
The information is a) Insufficient to determine the direction of any potential interaction. b) Insufficient to determine whether any interaction would occur. c) Adequate as evidence that no toxicologic interaction between/among the compounds is plausible.
a Data from U.S. EPA (3). The interaction has been shown to be relevant to human 1.0 -1.0 health effects and the direction of the interaction is unequivocal. II The direction of the interaction has been demonstrated in vivo 0.75 -0.50 in an appropriate animal model, and the relevance to potential human health effects is likely. III An interaction in a particular direction is plausible, but the 0.50 0.0 evidence supporting the interaction and its relevance to human health effects is weak.

IV
The assumption of additivity has been demonstrated or must be accepted. [9] Equation 6 satisfies this property. In the interaction factor, M Bg , the only part involving the exposures is the function g. For chemicals 1 and 2, g 12 is the geometric mean of HQ divided by half of (HQ 1 + HQ 2 ), which is fixed at H. So the maximum of M Bg is attained at the maximum of the geometric mean.
Properties 5, 6, and 7 require a parameter for M that is present only in Equation 6, so Equation 2 will not be considered further.  Classification Factor Direction of interaction Direction = Additive 0 > Greater than additive +1 < Less than additive -1 ? Indeterminate 0 Quality of the data Weighting Mechanistic understanding I Direct and unambiguous mechanistic data: the 1.0 mechanism(s) by which the interactions could occur has been well characterized and leads to an unambiguous interpretation of the direction of the interaction. II Mechanistic data on related compounds: the 0.71 mechanism(s) by which the interactions could occur have not been well characterized for the chemicals of concern, but structure-activity relationships, either quantitative or informal, can be used to infer the likely mechanisms(s) and the direction of the interaction. III Inadequate or ambiguous mechanistic data: the 0.32 mechanism(s) by which the interactions could occur has not been well characterized or information on the mechanism(s) does not clearly indicate the direction that the interaction will have. We have shown the plausibility that such formulas cannot be judged on the basis of the general formula structure or on the numerical properties of pieces of the formula. Equation 2 failed to have a property that appears to be present in the formula's construction (the use of the geometric mean of HQ j and HQ k ). Somehow these risk formulas must be designed so they produce plausible numerical values. One approach is to require that each formula adequately describe simple conditions that are well understood. For mixtures, we suggest that these simple conditions include the limit as interactions disappear and the limit as the component chemicals become more similar in their interactions. We also recommend that the formulas behave properly under the best of conditions, such as when the interactions data are excellent. Last, we recommend that these formulas have default parameters and functions so when the desired mixture data are weak, the defaults can be implemented. Mixture risk formulas can occasionally be tested in the standard manner by goodness-offit comparisons over several whole-mixture data sets of varying composition. We recommend that data for such evaluations be generated, at least for representative simple mixtures for each of the major types of environmental chemicals such as pesticides, volatile organics, inorganics, petroleum fractions, and other commonly occurring chemical groups. For example, U.S. EPA researchers have designed a set of experiments exploring hepatotoxicty in female CD-1 mice for the four trihalomethanes (THMs), including assays on each single chemical, all six binary combinations, and eight 4-THM mixture combination points (22). The experimental design for one of the six binary combinations is shown in Figure 2. The doses and mixing ratios were selected so interaction effects could be investigated at several total dose levels and at different proportions in the 2-THM mixture. These data will help quantify the M factor for the interaction-based HI and can be used with the eight 4-THM combination points to adjust other functions and parameters so the binary information adequately reflects the toxicity of the whole mixture.
Mixture risk assessment formulas should improve in the near future as more pharmacokinetic models are developed and as more principles of interaction are identified and related to individual chemical properties. In practice, the variety of environmental mixtures will ensure that most will have some missing or weak information. As a result, the HI formulas discussed in this article may be enhanced by the new information, but likely will be forced to include several defaults. Until extensive data on complete mixtures become available, judgments of the plausibility of such formulas will continue to be at least partly subjective. Using a structured evaluation such as presented here will help ensure acceptable quantitative behavior of the risk formulas.