Charting a Path Forward: Assessing the Science of Chemical Risk Evaluations under the Toxic Substances Control Act in the Context of Recent National Academies Recommendations

Background: In 2016, Congress enacted the Frank R. Lautenberg Chemical Safety for the 21st Century Act (“the Lautenberg Act”), which made major revisions to the main U.S. chemical safety law, the 1976 Toxic Substances Control Act (TSCA). Among other reforms, the Lautenberg Act mandates that the U.S. Environmental Protection Agency (U.S. EPA) conduct comprehensive risk evaluations of chemicals in commerce. The U.S. EPA recently finalized the first set of such chemical risk evaluations. Objectives: We examine the first 10 TSCA risk evaluations in relation to risk science recommendations from the National Academies to determine consistency with these recommendations and to identify opportunities to improve future TSCA risk evaluations by further implementing these key approaches and methods. Discussion: Our review of the first set of TSCA risk evaluations identified substantial deviations from best practices in risk assessment, including overly narrow problem formulations and scopes; insufficient characterization of uncertainty in the evidence; inadequate consideration of population variability; lack of consideration of background exposures, combined exposures, and cumulative risk; divergent approaches to dose–response assessment for carcinogens and noncarcinogens; and a flawed approach to systematic review. We believe these deviations result in underestimation of population exposures and health risks. We are hopeful that the agency can use these insights and have provided suggestions to produce chemical risk evaluations aligned with the intent and requirements of the Lautenberg Act and the best available science to better protect health and the environment—including the health of those most vulnerable to chemical exposures. https://doi.org/10.1289/EHP9649


Introduction
The 1976 Toxic Substances Control Act (TSCA) underwent major revisions in 2016 with enactment of the Frank R. Lautenberg Chemical Safety for the 21st Century Act ("the Lautenberg Act"), establishing for the first time a mandate for the U.S. Environmental Protection Agency (U.S. EPA) to evaluate the risks of chemicals in commerce (Schmidt 2016;U.S. Congress 2016). These evaluations determine whether a chemical presents an "unreasonable risk of injury to health or the environment. . .including an unreasonable risk to a potentially exposed or susceptible subpopulation. . .under the conditions of use. . .without consideration of costs or other nonrisk factors" (U.S. Congress 2016). The Lautenberg Act created a clear demarcation between risk evaluation, with determinations to be based exclusively on consideration of risk, and risk management, which must consider costs and other risk factors in selecting among options that are sufficient to mitigate any unreasonable risks that the U.S. EPA identifies (U.S. Congress 2016).
As a first step in the risk evaluation process, TSCA requires the U.S. EPA to establish the "scope" of a risk evaluation by specifying the chemical's conditions of use, hazards, exposures, and relevant potentially exposed or susceptible subpopulations. Definitions of key terms and scientific standards in TSCA are provided in Table S1. The TSCA amendments also required the U.S. EPA to identify, within 6 months of enactment, the first 10 chemicals to undergo risk evaluation, which the U.S. EPA did in December 2016 (Table 1).
The National Academies has put forth recommendations for best practices in chemical risk assessment to ensure that these assessments incorporate the best available science and provide information that is useful for decision-making. Specifically, the National Academies has recommended increased attention to planning, scoping, and problem formulation; characterizing and communicating uncertainty and variability; using a unified approach to dose-response assessment for cancer and noncancer effects; developing standards and criteria for the use of defaults; conducting cumulative risk assessments by incorporating chemical and nonchemical stressors; and adopting robust systematic review methods (NASEM 2018(NASEM , 2021(NASEM , 2017NRC 2008NRC , 2009NRC , 2014a. The development of the first chemical risk evaluations under the Lautenberg Act was an opportunity for the U.S. EPA to put these recommendations into practice. This paper examines the first 10 TSCA risk evaluations specifically in relation to recommendations on risk assessment science from the National Academies (Table 2). Where relevant, we also refer to input that the U.S. EPA received from the Science Advisory Committee on Chemicals (SACC), a statutorily created committee that provides independent scientific peer review of TSCA risk evaluations. Through this analysis, we highlight ways in which the U.S. EPA deviated from best practices-including for example, defining overly narrow risk evaluation scopes that exclude relevant exposures; applying different approaches to dose-response characterization for cancer and noncancer effects; and developing and implementing a systematic review approach that deviates from fundamental elements of the practice-and the associated consequences for public health protection. We provide recommendations to improve future TSCA risk evaluations and discuss the opportunities that TSCA, appropriately implemented, creates for advancing the science of risk assessment.

Planning, Scoping, and Problem Formulation
Defining the scope of a risk assessment, including the hazards, exposures, and populations to be considered, is a critical early step in the process, though it can evolve as the risk assessment is developed (NRC 2009). TSCA directs the U.S. EPA to comprehensively evaluate chemical risks, taking into consideration known and potential hazards and the potential for multiple pathways of exposure-including to subpopulations that are more highly exposed or susceptible (U.S. Congress 2016). This comprehensive approach is aligned with recommendations of the National Academies, which argued that a narrow scope may distort the validity and applicability of a chemical assessment (NRC 2009); by excluding relevant sources of exposure, the U.S. EPA may underestimate the aggregate risks faced by exposed populations. Despite these recommendations and TSCA's requirements, nearly all of the U.S. EPA's final risk evaluations to date have excluded known, often significant, sources of chemical releases and exposures (Table 3). The U.S. EPA's stated basis for these exclusions was that TSCA gives the agency wide discretion to choose which activities, releases, exposures, hazards, and subpopulations to include or exclude from the scope of a risk evaluation. The U.S. EPA asserted this discretionary authority in rules the agency issued in July 2017 establishing procedures to prioritize and conduct risk evaluations on chemicals in commerce (U.S. EPA 2017b, 2017d). These procedures were promptly challenged, and in 2019 the Ninth Circuit Court of Appeals issued a decision that found illegal or questionable several of the exclusions the U.S. EPA had asserted the agency had authority to invoke and concluded that the U.S. EPA's rules did not grant the agency discretion to exclude any of a chemical's conditions of use from a risk evaluation. The Court deemed it premature to rule on whether the U.S. EPA must under TSCA evaluate combinations of exposures from multiple conditions of use because at the time the U.S. EPA had yet to release any final risk evaluations for specific chemicals (U.S. Court of Appeals for the Ninth Circuit 2019). Subsequently, the U.S. EPA released the first 10 final risk evaluations, which do not examine combined exposures, prompting additional, active legal challenges of several of the evaluations.
The various types of exclusions of uses, releases, and exposure pathways that the U.S. EPA applied in its first 10 risk evaluations include: a) statute-based exclusion of environmental releases and human exposure pathways; b) general exclusion of exposures to a chemical when present as a by-product or impurity; c) exclusion or isolated analysis of "legacy" uses and associated disposal; and d) background exposures (Table 3). These exclusions and isolated analyses each would have contributed to an underestimation of exposure and risk to public health.

Characterization and Communication of Uncertainty and Variability
Uncertainty. The National Academies called for systematic analysis of "the sources, nature, and implications of the uncertainties" in a given risk assessment (NRC 2009). More than two decades earlier, former U.S. EPA Administrator William Ruckelshaus also highlighted this issue, emphasizing that "We must insist on risk calculations being expressed as distributions of estimates and not as magic numbers that can be manipulated without regard to what they really mean. We must try to display more realistic estimates of risk to show a range of probabilities" (Ruckelshaus 1984).
In the first 10 TSCA risk evaluations, the U.S. EPA did not conduct sufficient sensitivity analyses to characterize and communicate the implications of its reliance on various models in the absence of measured data or of its use of uncertain model inputs, such as when characterizing chemical exposures. In its review of the draft methylene chloride (DCM) risk evaluation, the SACC highlighted this issue, citing as an example the U.S. EPA's use of a single value of 57% for the removal of DCM during wastewater treatment without considering any potential variance. The SACC urged the U.S. EPA to better document uncertainties and assumptions associated with exposure model inputs and to employ sensitivity analyses to assess their impact (U.S. EPA 2020g). Similarly, in reviewing the draft 1-bromopropane (1-BP) risk evaluation, the SACC noted gaps in the data used to inform environmental and human exposure assessments-such as limited data on toxicity to aquatic organisms and on vapor capture efficiency and extent of use of personal protective equipment in occupational settingsand recommended that the U.S. EPA use sensitivity analyses to characterize the extent to which the gaps affect risk estimates (U.S. EPA 2020c).
One means of reducing uncertainty is to fill data gaps. An area of significant bipartisan consensus in TSCA reform was the need to enhance the U.S. EPA's authority to readily obtain more and better information on chemicals from chemical manufacturers and processors. Under the 2016 reforms the U.S. EPA can now, through the issuance of an order, require the development of real-world exposure information and information on a chemical's hazards, which would help reduce uncertainties in the risk assessment process. The U.S. EPA used this authority to obtain limited information for only one of the first 10 chemicals to undergo risk evaluation, and it did so only late in the risk evaluation process (U.S. EPA 2020b). For the risk evaluation of Pigment Violet 29 (PV29), the U.S. EPA required its manufacturers to conduct solubility testing and measurements of workplace dust levels and particle size distributions (U.S. EPA 2020b), some of which were incorporated into the final risk evaluation. However, deficiencies in the workplace monitoring data that the U.S. EPA received forced it to rely more heavily on assumptions in its occupational exposure assessment (U.S. EPA 2021a). The U.S. EPA also lacked any data on acute and chronic toxicity of PV29 by inhalation-its primary exposure route of concern. By not requiring testing earlier in the risk evaluation process, the agency was forced in the end to rely exclusively on analog data from carbon black (U.S. EPA 2021a). The U.S. EPA's explanation for its choice of carbon black included similarity with respect to particle size, physical chemical properties (e.g., solubility and density), chemical composition, and structure (i.e., planar with multiple carbon rings). Comments submitted to the agency by some authors described multiple concerns with the U.S. EPA's selection of carbon black as an analog to characterize PV29 toxicity (EDF 2019a). To date, no inhalation toxicity or air concentration data specific to PV29 have been developed (EDF 2019a; U.S. EPA 2021a). Across the first 10 risk evaluations, the U.S. EPA has been criticized for relying on limited information to draw firm risk conclusions, especially without conducting and integrating robust uncertainty analyses into risk conclusions (EDF 2019b(EDF , 2019c(EDF , 2019dSACC 2019aSACC , 2019bSACC , 2020a. Common critical gaps across risk evaluations identified by authors include insufficient or deficient information on releases of chemicals into various Table 2. Key recommendations on risk assessment from the National Academies and examples of departures from this guidance among the first 10 chemicals to undergo risk evaluation under the Lautenberg Act. Recommendations for best practices in risk assessment from the National Academies (NASEM 2021;NRC 2008NRC , 2009   • In its peer review report on the 1,4-D draft, the U.S. EPA Science Advisory Committee on Chemicals (SACC) noted the unscientific nature of the U.S.
EPA's statute-based exclusions: "Unfortunately, many of the inadequacies of the draft Evaluation have their genesis in a faulty problem formulation. There are several areas where the problem formulation strayed from basic risk assessment principles by omitting well known exposure routes such as water consumption by all occupationally and non-occupationally exposed humans as well as similar exposures to other biological receptors. . . The decision by the EPA to defer concerns of consumer exposure, or exposure of the general public, through ambient water or air because 'other environmental statutes administered by EPA adequately assess and effectively manage these exposures' was not deemed acceptable by many of the Committee members" (SACC 2019c) • The Ninth Circuit, in its opinion cited earlier on the legal challenge to the U.S. EPA's underlying Risk Evaluation Rule (U.S. Court of Appeals for the Ninth Circuit 2019), found that in Toxic Substances Control Act (TSCA) "[t]he phrase 'the conditions of use within the scope of' an evaluation simply refers to the conditions of use that are applicable to any particular substance-and that therefore are included in the scope of that substance's evaluation-without excluding any conditions of use in forming that list" (U.S Court of Appeals for the Ninth Circuit 2019). Thus, the Court found that the U.S. EPA's rule does not allow the U.S. EPA to exclude conditions of use when preparing a risk evaluation.

Asbestos
Exclusion of exposures to a chemical when present as a by-product or impurity In its risk evaluation of 1,4-D, the U.S. EPA stated: "EPA has exercised its authority in TSCA Section 6(b)(4)(D) to exclude from the scope of this risk evaluation conditions of use associated with 1,4-D generated as a by-product in manufacturing, industrial and commercial uses" (U.S. EPA 2020f). As a result, the U.S. EPA excluded all exposures and risks to workers arising from such conditions of use, based on a finding that 1,4-D is present in those settings and products as a by-product rather than being intentionally used (U.S. EPA 2020f). In response to requests from the formulated products industry, however, the U.S. EPA conducted a supplemental analysis limited to eight categories of consumer products in which 1,4-D is present as a by-product. That analysis failed to examine exposures to workers using the same types of products, and it did not consider releases and exposures associated with down-the-drain disposal of such products after use. Moreover, the U.S. EPA deferred any such considerations to future risk evaluations it may someday conduct for the ethoxylated chemicals that give rise to 1,4-D as a by-product: "EPA will consider other conditions of use where 1,4-D is a by-product as part of the future risk evaluations for chemicals that produce it as by-product" (U.S. EPA 2020f).
• TSCA makes no distinction between exposures to chemicals based on whether they are intentionally present or added to a formulation or are formed and present as a by-product. TSCA requires the U.S. EPA to evaluate the risks of all known and reasonably foreseen, as well as intended, conditions of use of a chemical, which clearly encompass its presence as a byproduct.
• This distinction lacks a scientific basis, as people and the environment can be exposed to byproducts just as they can to intentionally used chemicals. The U.S. EPA's exclusion leads to an underestimation of human and environmental exposure and risk and is counter to best practice in risk assessment.
• The U.S. EPA's SACC criticized this approach, stating that the U.S. EPA had not provided an adequate scientific basis for this policy decision (SACC 2019c).
• There are potentially hundreds of ethoxylated chemicals that give rise to 1,4-D as a by-product used in hundreds of types of consumer, commercial and industrial products (U.S. EPA 2017c). The U.S. EPA's approach would mean that it would evaluate the risks of such exposure piecemeal rather than comprehensively and would not evaluate even those risks for many years.
Hence, there would not be an evaluation of risks from all exposure sources.
• None Table 3. Exclusion or isolated analysis of "legacy" uses and associated disposal In its Risk Evaluation Rule, the U.S. EPA stated it had authority to and would exclude known exposures from ongoing uses and disposal of a chemical where it is no longer produced for those uses. In its draft risk evaluation for asbestos, the U.S. EPA excluded all exposure to installed building materials containing asbestos as well as exposures arising from the disposal of such materials. In response to widespread criticism of this exclusion and the Ninth Circuit Court of Appeals decision ruling such exclusions illegal, the U.S. EPA announced its intention to conduct a separate, supplemental risk evaluation limited to the legacy conditions of use (U.S. EPA 2020i).
• There is no basis in TSCA for distinguishing between exposures to chemicals from known uses and disposal activities based on whether or not the chemicals are currently manufactured for such uses.
• A basic principle of chemical risk assessment is that risk is informed by the total extent of exposure to a substance. Such exclusions underestimate the totality of chemical exposure, thereby leading to lower estimates of risk.
• The U.S. EPA's SACC was critical of the U.S. EPA's initial exclusion of "legacy" exposures and its later stated intention to address them separately, noting: "Risks from asbestos for disease is cumulative. Thus, the Committee suggested that calculations of the risk estimates for cancer should consider legacy asbestos exposures. In addition, this would require incorporation of aggregate exposures, as these are essential to understand how humans may be affected by multiple sources/pathways of legacy use.. . . Members noted that the statement 'risk could be underestimated' because legacy exposures were not included is an understatement. Almost all the existing sources of exposure come from 'legacy exposure'; the so called 'bystander exposure' is limited in scope and much focused, and as such it is not generalizable. An important feature is that legacy exposures could impact some exposures more than others and thus differentially impact the risk estimates. Some effort to quantify this, or at least characterize differential impacts of legacy exposures across categories should be considered. The Committee has recommended the Agency to include legacy exposure in the calculation of cancer risk from asbestos exposure" (SACC 2020b).
• The Ninth Circuit Court of Appeals "vacated" those portions of the rule that allowed the U.S. EPA to exclude legacy uses and their associated disposal (United States Court of Appeals for the Ninth Circuit 2019). In response, the U.S. EPA acknowledged it must revise its approach to address such exposures (U.S. EPA 2020i). However, its decision to do so through a separate, isolated evaluation will mean that the U.S. EPA will still exclude the contribution of legacy exposures to overall exposures. "Background levels of TCE in indoor and outdoor air are not considered or aggregated in this assessment; therefore, there is a potential for underestimating consumer inhalation exposures, particularly for populations living near a facility emitting TCE or living in a home with other sources of TCE, such as TCE-containing products stored in the home" (U.S. EPA 2020h).
• Although the U.S. EPA at least acknowledges this decision as a source of risk underestimation, its failure to evaluate or account for such exposures also means that it will not adequately assess risks to people living in proximity to sources of such background exposures, including environmental justice communities, who represent potentially exposed or susceptible subpopulations that must be considered under TSCA. In March 2020, the U.S. EPA deferred making a regulatory determination on whether even to initiate a national primary drinking water regulation for 1,4-dioxane.
environmental media and the resulting concentrations; concentrations in and releases from industrial, commercial, and consumer products and materials; the degree, frequency, and duration of exposures in workplaces; human hazard end points of key concern; and hazards to sediment-and soil-dwelling and terrestrial and aquatic organisms. A consequent overreliance on modeled physical-chemical and environment fate data is pervasive across the risk evaluations.
In light of these concerns and given the length of time some types of studies or testing take to conduct, authors believe it is essential that the U.S. EPA invoke its information authorities early in the risk evaluation process to allow sufficient time for the information to be developed (with any deficiencies addressed) and incorporated into risk evaluations. In using these authorities, we recommend that the U.S. EPA require companies to adhere to specific testing protocols and certify as to the accuracy and completeness of the information they submit. All submitted data should be rigorously evaluated, including through systematic review procedures that include conflict of interest as a risk of bias element.
Variability. Accounting for the true variability in vulnerability across the population by identifying subpopulations that may be uniquely susceptible and/or highly exposed is a critical component of risk assessment (NRC 2009). The first 10 risk evaluations did not fully consider or characterize risks to key groups of potentially exposed or susceptible subpopulations, such as workers; individuals living near conditions of use or disposal sites; individuals with preexisting conditions; fetuses, children, and pregnant women; nor groups who may be disproportionately exposed or susceptible in multiple ways (e.g., pregnant workers, individuals with preexisting conditions living near disposal sites). In most cases, vulnerable subpopulations were not included (select examples are provided in Table 4), resulting in an underestimation of risk to the most vulnerable persons. For example, in the risk evaluation for 1-BP, the U.S. EPA did not address the heightened risk to children and pregnant women as potentially susceptible subpopulations. Despite the U.S. EPA's acknowledgment that reliance on the default human variability uncertainty factor is not sufficient to protect these and other more susceptible groups (U.S. EPA 2020a) and evidence of perinatal vulnerability to 1-BP observed in rodent studies (WIL Research 2001), more appropriate approaches (such as use of uncertainty factor distributions better capable of capturing population variation in susceptibility) (Hattis et al. 2002;Zeise et al. 2013) were not considered in calculating risk estimates.

Unified Approach to Dose-Response Assessment
The U.S. EPA has traditionally evaluated risk from carcinogenic and noncarcinogenic substances using distinct methodologies. Chemicals believed to cause cancer via genotoxicity or an undefined mechanism are assumed to have a linear, no-threshold doseresponse relationship (unless robust data support an alternative relationship), and a quantitative risk estimate is developed. Chemicals eliciting noncancer effects regardless of mechanism are assumed to have thresholds below which adverse effects are not expected to occur, yielding a binary indication of hazard (e.g., the hazard quotient), but no risk estimate (Table S2). Instead, the U.S. EPA defines a reference dose (RfD) or reference concentration (RfC) as daily exposure estimates "likely to be without an appreciable risk of deleterious effects" over a lifetime of exposure "with uncertainty spanning perhaps an order of magnitude" (U.S. EPA 2002). Relatedly, for noncarcinogenic substances the U.S. EPA calculates a margin of exposure (MOE)-the ratio of the point of departure (POD) to an anticipated exposure-to characterize the potential risk (Table S2). A large MOE is considered to have minimal risk, whereas a low MOE reflects a higher level of risk (U.S. EPA 2002).
The National Academies recommended a unified approach to dose-response assessment for both cancer and noncancer effects. The approach is characterized by the definition of risk-based doses for all endpoints and the assumption of a linear, no-threshold dose response relationship in the absence of evidence to the contrary. Such an approach improves risk characterization and risk management decisions for chemicals with noncancer effects by quantifying the excess population risk at any specified dose, rather than the bright-line MOE approach.
The TSCA risk evaluations to date, however, do not follow this guidance and evaluate the potential for noncancer end points using the MOE approach. The calculated MOE is compared to a predetermined "benchmark MOE"-the MOE deemed to be acceptable, typically ranging from 10 to 300-to evaluate whether the substance presents unreasonable risk. As raised by others in the scientific community, we believe this binary benchmark does not address the potential for differential risk as a result of varying individual susceptibilities and coexposures across the population (NRC 2009). Further, we believe this approach limits the assessment's utility for risk managers, effectively hindering cost-benefit analyses of various risk management options.

Criteria for and Selection of Defaults
The U.S. EPA has historically relied on uncertainty factors (UF; also called adjustment factors) to account for uncertainties arising from gaps or deficiencies in a chemical's toxicity information (Table S2). For example, the U.S. EPA calculates RfD and RfC values by dividing a chemical's POD by UFs. Likewise, benchmark MOEs are derived by multiplying relevant UFs (Table S2). The default values assigned to the UFs are typically set at 10; any deviations require clear justification. Although older U.S. EPA workgroup proceedings are available to guide the use of these factors (U.S. EPA 2002), their application is largely expert driven, and no formal, objective framework has been proposed by the U.S. EPA to fully standardize their use. The National Academies has indicated that, in practice, UFs attempt to account for both uncertainties and variabilities, though the accuracy of these adjustments remains undercharacterized. As a result, the National Academies recommended use of probability distributions to transparently account for elements of uncertainty and variability in place of default values for UFs (NRC 2009).
In the first 10 risk evaluations, we observed that the U.S. EPA employed neither probability distributions nor an objective, transparent framework for use of UFs in deriving toxicity reference values and benchmark MOEs. Instead, an examination of these science policy decisions suggests a pattern of choosing lessprotective UFs. For example, in applying the lowest observed adverse effect level (LOAEL)-to-no observed adverse effect level (NOAEL) UF (UF L ) to certain PODs for trichloroethylene (TCE) and DCM, the U.S. EPA employed a value of 3, rather than the more common value of 10. In the case of the TCE toxicity reference value for immunotoxicity, the U.S. EPA justified the smaller UF L by characterizing changes in antibody levels as a subclinical effect rather than a frank measure of autoimmune toxicity (U.S. EPA 2020h). In the case of DCM, the U.S. EPA considered only the "small magnitude" of the effect, while ignoring the nature of the response at the LOAEL (U.S. EPA 2020d). Earlier U.S. EPA guidance indicated that both of these components-addressing the severity and burden of the effects in terms of the degree of change in a measured parameter at the LOAEL and the fraction of the population affected at the LOAEL-must be adequately considered before lowering the UF L (U.S. EPA 2002). Further, the application of a UF L of 3 to the acute DCM POD for decreased visual peripheral performance conflicts with earlier U.S. EPA and Office of Environmental Health Hazard Assessment (OEHHA) assessments (California Office of Environmental Health Hazard Assessment 2008a; U.S. EPA 2014a), which used a value of 6 for the UF L .
The risk evaluations did not employ a database UF (UF D ), which accounts for deficiencies in the study database that may have otherwise resulted in the identification of a more sensitive effect. More specifically, based on guidance from the 2002 U.S. EPA Risk Assessment Forum (RAF) report, the UF D accounts for "the uncertainty associated with extrapolation from animal data when the database is incomplete" or when there are deficiencies in the database related to "particular organ systems as well as life stages" (U.S. EPA 2002). In the DCM risk evaluation, the U.S. EPA chose not to apply a UF D for the selected developmental neurotoxicity and hematological effects (U.S. EPA 2020d), whereas an earlier Agency assessment of the chemical used an UF D to address limitations in data for these end points (U.S. EPA 2011b). Similarly, the Agency did not use any UF D s in the risk evaluations for 1-BP and 1,4-D, despite a limited evidence base of toxicological data.
The U.S. EPA conducted route-to-route extrapolation when dermal toxicity data were deficient or entirely unavailable. In the risk evaluations for TCE, DCM, and 1-BP, the U.S. EPA used previously developed physiologically based pharmacokinetic (PBPK) models to extrapolate from the inhalation to dermal routes (U.S. EPA 2020a, 2020h, 2020k). Oral-to-dermal extrapolation was performed for 1,4-D (U.S. EPA 2019b). Although cross-route extrapolation may sometimes be necessary, it introduces considerable uncertainty into risk estimation (Schröder et al. 2016). Despite this, the U.S. EPA did not perform quantitative adjustments or additional sensitivity analyses to account for uncertainties arising from this process.
Taken together, across the reviewed risk evaluations, we observed that the U.S. EPA systematically applied UFs that yielded less-protective risk estimates (or did not use UFs at all) and did not sufficiently characterize the quantitative impacts of doing so. In future risk evaluations, the U.S. EPA should employ a more rigorous and formal approach to quantifying these uncertainties, one that minimizes subjectivity and employs evidencedriven distributional analyses to show the range of impacts associated with various choices, as recommended by the National Academies (NRC 2009).

Cumulative Risk Assessment
The National Academies has provided recommendations to advance cumulative risk assessment, which considers risks to individuals and the population from coexposures to chemical and nonchemical stressors (NRC 2008(NRC , 2009. For example, in Science and Decisions, the National Academies noted that "A narrow focus does not accurately capture the risks associated with exposure, given simultaneous exposure to multiple chemical and nonchemical stressors and other factors that could influence vulnerability" (NRC 2009). Nonchemical stressors that can modify chemical exposure-related health risks include socioeconomic deprivation and other psychosocial stressors as well as preexisting health conditions and differential susceptibility across the life span (Chari et al. 2012;Fox et al. 2017;Payne-Sturges et al. 2021;Schwartz et al. 2011). Cumulative risk concepts relating to life-stage susceptibility have been applied in the cancer risk assessment context (age-dependent adjustment factors) (Barton et al. 2005), and coexposures have been recognized in the occupational risk assessment and management context for DCM (i.e., carbon monoxide coexposure), and also for noise and chemical or drug coexposure on ototoxicity (National Institute for Occupational Safety and Health 1976; Occupational Safety and Health Administration 2018).
TSCA does not include an explicit mandate for the U.S. EPA to assess cumulative risk, yet such an approach is certainly authorized and argued for by three other TSCA requirements: a) TSCA's requirement that the U.S. EPA identify, assess, and protect against risks to subpopulations subject to greater exposure or greater susceptibility to an effect from a given chemical exposure, relative to the general population-which calls for consideration of all factors that could exacerbate exposure of or a chemical's effect in an individual or group; b) TSCA's requirement that the U.S. EPA use best available science, which argues for a cumulative risk approach where the scientific evidence is sufficiently developed to do so; and c) the broad reach of TSCA to encompass risks across the full life cycle and from all intended, known, or reasonably foreseen conditions of use of a chemical (see definitions in Table S1).
We observed that the first 10 TSCA risk evaluations did not use the best available science, i.e., apply a cumulative risk approach (NRC 2009), as it relates to consideration of potentially exposed or susceptible subpopulations and coexposure to multiple chemicals and nonchemical stressors. For example, preexisting health Table 4. Selected examples of potentially exposed or susceptible subpopulations absent or not sufficiently accounted for in recent TSCA risk evaluations.  (Pastino et al. 2000) • Individuals with diabetes (Pastino et al. 2000) • Individuals with obesity (Pastino et al. 2000 • Individuals with preexisting conditions that affect liver or kidney (ATSDR 2012) • Individuals with elevated alcohol intake (ATSDR 2012) Workers TCE (U.S. EPA 2020h) • Workers with compromised health (Pastino et al. 2000) • Workers with alcohol coexposure (Pastino et al. 2000 conditions such as chronic liver or kidney diseases afflict millions of Americans and are expected to increase in prevalence over the next 10 y (Estes et al. 2018;Hoerger et al. 2015). These conditions may impair an exposed person's ability to detoxify and eliminate a chemical, making them more susceptible (Sheehan et al. 2012).
Yet, this type of information was not incorporated in the final risk evaluations for 1,4-D, likely resulting in decisions for the chemical that deviate from those using the best available science. We saw no effort to consider coexposures and nonchemical stressors in the first 10 risk evaluations (Table 2); in fact, we observed systematic undervaluing of epidemiological studies (see below) that have the advantage of encompassing real-world coexposures. Broader adoption of cumulative risk concepts would improve risk estimates and, we believe, overall public health protections of TSCA including those for susceptible subpopulations.

Application of Systematic Review
The application of systematic review (SR) to chemical assessment has gained substantial traction in the environmental health field. Arising from use in the clinical sciences, SR employs a structured approach to identifying, evaluating, and synthesizing evidence to enhance scientific rigor; promote consistency, transparency, and objectivity; and reduce bias. Prominent SR methods and tools in medicine, particularly Cochrane (Higgins et al. 2019) and GRADE (Guyatt et al. 2011), have influenced the approaches that have emerged in environmental health (Rooney et al. 2014;Woodruff and Sutton 2011).
The National Academies has published several reports in recent years recommending the application of systematic review in chemical assessment (NASEM 2018(NASEM , 2017NRC 2014), including a recent report evaluating the TSCA program's use of systematic review (NASEM 2021). Across these reports, the Academies have highlighted several best practices to improve the rigor and conduct of SRs in the context of chemical assessment; among these are inappropriate and appropriate uses of scoring and the need for a structured approach to the integration of evidence across multiple evidence streams (i.e., toxicological, epidemiological, mechanistic data).
Per the TSCA risk evaluation rule, the U.S. EPA established that it would apply SR to meet the Lautenberg Act's requirement to use weight of the scientific evidence to make determinations of chemical risk (U.S. EPA 2017a). In 2018, the U.S. EPA's Office of Chemical Safety and Pollution Prevention (OCSPP) released Application of Systematic Review in TSCA Risk Evaluations ("TSCA SR document") (U.S. EPA 2018). This document presents the Office of Pollution Prevention and Toxics' (OPPT) SR approach under TSCA, which it has applied to the chemical risk evaluations it has completed to date. The bulk of the document is devoted to specifying criteria for evaluating study quality and methods for converting study quality evaluations into overall study quality scores. The U.S. EPA subsequently published and applied revisions to the data quality criteria for epidemiological studies in June 2019 (U.S. EPA 2019a). Review of the TSCA SR document and its application to risk evaluations to date reveals several deviations from recommendations of the Academies. In the following sections, we will focus on the issues arising from the TSCA SR approaches used for study scoring and evidence integration.
Quantitative study scoring. The TSCA SR document defines a framework to assign study quality scores by evaluating studies across various metrics. The use of quantitative scoring systems and summary scores in SR has been repeatedly criticized both in established SR guidance (Jüni et al. 1999), in newer risk-of-bias frameworks tailored for use with observational studies (Dekkers et al. 2019), and by the National Academies (NASEM 2021). In its 2014 review of the Integrated Risk Information System (IRIS) program, the National Academies noted that SR methodologies have moved away from calculating quality scores because of numerous shortcomings. These shortcomings include the inherent subjectivity in assigning weights to various criteria, evolving requirements for reporting of study details, and empirical findings that quality scores have significant limitations in assessing risk of bias in clinical research (Higgins et al. 2008;Jüni et al. 1999;NRC 2014). Of particular concern, the TSCA SR document dictates that if any single metric is scored as "unacceptable," the entire study is excluded from further consideration in the risk evaluation.
Regarding epidemiological studies, the TSCA SR method, without justification, precludes numerous metrics from receiving a rating of "high confidence," diminishing a priori the use of epidemiological evidence in risk evaluation (U.S. EPA 2020e). This practice resulted in the systematic down-weighing of the line of evidence that throughout the U.S. EPA's history has been the preferred means of assessing risks to populations and informing efforts to protect human health and the environment. Epidemiological studies provide information that increases our understanding of the causes of disease, elucidates factors that influence the susceptibility of certain groups, and characterizes the levels of exposure leading to health effects. Integration of such evidence, along with that from in vivo and in vitro studies, can reduce the uncertainties and limitations of each type of study and strengthen the basis for scientific conclusions about risks (Deener et al. 2018).
The value of epidemiological data for human health risk assessment has been stated and reinforced by the U.S. EPA and others over many years (ATSDR 2005;Nachman et al. 2011;NRC 2009;U.S. EPA 1991U.S. EPA , 2005. Nevertheless, in multiple TSCA risk evaluations, we observed that the U.S. EPA dismissed or downgraded this evidence stream through the study scoring scheme set in the TSCA SR method, as well as through general criticism or flawed interpretation of study methods and results. For example, in the 1-BP risk evaluation, the U.S. EPA selected a rat study for its POD modeling of neurotoxicity, rather than human epidemiological studies. This decision came as a result of the study scoring scheme established in the TSCA SR method, as well as flawed interpretation of study methods described in public comments by the Environmental Defense Fund (EDF 2019b;U.S. EPA 2020aU.S. EPA , 2020d. In the DCM risk evaluation, the U.S. EPA discussed general limitations of epidemiological studies (i.e., potential for bias due to healthy worker effect or exposure misclassification; concerns about study sensitivity) (U.S. EPA 2020d). Yet there is no analogous discussion of the general limitations of experimental toxicological studies, suggesting a biased approach to evidence review. The U.S. EPA ultimately based the chronic inhalation hazard evaluation of DCM on an experimental rat study (U.S. EPA 2020d), rather than a human study that the state of California has used to set its chronic reference exposure level (California Office of Environmental Health Hazard Assessment 2008b); in our view the U.S. EPA dismissed the value of the human study in accounting for important coexposures (U.S. EPA 2020d). For both 1-BP and DCM, using toxicological studies rather than available human data resulted in risk evaluation findings (considering the POD and UFs) that allow higher exposures and therefore less public health protection.
Approach to evidence integration. A structured approach to evidence integration is a key element of applying consistent and transparent approach to SR. In its 2014 review of the U.S. EPA IRIS program, the National Academies identified evidence integration as fundamental to determining whether a chemical poses a hazard and recommended that the agency develop templates to structure this process and resulting conclusions (NRC 2014). Although the TSCA SR document calls for evidence integration (U.S. EPA 2018), it does not provide a specific approach for evidence synthesis or integration (NASEM 2021); as a result, the final disposition of evidence is handled inconsistently both within (different approaches across hazard end points for same chemical evaluation) and across risk evaluations.
Apart from an ad hoc approach used in the TCE assessment, the U.S. EPA's risk evaluations have lacked structured approaches for evidence integration (NASEM 2021). For TCE, the U.S. EPA applied the Risk Assessment Forum's Weight of Evidence in Ecological Assessment approach ("RAF approach") to the evidence base for congenital heart defects (U.S. EPA Risk Assessment Forum 2016). Not a formal SR method, the RAF approach involves assessing individual studies across three main criteria: reliability (quality), relevance, and strength. Weighted scores are assigned for each criterion and used to develop individual study "grades," "summary scores" for each line of evidence (i.e., epidemiological, animal, and mechanistic), and ultimately an overall integrated "summary score" combining all lines of evidence. Appendix F of the final TCE risk evaluation provides a detailed discussion of the RAF approach, the U.S. EPA's basis for selecting this approach to evidence integration relative to other options considered, and its application in the TCE risk evaluation for evaluating evidence for congenital heart defects. Using the RAF approach, the U.S. EPA concluded that scientific evidence supports TCE-induced congenital heart defects (CHDs): "Overall, an association between increased congenital cardiac defects and TCE exposure is supported by the weight of the evidence, in agreement with previous U.S. EPA analyses" (U.S. EPA 2020i).
Notably, the U.S. EPA applied this structured approach only for the CHD end point, using narrative summaries of the evidence to reach conclusions for all other end points (U.S. EPA 2020i). These narrative summaries briefly describe the available literature for various toxicity end points and in a nonstructured manner draw a conclusion for the evidence of an effect. In contrast, SR methods provide a structured approach to reaching weight of evidence determinations, from protocol development to evidence integration. Authors believe the U.S. EPA's disparate treatment of the evidence for CHDs in comparison with other TCE effects including immunotoxicity, the endpoint the U.S. EPA used to reach determinations of risk, introduces bias and uncertainty with regard to the conclusions of the risk evaluation. The TCE risk evaluation may have yielded different conclusions had the U.S. EPA applied the same approach to evidence integration for all end points. This disparate treatment is especially significant given the U.S. EPA's decision not to use CHDs, the most sensitive end point, as the basis for its determinations of TCE risk. That decision-to not use the most sensitive end point-is at odds with its previous assessments (U.S. EPA 2011a, 2014b) and decades of scientific policy and practice as well as TSCA's requirement that the U.S. EPA identify risks to potentially exposed and susceptible subpopulations (U.S. Congress 2016)-in this case, pregnant women.
The U.S. EPA's approach to SR for TSCA risk evaluations is immensely consequential because it dictates which data will be included in the risk evaluation and the weight these data will be given in the overall characterizations of chemical hazard and risk. It is concerning, then, that the TSCA SR approach deviates from established best practices for SR as documented in several National Academies reports (NASEM 2018(NASEM , 2021(NASEM , 2017NRC 2014).

Characterizing Risk Based on the Most Sensitive End Point
National Academies reports (NRC 1994(NRC , 2009) as well as the U.S. EPA guidance (U.S. EPA 1991EPA , 1995EPA , 2002 direct the agency to base chemical risk assessments on the most sensitive end point-a basic scientific approach to ensuring healthprotective characterizations of chemical risk. The U.S. EPA deviated from this long-standing practice in the TCE risk evaluation by using immunosuppression rather than CHDs in the assessment of acute and chronic risks from TCE exposure (U.S. EPA 2020i). The U.S. EPA made this decision despite the robust evidence base, as described in the TCE risk evaluation and prior assessments and publications (Makris et al. 2016;Runyan et al. 2019; U.S. EPA 2014b). The U.S. EPA's risk evaluation found that CHDs provide the lowest POD value for modeling. The immunosuppression end point, by contrast, was orders of magnitude less sensitive and was not subjected to the same rigorous weight of evidence analysis as CHDs.
A 2021 internal memo from Dr. Michal Freedhoff, Acting Assistant Administrator of the U.S. EPA's Office of Chemical Safety and Pollution Prevention, indicated that political interference by the Trump administration was responsible for the shift away from the fetal cardiac end point (Freedhoff 2021). The U.S. EPA's decision not to use the most sensitive end point in evaluating TCE risk (U.S. EPA 2020h) results in an underestimation of TCE's true risks to the population-including to susceptible subpopulations, in particular pregnant women-that is too low, defying best practices in risk assessment.

Conclusion
Through our review of the first set of TSCA risk evaluations, we found that the agency did not employ the best available scientific methods and data in characterizing the risks of these chemicals to population health. As a result, we believe that decisions based on the conclusions of these risk evaluations do not measure up to the intent of the Lautenberg Act, which explicitly requires that the agency use best available science and reasonably available information (Table S1).
As documented in this commentary, closer adherence to guidance from the National Academies and other credible voices (Children's Health Protection Advisory Committee 2021; UCSF Program on Reproductive Health and Development 2020) could have aided the agency in avoiding numerous substantial problems. For many of the evaluated chemicals, limited and narrow problem formulations excluded significant sources and amounts of known releases (see Tables 1 and 3). We observed that evaluations and characterization of uncertainty and variability were limited and lacked rigor, often resulting in overreliance on low-quality data. In the absence of reliable data, the agency often applied inadequate default assumptions which is inconsistent with the language of TSCA. Baseline population exposures, combined exposures, and cumulative risks were not considered for any TSCA chemical. Unified approaches to dose-response assessment (NRC 2009) were not employed; rather, the agency relied on the MOE approach, which does not produce a quantitative estimate of risk (McGartland et al. 2017). Finally, the risk evaluations used a fatally flawed approach to SR (NASEM 2021). We also observed that epidemiological studies, long considered the gold standard for characterization of human risks, were marginalized. Perhaps most troubling was the selection of an insufficiently sensitive end point (in the TCE risk evaluation) as a result of reported nonscientific political intervention.
New U.S. EPA leadership, bolstered by a presidential memorandum on scientific integrity and evidence-based policymaking (The President of the United States of America 2021), has reasserted the importance of employing the best available science and data in agency activities, including TSCA implementation (Regan 2021).
In June 2021, the U.S. EPA's Office of Chemical Safety and Pollution Prevention (OCSPP), which administers TSCA, announced a "Path Forward for TSCA Chemical Risk Evaluations"(U.S. EPA 2021b). That announcement indicated the U.S. EPA's intention to revisit most of the previously issued risk evaluations to address some of the key issues discussed in this paper. Among the changes the U.S. EPA announced are ones that expand consideration of exposure sources and pathways that were previously excluded because they also fall under another U.S. EPA-administered statute. For the 1,4-D risk evaluation, the U.S. EPA intends to include exposures through drinking water and ambient air and to expand its consideration of exposures to the chemical when present as a by-product. For six other chemicals, the U.S. EPA will conduct screening-level assessments of exposures to fenceline communities from air and water releases of the chemicals; if potential unreasonable risks are indicated, the U.S. EPA expects to revise the risk evaluations to thoroughly evaluate such risks. If implemented, these changes will go far to remedy flaws resulting from unscientific exclusions of chemicals' conditions of use and resulting exposures that we have described in this paper. These proposed actions are suited to address concerns with narrow risk evaluation scopes (as we identified for the first 10 risk evaluations in this commentary) that resulted in the exclusion of relevant sources and pathways of exposure and inadequate consideration of potentially exposed or susceptible subpopulations.
Additionally, in February 2021 the National Academies published a peer review report of the TSCA SR method (NASEM 2021) which was promptly followed by a public announcement by the U.S. EPA that it would no longer apply the method (U.S. EPA 2021h). In December 2021, EPA released a revised TSCA SR approach for public comment.
Beyond these improvements, we identify additional opportunities to enhance the rigor of the risk evaluations and improve the scientific basis for management of identified chemical risk. These include: • Greater use of TSCA's significantly strengthened information authorities, afforded by the Lautenberg Act, early in the risk evaluation process • Application of more robust approaches to account for sources of uncertainty and variability, namely employing probabilistic approaches that consider distributions of uncertainty and variability that could also address limitations associated with uncertainty factors • Adoption of a unified approach to dose-response modeling and risk characterization for chemicals' cancer and noncancer effects that abandons the outdated MOE paradigm in favor of estimations of population risk • Application of cumulative risk approaches that, at a minimum, address coexposures to other relevant chemical stressors (e.g., consideration of coexposures to multiple phthalates when conducting the phthalate risk evaluations currently in development), while implementing over time increasingly sophisticated methods to account for coexposures to nonchemical stressors • Use of established systematic review methods [i.e., Office of Health Assessment and Translation (OHAT), IRIS, Navigation Guide (NTP 2019; U.S. EPA 2020k; Woodruff and Sutton 2014)] for evaluating and integrating evidence where they exist (i.e., for animal and human studies), and focusing method development efforts where accepted approaches are nascent (e.g., for consideration of exposure and mechanistic studies) • Characterization and determination of risk based on the most sensitive end point. We have contrasted the deviations from best practice persistent across the first 10 TSCA risk evaluations (Table 2) with recommendations from the National Academies that would improve the rigor of subsequent risk evaluations under TSCA. Coupled with the U.S. EPA's affirmation of its commitment to scientific integrity and its course-correction on the first 10 risk evaluations, we are confident that the agency can use this opportunity to produce chemical risk evaluations aligned with the spirit as well as the letter of the Lautenberg Act amendments aimed at better protecting our population, including those most vulnerable to chemical exposures.