Skip to content

Environmental Health Perspectives

Facebook Page EHP Twitter Feed Open Access icon  

Correspondence March 2014 | Volume 122 | Issue 3

Email this to someoneShare on FacebookTweet about this on TwitterShare on LinkedInShare on Google+Share on StumbleUpon
Environ Health Perspect; DOI:10.1289/ehp.1307727R

Instruments for Assessing Risk of Bias and Other Methodological Criteria: Krauth et al. Respond

David Krauth,1 Tracey J. Woodruff,2,3 Lisa Bero1,4

Author Affiliations open
1Department of Clinical Pharmacy, and 2Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Francisco, San Francisco, California, USA; 3Program on Reproductive Health and the Environment, Oakland, California, USA; 4Institute for Health Policy Studies, University of California, San Francisco, San Francisco, California, USA
About This Article open

Citation: Krauth D, Woodruff TJ, Bero L. 2014. Instruments for assessing risk of bias and other methodological criteria: Krauth et al. respond. Environ Health Perspect 122:A67;


The authors declare they have no actual or potential competing financial interests.

Published: 1 March 2014

PDF icon PDF Version (119 KB)

Related EHP Correspondence

Instruments for Assessing Risk of Bias and Other Methodological Criteria

Nancy B. Beck, Richard A. Becker, Alan Boobis, Dean Fergusson, John R. Fowle III, Julie Goodman, Sebastian Hoffmann, Manoj Lalu, Marcel Leist, and Martin L. Stephens

Beck et al. criticize our systematic review (Krauth et al. 2013) because we included instruments derived from preclinical animal research. Assessment instruments developed for preclinical animal models have criteria that are relevant to hazard and risk assessment because risk of bias in animal studies is not dependent on the data stream or the question being asked, but on the design of the study. Many instruments that have been developed (including those for evaluating animal toxicology studies) have criteria that have not been shown to bias research outcomes (see Supplemental Material, Table S1, of Krauth et al. 2013).

Furthermore, Table 1 of our paper (Krauth et al. 2013) lists the criteria found in most instruments we identified. In the “Discussion,” we described the empirical evidence supporting the use of some of these criteria and cited the relevant references with the empirical data. By empirical evidence, we mean that a criterion (e.g., randomization) has been shown to be associated with overestimation or underestimation of effect (this could be an efficacy or harm outcome).

Beck et al. note several publications in environmental chemical health hazard assessment [Ågerstrand et al. 2011; Food and Drug Administration (FDA) 2003; Hulzebos et al. 2010; Organisation for Economic Co-operation and Development (OECD) 1998; Schneider et al. 2009; U.S. Environmental Protection Agency (EPA) 1999a, 1999b, 2013]. All of these publications, except OECD (1998), were identified in our search; however, they did not meet the a priori inclusion criteria for our systematic review. As noted in our “Methods” (Krauth et al. 2013), we included the earliest publication of an instrument when it was used in subsequent reports. The article by Ågerstrand et al. (2011) was based on four earlier published papers (i.e., Durda and Preziosi 2000; Hobbs et al. 2005; Klimisch et al. 1997; Schneider et al. 2009). We cited three of these in our review, but excluded Schneider et al. (2009) because it appeared to be a description of software that could be used to operationalize the Klimisch criteria. After reviewing the criteria described by Schneider et al. (2009) in their supplemental file, we found no unique additional criteria that were not already included in our Table 1 and Supplemental Material, Table S1. The reports from the U.S. EPA (1999a, 1999b) and FDA (2003) were neither indexed in Medline nor found in screening of bibliographies. In addition, U.S. EPA (2013) was published after we ended our study. Because we did not find the OECD document (OECD 1998), we cannot conclude whether or not it should have been included in our study.

The comment by Beck et al. that the National Toxicology Program is relying on criteria that have not been “transparently empirically tested” is not correct. In our paper (Krauth et al. 2013), we recommended the use of empirically tested criteria and we pointed out criteria that have been shown to be a risk of bias.

We caution against gathering judgments on how to assess study quality and propose that evidence should guide such evaluations. We propose an empirically based approach—as opposed to consensus-based opinion of experts—as this would provide a more unbiased evaluation of the data.


Ågerstrand M, Breitholtz M, Ruden C. 2011. Comparison of four different methods for reliability evaluation of ecotoxicity data: a case study of non-standard test data used in environmental risk assessments of pharmaceutical substances. Environ Sci Eur 23:17; doi: 10.1186/2190-4715-23-17.

Durda JL, Preziosi DV. 2000. Data quality evaluation of toxicological studies used to derive ecotoxicological benchmarks. Hum Ecol Risk Assess 6(5):747–765.

FDA (Food and Drug Administration). 2003. General Guidelines for Designing and Conducting Toxicity Studies. In: Guidance for Industry and Other Stakeholders, Toxicological Principles for the Safety Assessment of Food Ingredients, Redbook 2000. Available:​on/GuidanceDocumentsRegulatoryInformatio​n/IngredientsAdditivesGRASPackaging/ucm0​78315.htm [accessed 15 October 2013].

Hobbs DA, Warne MSJ, Markich SJ. 2005. Evaluation of criteria used to assess the quality of aquatic toxicity data. Integr Environ Assess Manag 1(3):174–180.

Hulzebos E, Gunnarsdottir S, Rila JP, Dang Z, Rorije E. 2010. An Integrated Assessment Scheme for assessing the adequacy of (eco)toxicological data under REACH. Toxicol Lett 198(2):255–262.

Klimisch HJ, Andreae M, Tillmann U. 1997. A systematic approach for evaluating the quality of experimental toxicological and ecotoxicological data. Regul Toxicol Pharmacol 25(1):1–5.

Krauth D, Woodruff TJ, Bero L. 2013. Instruments for assessing risk of bias and other methodological criteria of published animal studies: a systematic review. Environ Health Perspect 121:985–992; doi: 10.1289/ehp.1206389.

OECD (Organisation for Economic Co-operation and Development). 1998. OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring, No 1: OECD Principles on Good Laboratory Practice. ENV/MC/CHEM(98)17. Paris:OECD. Available:​/displaydocumentpdf/?doclanguage=en&cote​=env/mc/chem(98)17 [accessed 13 February 2014].

Schneider K, Schwarz M, Burkholder I, Kopp-Schneider A, Edler L, Kinsner-Ovaskainen A, et al. 2009. “ToxRTool”, a new tool to assess the reliability of toxicological data. Toxicol Lett 189(2):138–144.

U.S. EPA (U.S. Environmental Protection Agency). 1999a. Auditing General Toxicology Studies. Available:​policies/monitoring/fifra/sop/glp-da-09.​pdf [accessed 15 October 2013].

U.S. EPA (U.S. Environmental Protection Agency). 1999b. Determining the Adequacy of Existing Data. Available:​dfin.htm [accessed 15 October 2013].

U.S. EPA (U.S. Environmental Protection Agency). 2013. OCSPP Harmonized Test Guidelines. Series 870: Health Effects Test Guidelines. Available:​ations/Test_Guidelines/series870.htm [accessed 15 October 2013].

WP-Backgrounds Lite by InoPlugs Web Design and Juwelier Schönmann 1010 Wien