Chapter 13. Description of Ideal Evaluation Methods: Assessing the Strength of Evidence Across Studies of Patient Safety Practices

Assessing the Evidence for Context-Sensitive Effectiveness and Safety

A key step when conducting a systematic review is assessing the strength of the evidence across the studies of a particular topic. An extended discussion of this is included in Appendix H.

One of the most widely used methods is that developed by the Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group (www.gradeworkinggroup.org).1 GRADE has tools for grading the quality of evidence and the strength of practice guideline recommendations. These tools or related ones are already in widespread use by the American College of Physicians, the British Medical Journal's Clinical Evidence, the Society of Critical Care Medicine, the Scottish Intercollegiate Guidelines Network, and more than 35 other organizations. An adaptation of GRADE has been published for diagnostic tests.2 

AHRQ's Evidence-based Practice Center (EPC) program has developed its own method for assessing the strength of evidence, which started with GRADE but was adapted for the particular needs of the EPC program. The two methods share much in common, but they differ in the names they use for this construct as "quality of evidence" versus "strength of evidence" and in the labels and descriptors for the levels of evidence (Table 6). Also, GRADE suggests explicit weights for determining the level of evidence, while the EPC approach says that other methods, in addition to the GRADE weights, are acceptable as long as the method is transparent.

The rationale for developing an adaptation of GRADE or the AHRQ EPC system for patient safety practices (PSPs) is that there are issues about PSP interventions (as detailed in this report) that differ sufficiently from the kinds of interventions that the existing GRADE or EPC system are most commonly used for (drugs, surgery, etc.), such that a modification may be more relevant to stakeholders than trying to apply the existing GRADE or EPC criteria.

66 In an adaptation for PSPs, we propose using descriptive categories similar to these. Using the GRADE and AHRQ EPC tools as a starting point, a tool to assess the strength of evidence across studies of PSPs might look like Table 7. This uses the EPC labels and the GRADE system of weights (+1, -1, etc.) and domains from both GRADE and the EPC schemes, plus adds key domains we identified during this project as relevant to evaluations of PSPs.

This approach takes into account many of the points made by this project. For example, RCT evaluations about a PSP that lack reporting of theory, context, implementation, etc. decrease the strength of evidence to moderate or even low. Likewise, a body of evidence about a PSP that comes entirely from studies that are not RCTs can be considered high quality evidence if the studies use observational designs of stronger internal validity (such as statistical process control or controlled before-and-after); if they inform theory and measure and report contexts; or if they have very strong effects or consistent results are obtained in many studies. Our suggestion here is preliminary and would benefit from refinement from a varied group of PSP stakeholders. For example, one concept not yet incorporated into this scheme that deserves discussion is the concept of proportionality, meaning that interventions that are low cost and low risk (e.g., hand-washing) may be accepted with a lower strength of evidence than interventions that have higher cost or risks (e.g., CPOE/DSS).

References for Chapter 13

  1. Owens DK, Lohr KN, Atkins D, et al. Grading the strength of a body of evidence when comparing medical interventions. Agency for Healthcare Research and Quality and the Effective Health Care Program. J Clin Epidemiol 2009 Jul 10. [Epub ahead of print]. Also available as: Owens DK, Lohr KN, Atkins D, et al. Grading the strength of a body of evidence when comparing medical interventions. In: Methods guide for comparative effectiveness reviews [posted July 2009]. Rockville, MD: Agency for Healthcare Research and Quality; 2009. Available at: http://effectivehealthcare.ahrq.gov/healthInfo.cfm?infotype=rr&ProcessID=60.
  2. Schünemann HJ, Oxman AD, Brozek J, et al., GRADE Working Group. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. Br Med J 2008; 336(7653):1106-10.
Page last reviewed December 2010
Internet Citation: Chapter 13. Description of Ideal Evaluation Methods: Assessing the Strength of Evidence Across Studies of Patient Safety Practices: Assessing the Evidence for Context-Sensitive Effectiveness and Safety . December 2010. Agency for Healthcare Research and Quality, Rockville, MD. http://www.ahrq.gov/research/findings/final-reports/contextsensitive/context13.html