Skip Navigation U.S. Department of Health and Human Services www.hhs.gov
Agency for Healthcare Research Quality www.ahrq.gov
www.ahrq.gov

Appendix Table. U.S. Preventive Services Task Force Hierarchy of Research Design and Quality Rating Criteria*

Hierarchy of research design

I: Properly conducted randomized, controlled trial (RCT).
II-1: Well-designed controlled trial without randomization.
II-2: Well-designed cohort or case–control analytic study.
II-3: Multiple time series with or without the intervention; dramatic results from uncontrolled experiments.
III: Opinions of respected authorities, based on clinical experience; descriptive studies or case reports; reports of expert committees.

Design-specific criteria and quality category definitions

Systematic reviews
  Criteria:
    Comprehensiveness of sources considered/search strategy used
    Standard appraisal of included studies
    Validity of conclusions
    Recency and relevance are especially important for systematic reviews
  Definition of ratings from above criteria:
    Good: Recent, relevant review with comprehensive sources and search strategies; explicit and relevant selection criteria; standard appraisal of included studies; and valid conclusions.
    Fair: Recent, relevant review that is not clearly biased but lacks comprehensive sources and search strategies.
    Poor: Outdated, irrelevant, or biased review without systematic search for studies, explicit selection criteria, or standard appraisal of studies.
Case–control studies
  Criteria:
    Accurate ascertainment of cases
    Nonbiased selection of cases/controls with exclusion criteria applied equally to both
    Response rate
    Diagnostic testing procedures applied equally to each group
    Measurement of exposure accurate and applied equally to each group
    Appropriate attention to potential confounding variables
  Definition of ratings based on criteria above:
    Good: Appropriate ascertainment of cases and nonbiased selection of case and control participants, exclusion criteria applied equally to cases and controls, response rate equal to or greater than 80%, diagnostic procedures and measurements accurate and applied equally to cases and controls, and appropriate attention to confounding variables.
    Fair: Recent, relevant, without major apparent selection or diagnostic work-up bias but with response rates less than 80% or attention to some but not all important confounding variables.
    Poor: Major section or diagnostic work-up biases, response rates less than 50%, or inattention to confounding variables.
Randomized, controlled trials and cohort studies
  Criteria:
    Initial assembly of comparable groups
    For RCTs: adequate randomization, including first concealment and whether potential confounders were distributed equally among groups
    For cohort studies: consideration of potential confounders with either restriction or measurement for adjustment in the analysis; consideration of inception cohorts
    Maintenance of comparable groups (includes attrition, crossovers, adherence, and contamination)
    Important differential loss to follow-up or overall high loss to follow-up
    Measurements: equal, reliable, and valid (includes masking of outcome assessment)
    Clear definition of the interventions
    All important outcomes considered
  Definition of ratings based on above criteria:
    Good: Evaluates relevant available screening tests, uses a credible reference standard, interprets reference standard independently of screening test, reliability of test assessed, has few or handles indeterminate results in a reasonable manner, includes large number (more than 100) and broad spectrum of patients.
    Fair: Evaluates relevant available screening tests, uses reasonable although not best standard, interprets reference standard independent of screening test, moderate sample size (50–100 subjects) and a "medium" spectrum of patients.
    Poor: Has fatal flaw, such as use of an inappropriate reference standard, screening test improperly administered, biased ascertainment of reference standard, or very small sample size or very narrowly selected spectrum of patients.
Diagnostic accuracy studies
  Criteria:
    Screening test relevant, available for primary care, adequately described
    Study uses a credible reference standard, performed regardless of test results
    Reference standard interpreted independently of screening test
    Handles indeterminate result in a reasonable manner
    Spectrum of patients included in study
    Sample size
    Administration of reliable screening test
  Definition of ratings based on above criteria:
    Good: Evaluates relevant available screening test, uses a credible reference standard, interprets reference standard independently of screening test, reliability of test assessed, has few or handles indeterminate results in a reasonable manner, includes large number (more than 100) and broad spectrum of patients with and without disease.
    Fair: Evaluates relevant available screening test, uses reasonable although not best standard, interprets reference standard independent of screening test, includes moderate sample size (50–100 subjects) and a "medium" spectrum of patients.
    Poor: Has fatal flaw, such as use of inappropriate reference standard, screening test improperly administered, biased ascertainment of reference standard, or very small sample size or very narrowly selected spectrum of patients.

* Go to references 6 and 7.


Return to Document

 

AHRQ Advancing Excellence in Health Care