Skip Navigation U.S. Department of Health and Human Services www.hhs.gov
Agency for Healthcare Research Quality www.ahrq.gov
www.ahrq.gov

U.S. Preventive Services Task Force (USPSTF)


Appendix: Detailed Methods

Under guidance from the USPSTF, we created and received USPSTF approval for an analytic framework and key questions adapted from the 2002 USPSTF report.130 The scope of this targeted review differed from the 2002 USPSTF report in several ways:

  1. We did not update the direct evidence that standard FOBT screening is effective in improving health outcomes, except in addressing longer-term follow-up from the original trials included in the 2002 report; this evidence was considered established for the 2002 and was foundational for the last recommendation.
  2. We did not update evidence on colorectal cancer screening methods not recommended after the last review (such as digital rectal examination) or omitted from this review at the workplan stage by the USPSTF because of poor test performance characteristics (such as double-contrast barium enema). A single study (n = 580) from the previous 2002 evidence report found that double-contrast barium enema used as a surveillance method after adenomatous polypectomy (with comparison to colonoscopy as the gold standard) showed a sensitivity of only 48% (CI, 24% to 67%) for polyps larger than 10 mm. A more recent study in a high-risk screening and diagnostic evaluation population comparing double-contrast barium enema to both optical and CT colonoscopy showed similarly low sensitivity estimates for large polyps.131 Given its confirmed low sensitivity for the targets of screening (lesions ≥10 mm), double-contrast barium enema as a primary colorectal cancer screening test was removed from the review.
  3. Systematic review of the adherence, acceptability, and feasibility the screening tests was not part of this updated report. Similarly, the USPSTF judged that a thorough review of cost-effectiveness analyses was beyond the scope of our review, particularly because the USPSTF was conducting a simultaneous decision analysis.24 The decision analysis focused on projected benefits to a cohort that began colorectal cancer screening at age 40 years or later for different screening strategies, different beginning and ending ages, and different intervals for rescreening after a normal test result, with varying screening test adherence.24 These 2 reports were used together by the USPSTF to make its updated recommendation on colorectal cancer screening, and affected the scope of our updated evidence review.

Data Sources and Searches

We first searched PubMed, Database of Abstracts of Reviews of Effects, the Cochrane Database of Systematic Reviews, Institute of Medicine, National Institute for Health and Clinical Excellence, and Health Technology Assessment databases for recent systematic reviews (1999-2006) for all key questions. We also searched the National Guideline Clearinghouse™, Institute of Medicine, and National Institute for Clinical Evidence Web sites for relevant reports.

For each key question, we used already synthesized literature to identify all appropriate primary studies to the extent possible, supplementing with new literature searches corresponding with the end-of-search windows of relevant good-quality systematic reviews and meta-analyses. We developed literature search strategies and terms for each key question,25 with search dates guided by existing systematic reviews (including the 2002 UPSPTF report) and the development of screening technology.

We conducted 5 separate literature searches, 1 for each key question (except that we combined searching for harms for key questions 3 and 3b, but conducted 2 separate combined harms searches) in both MEDLINE and the Cochrane Central Register of Controlled Trials. Although the searches were specifically designed for a particular key question, all abstracts were reviewed for inclusion in all key questions. All searches covered reports published through January 2008. For all key questions, we supplemented literature searches by reviewing bibliographies of relevant articles (including systematic reviews) and considering studies recommended by experts during and after peer review.

For key question 2a (accuracy of flexible sigmoidoscopy and colonoscopy), we found no systematic reviews conforming to our inclusion and exclusion criteria more recent than the 2002 USPSTF review and therefore searched MEDLINE and the Cochrane Library from January 2000 through January 2008 for primary literature.

Key question 2b (test performance characteristics of newer screening tests) covered 3 tests: CT colonography, fecal immunochemical tests, and fecal DNA tests. We found 11 systematic reviews relevant to newer colorectal cancer screening tests: 6 of CT colonography screening,27,28,132-135 3 of fecal DNA screening,29,136,137 and 2 of fecal immunochemical screening tests.31,37 On the basis of their use of comprehensive search strategies, recent search dates (last search date at least within the last 3 years or no older than 2005), and use of quality assessment of articles as quality indicators, we selected 3 reviews (2 of CT colonography27,28 and 1 of fecal DNA29 to substitute for a portion of the comprehensive search strategy necessary to locate primary studies for key question 2b.26 We searched MEDLINE and the Cochrane Library for additional primary studies for CT colonography and for fecal DNA (January 2006 through January 2008) beginning after the latest systematic review search date. We considered all studies examining CT colonography screening in average-risk patients from the selected reviews,27,28 supplemented by studies in average-risk patients located through our literature search; as a final check, we examined the included studies in other relevant systematic reviews of CT colonography. No additional eligible studies were identified. Although we found several reviews of fecal immunochemical tests (key questions 2 and 3b), none met our standards for methods and reporting. We therefore searched MEDLINE and the Cochrane Library from 1990, when these tests began to be described, through January 2008. We checked our search results against 2 systematic reviews located during our review process to supplement with any potentially relevant studies not already identified.31,37

For key questions 3a and 3b (harms of screening tests), we found no systematic reviews more recent than the 2002 USPSTF review and therefore searched MEDLINE and the Cochrane Library from January 2000 through January 2008 and coded abstracts from both approaches.

Study Selection

In total, we evaluated 3948 abstracts and 490 full-text articles. Abstracts and articles were reviewed against specified inclusion criteria (see below) and required agreement of 2 reviewers. Eligible studies reported on the performance of colorectal cancer screening tests (sensitivity and specificity) or health outcomes. We excluded studies that did not address average-risk populations for colorectal cancer screening, unless an average-risk subgroup was reported. We excluded case-control studies of screening accuracy because these may overestimate sensitivity as a design-related source of bias,30 a problem recently demonstrated clearly for FOBTs.31 To avoid biases related to reference standards, we excluded studies of test accuracy that incompletely applied a valid reference standard or used an inadequate reference standard.32 For CT colonography, we considered only technologies that were compared against colonoscopy in average-risk populations, used a multidetector (not single-detector) scanner,27 and reported per-patient sensitivity and specificity.

Quality Assessment and Data Abstraction

Two investigators critically appraised and quality-rated all eligible studies by using design-specific USPSTF criteria (see below)33 supplemented by National Institute for Clinical Excellence138 and Oxman and Guyatt139 criteria for systematic reviews and QUADAS criteria for diagnostic accuracy studies.140 Only good-quality systematic reviews were used as sources for primary articles, and all poor-quality studies were excluded from the review. One investigator abstracted key elements of all included studies into standardized evidence tables. A second reviewer verified these data. Disagreements about data abstraction or quality appraisal were resolved by consensus. Evidence tables and excluded studies tables for each key question are available in the full report.25

Data Synthesis and Analysis

We primarily report qualitative synthesis of the results for most key questions because of study heterogeneity. Results of key questions 2b and 3b were judged to be too heterogeneous in terms of populations, settings, and study designs for meta-analysis and were therefore qualitatively synthesized. The performance of screening tests is preferentially described per person (sensitivity and specificity), supplemented by per-polyp analysis (miss rates). Ninety-five percent CIs are reported when available.

Because of the stringency of our inclusion criteria for key question 3a (complications of endoscopy), which focused on estimates of harms in the community practice setting, the studies we included were thought to be clinically homogenous enough to allow pooling of complication rates. Meta-analysis was performed to estimate combined complication rates for major or serious bleeding, perforation, and total serious adverse events that require hospital admission or result in death, including perforation, major bleeding, severe abdominal symptoms, and cardiovascular events. Several studies reported that their patients experienced no adverse events, and therefore we used a logistic random-effects model35,36 to include studies without any adverse events and estimate the combined complication rates. The model was described briefly as follows.

Suppose that there are i = 1, ..., n studies and number of complications and total procedures are xi and ni for study i. Denote that the complication rate from each study is pi, then we have

xi ~ binomial (ni, pi)
log(pi / 1 - pi) = β0 + μi
μi ~ N(0, τ2)

where μi is the random effects across studies and τ2 estimates the heterogeneity among studies on the logit scale. The combined complication rate, pcom, would be estimated by

pcom = exp(β0) / 1 + exp(β0)

This model allows inclusion of studies with no adverse events, and the random effects incorporate variation among studies into the combined estimate. A P value less than 0.05 for τ2 is considered to represent statistically significant heterogeneity.

Exploratory meta-regressions were conducted by using logistic random-effects models to examine the association of important study-level characteristics: study design, study setting by country, and population characteristics, including age range, and indication for endoscopy with complication rate. To do this, we need to add only one more term to equation2 of the logistic random-effects model:

log(pi / 1 - pi) = β0 + β1zi + μi

where zi represents any study-level characteristics from study i, and the association of this study characteristic with complication rate is investigated through β1.

The analysis was performed by using the NLMIXED procedure in SAS software, version 9.1 (SAS Institute Inc., Cary, North Carolina), with the code listed in Appendix Table 3.

Review Oversight and Peer Review

The Agency for Healthcare Research and Quality (AHRQ) funded this work, provided project oversight, and assisted with internal and external review of the draft evidence synthesis but had no role in the design, conduct, or reporting of the review. The authors worked with 4 USPSTF liaisons at key points throughout the review process to develop and refine the analytic framework questions, set the review scope, and resolve methodologic issues during the conduct of the review. A draft of the evidence synthesis was reviewed by 8 experts, including experts in the fields of gastroenterology and radiology, and several experts who have written systematic evidence reviews on one or more aspects of colorectal cancer screening.

Return to Contents

 

AHRQ Advancing Excellence in Health Care