Future Directions for the National Healthcare Quality and Disparities Reports
Chapter 4: Adopting a More Quantitative and Transparent Measure Select (pt. 2)
Changing the Status Quo
The NAC provides AHRQ with advice on "the most important questions that AHRQ's research should address in order to promote improvements in the quality, outcomes, and cost-effectiveness of clinical practice" (AHRQ, 2010). The committee considered whether the existing NAC could perform the necessary assessment of performance measures recommended by the Future Directions committee and concluded that it could not.
The NAC's advice is solicited for all of AHRQ's activities and is not solely directed to the content and presentation of the NHQR and NHDR (AHRQ, 2009b). Private sector members are appointed for three-year terms, and members of seven federal agencies also serve in an ex-officio capacity. The NAC currently meets three times a year for one day each time. The NAC, as currently constituted, does not have sufficient technical expertise to systematically apply constructs of clinically preventable burden (CPB), cost effectiveness (CE), and other valuation techniques to measurement prioritization and selection. Adequate expertise is necessary to evaluate any staff or contract work that supports the evaluation exercises; other prioritization and evaluation processes for guidelines and measures have found the need for such technical expertise on the decision-making body itself when employing rigorous grading of recommendations (Baumann et al., 2007; Guyatt et al., 2006). Additionally, the workload associated with quality measure selection and prioritization would be substantial and could interfere with current NAC duties. A new body to advise AHRQ with no affiliation with the NAC could be formed with the requisite expertise, but this approach raised concerns about lines of communication with AHRQ and disengagement from AHRQ's overall portfolio of work. Instead, building on precedent, the committee decided to recommend a technical advisory subcommittee to the NAC.
Proposed NAC Technical Ad isory Subcommittee for Measure Selection
The recommended NAC Technical Advisory Subcommittee for Measure Selection would differ from the current informal NAC subcommittee that provides general advice on the NHQR and NHDR. The current subcommittee is made up of NAC members and has limited face time with AHRQ staff (e.g., approximately one hour prior to the overall NAC meeting). The Technical Advisory Subcommittee for Measure Selection should have a more formal structure and will need more days per year to do its work, as well as the ability to commission and fund studies through AHRQ to support its deliberations.
A precedent for this more formal relationship is the NAC Subcommittee on Quality Measures for Children's Healthcare in Medicaid and Children's Health Insurance Programs that was formed for a specific task—namely, the identification of an initial core measure set for children under the Child Health Insurance Program Reauthorization Act.7 This NAC subcommittee includes two members from the NAC but meets separately from the NAC for detailed working sessions. The relationship of the NAC Subcommittee on Quality Measures for Children's Healthcare in Medicaid and Children's Health Insurance Programs is shown in Figure 4-1, and the Future Directions committee envisions the same relationship for the NAC Technical Advisory Subcommittee for Measure Selection for the NHQR and NHDR. Other NAC subcommittees have previously been formed for specific substantive tasks (e.g., safety).
Individuals chosen to serve on the proposed subcommittee should include people with responsibilities for performance measurement and accountability; experts in measure design and data collection; health services researchers; and subject matter experts in applying quantitative techniques to evaluate gaps between current and desired performance levels, and on issues of disparities, economics, and bioethics. The subcommittee should ensure that membership accounts for both consumer and provider perspectives. A subject matter expert in disparities need not be limited to health services researchers but could also include representation, for example, from communities of color to ensure sensitivity to the concerns of smaller population groups when determining high impact areas. It would also be useful to have an individual with expertise in quality improvement in fields other than health care to share the challenges faced and overcome. The committee believes that the NAC Subcommittee for Measure Selection should have approximately 10 to 15 persons in order to encompass all of these areas of expertise. The emphasis in the skill set of the subcommittee is technical expertise; the NAC will balance this out with its broader stakeholder representation.
The NAC Technical Advisory Subcommittee for Measure Selection will need staff and resources to help carry out its work in quantifying which areas of measurement constitute the greatest quality improvement impact considering value (health outcome for resource investment or net health benefit)8 and population and geographic variability. The committee believes that AHRQ's current NHQR and NHDR staff would play an important role in identifying content areas where there are actionable quality problems. However, the committee concludes that AHRQ would need to supplement its current report staff with other in-house technical experts, and/or seek assistance from entities such as the AHRQ-sponsored Evidence-Based Practice Centers or other outside contractors. Such additional experts could provide much of the detailed quantitative analyses to support the measure prioritization and selection process for review by the subcommittee. The Evidence-Based Practice Centers might be an attractive model because they could develop a core of expertise and then gear up and down using contracting mechanisms according to the review workload (AHRQ, 2008b). Even with this additional expertise available, the NAC Technical Advisory Subcommittee for Measure Selection should include individuals with sufficient expertise to evaluate technical materials in areas such as cost-effectiveness analysis, statistics, assessment of clinically preventable burden, and valuation from a bioethics as well as an economic perspective.
The NAC Technical Advisory Subcommittee for Measure Selection might want to use a variety of approaches in soliciting measures for the reports and in refining its selection criteria. Possible approaches include (1) issuing a public call for measures for inclusion/exclusion and areas needing measurement development or refinement, as well as suggestions for data support; (2) commissioning studies (e.g., comparison of different valuation techniques on the prioritization scheme, development of systematic reviews of presumed high-impact areas, valuation of disparities); (3) forming strategic partnerships with entities doing measurement development and endorsement applicable to the reports (e.g., NQF, the National Committee for Quality Assurance, the National Priorities Partnership, the American Medical Association's Physician Consortium for Performance Improvement, other HHS agencies such as CMS) to reduce duplication of effort; and (4) working with the Centers for Disease Control and Prevention (CDC) on those areas of health care improvement closely linked to priority public health outcomes and goals as well as the similar application of valuation techniques recommended for community-based prioritization in conjunction with Healthy People 2020 (go to Box 4-3).
Enhancing Transparency in the Selection Process
The committee believes that transparency in AHRQ's process for selecting performance measures for the NHQR and NHDR is extremely important. In 2008, an IOM report stressed that transparency is a key to building public trust in decisions by having "methods defined, consistently applied, [and] available for public review so that observers can readily link judgments, decisions or actions to the data on which they are based" (IOM, 2008, p. 12). Transparent processes for decision-making bodies have been described as:
- documenting decision-making by providing a public rationale;
- reviewing the effects of the prioritization (Downs and Larson, 2007; Sabik and Lie, 2008); and
- establishing and applying clear principles and criteria on which prioritization is based.
Each of these aspects of transparency is examined in the discussion that follows. The NAC and its subcommittees—which would include the proposed NAC Technical Advisory Subcommittee for Measure Selection—conduct their business in public under the Federal Advisory Committee Act.9 The fact that these bodies operate in public under this law is an attractive facet of their operation.
Documenting Decision-Making by Providing a Public Rationale
Documentation of the rationale behind the NAC subcommittee prioritization decisions, the evidence supporting the decisions, and an understanding of the role that data or resource constraints play in the decisions should be transparent. Furthermore, that information should be readily available for public access and in a timely fashion (Aron and Pogach, 2009). Such documentation should include analyses and syntheses of data and evidence produced by staff or obtained through other means. The Future Directions committee is particularly interested in this level of documentation because of its potential value in stimulating creation of an agenda for measure and data source development (including testing additional questions on existing data collection surveys or inclusion of elements in electronic health records) when desirable measures or data are not yet available (Battista and Hodge, 1995; Gibson et al., 2004; Whitlock et al., 2010). Documentation would also support why certain measures might either no longer be included in the print version of reports or removed from tracking altogether.
Reviewing the Effects of Prioritization
Prioritization is not a static activity but an "iterative process that allows priority setting to evolve" (Sabik and Lie, 2008, p. 9). With respect to the 46 core measures used in the print versions of the NHQR and NHDR, the process for selecting performance measures recommended by this committee could result in extensive changes in the measure set; the process, however, will be an iterative one. The existing measures displayed in the reports or the State Snapshots would not necessarily all be replaced. It would be logical for the NAC Technical Advisory Subcommittee for Measure Selection to begin its work by determining the relative prioritization within the existing core measure group, as currently there is no priority hierarchy within selected measures as all are given equal weight in assessing progress.
It is not known to what extent the existing measures within the NHQR, NHDR, or Web-based State Snapshots are specifically adopted as action items in whole or part by various audiences. This makes it difficult to evaluate the impact of changing the current measures on aspects other than report production within AHRQ. The committee posits that making public the conversation about which measures will or will not have national or state data provided for them will enable AHRQ to begin to document in a more systematic fashion who uses the reports, how the data are used, and the potential impact of keeping or deleting measures.
Principles And Criteria For Selection
In order to establish a transparent process for creating a hierarchy among performance measures being considered by AHRQ, the articulation of principles and criteria is necessary.
Before outlining the steps in the measure selection process, the Future Directions committee defined two principles that would guide the design. The first guiding principle is the use of a quantitative approach, whenever feasible, for assessing the value of closing the gap between current health care practice and goal levels (i.e., aspirational goal of 100 percent or other goal such as one derived from the relevant benchmark).10 To date, AHRQ's measure selection process has not focused on evaluating what it would take to close the performance gap, or the potential benefits that could accrue to the nation in doing so for the reported measures. The committee's second principle in prioritizing measures is taking specific note of significant, unwarranted variation in health care performance with regard to disparities across population groups, geographic areas, and other contextual factors such as types of providers or payment sources. Application of these principles can result in reducing the burden of reporting to those areas that are deemed most important (Romano, 2009).
Upon applying the principles in the measure selection process, the following provide further guidance:
- Simply stated, measures should be prioritized and selected based on their potential for maximizing health care value and equity at the population level.
- Priority should be given to selecting measures that maximize health benefit, improve equity, and minimize costs within a context that is respectful of and responsive to patient needs and preferences.
- Measures that are principally relevant to a particular group even if they have less significance to the U.S. population as a whole (e.g., quality measures for treatment of sickle cell anemia) should be considered in measure selection.
- The process, to the extent feasible, should be operationalized using formal quantitative methods and transparent decision-making.
Thus, the emphasis is on investing in measures of conditions with the most impact while considering the ethical principle of fairness. Siu and colleagues (1992) used such quantitative approaches to recommend measures for health plans in recognition that "limited resources [are] available for quality assessment and the policy consequences of better information on provider quality, priorities for assessment efforts should focus on those areas where better quality translated into improved health" (Siu et al., 1992).
Steps in the Process and Criteria
Figure 4-2 provides a schematic outline of the steps in the Future Directions committee's proposed process for reviewing performance measurement areas—both for currently reported measures and new measures—for inclusion in the NHQR and NHDR. Inherent in relative ranking would be the identification of measures that could be dropped by AHRQ from tracking if they rank at a low level. Additionally, the process builds in specific steps for identification of measure and data source needs that should be formally captured for inclusion in a strategy for research and data acquisition for future national reporting.
Previous IOM guidance regarding the selection of performance measures for the NHQR and NHDR gave greater prominence to the criterion of importance, noting that measures not meeting this criterion "would not qualify for the report regardless of the degree of feasibility or scientific soundness" (IOM, 2001, p. 83). NQF similarly stresses that every candidate measure for the NQF endorsement process "must be judged to be important to measure and report in order to be evaluated against the remaining criteria" (NQF, 2009a). To date, NQF has endorsed more than 500 measures. Although each of these measures may be useful for a specific quality improvement circumstance, there is a need to prioritize among the many possible measures for national reporting purposes. This committee recommends refining the pre-existing AHRQ-, IOM-, and NQF-recommended measure selection and endorsement criteria of importance to include consideration of recommended national priority areas, and an evaluation of the relative value of closing quality gaps, including consideration of equity (go to Criteria A, B, C, D, E, and F).
Environmental Scan for Importance
Identifying which areas should be considered important to monitor for performance improvement is a first step and could be undertaken by AHRQ staff prior to the Technical Advisory Subcommittee meeting. An environmental scan to identify those potential areas would include the type of factors that AHRQ has previously considered (go to Appendix E), as well as looking to the potential effects of changing population dynamics on overall national health status, the burden of disease, and appropriate health care utilization. Additionally, ideas for possible candidate measurement areas for review could come from staff review of the literature for presumed high-impact areas and from nominations of areas for consideration from sources internal and external to HHS, including the assessment of measures for leading conditions under Medicare by the NQF (HHS, 2009c). This process could include a public call for measure priorities, including measures specific to priority populations. Given the national healthcare reports have Congress as a major audience, querying staff of pertinent committees of their interest areas would be advised; some of these interests are expressed in existing and proposed legislation (e.g., high cost conditions under Medicare; insurance coverage; child health).
In general, the measurement areas that are important for the nation's population as a whole tend to be equally important for smaller population groups; disparities can be found in most of the standard quality measures included in sets such as HEDIS (Fiscella, 2002; Lurie et al., 2005; Nerenz, 2002). Thus, it is useful to have the same measures in both the NHQR and NHDR. However, the NHDR also reports on priority populations, and the environmental scan should note if there are specific measures that should be considered and ranked for individual populations (e.g., racial and ethnic groups, rural areas, individuals with disabilities). There are conditions and circumstances that disproportionately affect minority and other priority populations, and consideration should be given to developing measures for those areas if they do not yet exist (The Commonwealth Fund, 2002).
Criterion A, improvability, contains several aspects: one is whether a higher level of quality is feasible, as evidenced by high performance in some sectors or among some populations, and another is whether methods of improvement are available, and as applicable, whether the barriers to that improvement can be identified.11 The cost of implementing quality improvement activities can be a realistic barrier. That should not preclude further evaluation of a measurement area for national reporting, but it may ultimately affect its ranking. The Technical Advisory Subcommittee may encounter areas that are considered very important but have an insufficient evidence base for reliable and perhaps lower cost interventions; in that case, the topic areas should be considered for further implementation research to improve the evidence base. Most implementation research, however, does not rise to the level of rigor of randomized controlled trials (RCTs); the Future Directions committee believes other types of trustworthy study designs can be utilized to establish the evidence base.
Scientifically Sound Measure Availability
Application of Criterion B, scientific soundness, follows identification of importance and improvability because if the area is not one that is meaningful and important, it will not matter how scientifically sound a measure is. Furthermore, valid measures may not yet be ready for all areas considered very important, and thus these measurement areas should be considered as part of a measure development strategy.
Under the process outlined in Figure 4-2, the actual ranking of measures weighs their applicability to national priorities (Criterion C), the value of closing the gap between current and desired performance levels (Criterion D), and equity concerns (Criterion E for disparities among sociodemographic groups, and Criterion F for disparities among geographic regions or health systems/payers).
Major questions face AHRQ with regard to measures in the NHQR and NHDR:
- Are the 46 measures in its core set for the NHQR and NHDR the right ones to accelerate health care quality improvement in the Nation?
- Would a different set of measures offer a better yield on investment in interventions to close quality gaps?
In a similar vein, when thinking about selection of measurement areas for tracking and improvement in Healthy People 2020, the question was asked, "If I have my last dollar what should I spend it on?" (Secretary's Advisory Committee on National Health Promotion and Disease Prevention Objectives for 2020, 2008a). The implicit notion in selecting a measure for national reporting should be that there is a significant quality gap that needs to be addressed/closed. The elevation of an area and its measure to national prominence would likely mean that resources would ultimately follow to implement quality measurement as well as provide interventions to eliminate those gaps. Making choices among measures has consequences for influencing national quality improvement efforts.
Thus, to answer the first question, the NAC Technical Advisory Subcommittee for Measure Selection could begin with evaluating AHRQ's current core measure set to determine how much improved performance in those areas would contribute to the nation's health. Any newly considered measurement areas could be compared with the existing set as might some measures in the expanded measure set featured in the State Snapshots or NHQRDRnet. The committee believes that the subcommittee could be formed immediately to begin this work. To answer the second question, measures would be ranked according to their potential contribution; depending on the focus of any national strategy or realignment of investment, there can be differentials in outcomes (see discussion later in the chapter of the work of Tengs and Graham  and in Appendix F contributed by Meltzer and Chung)
Candidate measures would then be screened for their applicability to national priority areas (Criterion C). go toBox 2-3 in Chapter 2 for the Future Directions committee's recommended priority areas; other priorities may emerge in establishing a national health reform quality improvement strategy. The AHRQ measure selection process can help inform which measures should be highlighted as part of any such national strategy. It may turn out that numerous measures might pass through the screen of being applicable to national priority areas; not all of these should be automatically included in the national healthcare reports. Other ranking criteria need to be taken into account. Nevertheless, applicability to national priorities is an important factor for inclusion of measures in the reports. There may be measures that have been tracked by AHRQ that do not directly correspond to the national priority areas, and if deemed desirable, these could continue to be tracked in other report-related formats (e.g., online appendixes, State Snapshots, or NHQRDRnet) or through links to more extensive datasets (e.g., the National Health Interview Survey or Centers for Medicare and Medicaid Services' analyses or datasets) so that interested stakeholders could continue to track those data. The committee recognizes that priority areas may change over time, so encourages flexibility in maintaining additional measures.
The next step involves screening candidate measures for their relative quality improvement impact (go to boxes with Criteria D, E, and F in Figure 4-2). Measures would be assessed according to the potential to increase health care value (Criterion D), and this step also recognizes inequities along demographic lines and the possibilities of geographic and health systems variance (Criteria E and F).
Criterion D A value (Criterion D) is assigned to a measurement area based on a quantitative expression of the outcome of closing the gap between the current average U.S. performance and the desired performance level. The most simplistic approach would be to assess all measures against the aspirational level of 100 percent performance. While it might be desirable to have all appropriate persons receive a service, alternate fixed points other than 100 percent could also be used for analyses to further establish rankings for interventions; for instance, comparing the quality improvement impact if 90 percent versus 100 percent received care, as there may be only a very marginal impact after achieving a certain level of performance. Similarly, there may be a better yield when quality improvement interventions are focused on certain populations or age groups (go to Appendix F for further discussion of assessing the value of quality improvement). Goal levels could also be informed by the benchmarks achieved by best-in-class performers. Several scenarios of performance may need to be assessed for each measure to determine how best to focus resources and how to ultimately rank measures. Quantitative techniques for valuation are discussed later in this chapter.
In some areas of quality measurement, the applicability of techniques such as net health benefit and cost effectiveness analysis may work less well, or sufficient data may not be available. However, it is rarely the case that one has all the necessary information to do these estimates; invariably the analyst has to make some assumptions for analysis. It would be possible to consider, at a minimum, for most measures:
- What is the size of the population affected by the performance or equity gap (e.g., number of persons who would benefit if current levels of performance were improved to best performer [benchmark] or goal level)? This can readily be calculated based on the difference between the number of persons who would benefit under optimal versus current conditions. This factor often drives estimations of net health benefit.
- Existing measures can be ranked based on the relative size of the population affected. When considering equity, the number of additional persons within particular disparity populations who would receive the intervention (if equity in performance were achieved) should be compared.
- What is the potential impact of the intervention or care process reflected by the measure on health, well being, patient-centeredness, and/or costs? Interventions and care processes differ in their available evidence base, but numbers needed to treat (NNT) or a comparable measure of population impact are feasible. Effectiveness, safety, and timeliness measures can be prioritized based on interventions/processes that maximize population impact (e.g., lower NNT) while minimizing costs. Efficiency measures should be prioritized based on interventions/processes that minimize costs while maximizing health benefit. Access measures can be similarly assessed based on the evidence base for health benefit or linkage to key interventions associated with health benefit and health care system costs (e.g., avoidable hospital admissions). Patient/family-centeredness measures can be evaluated based on estimations of the potential impact for improving the responsiveness of the system to patient/family needs, values, and preferences related to care processes and interventions.
Criteria E and F Once value assessments are made, rankings can be established from greatest to lowest impact and then the impact on equity would be taken into account. What does taking equity into account mean? Having evidence of large disparities and variation would give greater weight for inclusion in the NHQR and NHDR to measures that are otherwise equal in the valuation step. Equity differences (both Criteria E and F) can be separately ranked by applying quantitative techniques such as net health benefit and cost effectiveness analysis; however, data are often not available to stratify every measure by sociodemographic variables, payers, and small area geography. Furthermore, ranking each measure by 15 sociodemographic categories and by multiple geographic variables, for example, may not lead to a consistent ranking pattern. However, available studies and data can inform the expected degree of disparity and allow assumptions about whether disparities exist at all, are relatively minor in degree of difference, or are of major concern.
AHRQ has chosen its current measure set, in part, based on the availability of subpopulation data to be able to report differences among population groups in the NHDR; thus, equity rankings may be more feasible with these measures. Incorporating new and better measures may mean that subpopulation data are not yet always available, but this factor alone should not preclude selection of such a measure.
If equity is hard to determine, why should it be part of the measure selection process? The need to pay specific attention to equity has been noted in other health-care prioritization practices (Bleichrodt et al., 2004, 2008; Stolk et al., 2005). Because populations at risk of disparities may have a small number of members, prioritizing measures based on overall national health impact or burden alone is unlikely to result in measures that capture some disparity gaps—even significant ones—to rise to the top of a ranking for inclusion in the NHQR and NHDR. At times, equity considerations may need to trump the overall valuation (for instance, if there is a large disparity gap, but the overall difference between national performance and the aspirational performance level is relatively small). Additionally, there may be measurement areas where the impact of a condition for one of the priority populations is profound. In these cases, the needs of the population could have precedence even if the overall valuation did not rank the measure highly for the entire population of the nation; then, the measure may be most appropriate to feature in the priority population section of the NHDR.
Measure and Data Development Strategy
The committee envisions the measure selection process as not only prioritizing measures but also informing a strategy for measure and data development (the boxes with asterisks in Figure 4-2). The results of applying the criteria of improvability (Criterion A) and availability of a validated measure (Criterion B) are steps in the selection process to inform the measurement research agenda when the answer to those criteria is not affirmative. A final consideration is the availability of national data to support reporting. A measure need not be excluded if national or subpopulation data are not currently available; alternative sources such as subnational data may be useful (go to Chapter 5). It is realistic that the cost of acquiring data will remain a feasibility consideration, although the committee recommends that sufficient resources be available to AHRQ to revamp its products and acquire data to support important measurement areas (go to Chapter 7).
7 Children's Health Insurance Program Reauthorization Act, Public Law 111-3, 111th Cong., 1st sess. (January 6, 2009).
8 Health outcome for resource investment and net health benefit reflect quantitative concepts and are aspects of the concept of value discussed in Chapter 3.
9 Federal Advisory Committee Act, Public Law 92-463, 92nd Cong., 2nd sess. (October 6, 1972).
10 The terms aspirational goal, benchmark, and target as used in this report are defined in Box 2-1 in Chapter 2. An aspirational goal is the ideal level of performance in a priority area (e.g., no patients are harmed by a preventable health care error; all diabetes patients receive a flu shot—unless contraindicated). Benchmark is the quantifiable highest level of performance achieved so far (e.g., the benchmark among states would be set at 66.4 percent of diabetes patients received a flu shot because that represents the highest performance level of any state). Target is a quantifiable level of actual performance to be achieved relative to goal, usually by a specific date (e.g., by January 1, 2015, 75 percent of diabetes patients will receive an annual influenza shot).
11 The IOM report Priority Areas for National Action: Transforming Health Care Quality (IOM, 2003) set about identifying priorities for quality improvement, specifically to identify areas for actionability; the report used the term impro ability—the extent of the gap between current practice and evidence-based best practice and the likelihood that the gap can be closed and conditions improved though change in an area.