Future Directions for the National Healthcare Quality and Disparities Reports
Appendix F: The Expected Population Value of Quality Indicator Reporting
EPV-QIR Calculations for Selected NHQR Measures
Table F-1 presents the results of attempts to estimate or bound EPV-QIR calculations for 14 NHQR measures for which we were able to obtain information on costs, effectiveness (in QALYs), denominator population, and current implementation rate. Appendix C lists the sources of data elements used in our calculations for each measure. Because of resource limitations, our primary goal in developing these estimates was to illustrate potential issues that could arise in the application of the EPV-QIR approach rather than to develop the best possible estimate for any one of these indicators. To facilitate discussion, we assigned a brief mnemonic to each NHQR measure in this report, listed in Column 1 of Table 1. Column 2 provides the measure definition for each NHQR measure. Column 3 shows the denominator population for each measure—i.e., the total number of individuals in the U.S. who should receive the standard of care for a given measure. Column 4 presents the total number of QALYs that can be achieved if all individuals in the denominator population received the standard of care—this is the population value of perfect implementation (PVPI). Column 5 presents the total number of QALYs currently achieved given existing patterns of care in the population—this is the population value of current implementation (PVCI). Column 6 presents the total number of QALYs that can be gained by improving performance on a measure to 100% compliance—this is the maximum population value of quality improvement (MaxPVQI), and it is equal to the difference between PVPI and PVCI.
Table F-2 sorts the 14 NHQR measures by descending order of PVPI. Perfect implementation of all 14 measures would yield a total of 17,852,224 QALYs. Nearly 40% of this total can be obtained by achieving perfect implementation of blood pressure control among adults with diagnosed diabetes (NHQR_DMHTN measure). More than half of the total number of QALYs achievable can be obtained by perfecting implementation of both blood pressure control for adults with diabetes and ensuring annual optimal foot care for adults with diabetes. Examining these 14 NHQR measures alone, we see that perfect implementation of the top 7 measures would yield over 90% of total QALYs possible. Moreover, these high-impact measures are all concentrated in public health domains—diabetes, cervical cancer screening, breast cancer screening, and HIV testing.
Table F-3 lists the 14 NHQR measures in descending order of MaxPVQI. This table provides important complementary insights to Table F-2. Whereas Table 2 identifies those measures with the greatest net health benefit at the population level, Table 3 identifies those measures promising the greatest returns to additional quality improvement in terms of net health benefit. For example, as shown in Table F-2, biennial mammography is associated with large health benefits; however, additional investment to improve mammography may not be warranted. As shown in Table 3, further improvement on this measure is expected to yield only 120,833 extra QALYs—less than 2% of the total additional QALYs that can be potentially gained from improving quality on the full set of 14 indicators.
IV. Scope of Application, Limitations, and Additional Areas for Future Development
Scope of Application
A key determinant of the value of the EPV-QIR approach to selecting and/or prioritizing measures is the extent to which it is applicable across a broad range of measure types. To assess the scope of the approach, it is usual to consider several broad classes of quality indicators:
Process Measures. For process measures defined explicitly on the basis of some standard of care, EVQI can be estimated as long as the net health benefit of S can be estimated using data from published studies.
Composite Process Measures. The 2008 NHQR/NHDR reports on 10 composite process measures. These composites are constructed as "all-or-none" aggregates of individual process measures that measure whether an individual received all standards of care for a given condition. Individuals receiving only some of the enumerated standards are considered to have not received appropriate care, and are scored as such. The EVQI of the composite requires an estimate of the NHBS associated with receiving all components of care in the composite measure. Although NHBs may be calculated for each component in the composite, one cannot sum NHBs across components to calculate the total NHB associated with the composite. The reason for this is that one cannot assume additive separability across components. There may be—for example—complementarities across components of care.
Outcomes Measures. A number of intermediate- and final-outcomes measures are reported in the NHQR/NHDR, and vary substantially in the way that they are defined. The primary problem with these measures is the lack of a specific treatment or intervention that can be identified as a target for improvement, which makes it impossible to estimate net health benefits of a standard of care, intervention, or treatment.
Access/Utilization Rates. The NHQR/NHDR includes several measures defined as population utilization rates. A utilization-based measure is intended to track desirable or appropriate use of health services. These measures may be evaluated using the EVQI approach if the net health benefit for an appropriate unit of access to care can be constructed. However, if these measures are indirect measures of the failure to provide unspecified interventions or services which then, as a consequence, result in otherwise-avoidable utilization of health services, then these suffer the same challenges as mortality-based measures and clinical intermediate outcomes measures in that net health benefits cannot be constructed.
Overuse and Inappropriate Use Measures. As noted above, the EPV-QIR approach can be extended to consider overuse. The cervical cancer screening example discussed in Appendix A provides a good example of how overuse might be addressed. As discussed in Appendix A, inappropriate use measures work similarly, with effects applied over the relevant populations in which inappropriate use is occurring.
Patient Experience Measures. Finally, NHRQ/NHDR contains a number of measures of patient experience/satisfaction. If these measures are assumed to reflect interpersonal quality of care, then the EPV-QIR approach can be applied if net health benefits can be constructed for dimensions of interpersonal relations between patients and providers. If the motivation for patient experience measures is instead driven by interest in promoting patient-centered or preference-concordant care, then estimating the EPV-QIR is more complicated. The EPV-QIR for communication will itself depend on the expected value of perfect information in a specific decision-making context, or what Basu and Meltzer (2007) term the expected value of individualized care (EVIC). Moreover, the expected value of perfect information may vary considerably depending on the amount of financial risk-sharing that a patient faces (Basu and Meltzer, 2007). In general, the EPV-QIR for communication will tend to be greater in the context of preference-sensitive care where alternative treatment modalities or plans of care present "significant tradeoffs affecting the patient's quality and/or length of life" (Dartmouth Center for the Evaluative Clinical Sciences, 2007). Although methods for estimating EVIC have been proposed (Basu and Meltzer, 2007), the communication-themed quality indicators in NHQR/NHDR do not measure individualized care, but rather the potential for obtaining individualized care. Valuation of the benefits of communication in this regard will require not only some estimate of the expected value of individualized care in the context of a specific preference-sensitive clinical care context, but also a patient's willingness-to-pay for communication that will result in preference-concordant care.
Disparities. The EVQI approach could be adapted to calculate measures appropriate for the study of disparities. For example, one could evaluate the value of equal implementation (VEI), or the elimination of disparities across groups. EPV-QIR also lends itself to methods for summarizing disparities across discrete groups. The PVCI of a standard of care in a population comprised of a certain number of groups, denoted by G, can be calculated as the sum of PVCI across all G groups. Each group's share of population health benefits can be calculated as the fraction of PVCI in the gth group, divided by the population total PVCI. The level of disparity in a measure might then be measured using a concentration index to determine the extent to which health benefits are concentrated in a single or a few groups within the population.
Limitations and Implementation Issues
The scope of applicability of the EPV-QIR framework, delineated in the preceding section, also defines the limitations of our approach. The EPV-QIR approach to prioritizing quality indicators may not be feasible for measures where data on costs and benefits of a standard of care within a population (or sub-population) of interest is not available. Practically speaking, it may not be feasible to use EPV-QIR for prioritizing some of the outcomes, access/utilization, and patient experience measures.
In our limited efforts to date in applying the EPV-QIR framework as we have reviewed the current set of 250+ NHQR quality measures, the main challenges we have observed are: 1) lack of data on costs and effectiveness; 2) multiple standards of care or comparators implicit in the quality measure; 3) undefined standards of care/comparators in the quality measure; and 4) lack of data on the size of the eligible population.
Lack of Data Costs, Effectiveness, and the Value of Health. A large number of NHQR quality measures focus on processes or standards of care for which we have not been able to find published studies providing usable estimates of costs and effectiveness. A prime example of a measure with no known cost or effectiveness data is the NHQR Patient Experience of Care Measure, "Children who had a doctor's office or clinic visit in the last 12 months whose health providers showed respect for what they had to say." While we appreciate the intuitive value of this measure, we are not aware of any study measuring the costs and health effects of provider demonstration of respect for patient communication.
Also, the existence of cost-effectiveness studies for a standard or process of care in a measure does not necessarily imply the existence of usable estimates of costs and effectiveness. It is not uncommon for cost-effectiveness studies to publish (incremental) cost-effectiveness ratios only, without a separate table of costs and effects. Unfortunately, cost-effectiveness ratios alone are insufficient inputs into the EPV-QIR calculations. Furthermore, the EPV-QIR technically requires that effects be measured in QALYs, because the NHB calculation involves dividing incremental costs by the cost-effectiveness threshold, which is denominated in units of dollars per QALY.
Also, there may be cost-effectiveness evaluations of a standard or process of care in a measure, but it may not have been conducted in the same population (or a similar population) as that in the denominator of a measure. In these cases, one must judge whether it may be reasonable or valid to use these estimates of costs and effectiveness from dissimilar populations in EPV-QIR calculations, if they are the only estimates available.
Finally, uncertainty about how to value health will surely change estimates of the magnitude and even sign of NHB calculations and all calculations that rely on them. Given this, the robustness of the results of analyses the setting (e.g., at least $50,000 to $200,000 per QALY in the United States).
Multiple Standards of Care/Comparators. The EPV-QIR framework requires assigning an estimate of NHB to the standard of care as well as the comparator or "non-standard" care. For some measures, a single standard of care and comparator may be identified, but for the majority of measures, there are multiple treatment patterns that may be compliant with the standard of care, and/or multiple comparator treatment patterns that are non-compliant with the standard of care. In theory, all treatment patterns that are compliant and non-compliant should be identified, treatment-specific NHBs should be used in the calculations, and the proportion of individuals in the eligible population receiving each treatment pattern needs to be known.
In practice, for measures with multiple standards of care and/or comparators, simplifying assumptions must be made to restrict the analysis to a limited set of treatment patterns that will be considered "compliant" with the standard of care and "non-compliant" with the standard of care. Identification of this set of treatment patterns will hinge on the availability of usable estimates of costs and effectiveness, the prevalence and evidence base for these patterns, and the availability of data on the proportion of the eligible population receiving these different patterns of care. For example, for the NHQR measure, "Percent of women (age 40+) who report they had a mammogram within the past 2 years" any screening occurring at intervals of 2 years or less can be considered compliant with the measure. All other screening schedules—triennial, quadrennial, or intervals of 5 years or longer, in addition to no screening at all, are non-compliant. Each of these non-compliant schedules is associated with different lifetime costs and effectiveness, and thus, a separate NHB should be estimated for each screening schedule. Calculations for this measure are discussed in detail in Appendix F-B, Calculation 2.
Outcomes-Based Measure with No Defined Standard of Care/Comparator. A related problem exists for many outcome measures when there is no explicit standard or process of care referenced in the quality measure. In some cases, it may be reasonable to identify one or a few interventions or standards of care with direct links to the outcomes of interest. EPV-QIR calculations can be carried out if cost-effectiveness studies for the identified standards of care exist. An example of a quality measure with undefined standards and comparators is the NHQR measure, "Number of bloodstream infections (BSIs) per 1,000 central venous catheter (CVC) placements." Bloodstream infection rates are influenced only in part by processes of care by healthcare providers, of which there are many. To calculate the EPV-QIR of this measure and other outcomes-based measures with no defined standards of care, it is necessary to: identify an intervention or group of interventions to be considered; find estimates of the NHB of each intervention and comparator under consideration; and to find estimates of the proportion of the population receiving each intervention/comparator. An example of these calculations is presented in Appendix C, Calculation 3.
Lack of Data on Population Estimates. For some measures, it may be difficult to obtain population estimates of the number of individuals eligible for the standard of care in a measure. For example, for the NHQR measure, "Percent of hospital patients with heart attack and left ventricular systolic dysfunction who were prescribed ACE inhibitor or ARB at discharge," determining the number of individuals eligible for the standard of care requires estimating the number of individuals hospitalized with AMI, as well as the prevalence of LVSD among hospitalized AMI patients. Data from national healthcare utilization surveys can be used to obtain population estimates of the number of discharges for AMI each year; however, information in these surveys may not be sufficient to determine whether a patient had LVSD. Prevalence of LVSD among AMI inpatients may be obtained from reviewing clinical literature. This measure is discussed in detail in Calculation 4, Appendix C.
Uncertainty in Estimates. All the inputs into the above framework may be uncertain and finding ways to reflect this may be important when considering the use of this framework for decision making. When the consequences of a decision are small, a case can be made for making policy based only on expected value (Arrow and Lind, 1970; Claxton, 1999; Meltzer, 2001).
Additional Directions for Future Development
The current formulation of the EPV-QIR framework considers expected value of quality improvement based on net health benefits accruing to a single cohort at a given time point. However, quality improvement undertaken at a single point in time will alter the quality of care for succeeding cohorts. A more elaborate model can be constructed that estimates the expected value of quality improvement based on discounted streams of net health benefits that may be realized over a specific time horizon. Prioritization of measures based on such a model would result in selection of measures offering the greatest rate of return on investment over a fixed period. Similarly, analyses could examine the value of quality improvement research in multiple settings, and through numerous diverse strategies for quality improvement, whether through indicator reporting or other mechanisms.
Bradley EH, Herrin J, Mattera JA, Holmboe ES, Wang Y, Frederick P, et al. Quality improvement efforts and hospital performance: rates of beta-blocker prescription after acute myocardial infarction. Med Care 2005;43(3):282-92.
Dartmouth Center for the Evaluative Clinical Sciences. Preference-sensitive care. A Dartmouth Atlas Topic Brief 2007-01-05. Available at: www.dartmouthatlas.org/downloads/reports/preference_sensitive.pdf. Accessed December 22, 2010.
Farmer AP, Légaré F, Turcot L, Grimshaw J, Harvey E, McGowan JL, Wolf F. Printed educational materials: effects on professional practice and health care outcomes. Cochrane Database Syst Rev 2008 Jul 16;(3):CD004398.
Hoomans T, Fenwick E, Palmer S, Claxton K. Value of information and value of implementation: application of an analytic framework to inform resource allocation decisions in metastatic hormone-refractory prostate cancer. Value in Health 2009;12(2):315-24.
Meltzer DO. Addressing uncertainty in medical cost-effectiveness analysis: implications of expected utility maximization for methods to perform sensitivity analysis and the use of cost-effectiveness analysis to set priorities for medical research. Journal of Health Economics 2001;20:109-129.
National Healthcare Quality Report, 2003. Agency for Healthcare Research and Quality, Rockville, MD. http://www.ahrq.gov/qual/nhqr03/nhqr03.htm.
O'Brien MA, Rogers S, Jamtvedt G, Oxman AD, Odgaard-Jensen J, Kristoffersen DT, Forsetlund L, Bainbridge D, Freemantle N, Davis DA, Haynes RB, Harvey EL. Educational outreach visits: effects on professional practice and health care outcomes. Cochrane Database Syst Rev 2007 Oct 17;(4):CD000409. Review.
Oostenbrink JB, Al MJ, Oppe M, Rutten-van Molken MP. Expected value of perfect information: an empirical example of reducing decision uncertainty by conducting additional research. Value in Health 2008;11(7):1070-80.
Renders CM, Valk GD, Griffin S, Wagner EH, Eijk JT, Assendelft WJ. Interventions to improve the management of diabetes mellitus in primary care, outpatient and community settings. Cochrane Database Syst Rev 2001;(1):CD001481. Review.
Rodriguez HP, von Glahn T, Elliott MN, Rogers WH, Safran DG. The effect of performance-based financial incentives on improving patient care experiences: a statewide evaluation. J Gen Intern Med 2009;Oct 14 [Epub ahead of print].
Rogowski W, Burch J, Palmer S, Craigs C, Golder S, Woolacott N. The effect of different treatment durations of clopidogrel in patients with non-ST-segment elevation acute coronary syndromes: a systematic review and value of information analysis. Health Technology Assessment 2009;13(31):1-77.
Tengs TO, Graham JD. The opportunity cost of haphazard social investments in life-saving. In: Hahn RW (editor). Risks, Costs, and Lives Saved: Getting Better Results from Regulation. New York: Oxford University Press, 1996, pp. 167-182.