Selecting Quality and Resource Use Measures: A Decision Guide for Community Quality Collaboratives
Part II. Introduction to Measures of Quality (continued)
Question 11. What is "risk adjustment" and how is it best applied?
Risk adjustment involves using statistical methods to "level the playing field" by adjusting for the effects of patient characteristics that may vary across providers. Without risk adjustment, users can easily draw incorrect conclusions, because the hospitals or physician organizations that appear to have the worst outcomes may simply have the sickest patients. Risk adjustment is particularly important for outcome measures, because patient outcomes are driven not just by quality of care but also by age, gender, medical history, comorbid illnesses, behavioral and social factors, and physiologic factors. Risk adjustment is not used for structural measures of quality, such as whether hospitals have implemented appropriate error prevention practices, according to the Leapfrog Safe Practices Survey, because implementation of these desirable structures is not related to patient characteristics.
Limitations to Risk Adjustment
The major limitation of risk adjustment is that it can only account for measurable and reported risk factors. Unfortunately, many important risk factors for adverse patient outcomes are either not measurable using available data (e.g., preoperative functional status) or are not consistently reported (e.g., obesity). For some outcome measures, such as heart attack mortality at the hospital level, classification of hospital performance is reasonably robust despite these immeasurable or unreported factors, because between-hospital variation in outcomes is relatively large.87,88 In addition, unmeasured risk factors tend to be randomly distributed across hospitals.89 For other outcome measures, the limitations of risk adjustment are likely to be more problematic, particularly at the physician level, due to clustering of certain types of patients in certain physicians' practices.
Another problem with risk-adjusted outcomes is that they are often misinterpreted. Most risk-adjustment approaches involve estimating indirectly standardized outcome ratios, also referred to as ratios of observed to expected outcomes. These ratios compare the actual outcomes of the specific set of patients treated at each hospital with their expected outcomes had they been treated by an average hospital in the population. If a hospital is identified as a poor outlier, then its outcomes were significantly worse than what would have been expected if the same patients had been treated at a hypothetical average hospital. In other words, each hospital is compared with the hypothetical average hospital treating the same patients, not with any specific hospital treating different patients.90 Therefore, community quality collaboratives should avoid ranking hospitals based on their risk-adjusted outcomes, even though such rankings are easy for users to interpret. If Hospital A's outcomes are significantly better than expected, while Hospital B's are not, then we are more confident that Hospital A offers high quality of care, but we cannot assume that Hospital A is actually better than Hospital B. Therefore, hospitals should be placed into a limited number of "bins" (typically 3-5) based on statistical criteria, and ordered alphabetically or geographically (not ranked) within those categories.
Risk adjustment may be implemented in a wide variety of ways, but most community quality collaboratives use one of the following approaches:
- Adopt indicators that have already been risk adjusted by an intermediary (e.g., Centers for Medicare & Medicaid [CMS] measures of 30-day mortality after heart attack, heart failure, or pneumonia).91
- Use off-the-shelf methods that are built into readily available software programs (e.g., AHRQ Inpatient Quality Indicators and Patient Safety Indicators, 3M Health Information Systems' all patient refined-diagnosis related groups (APR-DRGs), Chronic Disability & Illness Payment System for the Medicaid population).
- Some community quality collaboratives use risk-adjusted rates calculated by State health agencies that develop their own risk-adjustment models for selected conditions or procedures, such as coronary artery bypass mortality in Massachusetts, New York, New Jersey, California, and Pennsylvania. This type of customized modeling is most important when community quality collaboratives want to take advantage of particular strengths of their local data, such as "present on admission" coding of every diagnosis in California (and other States) and "key clinical findings" in Pennsylvania. Both of these data features dramatically improve risk adjustment92; adding fewer than 15 laboratory findings has been shown to eliminate more than 75% of the estimated bias in hospitals' expected mortality rates for major medical conditions.93
Community quality collaborative that are interested in customized modeling, similar to what has been done in California and Pennsylvania, should refer to a standard text in the field before undertaking such analyses. These texts explain the standard methods for estimating, assessing, and validating customized models.94
Community quality collaboratives may not have access to the data needed for risk adjustment, even when risk adjustment is desirable. For example, HCAHPS® (Hospital Consumer Assessment of Healthcare Providers and Systems) survey results regarding hospital care are adjusted for the effects of both mode of survey administration and patient mix before they are publicly reported by the Hospital Quality Alliance. Generally speaking, HCAHPS® adjustments for survey mode are larger than adjustments for patient mix.95 The factors included in patient-mix adjustment include respondent education, age, self-rated health status, emergency room admission, primary language, and service line (i.e., maternity, surgical, medical).96 Although risk adjustment is also useful for patients' assessments of health plans and clinicians,97,98 and the CAHPS Analysis Programs downloadable from the AHRQ Web site include optional risk adjustment, most community quality collaboratives do not receive the respondent-level data necessary for risk adjustment.
Alternatives to Risk Adjustment
Not all quality measures require risk adjustment. Two alternative approaches, which are more commonly used for process measures, include risk stratification and exclusion. Under risk stratification, patients are divided into two or more groups according to their expected risk of the process or outcome of interest. For example, CMS's Nursing Home Compare system includes the "Percentage of High-Risk Long-Stay Residents Who Have Pressure Sores" and the "Percentage of Low-Risk Long-Stay Residents Who Have Pressure Sores" as separate measures (www.medicare.gov/NHCompare/). Occasionally, stratification is used to support numerator definitions of process measures that differ according to the patient's risk status. In this case, a broader time window for the process measure is allowed if the patient is classified as low risk.
Risk stratification can be applied to reporting of CAHPS® data when risk adjustment is impossible; for example, plans could be asked to report separately on the experiences of healthy and sick members,99 members in different markets,100 or members with different benefit designs. Stratification may be particularly helpful for exposing disparities in care and for rewarding plans and physician groups that reduce disparities.101,102 However, reporting stratified data typically requires larger sample sizes than reporting aggregated data, or else stratum-specific estimates of performance are unreliable. Community quality collaboratives may need to provide additional resources to support collecting and reporting stratified data at the local level.
A more widely used approach, however, is simply to exclude patients who do not qualify for the process of care in question, or for whom the process of care has not been shown to confer a clear benefit. For example, all of The Joint Commission's Core Measures of hospital quality for heart attack, heart failure, pneumonia, and surgical care have carefully defined denominators that exclude patients for whom the therapy in question is documented as medically inadvisable (www.jointcommission.org/PerformanceMeasurement/PerformanceMeasurement/default.htm)
Question 12. What are the opportunities and challenges to using patient experience surveys to measure hospital or physician performance at the regional or State level?
Both clinical treatments and patient experiences are important facets of the overall quality of care. In the absence of a standardized set of tools to assess patient experience, many providers in the 1990s designed their own surveys or contracted with leading vendors (e.g., Press Ganey, PRC) to administer vendor-specific surveys. Additionally, Web-based patient experience sites, such as Vitals.com and AngiesList.com, have been established. Some of these sites report licensure/certification information as well as the opinions of a nonrepresentative sample of patients who initiate a posting. Although these sites may demonstrate consumers' desire for data to inform their health care decisionmaking, there is generally no scientific rigor supporting the conclusions.
Today, the premier tool for measuring patient experiences with care is the Consumer Assessment of Healthcare Providers and Systems (CAHPS®) series of surveys created by AHRQ (https://cahps.ahrq.gov). This standardized survey series has been constructed carefully, tested rigorously, endorsed by the National Quality Forum, and accepted by stakeholders nationally. The two surveys most relevant to community quality collaboratives are the Hospital CAHPS® and the Clinician and Group (C/G) CAHPS®. In addition, the National CAHPS® Benchmarking Database (NCBD) provides national benchmarks for many of the surveys.
Hospitals contract with a Centers for Medicare & Medicaid (CMS)-approved vendor to administer the HCAHPS® survey to a sample of all inpatients (not just Medicare beneficiaries), which makes the results relevant to community quality collaboratives. To encourage participation, CMS now links annual hospital payment updates to submission of HCAHPS® survey results. The results are publicly reported on the CMS Web site: www.HospitalCompare.hhs.gov. Some CVEs, such as the Washington-Puget Sound Health Alliance and the Maine Chartered Value Exchange Alliance, incorporate the CMS HCAHPS® results into their public reports.
Physicians: C/G CAHPS
In 2009, the NCBD will provide preliminary benchmarks for the Clinician-Group survey, which will offer useful comparisons for CVEs and local collaboratives already measuring patient experience with physician care. Preliminary evidence from CAHPS® surveys indicates that providing feedback to physicians and their practices improves their quality of care.103 Supplemental items are now being developed to address issues of particular concern to the Medicaid and Children's Health Insurance Program (CHIP) populations, including care for children with chronic conditions and people with impaired mobility, reduced health literacy, and other special health care needs.
Patient Experience Surveys: Challenges and Possible Solutions:
- Challenge: Cost of survey administration
Possible Solution: The resource challenges of administering the C/G CAHPS® survey for small group practices can generally be overcome. A study by a small practice in Pennsylvania found that for about $1,000, or $200 per physician, reliable data could be captured.5 The CAHPS® Consortium estimates a cost of $8 per completed survey for mail administration, or $360 per clinician. Practices of all sizes can obtain data appropriate for benchmarking and goal setting from the NCBD. The CAHPS® User Network recommends that practices contact researchers at local universities for help with statistical analysis, although a better long-term strategy may be to create a permanent infrastructure for data management and analysis through CVEs or similar collaboratives.
- Challenge: Difficulty of gaining provider buy-in
Possible Solution: Some pay-for-performance programs reward providers for participating in the survey. For example, California's Integrated Healthcare Association (IHA) sponsors a pay-for-participation incentive program to encourage physicians to participate in California's version of C/G CAHPS®. Nearly universal hospital buy-in with HCAHPS has been achieved through a 2% annual payment update incentive from CMS.
- Challenge: Concerns about differences in case mix across providers
Possible Solution: Transparent methodologies for case-mix adjustment have been carefully tested by the CAHPS® Consortium. The CAHPS® Consortium recommends adjusting for self-reported general health status (i.e., excellent, very good, good, fair, poor), age, and education. Older individuals and those in better health tend to rate their care, plans, and providers higher than younger individuals and those in worse health. There is also evidence from a number of studies that education affects ratings, with more educated individuals giving lower ratings. However, users of CAHPS® Analysis software can specify an unlimited number of adjuster variables or choose not to adjust the data at all, depending on their preferences and data quality.
- Challenge: Duplication of data collection effort
Possible Solution: Another barrier to physician participation is the duplication of effort that occurs when a single physician's patients are surveyed multiple times by different organizations (e.g., health plans, State agencies). To minimize that redundancy and use resources efficiently, AHRQ is developing a strategy for administering the survey for a physician only once in a given period.53 California avoids redundancy through its California Cooperative Healthcare Reporting Initiative (CCHRI), which coordinates the annual survey of medical groups statewide. They also coordinate public reporting of results with the State's Office of the Patient Advocate. Massachusetts Health Quality Partners, part of the Massachusetts Chartered Value Exchange, coordinates a similar effort.104 Others are considering administering the survey biennially.
- Challenge: Poor response rates
Possible Solution: In general, response rates to mail surveys have been declining over the past two decades. Variation in response rates across provider organizations may lead to bias in estimating comparative performance. Three modes of survey administration are available (mail only, telephone only, mixed mode); mail with telephone followup achieves the best response rates (i.e., about 40%, or about 10% higher than mail only) in most settings. Regardless of which mode is selected, survey sponsors should always follow recommended protocols to improve contact rates and response rates and should report their results. According to CAHPS® reports, patients may give more positive reports and ratings of care when the data are collected by telephone as opposed to mail, but there is not yet a standard method to adjust for this difference.
Community Collaborative Example
In a nationwide effort to report patient experience with physician care, Consumers' Checkbook/Center for the Study of Services is testing a modified version of C/G CAHPS with three collaboratives in a pilot project to report patient experience at the individual physician level. One of the collaboratives, the Kansas City Quality Improvement Consortium, reported that this pilot survey is very similar to the C/G CAHPS® survey, except for a few "dropped" questions and a few modified demographic questions. Consumers' Checkbook worked with the local communities and health plans to identify the patient and physician population. Of the seven health plans in the area, three (including the largest) contributed patient and physician contact information.
A community awareness campaign about the survey was implemented to familiarize physicians and consumers with its purpose and to increase response rates. Surveys were mailed in November 2008 and again in January 2009 to nonrespondents. They achieved a 47% response rate in Kansas, with an average of 57 responses for each of the 713 participating physicians. Preliminary results will be distributed to physicians for a 60-day review and comment period, as required by NCQA. The goal was to create a useful presentation of information, based on consumer focus group feedback and to publicly report results in 2009.105
Question 13. What is the "Better Quality Information" pilot project, sponsored by the Centers for Medicare & Medicaid Services, and what can be learned from it?
The Better Quality Information to Improve Care for Medicare Beneficiaries (BQI) Project (www.cms.hhs.gov/bqi/) is a CMS-funded pilot project that ended in late 2008. The Delmarva Foundation, CMS's contractor, subcontracted with six pilot sites, five of which were Chartered Value Exchanges (CVEs), to test methods to aggregate Medicare claims data with claims data from commercial health plans and Medicaid to calculate and report quality measures for physician groups and individual physicians. (The pilot sites were California Cooperative Healthcare Reporting Initiative, Indiana Health Information Exchange, Massachusetts Health Quality Partners, Minnesota Community Measurement, Phoenix Regional Healthcare Value Measurement Initiative, and Wisconsin Collaborative for Healthcare Quality.)
The project aims were: (1) to study the challenges and benefits of aggregating Medicare fee-for-service data with other regional quality data, including commercial payer administrative data and provider-submitted data, to calculate quality measures for ambulatory care; (2) to study the benefits of reporting quality measures to physicians and other providers of care; and (3) to study the benefits of reporting quality measures to beneficiaries.
Findings Pertinent to Measure Selection
The final report provides a rich source of information that can guide the selection of measures and ensuing data collection.44 In particular, CVEs and other collaboratives may find Chapter 2 relevant to their needs in selecting measures. The BQI project applied a version of the measure selection criteria outlined in Question 20 of this Decision Guide, including an iterative process that considered several measure sets authored by different developers.
The six sites first considered the AQA Starter Set and later considered Healthcare Effectiveness Data and Information Set(HEDIS) and other measure sets. Their reasons for excluding individual measures ranged from limited relevance to Medicare beneficiaries (e.g., screening for HIV) to the need for medical chart review, which was outside the scope of this project. Three pilot sites chose to retain locally developed measure specifications for the selected BQI measures. The sites also varied in their use of data sources, in how they defined their target population, and in how they included medical professionals. The BQI report states that they were "encouraged by both the consistency and the variation" in measures as it allowed analysis of the effects of minor differences on measure outcomes.
Findings Pertinent to Data Sources and Attribution
Most of the sites included administrative data from commercial payers, and many included clinical data reported from physicians, hospitals, and other providers. Most sources were electronic but some were from paper records. Some sites had experience with electronic transmission of laboratory data and pharmacy/Medicare Part D data. Generally, their choices were guided by what was available and practical.
Proper attribution of patients (to physician and group) and physicians (to medical group) is critical to ensuring that physician and group scores are calculated and interpreted correctly. Attribution proved to be challenging because physician identifiers (Unique Physician Identification Numbers [UPINs] and Taxpayer Identification Numbers [TINs]) were either not consistently available on all claims or could only link patients to large, corporate groups, rather than clinic sites.
All three methods (UPIN, TIN, and physician group roster) of assigning physicians into groups using claims data were found to have inaccuracies of 10% or more. The project participants concluded that it is preferable to aggregate data at the individual provider level. Results can "then be combined using consistent rosters for medical group-level reporting." However, merging data sets at the higher level of medical groups introduced complex errors due to nonstandard assignment of physicians to groups. The report stated, "assigning physicians to groups using TINs is easiest because the TIN is available in all encounter forms while group identifiers and UPINs are not. Methods to allow individual providers to correct their medical group memberships were found to be effective by the BQI pilot sites" (Appendix 6 of the BQI report provides detail).
All six pilot sites thought that inclusion of Medicare fee-for-service data gave "a more complete picture of care quality" because Medicare beneficiaries represent a major population segment in all pilot communities. Moreover, increasing the "N" (denominator) for individual physicians helped to stabilize the measure results, which would not have been feasible without the Medicare data. Harmonizing data standards and measure specifications is critical to meeting regional and national comparative reporting needs, and the BQI project contributed to accomplishing that difficult task. In future followup projects, proper attribution of claims to individual physicians or physician practice sites is the critical issue that will need to be addressed.