Pay for Performance: A Decision Guide for Purchasers

Phase 2. Contemplation

In this section, we discuss Questions 4-13, which purchasers need to address once they have decided that they will undertake a P4P initiative. They are:

Question 4. Which providers should we target first? Hospitals or physicians? Specialists or primary care providers?
Question 5. For physicians, what are the advantages and disadvantages of targeting individual clinicians versus medical groups? In the case of hospitals, what are the advantages and disadvantages of targeting individual hospitals versus hospital systems?
Question 6. Should provider participation be voluntary or mandatory?
Question 7. Should we use carrots or sticks—bonuses or penalties—or a combination?
Question 8. How should the bonus be structured?
Question 9. Should we use relative or absolute performance thresholds?
Question 10. What are our options for phasing in P4P?
Question 11. Where do we find the money?
Question 12. How much money should we put into performance pay?
Question 13. What measure characteristics make them attractive candidates for inclusion in an initial measure set?

Question 4. Which providers should we target first? Hospitals or physicians? Specialists or primary care providers?

A recent study suggested that the majority of P4P programs now target both primary care physicians (PCPs) and specialists and about 25 percent target hospitals.1 Three key factors help determine which types of providers should be the initial focus of P4P programs:

  • Most significant performance (quality or cost) problems. All else equal, payment incentives should be introduced where the greatest gains may be achieved. Uncovering local quality problems might require claims data analysis but also could be informed by reviewing existing data such as HEDIS data (health plan report cards); the Dartmouth Atlases, which report a variety of utilization, cost, and quality measures by geographic area; and the National Healthcare Quality Report (NHQR),20 which annually tracks nearly 200 measures on a nationwide basis and includes measures of care in a variety of settings. In addition, the online State Snapshots based on the NHQR identify potential areas for quality improvement in every State in the Nation.
  • Share of covered services delivered by different categories of providers. If few covered beneficiaries ever use a type of provider (e.g., rehabilitation facilities), then the value of changing practice patterns may be small.
  • Available performance measures and existing data for each type of provider. A prerequisite for P4P is that there must be valid and reliable performance measures to capture the relevant dimensions of provider behavior and/or patient outcomes. The existence of a set of validated measures is important, not only for the effective design of the payment system but also for securing the support of providers. There has been a great deal of collective investment in quality measurement focused on certain areas, e.g., preventive care. For some specialist physicians and hospital departments, however, there are few accepted measures of clinical quality of care. Structural measures—such as those found in the NCQA’s Physician Practice Connections tool—and patient experience measures—such as CAHPS®—may be applicable to a wide range of physician specialties.

Resources for identifying performance measures that are in use and have been validated include:

  • Joint Commission on Accreditation of Healthcare Organizations.
  • National Quality Measures Clearinghouse.
  • National Quality Forum.
  • National Committee for Quality Assurance.
  • Hospital Quality Alliance.
  • Ambulatory Care Quality Alliance.

In addition, the CAHPS® family of measures offers several validated instruments for measuring patient experience with physicians, medical groups, hospitals, hemodialysis centers, and nursing homes in addition to health plans.

Purchasers should be actively looking to augment these nationally accepted measure sets particularly in many specialty areas. P4P should apply to all high-volume specialists and not purely PCPs. Most of the real cost and quality drivers involve chronic disease processes that are more often managed by specialists.

—Nicholas Bonvicino, M.D., M.B.A., Senior Medical Director
Clinical Network Management, Horizon Blue Cross Blue Shield of New Jersey

Question 5. For physicians, what are the advantages and disadvantages of targeting individual clinicians versus medical groups? in the case of hospitals, what are the advantages and disadvantages of targeting individual hospitals versus hospital systems?

Some payers may have a choice of whether to institute P4P at the level of an individual hospital or physician rather than a hospital system or medical group.iii There are pros and cons of targeting incentives at the individual provider versus the medical group or hospital system.

The advantages of targeting incentives at the individual provider are:

  • Incentive schemes that directly link payment to those responsible for improving care provide stronger motivation than incentives linked to group behavior. If an individual physician is paid a bonus for the quality of care provided to her own patients, she has more opportunity to influence the chances that she receives a bonus than if she is 1 of 10 physicians whose practice patterns are aggregated for bonus determination.
  • Measuring and rewarding performance at the individual physician or hospital level may provide more actionable feedback than relying on more aggregate data and may enhance accountability.

The advantages of targeting incentives at the medical group or hospital system are:

  • Many believe that system failures are the key to quality problems and that system reforms are needed to overcome the problems.21 Moreover, many medical groups and independent practice associations would argue that they exist in large part to improve the coordination and quality of care. Providing incentives to "systems" so they can invest in improvement would be more consistent with this idea than paying individuals. For example, Blue Cross Blue Shield of Alabama and the Dean Health Plan in Wisconsin offer incentives for the adoption of electronic medical records (EMRs). Given the investment required to introduce an EMR, targeting individual physicians with an incentive is less likely to drive behavior than targeting groups. This "bigger is better" notion may not extend to hospital systems, where the evidence suggests that the advantages of larger scale operations are limited.
  • Rewarding groups of providers may be advantageous is related to errors in the measurement of performance. In some instances, particularly where the clinical process or outcome of interest occurs relatively rarely, there will be random variation in performance (and therefore payment), unrelated to the actions of the provider. With larger numbers of patients, there is less of this type of uncontrollable variation. This problem will be of greater concern for outcome measures generally but can be relevant for process measures that apply to small populations.


iii. Some purchasers may want to consider rewarding clinical teams rather than contracting entities, such as medical groups or independent practice associations, because many chronic care models rely on the concept of a clinical team as the locus of care management. Although such an approach would be more consistent with how care is delivered, it would likely pose challenges for data collection and payment since these entities are not generally recognized for contracting or billing purposes.

Question 6. Should provider participation be voluntary or mandatory?

Many pilot P4P and public reporting programs, such as the CMS/Premier demonstration, are voluntary, which is generally preferred by providers.22 The major advantage of a voluntary program is the relative ease with which it can be implemented because not all providers need be ready and willing to participate. Voluntary programs will be likely to attract those providers who expect to perform well—usually those that are already performing well23—while the poor performers remain on the sideline, which may limit the potential of a voluntary program to improve care among poor or mediocre performers.

Other programs mandate participation in the sense that it becomes a requirement for contracting, such as with most P4P programs implemented broadly (as opposed to pilots) by health plans. The main advantage of a mandatory program is fairness and the ability to promote quality across the market or network. (We note that a mandatory program where P4P takes the form of a bonus may be, in practice, exactly the same as a voluntary program because not all providers will find it worthwhile to respond.)

In practice, the decision of whether to make a program voluntary or mandatory is intertwined with considerations of data availability, the respective clout of providers relative to purchasers in the community, and the basic structure of the P4P program. A mandatory withhold, for example, appears much different from a mandatory bonus, as described above.

Question 7. Should we use carrots or sticks—bonuses or penalties—or a combination?

There is disagreement among researchers and industry leaders on whether threats or rewards are more effective motivators. Some analysts argue that penalties may be more effective motivational tools than bonuses because people view potential losses differently from potential gains.24,25 Although some documented evidence supports this theory, the conclusions are somewhat mixed.25-27 Others argue that providers dislike penalty-based approaches and, when faced with such negative incentives, they "game" the system.28,29

In practice, only a few P4P programs—such as the new general practitioner contract in the United Kingdom (UK) and the CMS/Premier hospital P4P demonstration— incorporate penalties for consistent poor performance. And even these programs plan for only very rare use of the penalties. In the first year of the UK program, almost 90 percent of physicians attained the program’s maximum rewards, to a large extent because performance goals were set very low.30

Similarly, in the CMS/Premier demonstration, CMS agreed that there would be no penalties in the first 2 years and that the penalty threshold for the third year would be set as quality at or below the 10th percentile of performance in the baseline year. Avoiding the penalty requires a relatively low level of quality improvement and all providers have at least 2 years to accomplish this goal.31,32

The impact of these strategies on quality of care is not yet known. Use of penalties to set a floor for performance expectations may prove to be an effective strategy. As overall performance improves, the floor could be moved upward over time.

Question 8. How should the bonus be structured?

The answer to this question in part depends on the overarching aim—to reward high-performing providers versus to encourage improvement.

At least four options in designing a bonus exist (Table 1):

  • Rewarding only those providers that meet or exceed a single threshold of performance.
  • Differentially rewarding providers for achievements along a continuum of performance thresholds.
  • Rewarding providers that meet or exceed a single threshold of performance combined with incentive rewarding of those that improve, regardless of whether they meet the threshold.
  • Rewarding providers in a continuous manner in proportion to their achievement.

The most common approach to P4P is to set a single benchmark level of performance that represents "good" quality and pay a bonus to providers that meet or exceed this threshold. As noted in Table 1, in its first year, the PacifiCare of California Quality Incentive Program rewarded all medical groups that exceeded a single threshold, which was pegged at the previous year’s 75th percentile for each measure. This approach is consistent with a strategy to reward high-quality providers (rather than to improve performance) and has the advantage of simplicity. This approach does not uniformly provide incentives for improvement, however. High-quality providers may receive bonuses without making any improvements, and low-quality providers may find the single threshold too difficult to meet and opt not to engage. Some early empirical evidence on the impact of recently implemented P4P programs supports this understanding.23


Table 1. Four Strategies for Designing a Bonus Structure, With Purchaser Examples

Bonus to providers that meet or exceed single benchmark level of performance, one benchmark for all providers
  • PacifiCare of California Quality Incentive Program, year 1: All medical groups that score above the prior-year 75th percentile of performance in the network receive per member per month bonus.
Graduated or tiered bonus based on more than one level of performance
  • PacifiCare of California Quality Incentive Program, year 2: All medical groups that score between the prior-year 75th and 85th percentile of performance in the network receive 50 percent of the bonus potential; providers scoring above the 85th percentile receive full bonus.
  • Bridges to Excellence Physician Office Link: Physicians receive per patient bonus for meeting a set of standards related to office systems that promote quality care; incremental rewards are associated with higher levels of achievement (basic, intermediate, advanced).
Combination of bonus for meeting threshold and bonus for improvement
  • Premera Blue Cross of Washington State: Rewards clinics based on process and outcome measures of quality (as well as other efficiency- and access-related metrics).  Points, which determine each clinic's allocation, are awarded based both on rank among peers and improvement.
Continuous rewards
  • Hudson Health Plan (a Medicaid managed care plan in New York): Pays $200 for every 2-year-old who receives all recommended immunizations on time.

As an alternative, purchasers may wish to consider tiered awards, in which differential incentives are offered to providers at different performance levels, such as 70 percent compliance, 80 percent compliance, 90 percent compliance. The more thresholds, the greater the likelihood that providers at different levels of quality performance will have an incentive to engage and improve. Again using the PacifiCare example, in the second year, it offers the full bonus to groups whose performance is above the prior year’s 85th percentile level and 50 percent of that amount to groups that perform above the prior year’s 75th percentile but below the 85th percentile.

Alternatively, purchasers might explicitly tie payment to improvement either in addition to or instead of a benchmark level of attainment. Premera Blue Cross of Washington State rewards clinics based both on their rank among peers and the degree of improvement over the prior year.

For measures that reflect concerns about underuse of effective services (e.g., retinal exams for patients with diabetes), another alternative would to pay an additional fee for each appropriately managed patient or for each "recommended" service that the purchaser is targeting. Unlike setting a bonus threshold at a single level, under the additional fees-for-service model, physicians always do better financially by bringing more patients into compliance with the standard.

Although the incentive properties of rewarding improvement or using additional fees each time a service is performed are preferable to a single fixed threshold, some may object in principle to rewarding physicians at levels of performance that are below acceptable norms (whatever these are.) To accommodate such concerns purchasers could set a minimum threshold—such as 60 percent adherence to the evidence-based guideline in question—below which physicians are ineligible for any payment.

Question 9. Should we use relative or absolute performance thresholds?

Asked another way, should the incentive be structured such that all providers could theoretically receive some reward, or should we structure the program such that there are only a limited number of winners?

In contrast to Question 8, which examines he relationship between performance and payments, Question 9 addresses whether providers compete against one another or are held to some external standard. Many current P4P programs pay bonuses based on the ranking of performance relative to other providers in the network. For example, Anthem Blue Cross Blue Shield of New Hampshire rewards physicians whose performance on clinical quality measures places them in the top two quartiles of the distribution (with larger bonuses for the top quartile). This type of reward structure is sometimes referred to as a tournament.

Tournaments may be desirable for the following reasons:

  • Relative performance measures can filter out common sources of uncontrolled variation in performance. For example, if a purchaser who wanted to target flu shots for improvement compared an individual physician’s performance in 2004 against 2003, the physician’s quality might appear to have declined in 2004 due to a decrease in vaccination rates, even though these lower rates primarily reflected vaccine shortages over which the physician had no control. However, examining the change in vaccination rates over time nationally or among physicians in the same market would produce a different picture of physician efforts to improve quality.
  • Tournaments provide strong incentives to improve continuously because there is no level at which it is guaranteed that a provider will be ranked sufficiently high to receive a reward.
  • Because not everyone receives a bonus, a tournament program with the same maximum bonus potential for those who will receive one will cost less than a program where all providers could get the bonus.

Important disadvantages of tournament-style rewards also exist, such as these:

  • Because providers cannot be certain beforehand what level of performance must be achieved to result in a bonus payment, they may judge investments in quality improvement to be unacceptably risky.
  • Providers that have already determined how to deliver good-quality health care along the targeted dimensions will be at an advantage (the same is true with a non-tournament program with a single high threshold). Providers that are ranked low among their peers are less likely to find it worthwhile to strive for these bonuses because of the low likelihood of surpassing the competition. This rewarding of historical investments in quality, although possibly justified, may not yield as much quality improvement across the population as other approaches.
  • When providers know their payment may be determined by relative performance, they may be less willing to engage in one of the most commonly used quality improvement models—the local collaborative in which successful local providers advise and assist less successful ones.

For questions 8 and 9, purchasers need to decide whether the primary goal of their P4P program is to improve the quality of care delivered by all eligible providers or to begin paying more to high-quality providers than to low-quality providers. These objectives are not incompatible, but some approaches to P4P (in particular, using tournaments or high fixed benchmarks) will favor the latter. Alternatively, the approach of paying additional fees-for-service achieves both goals, since higher performing providers receive more fees but all providers have a reason to improve.

Question 10. What are our options for phasing in P4P?

Most purchasers that have introduced P4P have started in a limited way and expanded over time. Advantages for phasing in P4P are that it permits testing of measures before full scale implementation, gives providers time to gear up for a P4P initiative; and enables purchasers to evaluate the small scale impact before applying it to the larger group of providers.

Options for phasing in P4P include the following:

  • Pilot test a payment scheme in a limited geographic area.
  • Focus on specific provider types or clinical areas.
  • Begin with pre-existing, national measure sets and add measures over time.
  • Rely on existing data (most likely billing data) and incorporate additional data as needed over time.
  • Begin with a voluntary system.
  • Begin with private quality reports and introduce incentives over time.
  • Begin with a modest benchmark for performance and raise the standard over time.
  • Begin with requiring or rewarding data collection and reporting and introduce performance incentives over time.

The CMS experience with hospital incentives illustrates one approach to phasing in a P4P effort. CMS introduced a pay-for-reporting program to encourage hospital participation in the Hospital Quality Alliance, in which participating hospitals receive 0.4 percent of their payment update if they publicly report a set of quality measures; non-participating hospitals lose this revenue stream. Because of the large market share represented by Medicare, more than 98 percent of hospitals nationwide report on the set of measures.

Current as of April 2006
Internet Citation: Pay for Performance: A Decision Guide for Purchasers: Phase 2. Contemplation. April 2006. Agency for Healthcare Research and Quality, Rockville, MD.