Measuring Coding Intensity in the Medicare Advantage Program
Medicare Advantage (MA) remains an attractive option for Medicare beneficiaries, and enrollment has increased substantially—by 66 percent in the past 6 years. Under Medicare Advantage, health plans are paid a monthly capitated amount that covers all health care services used by enrollees. In order to reward MA plans that enroll sicker-than-average beneficiaries, the program risk-adjusts payment, paying more to plans that enroll people with greater-than-expected health care needs and less to plans that disproportionately serve beneficiaries with fewer-than-average expected health care needs. Diagnostic information is used to assign a risk score to each beneficiary, and MA plans are paid the product of their bid multiplied by the enrollee’s risk score. This payment system creates incentives for MA plans to find and report as many diagnoses as can be supported by the patient’s condition.
Under fee for service (FFS), providers are paid based on the services provided. There is no need to include multiple diagnoses, as long as the diagnoses included support the services provided. By contrast, the MA payment system provides strong incentives for MA plans to identify and report multiple diagnoses. MA plans may review medical records and can report all diagnoses that are supported in the medical record, including those that were not reported by physicians on any health care claim or encounter record. In addition, MA plans also can employ nurses to visit enrollees in their homes to conduct health assessments and report diagnoses they find.
Diagnostic information in FFS is known to be incomplete. Research published by the Medicare Payment Advisory Commission (MedPAC) in 1998 showed that among Medicare beneficiaries diagnosed with quadriplegia at some point in a calendar year, fully 40 percent of them did not have a diagnosis of quadriplegia appear on any health care claim during the subsequent 12-month period. The rate of persistence for many other chronic diagnoses was similarly low. The lack of completeness of diagnostic information in FFS claims data provides ample opportunity for MA plans to report diagnoses more completely than in FFS, and a number of vendors actively market services that help plans to do so. The MA payment system is based on the assumption that the average enrollee receives a 1.0 risk score; the risk adjustment scores are normalized so that the average FFS beneficiary has a 1.0 risk score. If MA plans are finding and reporting more diagnostic information compared with FFS, then the average risk score in MA would be greater than 1.0. If MA plans are coding more completely than in FFS, then a beneficiary who would receive a 1.0 risk score in FFS, and for whom payment should be at the FFS average, might instead receive a risk score of 1.1 in MA and payment would be 10 percent greater.
As a result of concerns about the effects of incentives to aggressively find and report more diagnoses on MA plan payment levels, the Deficit Reduction Act of 2005 directed the Centers for Medicare & Medicaid Services (CMS) to measure and adjust for coding intensity, and in the 2014 payment year, CMS adjusted risk scores by 3.41 percent to reflect anticipated differences between MA and FFS coding. The Affordable Care Act directs CMS to increase the coding intensity adjustment to at least 4.71 percent in 2014, and further increase it to at least 5.71 percent by 2018. The American Taxpayers Relief Act of 2012 further increases the minimum coding intensity adjustment to 4.91 percent in 2014 and 5.91 percent in 2018.
Study Examines Coding Intensity Effects on Risk Scores
In a recently published study, my colleague Pete Welch and I used recent data as well as improved methods to estimate the effects of coding intensity on MA risk scores, with particular attention to variation over time in coding intensity, variation in coding intensity across plans, and variation in the diagnoses most subject to coding intensity methods. We evaluated whether changes in MA risk scores relative to FFS risk scores are due to coding intensity efforts or to changes in the composition of enrollees by analyzing the contribution of changes in risk scores for four types of beneficiaries: stayers, leavers, joiners and switchers. For purposes of our analysis, stayers are beneficiaries in either MA or FFS for two consecutive years, leavers are beneficiaries (primarily decedents) who were in one sector in the first year but not Medicare eligible in the second year, joiners are beneficiaries who were not eligible for Medicare in the first year (primarily those turning 65) and switchers are beneficiaries are beneficiaries who move from FFS to MA or vice versa between two consecutive years. To the extent that the contribution of leavers, joiners or switchers differed between MA and FFS, then we have evidence that part of the differential growth in risk scores between MA and FFS is a result of differences arising from enrollment decisions or mortality of beneficiaries—otherwise known as “caseload dynamics.” On the other hand, if risk scores increased more rapidly for MA stayers than for similar FFS stayers, then we have evidence that coding intensity accounts for part of the more rapid growth in MA scores. We used Medicare administrative data from 2004 to 2013 for this study.
Risk scores among MA stayers increased more quickly than risk scores among FFS stayers in every 2-year cohort between 2004 and 2014. The difference between the rate of growth in risk scores for MA and FFS is substantial, with average MA scores increasing one-third more rapidly than FFS scores. On average over the 2004–2013 period, caseload dynamics had virtually no net effect on the difference between MA and FFS in the rate of growth of risk scores; caseload dynamics led to more rapid increases in risk scores in MA in the early part of the period, but to slower increases in the latter part and, on balance, made very little contribution to the differential growth in risk scores.
There is a striking amount of heterogeneity across MA plans in the extent to which risk scores for stayers increased from 2004 to 2011—in some plans, risk scores for stayers increased at a rate very similar to the rate in FFS, while in some plans risk scores for stayers increased twice as much as in FFS. Further, increases in diagnostic coding are especially large in a relatively small number of highly discretionary diagnostic categories, including drug/alcohol dependence; major depressive, bipolar, and paranoid disorders; vascular disease; chronic obstructive pulmonary disease; diabetes with renal or peripheral circulatory manifestations; renal failure and polyneuropathy.
Coding Intensity Increasing in MA Plans but Reasons Not Clear
Based on this analysis, it appears that most of the reason that MA risk scores increased more quickly than FFS scores is due to increases in coding intensity—measured as increases in risk scores for stayers—with little of it accounted for by changes in enrollment mix. There is little sign of coding intensity slowing; in fact, it may be increasing.
The analysis that Pete Welch and I conducted was unable to determine whether the greater rate of increase of risk scores in MA plans was due to more accurate coding or to fraud. I would not be surprised if there is some fraud involved, because this does occur in many areas of human behavior when a lot of money is at stake, but I suspect that much of the increase in risk scores is a result of health plan efforts to more fully document diagnoses that do exist. Regardless of the mechanism underlying the increase in risk scores, the result is that a beneficiary who would have a risk score of 1.0 in FFS will have a higher risk score, on average, in MA and a much higher score in some MA plans than in other plans.
As discussed earlier, CMS and the Congress have responded to the increase in risk scores over time in several ways. However, across-the-board adjustments do not address the substantial heterogeneity in the plan-level results we have found.
It is challenging to accurately measure the effects of coding intensity on MA risk scores and even more challenging to devise optimal policy responses, particularly in the context of the substantial heterogeneity across plans in the level of coding intensity. My colleagues at CMS continue to work actively and productively on responding to these challenges.
Page originally created October 2014