Measuring the Quality of Physician Care
Studies have found that consumers are very interested in information about the quality of physician care.[1-2] To meet that demand, researchers, medical societies, and health care experts have made a concerted effort in the past 10 to 15 years to address the many issues encumbering physician quality measurement. As a result, sponsors of quality reports can find a growing set of valid and reliable physician quality measures and look to a number of successful public reporting projects as examples. That said, many of the available measures are primarily suited for and used in quality improvement activities and pay-for-performance initiatives.
Why Measuring Physician Quality Is Challenging
Technical and political issues have contributed to the scarce use of physician measures in public reports.
- Technical challenges include:
- The difficulty of constructing valid measures with data generated from small patient populations.
- Data sources that are not comprehensive.
- Information systems that are not standardized.
- Political issues include:
- The wariness of physician stakeholders.
- A lack of consensus about appropriate measures and methods for reporting to consumers.
Measuring Groups Versus Individuals
Physician quality measures can be used to evaluate the performance of an individual physician or groups of physicians that practice together (such as a pediatric group practice). However, while consumers have indicated a preference for quality information at the level of individual physicians, most information on quality is at the level of medical groups or practices.
Whether a measure can be regarded as valid—and therefore appropriate for reporting—is a function of the sample size (i.e., the number of patient visits or services), which depends on the prevalence of the health condition in the physician’s patient population.
For that reason, most measures are used to assess physician groups rather than individuals. The advantage of constructing scores at the group level is the availability of a larger patient population. When measuring quality at the medical group level, you can create a sample by combining patient data from each physician in the group.
It is more difficult to produce adequate sample sizes for individual physicians, who do not necessarily have a sufficient number of patients with the disease or condition addressed by the measure. The minimum number of required observations needed to calculate a score for an individual performance measure varies; recommendations range from 30 to 50 patients per physician. However, a larger sample is often necessary depending on the characteristics of the measure or data source.
Other Technical Issues in Measuring Physician Performance
Measurement experts are working on various methodological issues to advance physician-level data collection and reporting. Resolving the issues listed below is critical to getting the consistent and valid results necessary for public reporting.
- Rules for attributing patients to individual physicians. Attribution rules determine which physicians will be accountable for the care provided. For example, visit-based attribution uses the number of visits a patient has with a physician; cost-based attribution uses physicians responsible for the greatest health care expenditures for that patient; and assignment-based attribution uses the primary care physician or specialist assigned to the patient. Determining the “best” method is the subject of considerable debate.
- Methods for aggregating data from different sources. You can attain a valid sample for either a group or an individual by collecting patient data for a particular physician or medical group from each health plan (or hospital or nursing home) where the care was provided. The patient data collected from each source is then aggregated to produce the performance score. However, methods for aggregating physician data across health care organizations are still under development. For example, how do you ensure that data about the Dr. Smith who is affiliated with Health Plan A is combined with data for the same Dr. Smith affiliated with Health Plan B? The lack of standardized information systems and coding practices impedes efforts to aggregate data from different sources.
- Methods for creating composite scores. Composite scores combine results across individual measures to create results for a broad topic—a particular aspect of care (e.g., prevention) or condition (e.g., diabetes care). For example, a composite score for diabetes care could include rates for measures relating to HbA1C testing and control, eye exams, LDL-C screening and control, kidney function tests, blood pressure control, foot exams, and smoking status. Results for a broad topic (rather than individual measures) are more understandable to consumers and can also address small sample sizes (i.e., small patient populations).
- Calculation of benchmarks and assignment of peer groups for comparing physician performance. Peer group comparison can generally be defined by the patient population (e.g., Medicaid beneficiaries) and the physician specialty. In many situations, specialty cannot be determined simply based on credentialed specialty, stated specialty, or board-certified specialty alone.
- Processes for auditing/validating results. Once the results are collected, aggregated, and analyzed, they need to be validated. Approaches include having physicians review their data to confirm accuracy and using third-party auditors.
 Hibbard JH, & Jewett JJ. What type of quality information do consumers want in a health care report card? Medical Care Research and Review 1996.53;28-47.
 Ranganathan M, Hibbard J, Rodday AM, de Brantes F, Conroy K, Rogers WH, Safran DG. Motivating public use of physician-level performance data: an experiment on the effects of message and mode. Medical Care Research and Review 2009 Feb;66, 68-81. Originally published online Oct 15, 2008.
Page originally created February 2015