Appendix E: Blinded Reviewer Comments (continued, 2)

Health Care Efficiency Measures: Identification, Categorization, and Evaluation

Contents - By Section:
Explanation of Interest in Efficiency Measures
General
Executive Summary
Chapter 1 - Introduction
Chapter 2 - Methods
  Typology
Chapter 3 - Results
Chapter 4 - Assessing Measures
Chapter 5 - Discussion
Appendix
  Editorial Comment
  Which measures are ready for use?
  Are there published measures not included?
  Are there vendor developed measures not included?

Typology

SectionCommentsResponse
TypologyRAND has constructed a basic typology to categorize efficiency measures along three dimensions: perspective, outputs, and inputs.

One suggestion is to consider the relevance of time horizon as a fourth dimension. Measures of short-run efficiency may focus on the relationship between inputs and outputs from a given perspective over a short period of time. A short-run perspective does not question a producer's choice of outputs to produce, nor does it question the efficiency of the technology investments that are made by a producer. However, a long-run perspective may be useful if one believes there may be more or less efficient choices of outputs to include in a product line (i.e. the choice to specialize, the scope of conditions a given producer chooses to treat), and/or if one believes that there may be more or less efficient choices of technologies with which to produce a given output, whether a service or a health outcome.

The issue of time horizon may have differential importance depending on the perspective of measurement. Consumers may care more about short-run measures of efficiency (i.e. measures in which there exist "fixed costs"). Healthcare providers at the point of care (nurses, doctors) may also care more about short-run measures. Healthcare administrators may care about short-run measures, but also long-run measures in which all costs are variable. Intermediaries (health plans, purchasers), may care about long run and short-run measures, and Society may likewise care about long-run measures more in terms of societal public health planning (this is not to say, however, that short-run measures would not be important to Society).

Another reason why time horizon might be a useful dimension to distinguish between efficiency measures, is that it imposes additional structure on the consideration of costs. What counts as a relevant cost in a long-run analysis of efficiency may not enter into an analysis of short-run efficiency.
We thank the reviewer for this suggestion of an additional dimension to the typology. We are not in a position at this point to make this revision but would be open to this suggestion and others that may arise from a broader audience who will read this after dissemination of the final report.
TypologyAlso, the typology didn't include mention of the use of re-admissions as a modifier or adjuster of LOS or other measures of use. Note that The Leapfrog Group model uses re-admissions as an adjuster to assure that hospitals with shorter lengths of stay but with high rates of readmissions are not considered efficient. The Leapfrog solicited comments from providers and received extensive feedback from the provider community about which measures to use, and the providers supported the use of LOS adjusted for severity and readmissions.No response necessary.
TypologyI thought that the proposed typology was quite helpful in terms of providing a definition of efficiency and a "big picture" understanding of efficiency in a far more systematic and thoughtful manner than I had come up with independently. The typology does cover our interests in a very general way. The document somewhat meets the purpose outlined in the title: a thorough list of healthcare efficiency measures is identified and categorized. However, the document seems less successful at evaluating the measures.No response necessary.
TypologyI like it. I think it's clear and makes good distinctions between the different levels at which efficiency should be viewed/measured.No response necessary.
TypologyI found the revised typology somewhat "frustrating" rather than helpful. While I found the draft report well-researched and written, the emphasis that I read was on "absolute efficiency" measures, rather than the more practical "relative efficiency" measures that are already in use. Taking the typology dimensions one at a time:
  • ~Perspective: isn't there really just a single perspective of making the system less costly? Why do we care how many units of physician time, nursing time, etc. are used, as long as the dollar-denominated results are at an average level or better?
  • ~Outputs -- what type of product is being evaluated? Again, the main output needs to be dollar costs, with quality measures used where available. One note -- mention is made that quality measures are "further advanced" than efficiency measures. While there may be more research completed, I have personally found that there is a lack of consensus on quality measures, which means that nothing much gets measured or agreed to. I would state that efficiency measures are further along -- solid software by private vendors that is being used on an everyday basis.
  • ~Inputs -- what Inputs are used to produce the output? I found this to be a less than productive discussion. Again, why do we care what the Inputs are, as long as the quality and dollar-efficiency output is better?
We have tried to provide a better balance in the review between what one might learn from the peer reviewed literature versus the applications that are being used in practice.
TypologyI am concerned that the current typology focuses mainly on costs. Although it is mentioned in the report about incorporating quality metrics into efficiency assessments the general conclusion is that this is too challenging and is part of a future research agenda. As such, I don't believe this current typology will move us forward towards evaluating the “value” of care delivered across the continuum and doesn't target high leverage crosscutting areas such as longitudinal efficiency and outcomes, care coordination, care transitions, patient engagement, and end of life care. This report is a good overview of where we are now with existing “efficiency” measures and proprietary grouper methodologies, and how these can be classified.You are correct that the main focus is on costs. There are many other domains of performance in the health care system that require different types of measures. As IOM has reminded us, we need to look at all of these domains.
TypologyI found it easy to understand and it provided a useful framework for considering not only the existing measures, but also measures that are in development (e.g. by NCQA) or measures that could be developed in the future. The typology has face validity and any efficiency measures that I'm aware of could easily fit into this typology.

There were, however, a couple of definitional concepts that seemed a bit inconsistent with my understanding of how efficiency measures are currently constructed by health plans. For example, on page 3 in paragraph 5, the authors state, “The main difference between the various measurement approaches is that ratio-based measures can include only single inputs and outputs…” Perhaps what the authors meant is single units, metrics or types of inputs and outputs. When financial inputs are used to create efficiency ratios, they encompass multiple types of inputs (visits, procedures, hospital days, prescriptions, etc) expressed in terms of a single unit of measure or metric--their dollar expense. What seems to distinguish between ratios and the other measurement approaches to me is that when we convert all of these different units of inputs to a common metric—dollars, in my example—we lose the ability to define the optimal mix of different types of inputs to produce a given output. That makes ratios less useful for improvement purposes.
The text now acknowledges that inputs or outputs may be aggregated into a single input or output, with a ratio then applied.
TypologyRAND/AHRQ's attempts to define and develop a typology of efficiency measures are commendable. However, the proposed typology continues the current measurement of efficiency measures in terms of resource consumption and associated costs without accounting for quality. The proposed typology fosters a provider/payer perspective rather than a broader provider/payer/patient perspective of care and is disconnected with the principle of quality improvement and value based purchasing of care.

Quality and efficiency should not be discussed separately. A good example is avoidable readmission since it is a fact that reducing complications and readmissions will result in greater economic returns. As discussed in the report, the Medicare program has been using readmission rates as measure of efficiency. However, defining efficiency solely in terms of the relationship between inputs and outputs excludes avoidable readmissions from being classified as an efficiency measure under the proposed typology.
We have tried to make clearer the role of measures of effectiveness in combination with measures of efficiency and the potential to see these on a continuum based on the choice of output measure.
TypologyReaction to revised typology: I found the revised typology much more useful. The addition of perspective is a key improvement. One entity's efficiency is often another entity's decreased income.No response necessary.
TypologyThe proposed typology makes sense and does provide a way in which to classify efficiency measures: perspective, inputs, and outputs. However, when actually implementing measures, it would be challenging to use this typology to classify the IHA measures in terms of inputs, outputs etc. The way in which we have gone about doing this is illustrated in the table below. I only included some examples. (Listed as Table B in the end of reviewers comments)No response necessary.
Chapter 2 - MethodsP25, paragraph 1 - In third sentence from the end, "publication" should be plural.This change was made.
Chapter 2 - MethodsPage 25: What is a “purposive reputational sample approach?”A sample we chose based on reputation.
Chapter 2 - MethodsP26, paragraph 2 - 1st bullet should read "…treatment or product"This change was made
Chapter 2 - MethodsP27, paragraph 2 - Refers to a list of 12 potential stakeholders who would be interested in using efficiency members. I assume this will be Table 1B in Appendix D. It was missing from my copy.This is included in the final report.

Return to Top

Chapter 3 - Results

SectionCommentsResponse
Chapter 3 - ResultsP29, paragraph 3 - Next to last sentence-either delete the word "from" in ".using from data USA data sources" or reword as "using data from USA sourcesWe deleted the word "from".
Chapter 3 - ResultsP30, figure 2 - Is the asterisked footnote "*submitted after review of draft report" an orphan? I couldn't find the asterisk it relates to in the flowchart.This change was made.
Chapter 3 - ResultsP31, paragraph 2 - Reads "..our definition presented above." I believe the definition was presented 19 pages earlier (i.e. on page 12) so this was confusing. It could be changed to ".our definition presented earlier.":This change was made.
Chapter 3 - ResultsPage 31: In the last sentence of the second paragraph, "article" should be "articles."This change was made.
Chapter 3 - ResultsP32, paragraph 1 - Move this paragraph after Box 1.This change was made.
Chapter 3 - ResultsPage 32, Box, third paragraph, last sentence: the lack of direct information about why the providers are different is exactly the main problem with efficiency indexes, we believe. Another problem is that the ratio removes the size of the problem. Two practitioners may both have an efficiency of 1.20. All other things being equal, if one practitioner has 100 patients in their panel and the other has 1000, it is much more important to work with the latter. If a practitioner has 10, I would ignore the EI of 1.20 completely as it is unlikely to be accurate, stable, actionable, or worth pursuing by itself.We added a comment about ratios masking differences in order of magnitude.
Chapt. 3 - ResultsPage 33, first paragraph: at the top, another reason for many studies of hospitals might be that they are relatively closed systems where one can measure all the inputs, outputs, and outcomes (in theory). At the bottom of the paragraph, one couldn't count all the physicians, but one could count the visits. Similarly, a general problem we've had in managing our network is understanding how many physicians are full time and how many part time (and how part time they are).This change was made.
Chapt. 3 - ResultsP33, table 5 - In the inputs column of this table, is Financial equated to Productive Efficiency and Physical equated to Technical Efficiency? If so, does Both mean that both technical and productive efficiency were addressed in the article?Financial and physical are similar to productive and technical efficiency and "Both" does refer to an article addressing financial and physical efficiency.
Chapt. 3 - ResultsP36, table 6 - Add "at the Hospital Level" to the title of table 6.This table was retitled.
Chapt. 3 - ResultsPage 36, table 6: I believe you mean to label this "20 Most Frequent Inputs and Outputs for Hospital Efficiency Measures."This table was retitled.
Chapt. 3 - ResultsPage 36, end of first paragraph: Another difficulty in measuring physician efficiency is that pharmacy use is such a key element and may not readily available (as in Medicare before Part D).Added this point.
Chapt. 3 - ResultsPage 37, middle paragraph: again, in practice efficiency indexes would be very wide spread and this discussion does not bring that out.We did not make the change because we were not certain how to interpret the comment.
Chapt. 3 - ResultsPage 37, towards bottom: again, would be helpful to have simple examples of SFA, DEA, and EI for physician oriented measures.DEA example given on p.37.
Chapt. 3 - ResultsPage 39: On the first line, “measures” should be “measure.”This change was made
Chapt. 3 - ResultsPage 42, end of first paragraph: Pilot is in the eye of the beholder, of course, but I expect some readers would consider PFP activities to be beyond the pilot level.Deleted “pilot.”
Chapt. 3 - ResultsThe discussion of episode-based measures vs. population-based measures is a very important one (p. 42). Have you considered that some measures (e.g., treatment of diabetes or acute MIs) might be better assessed using episodes and others (e.g., flu vaccination and flu treatment/prevention) would be better through population measures? Very few conditions/treatments can be evaluated on the basis of population measurements -- too many people needed for many of the low incidence diseases/treatments.Added at the end of the vendor measures.
Chapt. 3 - ResultsPage 42, 3rd paragraph. The category you call “population-based” is more properly called “person-level risk adjusters.” In the 3rd line of the paragraph, you need to fix the episode definition so that either you're defining a single episode, or so that your definition describes episodes (plural).We continue to use population-based.
Chapt. 3 - ResultsP42, paragraph 3 - I would insert the paragraph describing ETGs and the MEGs from page 4 of the Executive Summary after paragraph 3.This change was made.
Chapt. 3 - ResultsP42, paragraph 4 - I would insert the paragraph describing ACGs and CRGs (and potentially DxCGs) from page 5 of the Executive Summary after paragraph 4.This change was made.
Chapt. 3 - ResultsPage 42, last paragraph. You should change “The outputs, either episodes or risk-adjusted populations,….” to “The outputs, either episodes or person years of care,…”This change was made.
Chapt. 3 - ResultsP43, paragraph 2 - The acronyms for ACGs and CRGs have not yet been defined in the body of the report—only in the Executive Summary. If you take my suggestion above, this is moot.Took suggestion above so moot.
Chapt. 3 - ResultsPage 43, last paragraph. Line 4: should be “networks” (plural).This change was made.
Chapt. 3 - ResultsPage 43, last paragraph and Table 8, page 44. You specifically mention ACGs and CRGs, but there are several other person-level risk adjusters that you do not mention. In the last sentence of the paragraph, you state that you don't have information “on efforts to validate and test the reliability of these algorithms specifically as efficiency measures,” but that is precisely what we were doing in rejected paper #136 on page E-29. In that paper, we identify a number of person level risk adjusters that you don't mention. I'm attaching lists of citations for two of these omitted measures (Burden of Illness may be dead by now, but DCGs are widely used. Actually, DCGs and ACGs were developed at the same time during the early 19080s, both with grants from HCFA to develop Medicare HMO capitation instruments.) In Table 8, you list ETGs and MEGs, but you don't list the Cave episode grouper. I've attached citations for a couple of articles by Doug Cave on his episode grouper.Added DxCGs and Cave. Changed discussion of reliability/validity testing and cite the article mentioned.
Chapt. 3 - ResultsP44, table 8 - There are 3 published articles describing ETGs (see attached list) as well as a detailed descriptive document on the Symmetry website.
http://www.ingenix.com/content/attachments/ETG%206.0%20White%20Paper_01-17-07.pdf
I would also add DxCG to the vendor list. http://www.dxcg.com/
Added DxCG and requested ETG cites.
Chapt. 3 - ResultsNote: This section (Sample of Stakeholder's Perspectives) is quite valuable and as such should include a more detailed discussion of the key themes that emerged. Tables etc can be put in Appendices to save space. Also was there any feedback in regards to limitations of existing efficiency measures and how they are trying to overcome? This could also inform the research agenda.Incorporated comments received from stakeholders into research agenda section. We added a discussion summarizing the key themes from the stakeholder perspectives near the front of this section.
Chapt. 3 - Resultsp.45 bottom page last line: The “desirable attributes” is an important finding of this qualitative analysis and should be discussed and set-up in this section. In the next chapter (p 52-53) these are described very cursorily as compared to other criteria we are more familiar with. For example, risk adjustment is a major concern amongst physicians and very relevant to assessing efficiency across episodes of care. Also what is the difference between “criteria” and “attributes” as presented?We've combined the desirable attributes with criteria for evaluation.
Chapt. 3 - Resultsp.46 Stakeholder feedback emphasized the importance of composite quality-efficiency measurement. Perhaps an explanation is needed here as to why this approach was not incorporated into the original framing of the typology. Also any examples of stakeholders taking this approach and success factors/barriers?Addressing the quality-efficiency issue elsewhere.
Chapt. 3 - ResultsP46, paragraph 3 - Under first bullet, I would mention the quality of encounter data (completeness and accuracy) under capitation payment.
Under second bullet, I would reference the accuracy of service-based costs in encounter data. I don't think the issue for cost calculation is the availability of claims data if complete encounter data are available (previous bullet). I do think that the service level price or payment information is potentially incomplete as more costs are likely to be included outside the fee schedule (since payment is not linked to the fee schedule). Other issues in the second bullet include outlier handling (e.g. trim outlier episodes or truncate their costs?) and whether only the ETGs that are relevant to a given specialty should be included (some specialists also serve as PCPs for some of their patients and have a wide range of ETGs with small numbers of episodes that are unrelated to their primary specialty).

I would add a bullet on defining peer groups for comparison—cardiology is a good example, where there are diagnostic/consulting cardiologists and interventional cardiologists—assuming cardiothoracic surgery is handled as a separate specialty.
Made these changes.
Chapt. 3 - ResultsP46, paragraph 5 - There are some more mature initiatives, including the Massachusetts Group Insurance Commission's Clinical Performance Improvement project and the efforts of some individual health plans (e.g. BCBS of Texas, Regence BCBS, United Healthcare's Premium Designation Program, Aetna's Aexcel, etc.)Added these examples.
Chapt. 3 - ResultsPage 46, middle of second paragraph, “There is wide recognition of the importance of developing a composite quality-efficiency metric.” This sounds like an endorsement. Is that your intent? Or do you intend simply to make the observation that “many believe it is important to develop a composite quality-efficiency metric.” We would argue against a single quality-efficiency metric (as we argue against a single composite efficiency metric). A single metric would quickly become a judgmental score without action to connect to quality improvement programs. We believe this to be counterproductive, as outlined briefly above in regards to the LASIK surgery example.Reworded this to reflect this concern. We are not endorsing a composite measure.
Chapt. 3 - ResultsI have attached a revised table including updated information on the IHA efficiency measures found in table 9 of the report. Please feel free to contact me at 415-615-6377 with questions. (Tammy Fisher's revised tables are at the end of the reviewers comments)Revised Tables 9 and 10
Chapt. 3 - ResultsTable 10, page 47. Your comment on IHA is out of date. Since this document is still in draft, you may want to correct it. IHA has selected a vendor (it's MedStat), and they are in the process planning the Beta testing of their efficiency measures.Revised Tables 9 and 10

Return to Top

Current as of April 2008
Internet Citation: Appendix E: Blinded Reviewer Comments (continued, 2): Health Care Efficiency Measures: Identification, Categorization, and Evaluation. April 2008. Agency for Healthcare Research and Quality, Rockville, MD. http://www.ahrq.gov/research/findings/final-reports/efficiency/hcemappe3.html