February 24, 2010: Morning Session (continued)
Dr. Scholle: Sarah Scholle. So I've been struggling with this morning's discussion because the National Quality Forum (NQF) endorsement criteria I think are very clear, and that's something that the National Committee for Quality Assurance (NCQA)—we looked—those criteria are very similar to the criteria that our committee on performance measurement uses to determine what measures get into the Healthcare Effectiveness Data and Information Set (HEDIS), and I know the American Medical Association (AMA) Physician Consortium for Performance Improvement (PCPI) has a set of criteria that are very similar, so we've all been trying to work towards the same set of criteria. And when I see some of the topics that you've listed here, to me they're not criteria for a measure as much as information about the measure that you'd like to have that would determine whether the measure would be useful in reporting on CHIPRA—for CHIPRA under the CHIPRA rules. So for example, the question about race/ethnicity—having information about race and ethnicity, you could ask, as part of your importance criteria you could say is this a problem more in a minority population, or you could ask people to discuss that as part of setting up the importance of the measure. But you could also say to the grantees in the States that are developing measures and testing measures that we want to know what it looks like when you use this measure in different populations. And you want to do that because you want to be able to set sample sizes, and you want to set—you want to know something about prevalence of the problem and how well this measure is going to work in those different populations. But whether that measure is actually useful, you know NQF endorses measures for use in a lot of different settings I think, and so it would be up to different settings to say that's good. So are we advising AHRQ and the Centers for Medicare & Medicaid Services (CMS) about what—how a measure should perform in different populations, and what information should be collected so that you can gauge whether this is going to really be useful to you in evaluating the care for minority populations as well as for children with special health care needs?
Dr. Dougherty: Well, I think what we're trying to do is have some consistency. So—among these awardees—so that every awardee for every measure does not do something, specify something differently for dealing with unknowns, or for knowing what are the quantitative criteria that should be used to say whether a particular State or health plan or whatever has just such a small population of African-American children that you should disregard those data. We're trying to set some bars for consistency because without that, when CMS gets the data from States using these, hopefully the initial core measure set and the improved core measure set, right now they have no way to know, even if they're not comparing States, they have no way to know whether one State is actually doing—collecting the data in such a way that it's valid. Is that reasonable to say? Maybe somebody else can help me here who's more of a measurement expert. So we want to be able to give guidance to the people.
Now, maybe for race/ethnicity it's not so much a measure development issue as a data collection issue, but that's important. We want some specifics to say, you know, here's the best we can tell you about collecting data and how you collect data across four particular racial and ethnic groups. And I think Patrick Romano may be able to help us. The Healthcare Cost and Utilization Project (HCUP) with the State inpatient data system has done that a lot already. I almost took inpatient measures and disparities off. So that there are only, you know, 22 States that they feel comfortable when they're doing a national estimate, there are only 22 states when they look at how the data are actually being collected and reported that they feel comfortable including the data, say, in the National Healthcare Quality or Disparities Report in this case. And they've done a lot of work on that, and I would imagine that the State health data organizations and the hospitals hopefully know what those criteria are ahead of time. So if they want to participate and know what their disparities rate is in a particular quality issue for inpatient care, they know that they need to collect the data the same way that the other states are doing it. Does that make sense? Maybe somebody else can explain it better.
Dr. Scholle: So I'm still struggling with this. I don't know if it's taking us off track, but in order to evaluate the measures that could be used as the core set in 2013, you need to have consistent information on each of the measures, so you want to have—you want it to be in a structured format so you can just go down the list and say here's the importance and you need what those criteria are.
Dr. Dougherty: Not importance. That's different.
Dr. Scholle: Well—so if we're getting into the issues of feasibility and rating of disparities and sample sizes, then it's going to be important—at what level of reporting should we have in our mind as we're thinking about these criteria? Are we thinking about it at the State level, evaluating State Medicaid programs, State CHIPRA programs, or are we evaluating it at a practice or physician organization level or an individual clinician level? Because you'd have to have separate criteria for risk adjustment if you were going to do—I mean, we want to consider whether risk adjustment would be different for population versus for an individual clinician because we know there's a lot of selection bias. So that's where—if you could say how these criteria would be used—this is information that you want every measure developer to provide on each measure, and the measures are going to be used to evaluate Medicaid programs, State Medicaid programs—at the State level?
Dr. Dougherty: Well, I mean, Congress in its wisdom wanted measures and criteria that would go vertically, right? You could collect it at the—and it's the electronic health record (EHR) idea, right? You collect it once and Beth, you've certainly talked about this, maybe you can help out here. You collect it once, then you roll it up, and then you roll it up, and then you roll it up again; that's I think the goal. We're not there yet, but I think CMS is still struggling with that question. Certainly the State Medicaid programs may want to compare their health plans, and the health plans will want to compare the providers they hire, and maybe someday Congress or somebody will want to compare across States. Even if for just confidentiality and for the purpose of saying where do we need to provide more technical assistance, more quality improvement money, you know, even if it's not a publicly reported thing, which the legislation says it should be eventually. So I think it's not possible to answer that question right now.
Ms. Dailey: I think that's one of the unknowns. What we have struggled with is States have had various experiences in terms of the majority being in managed care. When we've worked with CHIP we saw similar challenges in terms of are they going to be collecting information that is patient-centered versus practice-oriented or if they have a fee-for-service program. We have almost no information in that kind of a delivery system. So ultimately I think what we're looking for are measures that can go across delivery systems and ultimately it would patient-centered because we want to evaluate health outcomes, but I think that's more aspirational and that's where the EHR comes in as a venue for collecting information, again being aspirational, we want to reach.
So our struggle now is States are going to be required to submit information to CMS. In order for them to get their information, how are these measures going to be structured so they can collect information from different types of providers? And so that is our struggle in terms of trying to give you guidance. States are going to have their collection methodologies, and then they have to report to us, ultimately we're hoping in a comparative way, but again, that's aspirational. We're not anywhere near at that point. So when we have to give technical assistance to States and tell them what kind of specifications they need to use to collect information, that's where we're really struggling in terms of, okay, what kind of criteria do we give those States?
Dr. Dougherty: Right now, unless it's a structural measure of certain types, I think most of the data come from the individual provider level. So no, that may be—if you're looking for a level here, and people disagree with me—yes.
Ms. McColm: I'm Denni McColm, I'm with Citizens Memorial and just looking at the 25 core measures, it's like a mix. I see the confusion. It's a mix of things that the provider wouldn't have to report, that would have to be at the State level, like the whole population of patients who had an ED visit. So those are two different things.
Dr. Scholle: Most of the measures—14 out of the 24 are HEDIS measures, and those are health fund population measures that could be used for other populations, but they largely come from claims data or claims augmented by chart review.
Dr. Dougherty: And so that means in a sense yes, they're coming from individual providers' claims? No. Okay. Yes?
Dr. Scholle: They're not intended to represent an individual provider. They are intended to represent the populations served who are members of that health plan.
Ms. McColm: If a patient doesn't have a visit with their pediatrician, what provider are you going to allocate that one to?
Dr. McIntyre: And I think we do talk about this as far as discussion about accountable, you know, was the State accountable or were we talking about the provider, and I think we kind of ended up saying we were really looking at State-level accountability because some of these measures really were not, even though we pulled them from information related to a provider, that ultimately we were looking at this point at trying to figure out what to do as far as from a State entity and rolling it up to the national side. So even if we ended up from a State standpoint using something, one of the measures to look at providers, that was not how we ended up designing or actually saying that we wanted to put a particular measure in there. So from my standpoint, what I've been trying to explain because I've had a lot of people asking about well, what are you trying to do with the CHIPRA measures? We're looking not just at Medicaid, but also with our private insurers and other groups because we already had them pulled together to say that we would ultimately like to get some consistency in what we're looking at when it comes down to children's health care quality and together work to get some improvement so that it's not so much what's happening at the provider level, but what's happening in the State when it comes down to health care quality.
Dr. Dougherty: But there is some trickle-down. If the State has to collect things one way, then they will have the health plans and the fee-for-service providers collect that data the same way.
Ms. Dailey: And of course we have the churning issue with the children in and out, across programs, between private, between Medicaid and CHIP, and that is why we've been struggling in terms of how do we set up the specifications that apply to not only different providers but different delivery systems if a child is in transition.
Dr. Dougherty: And that's high on the priority list for next time, but it's not on—we're not breaking out to figure out how to do that today.
Dr. Brown: Question? Hi, this is Julie Brown from RAND. You keep talking about providing specifications that say these higher aggregated entities can hold these lower, more granular entities accountable for—and when you say "specifications" do you really mean guidance, instructions, specifically this is what you must collect and how you must collect it?
Dr. Dougherty: Well, I think you could probably give us an example from the CAHPS work, right?
Dr. Brown: I guess I'm kind of with—I think I'm with Sarah on this whereas where I'm struggling is I'm trying to imagine how would you pull a State-level measure, maybe it's, I don't know, immunizations delivered within pediatric practices over the prior 12 months. How would you do that without conducting chart review, and who would conduct the chart review, and what are the entities that they would, you know, would they visit each individual practice and how would you make sure everybody is charting it the same too?
Dr. Dougherty: Well, I think NCQA has specifications or guidance on that issue.
Dr. Brown: Well, that's the point. When you say "specifications" you're pretty much saying collect this information the way NCQA requires health plans to collect and report this information.
Dr. Dougherty: Except that here we're not starting from the basis of NCQA or CAHPS or whatever. We're saying, okay, in the best of all possible worlds if we wanted to collect the data across all providers or health plans what would be the guidance that we would give. So specifications—I think Rita, can you—is "guidance" another word for "specifications?"
Dr. Mangione-Smith: I mean, I understand the discomfort here because I think, you know, I have certainly felt that also, like what exactly do we mean when we say there need to be good, detailed specifications. It's different when it's for a health plan versus what was an improvement of QA tools measure. The way we specify one of those is vastly different from the way we would specify a HEDIS measure. So I think it is a little bit daunting for us to decide on consistent criteria when we don't know the level at which we want measures to be developed. What's the unit of analysis? Is it State, is it health plans, is it providers, you know? And I think it's very hard with these specifications.
Dr. Brown: It's like you're being asked to weigh something that could be weighed in grams, or in tens of thousands of pounds, and the specification you give for measuring that is really going to vary if you're measuring a mouse or an elephant.
Dr. Scholle: I wonder if what we—because what I'm hearing is that you're very interested at the State-level accountability delivery system and whether it's across the delivery system. So that's one level of interest. And then the other thing that I'm reading between the lines is that EHRs are important, and that somehow it's going to be there. So for convenience sake maybe what we need to do is say if you're doing it at the State accountability level, what would be the issues that are important? If you're doing it through EHRs, what would be the issues that are important at a provider level because the EHR reporting would be sort of physician organization level? And we could maybe—because I think they're going to be different, and that might help to organize our thinking as we go through this. I'm looking to my colleagues who've worked with us because that's what we've sort of set up that we're going to be doing specs at both of those levels.
Dr. Dougherty: And I think to get us moving forward that would be a good way to go.
Ms. Fei: My name's Kerri Fei, and I'm from the AMA PCPI, and we've worked closely with NCQA on some measures. Traditionally we have done the provider-level measurements, so the majority of levels like on the Physician Quality Reporting Initiative (PQRI) we have developed. One concern that I have, maybe you can clear it up for me. It seems like when we're talking about these criteria, as a measure developer would there maybe be a separate set of criteria we would need to meet for a pediatric measure versus our criteria we already have set for ourselves based on NQF? Because I mean, we follow NQF pretty much to the letter. So if in developing a measure, if we meet the NQF criteria is there going to be some separate criteria that come out of here that we would also have to meet? Because that would be concerning.
Dr. Dougherty: Well I mean, NQF doesn't have criteria for how you collect the data or design a measure so that it could be applied across, say, the primary care provider, the managed behavioral health plan, and the public mental health system. And for a lot of kids that's an issue. Right now those other settings are pretty much excluded. So, for States, and States correct me if I'm wrong, Medicaid, I mean kids are going to a lot of different places for care. And so the CHIPRA does not say, you know, use the NQF criteria, though certainly there's lots and lots of overlap, but most of the measurement development work in this country has focused on the Medicare population or slightly younger adults. There are similar issues for some of the elderly that we haven't grappled with, like the transition between long-term care and home health care, which are similar to some of these kids' issues, but we certainly haven't grappled with them at a detailed level, the issues with kids and their fragmented health system. Jeff?
Dr. Thompson: Well, I just wanted to comment on that. I mean, as States, we're just in pandemonium right now. And I think you developed these measures at a time when there was a little bit more stability, but I don't know of a State that isn't sort of waxing or waning in not only benefits, but eligibility. So I'd really like you to sort of think about that. And we're probably looking at, you know, unless health care reform comes around which, you know, probably means another 3 years of instability at the State level. And so from your idea of sort of aspirational, I think we've got to tone it down. Because I can tell you, from a chart review, we're cutting provider rates, and then we're going to ask them to do chart reviews at no extra payment? It's just—it's not going to happen.
Dr. Dougherty: I mean, that's in part why it's voluntary and in part why the CHIPRA legislation said CMS shall provide technical assistance to the States. At the same time we realize this isn't -
Dr. Thompson:—the financial system. We're barely keeping afloat. So I just think we've got to, you know, sort of tone down a little bit from a State perspective. It's just unbelievable what's going on at the State level.
Dr. Dougherty: Yes, I think you're right, and this is definitely an evolutionary process. We are not going to answer every question and have specifics at the end of today. We want to make some progress so that when people are developing measures or enhancing the measures we have now for the future that they have more guidance than they have now.
Dr. Thompson: And I'm not going to say stop, but I'm saying think about things. I know we can't talk about this, but you know, eligibility for the denominator. You know, 6 or more months of stability in fee-for-service or managed care or whatever would help out, rather than a denominator where you're trying to figure out did they switch three or four times during a year, or even a year of sort of eligibility in one plan or the other would make it a little bit more doable. But if you start cutting it too fine I don't think you're going to get the answers you want at CMS.
Dr. Dougherty: Okay. But let's do what we can, for a brighter future at some point. Yes, Mary. Then we're going to break for lunch.
Dr. McIntyre: And I hate to bring this up. I just wanted to give a concrete example of actually—and this is Mary McIntyre, Alabama Medicaid—of trying to take the measures that currently exist. We, in these immunizations, and I'm just going to do the 2-year immunization, okay? And in there it talks about on or before the child's second birthday, and you're looking at the individual measures and then there's the combo measures. You've actually got two combo, two combo three. Well, when we go in and we look because we're looking specifically at Medicaid data, and you're looking at the continuous eligibility requirements as the specs identify them and allowing for a gap and what that gap is, so that basically you end up with a 30-day gap that we can identify and look at, which means 11 months of eligibility. Well, when we do all of that, and I roll it, and I actually put them down because I couldn't believe the results, we ended up with a 3 percent for the combo three, okay? And a lot of that deals with when we look at on or before the second birthday, some of those kids get it but it's 30 days after, 60 days after, and then the fact that we have we do not have the information for any kind of gap, so that the population number that you start with that's 2 years old that you end up with that's actually in that timeframe where the view—there is some drop-off. So those are things that need to be considered as far as with the specifications, with the population that you're looking at. And we've done that with several of the other measures just to see what we could get if we stuck strictly to the specs. So there's modification that really needs to happen in order to make them so that they're usable for the population that we're dealing with.
Dr. Dougherty: Okay. I think lunch is out where the registration area was. Bring it back here and then we'll ask you to go into breakout groups and take on one of these challenges and come back with at least one idea for an improved specification, improved specific, concrete specification. Thank you.
Return to Contents
Dr. Dougherty: Just a couple of announcements and then I think we're going to do this a little bit differently than we originally planned. We were going to have all of you give the thumb drives back to us, but now we're going to ask either the facilitator or the reporter to read what you did and sort of summarize it, and then we will—we'll synthesize it as we go or later. We have a public comment period at 2:45 which is one of the announcements.
Okay. Just a couple of announcements. Weather, and I'm biased here. I looked at the local weather, and it seems like by tomorrow sometime there's going to be one inch of slushy mix but high wind. So, and supposed to be 36 degrees tonight and in the upper 30s or low 40s tomorrow. Okay, Linnea. You can get it from her because I'm biased, I want to keep you all here. She has checked the airports and there are no weather cancellations. Okay, hang on to your flash drives because you're going to report back using those as quickly as we can because I think some of you had some experiences and changed around some of the criteria and you know, the assignment of course. We expected that.
The other couple of things is I don't know if there's anybody here who's planning to make a public comment at the public comment period which is at 2:45 to 3:30, but if you are, could you please go and sign up at the registration desk for that? The other thing is that we said we would organize if people wanted to go out and have dinner together. I believe we have made a reservation at the Thai Farm restaurant which is about—it's very, it's walking distance, easy, easy walking distance from the Sheraton Rockville Hotel if people want to go there.
So I floated a little bit and saw that you were all having an interesting time on this assignment. So let's get some reports back. Let's start with the easy one, medical home, underlying scientific soundness for the medical home. You know, when will we know we have a valid measure in terms of underlying scientific soundness for medical home, what kind of criteria do we need, so forth. So either the facilitator or the reporter—Gareth, you're on.
Dr. Parry: Yes, this was so easy. So, we had some kind of criteria laid out for us. I'll start with the first one which says a quality measure should be considered valid if it meets the following—well, you've seen it I suppose, the following criteria: scientific soundness, adequate scientific evidence, and so on. I think you've all seen it. What we decided to do was we decided to keep that criterion but also to add some new criteria to it, specifically around what was actually—or how we actually defined adequate scientific evidence. And we suggested that adequate scientific evidence be in the following order of importance, the first thing being professional consensus, then existence of one or more published quality improvement (QI) or QI-related studies in a peer-reviewed journal, followed by the existence of evidence-based guidelines. We put it in that order because we thought that professional consensus is likely to exist if actually there are things like published QI studies already in existence, so that's why it's kind of in that order.
Dr. Brown: Can I just make one point? We just thought it was so new, the content as one of our team members described it medical "hominess" was so new that we might need to start with—sorry—might need to start with professional consensus and then over time you could move to the peer-reviewed journal and then to the highest degree.
Dr. Parry: Then under the required documentation for criteria to be kept or added we put in specifics from a medical home. A definition of a medical home or should be validity piece, that consensus was fine as in absence of actual guidelines themselves that realizing—and pediatrics, that's probably a necessity that we need to do that. Trying to think on the others—that the idea of evidence of linking to improved health and avoidance of harm is something that you have to continue to monitor over time to see if there's a linkage, that it's not always clear, and that we would look for required documentation as it relates to the measures in terms of combination of different measures to make sure that—if you're doing a composite, for example, we talked a lot about that. You would still tie it to the guidelines.
Dr. Loeb: The only thing I would add to what Barb said is we also talked about the notion that a pediatric measure is not an adult measure that's dialed down. Rather, where appropriate, the scientific evidence should be gathered in the pediatric population, and where not, it needs to be thoughtfully reconstituted with the notion that it would be better if it was in fact within the pediatric literature. But, considering there are large gaps, it might be okay at least on a temporary basis to go ahead and use something that is dialed down. But that's not optimal by any means.
Dr. Dougherty: Okay, let's move on to health outcomes. Patrick Romano.
Dr. Romano: Okay, so I think it's fair to say that generally we were in support of these three criteria that were specified as they apply to health outcome measures. We did think that the third one on the list, which is the evidence of a link to improved health or avoidance of harm, maybe ought to come first because it's really sort of central to decide whether an outcome measure is a legitimate health outcome. So—and we talked about some examples such as looking at missed days of school or looking at hemoglobin A1cs for kids with diabetes. We agreed that evidence of relationship between process and outcomes is very important, that we would ask measure developers to present evidence if it's available regarding the ability to improve an outcome, specific changes in the health care delivery system, and how health care is organized or what specific treatments are provided. And we also agreed in some cases that the evidence would be insufficient, but the professional consensus should be relied upon. And here we thought that there needed to be more attention because professional consensus bodies have generally focused on processes and deciding which processes are evidence-based, but they also need to think about which outcomes are evidence-based. So we suggested that there really should be more focus and more effort to develop professional consensus around what are valid outcomes. We mentioned that U.S. Preventive Services Task Force is sort of the gold standard. Other government task forces, professional societies, some of them have been active in this area. But we were concerned about sort of favoring multidisciplinary professional consensus processes to avoid perhaps undue influence from people who may have a stake in a particular measure or outcome. Does that cover?
Dr. Dougherty: Okay, anything on documentation?
Dr. Romano: Nothing specifically.
Dr. Dougherty: Okay, thank you. And let's see. We have disparities. I don't know if you chose a facilitator, but Dr. McIntyre was the -
Dr. McIntyre: Well, I was the reporter, and we ended up with several people acting as facilitator, and if it's okay with the group, I was going to go ahead and report. I can't get this thing to open up, so I'm just going to go from what I have written down here. And we had a lot of discussion on the whole issue about what the evidence was when it came down to disparities, that there's evidence there relating to racial and ethnic disparities, socioeconomic disparities to specifically worse outcomes when it comes down to specific conditions. But when it comes down specifically to—the first one talks about a causal relationship, and then it talks about the type of measures as far as structuring process, linkage to structure and outcome, process and outcome, and so the real discussion then ended up being centered around that there was still not a lot of evidence when it came down to those areas, but that if we didn't get the—if we didn't go in and actually look at disparities, we never would have evidence. So there was discussion about actually looking at this as not being something that was maybe necessarily initially required until we could get the systems where it needed to be in order to get the info. There was some discussion about a lot of the reports that are there now did not include that information and that we were not able to get it from the bottom up versus the fact that we can generate it from a State level, but not necessarily with other entities, and I think with managed care plans and some of the hospitals that was the information that we were given. So when we ended up looking at this and not really throwing any of these out, but looking at the fact that there is no real—while there's an evidence link to—with showing worse outcomes about looking at the improved link for health and avoidance of harm, that it's really not there at this point.
Where evidence is insufficient we did talk about professional consensus to support the stated relationship that that's probably where we were going to have to be at the very beginning for this. But we don't really have, and we were kind of confused, I'm going to just tell you, about what we were supposed to be doing. We actually went to the other document and started trying to address it from the standpoint of looking specifically at those areas under the validity part, and that's where we spent a lot of time in the beginning, and then we came back to the specific areas under the validity and underlying scientific soundness. So the ultimate result is that we don't think that we're really there when it comes down to the evidence, but we really think we need to start pulling the information in order to be able to get the evidence. Group, was that what we said?
Ms. McColm: So a clarification of the next round. Are we supposed to be looking at it relative to this criterion, or just this?
Dr. McIntyre: We were like, and then somebody ran out and said find Denise.
Dr. Dougherty: But I saw Ernest in there so I walked on by and went to another room.
Dr. McIntyre: Well, Ernest was trying to help us.
Dr. Dougherty: What we've tried to do here is—on this sheet—is to take out the domains mostly from the NQF descriptions. So the purpose of having the spreadsheet is if you have time to see how other people and NQF have more specifically defined these topics here. So we couldn't replicate the entire spreadsheet and still give you room to say which ones should be in or out.
Dr. McIntyre: So we should be going through this list right here?
Return to Contents
Proceed to Next Section