Background Report for the Request for Public Comment on Initial, Recommended Core Set of Children's Healthcare Quality Measures for Voluntary Use by Medicaid and CHIP Programs
The initial core set of Children's Healthcare Quality Measures for Voluntary Use by Medicaid and CHIP Programs was developed using a transparent and evidence-informed process, with broad input from multiple stakeholders. Key components included multiple opportunities for public comment including a CMS-led listening session for Medicaid and CHIP officials; an AHRQ National Advisory Council Subcommittee that contributed expertise on validity, feasibility, and importance of measures in use; and supportive background work by AHRQ, CMS, and members of the CHIPRA Federal Quality Workgroup.
Creation of the Subcommittee on Children's Healthcare Quality Measures for Medicaid and CHIP Programs
In May 2009, the AHRQ Director approved a Charter creating the Agency for Healthcare Research and Quality's National Advisory Council for Healthcare Research and Quality (AHRQ NAC) Subcommittee on Children's Healthcare Quality Measures for Medicaid and CHIP Programs (SNAC). The AHRQ NAC had agreed to provide advice to AHRQ and CMS to facilitate their work to recommend an initial core set of measures of children's health care quality for Medicaid and CHIP programs. To provide the requisite expertise and input from the range of stakeholders identified in the CHIPRA legislation, the NAC established the Subcommittee on Children's Healthcare Quality Measures for Medicaid and CHIP Programs (SNAC; go to Appendix A-3 for a list of members).
The SNAC was charged with: (a) providing guidance on criteria for identifying an initial core measurement set; (b) providing guidance on a strategy for gathering additional measures and measure information from State programs and others; and (c) reviewing and applying criteria to a compilation of measures currently in use by Medicaid and CHIP programs to begin to select the initial core measurement set. SNAC recommendations were to be provided to the NAC, which in turn advises the Director of AHRQ.
Nominations for SNAC members to represent the range of stakeholders were sought from CMS and the CHIPRA Federal Quality Workgroup (Appendix A-4). An emphasis was placed on identifying Medicaid and CHIP officials because of their unique role as potential implementers of the initial core set. Although more were invited, four State Medicaid program officials (from Alabama, Minnesota, Missouri, District of Columbia), and one State CHIP official were able to participate as SNAC members. Others represented Medicaid, CHIP, and other State programs more generally (i.e., representatives of the National Academy on State Health Policy, National Association of State Medicaid Directors, and the Association of Maternal and Child Health Programs).
Representatives of health care provider groups came from the American Academy of Family Physicians, American Academy of Pediatrics, American Board of Pediatrics, the National Association of Children's Hospitals and Related Institutions, the National Association of Pediatric Nurse Practitioners, and a Medicaid health plan representative. The interests of families and children were represented by the March of Dimes. Individual SNAC members provided expertise in children's health care quality measurement, children's health care disparities, tribal health care, dental care, substance abuse and mental health care, adolescent health, and children's health care delivery systems in general. Two members of the NAC also participated in the SNAC
The SNAC Co-Chairs Rita Mangione-Smith, MD, MPH and Jeffrey Schiff, MD, MBA were selected because of their expertise in children's health care quality measurement and leadership roles in the Medicaid Medical Directors Learning Network, respectively. The SNAC charter expires December 31, 2009.3,4
The SNAC held two public meetings (July 22-23 and September 17-18, 2009) and accomplished a substantial amount of work outside of the meetings in order to help the NAC, AHRQ, CMS, and the Secretary meet the CHIPRA legislative deadline of January 1, 2010. Details are provided later in this section.
Multiple ongoing opportunities for public input were provided as part of this process. In June 2009, AHRQ established a Web site to provide information on the Agency's role in CHIPRA implementation, in close collaboration with CMS, and an E-mail address through which the public could comment on the process. In addition, both SNAC meetings were open to the public and provided opportunities on each day for anyone to make formal public comments. Additional opportunity for public comment came during the July 24, 2009 NAC meeting at which the SNAC Co-Chairs presented on the process used and results of the July 22-23, 2009, SNAC meeting.5 In addition, the SNAC co-chair, Dr. Schiff, arranged for a conference call for members of the Medicaid Medical Directors Learning Network (MMDLN) to seek input on the measure identification and recommendation process. Several members of the MMDLN responded by nominating children's health care quality measures in use by their States for consideration for the initial core measure set. Finally, on September 30, 2009, CMS led a listening session for Medicaid and CHIP officials so that they could comment on the initial, recommended core measure set.
Those making public comments through these mechanisms included individual health care practitioners, additional Medicaid and CHIP programs, representatives of industry groups, child and family advocates, and members of the CHIPRA Federal Quality Workgroup. A list of public commenters is included in the Appendix (Appendix A-5).
First SNAC Meeting July 22-23, 2009
The first SNAC meeting was held July 22-23, 2009, in Washington, DC. The meeting was open to the public. This section describes preparation for the first SNAC meeting, the focus of SNAC discussions, presentations to the SNAC, refinements to methodology made during the meeting, and the identification of a preliminary group of measures to further consider for inclusion in the final core set, as well as needs for additional information and work.
AHRQ and CMS staff and the subcommittee Co-Chairs began conferring prior to the first scheduled SNAC meeting. Seventy-seven measures in use by Medicaid and State Children's Health Insurance Program (SCHIP) programs were identified by AHRQ staff with the assistance of CMS, and a process to initially evaluate those measures was agreed upon by AHRQ and CMS.
Prior to the July meeting, the SNAC Co-Chairs, working through AHRQ, provided subcommittee members with standard definitions and criteria recommended for use in evaluating the validity and feasibility of quality measures (Appendix A-6). SNAC members were asked to apply these evaluation criteria to the 77 measures using the RAND Corporation's modified Delphi process.2 Previous work has shown this method of evaluating quality measures to be reliable and to have content, construct, and predictive validity in other applications.3-5.
The modified Delphi process involved individual SNAC members scoring the initial identified set of Medicaid and CHIP quality measures for validity and feasibility on a 1- to 9-point scale (with 1 denoting the measure was not valid or feasible and 9 indicating it was definitely valid and feasible). Objective information (e.g., on underlying scientific soundness of the measures) related to both measure validity and feasibility was provided to the extent it was available. However some measures were scored in this round without adequate identification of numerators, denominators, or measure specifications. Measure specifications are essential for evaluating feasibility. Instructions to the SNAC for Delphi I noted that scores for validity could be guided by professional consensus when published evidence to support the measure's validity was insufficient.
The RAND modified Delphi method outlines cut-points for passing scores on validity and feasibility. For validity, the median passing score used is more stringent, i.e., 7-9 on the 9-point scale, than the median passing score for feasibility, which requires a median score of 4-9 to pass. The rationale for this difference is that for validity, either the evidence exists to support the measure or it does not, which results in relatively objective information being available to make this assessment. Feasibility is a more subjective assessment than validity. Some Medicaid or CHIP programs may find a measure quite feasible to implement (due to their infrastructure, amount of available funding, etc), while others will not. For the purposes of the July meeting, measures with a median validity score of 6 or 7 and a median feasibility score of >4 were discussed by the SNAC. Measures with a validity score of 6 or 7 were selected for discussion, as these measures were deemed controversial and in need of further consideration by the group.
Median scores and a display of the distribution of scores across voting members were calculated and prepared for SNAC review by AHRQ staff prior to the July meeting. The median scores summarized the individual scores of SNAC members on these two domains (i.e., validity and feasibility). The median scores and the display of distribution across voting SNAC members were presented at the July SNAC meeting and used to determine whether candidate measures would be discussed further.
The SNAC spent most of the first day reviewing the criteria for validity and feasibility; identifying criteria for importance; discussing the measures that were deemed "controversial" after Delphi Round 1, i.e., measures with a median validity score of 6 or 7, median feasibility of >4, and a relatively wide distribution across members, suggesting little consensus among the group. Forty-five of 77 measures met these criteria. On the second day, the SNAC heard presentations by experts commissioned by AHRQ and CMS to provide further input into the overall process.
Additional input and discussion: Presentations to SNAC and the participating public
At the July 22-23, 2009, SNAC meeting, members and the public present at the meeting heard several presentations and engaged in discussions with presenters. Presentations by the AHRQ Director, Carolyn Clancy; CMS's Director of the Center for Medicaid and State Operations (CMSO), Cindy Mann; and the Director of the Division of Evaluation, Quality and Health Outcomes in CMSO, Barbara Dailey, set the stage for the meeting. The AHRQ Director provided the charge to the SNAC, and the CMSO Director expressed a strong desire for the SNAC to recommend a grounded and parsimonious core set that could be implemented voluntarily by State programs, health plans, or provider groups.6,7 Representatives of the National Quality Forum, the National Committee on Quality Assurance, and the Center for Health Care Strategies spoke on the challenges of implementing health care quality measures for children.
In addition, several experts who had been asked to write federally supported white papers on specific aspects of measurement in the legislation presented their early thoughts about their work. These experts addressed the charges to them of conceptualizing and assessing the validity, feasibility, and importance of measures of mental and behavioral health care, family experiences of care, duration of enrollment and coverage, availability of services, and the "most integrated health care setting." AHRQ and CMS also asked that papers be prepared analyzing data sets of the National Academy for State Health Policy, Health Management Associates, and the Child and Adolescent Health Measurement Initiative (CAHMI) database from the 2007 National Survey of Children's Health (NSCH). An additional environmental scan of Medicaid and CHIP Web sites had also been commissioned to identify additional children's health care quality measures that may have been missed in the first effort by AHRQ staff and CMS. Not all authors could participate in the July SNAC meeting. All presentations are included in the transcript of the July meeting.
Refinements to methodology
During the July meeting, the SNAC agreed upon refinements to the methodology to be used for future rounds of the modified Delphi process. Importance was added as a third domain to consider when evaluating potential measures in addition to validity and feasibility. The SNAC worked to establish consensus on the criteria to use to rank the importance of measures under consideration. To be considered important, at least some of the following criteria had to be met by the measure. The criteria are listed in order of decreasing weight as determined through a voting process by SNAC members on July 23, 2009:
- The measure should be actionable. State Medicaid and CHIP programs, managed care plans, and relevant health care organizations should have the ability to improve their performance on the measure with implementation of quality improvement efforts.
- The cost to the Nation for the area of care addressed by the measure should be substantial.
- Health care systems should clearly be accountable for the quality problem assessed by the measure.
- The extent of the quality problem addressed by the measure should be substantial.
- There should be documented variation in performance on the measure.
- The measure should be representative of a class of quality problems, i.e., it should be a "sentinel measure" of quality of care (QOC) provided for preventive care, mental health care, or dental care, etc.
- The measure should assess an aspect of health care where there are known disparities.
- The measure should contribute to a final core set that represents a balanced portfolio of measures and is consistent with the intent of the legislation.
- Improving performance on measures included in the core set should have the potential to transform care for our Nation's children.
Similar to feasibility, the threshold for a passing score on importance was also set at >4 on the 9-point scale, as this was felt to be the most subjective of the three evaluation domains. The SNAC members were asked to score each of the measures that had passed the first round of Delphi scoring for validity and feasibility on the new criterion of importance. AHRQ staff then summarized these scores using the median value. Measures were considered to pass the importance criterion if the median score was >4.
The refinement process further involved reviewing, discussing, and reaching consensus on criteria the SNAC would use to evaluate the validity and feasibility (including reliability) of candidate measures that would be considered for potential inclusion in the recommended core set.
Other steps and decisions
The SNAC's discussion of controversial measures resulted in the recommendation that further information related to measure validity, feasibility and importance would be needed prior to further consideration of these controversial measures. The SNAC asked AHRQ staff to obtain that information.
During their July deliberations, the SNAC also determined that a call for nominations of additional pediatric quality measures in use (either within or outside of the Medicaid and CHIP programs) should be used to identify a larger set of measures to consider for the final core set.
SNAC members expressed a strong desire to recommend a grounded and parsimonious core set of measures that could be implemented voluntarily by State programs, health plans, and provider groups, and agreed on a target number of no more than 25 measures. The SNAC acknowledged that such a core set would be incomplete, but efforts would be made to balance the set to accomplish the legislative goals and the goals articulated in the SNAC discussion of measure importance. The SNAC agreed to bring forth to the NAC's attention measures not accepted into the core set and aspects of child health for which current measures do not exist.
By the end of the July SNAC meeting, SNAC members had identified a preliminary set of 24 measures that had clearly passed criteria for validity and feasibility in the first round of Delphi scoring and also passed scoring for importance using the criteria agreed to by the SNAC at the July meeting. This preliminary list of measures is available at the AHRQ CHIPRA Web site as part of the SNAC Co-Chairs presentation to the NAC on July 24 (see below).5 The Co-Chairs made clear that this preliminary group of measures would be subject to further research by the AHRQ staff as needed and included in the second round of Delphi scoring prior to the September SNAC meeting. In addition, SNAC members were invited to nominate additional measures for consideration.
First SNAC Report to the NAC
The SNAC Co-Chairs reported to the NAC immediately after the July meeting (on July 24, 2009).5 This presentation included a review of the SNAC-refined criteria for the measure evaluation (validity, feasibility, and importance), as well as the preliminary list of 24 measures passing all three domains after the initial round of Delphi scoring. The SNAC report is available in the form of a slide presentation at https://www.ahrq.gov/policymakers/chipra/overview/background/methods.html.6
Second SNAC meeting September 17-18, 2009
The SNAC held its second meeting on September 17-18, 2009, in Washington, DC. In addition to being open to public participation onsite, the meeting was Webcast. The technology allowed for greater participation and public comment. A webcast was available.
Preparation for the Meeting
Additional Measure Nominations
Shortly after the July meeting, the AHRQ staff in collaboration with the SNAC Co-Chairs developed a measure nomination template. This template was created in order to collect a standardized set of information on all measures nominated for potential inclusion in the core set (go to Appendix A-7). The nomination template was made available in early August 2009, and nominations were accepted until August 24, 2009. In addition to measure nominations by SNAC members, public nominators included members of the Medicaid Medical Directors Learning Network, the American Medical Association Physician Consortium for Performance Improvement, the National Partnership for Women and Families, and the Child and Adolescent Health Measurement Initiative on behalf of The Commonwealth Fund. Additional nominations were obtained through E-mail to the AHRQ public comment E-mail address. CHIPRA Federal Quality Workgroup nominations also came from CMS and the Health Resources and Services Administration (HRSA).
In addition to all newly nominated measures, each measure that either (1) passed Delphi round one or (2) was considered controversial by the SNAC during their first meeting in July was entered into the measure template, with required information, by AHRQ staff. Authors of the CHIPRA-commissioned papers also recommended measures for consideration and additional sources of data for quality measurement based on their works in progress. Measures recommended by the contractors included a measure of medical home (for "most integrated health care setting") using items from the Healthcare Effectiveness Data and Information Set (HEDIS) Consumer Assessment of Healthcare Providers and Systems (CAHPS®) surveys, a preliminary measure of availability also using items from the HEDIS CAHPS®, and measures of duration of enrollment based on work done by researchers primarily using Medicaid and CHIP enrollment data. In addition, one of the works in progress focused on the type of data (e.g., race/ethnicity) and measures that could be obtained from the Medicaid Statistical Information System (MSIS) statistics.
At a minimum, nominators were asked to identify the measure numerator and denominator, measure specifications, and current use of the measure. Substantial effort was put into obtaining all of the information requested in the template for every measure under consideration. The nominators entered information into the nomination template. Each template was then supplemented with additional information where necessary by AHRQ staff and the SNAC Co-Chairs. Through this work, a standardized set of information was made available for almost all measures for consideration by the SNAC members during their second round of Delphi scoring. One-page summary sheets that abstracted information from the measure nomination templates were provided for each measure under consideration (go to Appendix A-9).
By mid-September 2009, the SNAC had 121 measures to consider during a second modified Delphi process.
Delphi II scoring by the SNAC
Using a second modified Delphi scoring process prior to the September meeting but including the SNAC-identified criteria for importance (Appendix A-8), SNAC members selected 65 of the 121 measures as meeting criteria for validity, feasibility, and importance. As in Delphi I, SNAC members were instructed to use professional consensus on the underlying scientific soundness of the measures in cases of insufficient published evidence.
As at the first SNAC meeting, the SNAC first heard opening remarks from the Directors of AHRQ and CMSO and an overview of the meeting agenda and process.8 Unlike the first meeting, there were no invited presentations (other than during public comment periods on Days 1 and 2). Due to the time constraints and the need to identify for NAC consideration a reasonable core set of measures near the SNAC's target number of 25, the initial plan was to only discuss and consider the 65 measures that passed the second modified Delphi scoring process as candidates for the core set. However, initial discussions at the September 17-18, 2009, SNAC meeting resulted in adding back five measures that did not strictly pass the second Delphi round (i.e., those with high median feasibility and importance scores [>7] and median validity scores of 6 or 6.5 rather than the cutoff of 7) to the list of measures to be discussed and voted on during the meeting. Thus, 70 of the 121 measures scored in Delphi round two were discussed and considered for the core set.
Electronic voting process
Throughout the 1.5-day meeting in September, a method of electronic confidential voting was used extensively by SNAC members. This method was chosen because in small groups some members may dominate a discussion, leading to group decisions that do not reflect the true sense of the group membership.6 Through private electronic voting, the SNAC process was most likely to obtain the candid individual preferences of members, accumulating to a consensus of the SNAC.
Discussion of overlapping measures
On day 1 of the meeting, SNAC members engaged in detailed discussions of measures felt to have substantial overlap. For example, multiple measures pertaining to premature birth passed the criteria for validity, feasibility, and importance, as did multiple dental measures. They also reviewed and prioritized measures based on several characteristics pertaining to legislative and feasibility criteria, including: data source (administrative, medical record, health IT, survey); site of care (primary care, specialty care, inpatient, emergency, mental health, substance abuse, dental); measure type (outcome, process, structural); care continuum (screening, prevention, diagnosis, treatment, care coordination); accountable entity (state program, health plan, provider); child ages to which the measure applied; and availability of data to report disparities.
Elimination of Overlapping Measures, Merging of Some Measures, Voting
After discussions were completed, a series of votes was conducted that resulted in elimination of multiple measures and merging of some measures within a given category. For example, three separate well-child-care visit (WCV) measures that apply to different age groups were combined into one measure for voting purposes. Similarly, multiple measures of premature birth were eliminated, narrowing measures in this area to one measure of low birth weight. Measures in each category (e.g., prevention/health promotion, care of children with chronic disease) were rank-ordered within the category. Lowest scoring measures were eliminated from further consideration. This process resulted in 31 measures for final consideration on the second day of the meeting.
Getting to 25 Measures to Recommend to NAC
On day 2 of the meeting, three rounds of voting were conducted in succession. SNAC members could vote for their top 20 measures out of the 31 that remained. In round one, SNAC members individually voted for their top 10 measures; in round two their next 5 measures; and in round three their final 5 measure choices. Measures voted for in the first round received 3 points per vote, measures voted for in the second round received 2 points per vote, and measures voted for in the third round received 1 point per vote. A priority score was then calculated for each measure representing the total points assigned to that measure by SNAC members after the three rounds of voting. The final rank order of the measures based on priority scores was examined by the SNAC to assess how the acceptance of various cut-points (i.e., 10, 15, 20, 25 total measures) would fulfill the goal of arriving at a grounded, parsimonious, balanced core set of measures. The SNAC voted to recommend the top 25 measures on the list. (Appendix B lists the measures that were discussed during the September SNAC meeting but not included in the SNAC's initial, recommended core measure set, as well as the measures that did not pass the criteria for Delphi II scoring.)
The SNAC Co-Chairs delivered a written summary report to the NAC Chair on November 13, 2009. Select for the summary.
Additional Consideration of SNAC-Recommended Measures
Several rounds of review prior to posting focused on the SNAC-recommended initial core set. The CHIPRA Federal Quality Workgroup held a conference call during which several questions about the measures were clarified. CMS held a listening session for Medicaid and CHIP officials and other key stakeholders in Medicaid and CHIP health care quality on September 29, 2009, during which comments were made. In addition, participants in the listening session (as well as others on the mailing list who were not able to participate) were invited to send comments to the public comment E-mail address by September 30, so that the comments would be available in time to develop a recommendation to the Secretary. These and other public comments were used to prepare the Results section of this background paper.
Following on these comments, AHRQ and CMS staff examined the SNAC-recommended set and agreed to the following modifications for purposes of public posting: (1) separate the well- child-care visit measures into three separate measures by age group; (2) eliminate from the set the National Committee for Quality Assurance (NCQA) annual dental visit measure; (3) eliminate the measure of suicide risk assessment for children with major depressive disorder; and (4) remove from the set the clinician-group level CAHPS® primary care survey. The annual dental visit measure was removed because two other dental measures were recommended using State/CMS Early Periodic Screening, Diagnosis, and Treatment (EPSDT) data.9 Suicide risk assessment for children with major depressive disorder was eliminated because of likely feasibility issues; the measure as nominated is not yet in use for children. Similarly, field experience with the clinician-group level CAHPS® primary care survey is limited at this time.10 Although having a measure of family experiences of care at the provider level was seen as important by the SNAC, the cost of an additional survey in tight economic times was an additional concern.