Chapter IV. Data to Support Work on Disparities

Evaluation of a Learning Collaborative's Process and Effectiveness to Reduce Health Care Disparities Among Minority Populations

To reduce disparities, firms need to know what disparities exist and make changes in response. Of course, concepts become more complex in execution. Available firm data on the race and ethnicity of members are limited, making it difficult to measure member disparities in care processes and outcomes by race or ethnicity. In addition, several firms reported that important gaps exist in understanding disparities. For example, there is a limited evidence base to determine how best to improve care to reduce disparities in outcomes associated with members' racial or ethnic characteristics. There is little agreement on how best to measure effective reductions in disparities because absolute change in outcomes, and relative change in outcomes for one group versus another, may yield different conclusions. Faced with these constraints, there is a tension between taking the time to develop better measures and understanding of disparities and moving more immediately to implement interventions believed to have some promise in reducing disparities even if the evidence or ability to measure their effects is limited. 

Measuring disparities was one of the four main areas the Collaborative sought to address (Figure IV.1). A major focus of Phase I involved RAND working with firms to better estimate race and ethnicity for their members in order to assess disparities in diabetes care (using HEDIS indicators) and potentially other care. The results of this analysis helped inform firm leadership and in some cases formed the basis for intervention. Several firms saw weaknesses in what geocoding and surname analysis provided them; such limitations actually encouraged them to begin collecting their own data on the race and ethnic composition of their members. Few firms shared their HEDIS data on diabetes by race and ethnic subgroup with others.

This chapter provides an overview of findings. We review why capturing racial and ethnic data to measure disparities poses a challenge for firms. We discuss why geocoding and surname analysis were an initial focus of the Collaborative, and what these approaches did and did not accomplish. We then summarize, using the available information, firms' current status in collecting patient-level racial and ethnic data. We conclude with a discussion of the Collaborative's generally unsuccessful effort to motivate firms to report HEDIS measures on diabetic members to each other.

Readers should note that Chapter IV focuses primarily on measuring disparities among firms' commercial members, the focus of the Collaborative. 

A. Summary of Findings

Gathering the necessary data to analyze disparities consumed much of Phase I of the Collaborative. Geocoding and surname analysis took much longer than anticipated and were controversial with sponsors for at least that reason, yet many firms found the results beneficial. Few firms had good data on the racial and ethnic characteristics of their members, but most assumed, because of national research, that disparities existed. The majority of firms involved in the effort at geocoding and surname analysis shared their results with firm leadership and said that the findings elevated the disparities issue within the firm. A few firms were disappointed in the results of geocoding and surname analysis because the technique was not sufficiently robust to provide insight relevant to patterns in their market. (A few also expressed disappointment that the geocoding/surname analysis yielded only proxy data that could not be used to target specific members for specific interventions.) Often, however, firms were able to use the results to some end. Although they were disappointed that the work took as long as it did, firms blamed themselves as much as RAND for delays, and perceived that on balance the process had a favorable benefit/cost ratio.

The Collaborative supported presentations of what leading firms were doing to collect race and ethnicity data directly from their members, but did not do more to directly support some firms' desire for assistance in modifying national policy to make it easier for them to obtain data on the race and ethnicity of their members. This omission was a point of contention among some participants in the Collaborative. Phase II will place more emphasis on primary data collection related to disparities, including efforts to define aspects of the way firms approach this to promote consistency.

The Collaborative did not succeed in getting all or most firms to share their data for common HEDIS measures. Such sharing was very important to sponsors and some support organizations, but firm buy-in appears to have been lacking from the beginning. The experience in the area of common measures highlights the challenges of communication and conflicting goals among participants in the Collaborative.

Return to Contents 

B. The Challenges in Capturing Racial and Ethnic Data

National policy on whether, how, and what to collect about the racial and ethnic characteristics of the population served by the health care system was still evolving over the period in which the Collaborative proceeded, a fact that shaped the opportunities and challenges faced by firms seeking such data (Appendix B). Firms found it easier to capture racial and ethnic data for the Medicare and Medicaid populations than for their commercial members, because the Centers for Medicare and Medicaid Services (CMS) collected some of these data for Medicare beneficiaries and required states to provide them for Medicaid beneficiaries (Bierman, Lurie, Collins, and Eisenberg 2002; AHIP 2004). Firms sometimes maintained race and ethnicity data for particular subgroups of their commercial members. For example, many firms in the Collaborative structured protocols for disease management programs so that such data were collected as part of a health risk appraisal. However, these data were not necessarily stored in ways that made them accessible across the firm.

Despite isolated efforts to secure direct data on the racial and ethnic composition of their membership, few, if any, national health plans had (or currently have) complete data on the racial and ethnic composition of all or even most of their members. Collecting racial and ethnic data requires both a process for obtaining information and a mechanism for maintaining and sharing the information across the organization.

Most commercial members enroll through employer groups. Some employer groups have racial and ethnic data on their employees and may be willing and legally able to share the information. The data tend to be specific to subscribers, not to others covered by the policy, such as a spouse or children. Further, unless employers require subscribers to re-enroll affirmatively each year, new requests for information will generate data only for those filing that year—those new to coverage, those changing family status, or those switching plans. Given the difficulties in reaching agreements with a broad range of purchasers, some participating firms with an interest in disparities have started by obtaining data on their own employees.

Firms can obtain racial and ethnic data by asking members directly, although they must comply with state-level legal restrictions or approval requirements. After member enrollment, the collection of racial/ethnic data is subject to fewer legal constraints, especially if the response is voluntary. Another alternative, especially for firms with strong linkages with providers, is to collect such data at points of service and possibly incorporate it as part of an electronic medical record. Most firms sponsoring health plans however do not have such strong linkages to providers. Regardless of their strategy for obtaining data, all firms must meet federal and other requirements that provide appropriate safeguards related to privacy and other concerns. Collaborative firms have found that even when they decide to collect data, there are no perfect strategies for doing so; despite the best intentions, progress is slow.

Firms also face challenges in maintaining and manipulating racial and ethnic data, especially if their systems were not initially designed to support such work. Unless the firm's IT platform has one or more fields for entering data on race and ethnicity, appropriate fields must be added, a process that is typically costly and time-consuming; in fact, such an addition may not be possible if the vendor of an old system no longer maintains it, as one firm found. In addition, there may be more than one IT platform in place across a firm and its affiliates, thereby limiting the pooling of data and access to it. Provider networks are complex; consequently, only a small share of affiliated providers may have racial/ethnic data or be willing to share the information. Willing providers may have IT platforms that are incompatible with those of the firm. Such inconsistencies occur even if the firm has providers integrated with the health plan. Many firms sponsoring health plans were themselves formed from mergers spanning several companies over several years. Each legacy firm may bring its own IT platform. In many cases where integration is a goal, the process is ongoing.

At the Collaborative's inception, only a few participating firms had begun to collect data on the race and ethnicity of all of their members, with a few others planning to do so. Aetna had already started to collect members' race and ethnicity, which helped motivate other firms' interest in the Collaborative. Another regional plan was beginning to collect data, and two firms had policies in place that supported such data collection but found implementation challenging, in part because of competing demands. Of participating firms, only the sole Medicaid dominant plan in the Collaborative had such data for its entire membership—and that was because it could obtain this information from state agencies.

Recognizing that capturing racial/ethnic data would take time, RAND offered to work with interested firms in the early days of the Collaborative to apply geocoding and surname analysis to give participants a preliminary understanding of any disparities in their firm. RAND staff hoped that doing so would reinforce firms' perception that disparities were a problem warranting their attention, and motivate efforts to reduce disparities. Geocoding/surname analysis was also a technique in which RAND's staff were personally interested and experienced (Fremont and Lurie 2004).

Return to Contents 

C. Experience with Geocoding and Surname Analysis

1. Geocoding and Surname Analysis, and RAND's Approach

The goal of geocoding and surname analysis is to allow firms to generate estimates specific to the race and ethnicity of their members. The estimation technique assumes that firms already have the outcome data of interest for the population—such as membership-based HEDIS measures for diabetes—and lack mainly descriptor information on the racial and ethnic characteristics of members for whom outcomes are reported. In short, geocoding and surname analysis use proxy information that is known for members to estimate racial and ethnic characteristics. These data are then linked to outcome measures, such as HEDIS. HEDIS measures are more likely to be captured for HMOs than for other products because quality improvement goals, measures, and requirements are more developed there than elsewhere. Disparities are thus easier to measure in HMOs and other products that employ such measures.

RAND staff explained that most of the agreements with firms were structured such that firms provided individual surnames and physical addresses for relevant members—specifically, those with diabetes (the Collaborative's target population) and others of interest to the firm. Firms, rather than RAND, defined whom to include in the population of interest. RAND staff then analyzed surnames to identify Latinos and Asians, and converted member addresses to census block groups (of around 1,000 people). RAND next examined data on the census block of residence for members not classified as Latinos or Asians through surname analysis. While geocoding lends itself to several approaches, RAND's technique for the Collaborative coded as African American individuals who reside in census block groups with a population that is more than two-thirds African American and others as white or other.8 Based on its geocoding and surname analyses, RAND classified members into one of four mutually exclusive categories: African American, Asian, Hispanic, or white/other.

RAND returned the identifying information to the firm with its racial/ethnic code. In most cases, RAND did not have access to firm HEDIS data, as firms were sensitive about releasing such information. With the information from RAND, firms were to construct HEDIS diabetes indicators for the relevant population. HEDIS includes four process measures for diabetes (HbA1C monitoring, lipid profile, diabetic eye examination, and urine protein) and two outcome measures (HbA1c level controlled and lipid level controlled). Some firms calculated the subset of HEDIS measures that could be computed with administrative data without chart audits, since measures requiring chart audit can be expensive. Some firms provided information on a broader set of members that went beyond just those in the commercial market with diabetes and used the information to develop a broader set of measures about disparities.

For firms that were willing to share HEDIS data, RAND could do more to help them with analysis. For a few firms that expressed interest, RAND incorporated the data into a mapping tool to help firms visually analyze variations in HEDIS outcomes across geographic areas with diverse racial and ethnic characteristics. Based on firm experience in the first round of estimation, some firms contracted with RAND to provide specialized support whereas others either had or built such capacity internally or rejected the geocoding/surname analysis approach entirely. To the best of our knowledge, RAND has not developed a report documenting the work of the Collaborative on geocoding and surname analysis—perhaps because of firm agreements and sensitivities about public reports on their internal processes and data, or other reasons. As a result, information about this process comes from firm presentations to the Collaborative or interviews conducted for the evaluation.

RAND staff members indicated that geocoding works best in highly homogeneous areas—with high concentrations of members in particular racial and ethnic groups—although they believe that it also can be used effectively elsewhere, particularly with recent refinements. Given that geocoding is based on geography rather than on the individual, the technique is best suited for comparing HEDIS or other outcome measures across geographic areas that are known to vary in racial/ethnic composition. Firms can map areas to visually display the diversity therein and identify priorities for interventions. Mapping by geographic coordinates also allows firms to merge many other kinds of data available geographically. The geocoded/surname analyzed data are typically less useful as longitudinal measures of outcomes for person-specific interventions because of the assumptions used in constructing racial and ethnic identifiers using geocoding and surname techniques.

2. Firm Experience with Geocoding and Surname Analysis

At the July 8, 2004 meeting, RAND proposed to work with firms to support the analysis of racial and ethnic disparities by using geocoded and surname analyzed data; RAND then formed a workgroup of interested firms. All firms in the Collaborative participated in the geocoding and surname analysis process except one that already had race/ethnic data for all its members and a second that was actively engaged in capturing such data nationwide.9 Originally intended to provide analysis that could be used in the first Collaborative meeting in September 2004, the work took much longer to complete (as discussed later). The delay reflects an often considerable underestimate of the time required to establish the necessary legal agreements with firms to share data and to have the firms' information systems generate the member data upon which racial/ethnic proxies are based. 

All seven of the firms originally participating in geocoding/surname analysis ultimately received data with geocodes and surname identifiers for at least one time period and had an opportunity to use the data to develop measures of disparities. (An eighth firm recently began talking with RAND about developing such analysis.) In our round two interviews, we discussed the experience with geocoding and surname analysis with staff from each firm involved in the effort. The interviews varied in specificity and did not allow us to describe firms' geocoding experiences in detail with any rigor or consistency.10 They do, however, provide a good indication of the range of firm experiences with the process (Table IV.1). Since that time, some of the firms have continued their geocoding and surname analysis work and several have become more involved in the use of mapping techniques for visual display and analysis of data by neighborhoods and other areas.

Focus of Work. Firms varied markedly in the content and scope of the data they provided to RAND for geocoding and surname analysis. While some firms restricted their scope to diabetes, others went beyond this and included events such as Acute Myocardial Infarction (AMI). One regional firm included all of its adult commercial members in the Consumer Assessment of Healthcare Providers and Systems (CAHPS®) sample frame for its dominant state, along with a subgroup of Medicare enrollees and a targeted group of Medicaid patients. This firm and a few others solicited support for several years of measures; others appear to have limited their focus to a single year. Firms structured their requests to match their needs. For example, one firm excluded members for whom it already had racial/ethnic data, and another used the rules it applies in defining all those categorically eligible for disease management. Many firms included in their request only a subset of their plans or geographic regions so that they could limit burden, address divergent interests among their affiliates, or handle any inconsistencies in IT platforms. Regardless of the variation, the total number of lives that appear to have been included in the exercise is impressive for the potential—provided the technique works—to understand the disparities by race and ethnicity in firms.

Analytic Sophistication. The geocoding and surname analysis process was structured in such a way that its value depended at least partly on what firms did with the data they received. Firms varied in the analytic skills and resources available to support the analysis and in their preferences for support. At least half the firms had some experience with geocoding, typically for African Americans. A few of these firms preferred their own geocoding techniques for designating race to those used by RAND. Analysts in one firm, for example, relied on RAND only for surname analysis and used the firm's own probabilistic techniques to assign racial codes.11 Another firm favored the same approach. Some firms did extensive analysis with the data. At least two firms examined the relative role of race and socioeconomic status in contributing to disparities and their differential effects on diabetes process measures versus outcomes measures. The firms used the results to develop a better understanding of disparities and the approaches most likely to be effective in designing interventions. Firms that could not access sufficient analytic support did far less analysis. For example, one large firm was limited in the programming resources available for geocoding-related analysis and found its progress substantially delayed. It had to purchase additional help from outside vendors for tasks other firms could easily handle in-house.

Perceptions of RAND Support. Those involved with the geocoding and surname analysis project generally expressed satisfaction with RAND's support. They felt that RAND staff met their expectations and that the help was valuable. They also reported that the exercise was not very burdensome. The main substantive disappointment we heard from a few particularly sophisticated firms focused on the fact that RAND staff did not provide more specific technical guidance, such as how to judge the substantive rather than statistical significance of a disparity.One firm perceived the support to focus more on the rigor required for research than the firm's needs. Otherwise, the main limitation, as noted, related to the delays associated with establishing the necessary administrative agreements with RAND to support the geocoding and surname analysis work. Firms typically attributed delays equally to RAND and their own administration. Delays were likely inevitable as firms sought to satisfy the Health Insurance Portability and Accountability Act's (HIPAA) privacy and other concerns.However, some reports suggest to us that management and administrative staff at the participating firms and at RAND could have been more nimble in moving the process forward.

3. Ultimate Value and Use of Geocoding and Surname Analysis

While firms varied in how valid they considered the results of geocoding and surname analysis for their markets, they generally said that they benefited from their involvement in the process. They perceived a positive benefit/cost ratio or provided examples suggesting as much.

Perceived Value. Most firms involved in geocoding and surname analysis stated that, despite the limitations of the resulting data, the technique was sufficiently robust to support the intended uses of the data. The firms shared their results with firm leaders. In some cases, the results provided new and valuable insights that helped firms better conceptualize the issues behind disparities. In others, the findings confirmed what firms already knew, reinforcing the importance of work in the disparities area, particularly among non-clinical staff who might need more convincing. Most firms reported that the analyses revealed some disparities. A few were pleased that disparities were less extensive than they thought or than in the general population. Firms also found value in analyses showing specific geographic areas that were more or less problematic on different measures. Firms using mapping found it valuable in graphically illustrating disparities for internal discussion.

Two firms and some staff in a third firm found the geocoding results disappointing. In one firm, the estimated proportion of African Americans based on geocoding was substantially below what the firm derived from patients with self-reported data; as a result, firm staff did not use the geocoded data. Another firm, perhaps unrealistically, had not realized that the analysis would be less useful in supporting member-specific rather than geographically targeted interventions. In this firm and another with a geographically diverse service area, staff in certain regions felt that the geocoding technique was not well suited to their market. They explained that the disappointing analyses stemmed from markets with very heterogeneous residence patterns by race/ethnicity. Most commonly, geocoded results were at issue. Some had only limited diversity in their membership; therefore, if the strategy for a particular subgroup did not work, the exercise had no other value. Firms with particularly diverse enrollments were also disappointed if the technique did not yield the sensitivity to isolate desired subgroups. (As mentioned before, RAND perceives that recent refinements to the methods address some of these concerns.)

Applications of the Analysis. For most firms—whether or not they found the results compelling—involvement in geocoding and surname analysis proved valuable. By our round two interviews, two firms had already used the data to formulate pilot projects, and several more were in the process of doing so. Others said that they planned to use the information to help them further identify needs and areas to target. One of the firms that found the results invalid used its failure as a vehicle for reinforcing its decision to capture primary data on member race and ethnicity; respondents from two other firms similarly commented that limitations in geocoding and surname analysis solidified firm commitment to primary race and ethnicity data collection. Another firm had not yet found the data useful, but it reported that the process enhanced communication among midlevel staff responsible for such analyses, leading to an ad hoc group that is encouraging further firm investment in analyzing disparities and designing pilot interventions. This firm said that improved communication and the willingness to consider allocating more resources to disparities work were a direct result of participation in the Collaborative.

Future Plans for Geocoding/Surname Analysis. The Collaborative will not support firms in their individual efforts at geocoding and surname analysis during Phase II. However, of the firms that used these techniques in Phase I, over half have plans to continue the analysis, in some form. RAND staff indicated that at least half of the firms decided to use the mapping tool that RAND developed, one firm based on its own earlier experience and the others after another firm that used the tool during Phase I gave a presentation of their results at the June 2006 meeting. The lead contact from another firm indicated that they already had a similar mapping tool, but would be interested in continuing to do geocoding/surname analysis if the financial burden of doing so were minimal. One other firm generally lagged behind the others in this work during Phase I, due to internal reorganization, but has plans to continue geocoding and surname analysis with RAND under a separate contract, unassociated with their commitment to the Collaborative. This firm has hired an analyst to help it gain internal capacity to study disparities and hopes to use the RAND contract for training and other help getting started. Although, as discussed later, all but one of the firms have begun or have plans to begin primary race and ethnicity data collection, putting such systems in place takes time; current and continued work around geocoding and surname analysis holds appeal in that it allows firms to begin to address disparities in their minority populations, while developing longer-term systems to collect and maintain race and ethnicity directly from members. However, one of the firms that used geocoding and surname analysis extensively in the past has not expressed interest in continuing it in the future.

A potential issue for firms involves how to transition from building their geocoding and surname analysis using the support provided through the Collaborative to using their own resources. RAND's tools are not publicly available though we understand RAND has agreed to make its algorithms for assigning surnames available to firms in the Collaborative and is providing advice on vendors and low cost ways to purchase geocoding software.12 Because of the way our firm interviews were timed, we did not learn about firm reactions to these options. At least two firms have contracted with RAND independently to support the geocoding/surname analysis efforts. While internalizing the function can help firms institutionalize the process, some do not have the expertise or staff to do so. In addition, converting to other software may result in inconsistencies with prior analysis, thus detracting from firms' ability to leverage past work and trend experience.

Page last reviewed December 2007
Internet Citation: Chapter IV. Data to Support Work on Disparities: Evaluation of a Learning Collaborative's Process and Effectiveness to Reduce Health Care Disparities Among Minority Populations. December 2007. Agency for Healthcare Research and Quality, Rockville, MD. http://www.ahrq.gov/research/findings/final-reports/learning/4.html