Selecting Quality and Resource Use Measures: A Decision Guide for Community Quality Collaboratives

Part I. Introduction to Performance Data (continued)

Question 2. What are the strengths and weaknesses of using administrative data, medical record data, and hybrid data?

Four (commonly used or primary) sources of data are used to assess health care quality, three of which are described here: administrative, medical record, and hybrid data.12 (Survey data are also used to assess quality of care and will be addressed in Question 12 of this guide). In general, measure specifications should guide the selection of the most appropriate data format to ensure the most reliable and valid results.

Administrative Data

Administrative data are derived from a variety of preexisting sources such as insurance enrollment files and provider claims. These data are:

  • Readily available.
  • Relatively inexpensive to acquire in electronic formats.
  • Coded by health information professionals using accepted coding systems.
  • Drawn from large populations and therefore more representative of the populations of interest.13

Because most administrative data are intended for financial management rather than quality assessment, they contain varying degrees of clinical detail and are often limited in content, completeness, timeliness, and accuracy.14 Studies analyzing the validity of administrative data for quality assessment at the hospital and health plan levels have generally found that administrative data are sufficiently sensitive and specific to estimate certain performance measures, such as mammography or prenatal care rates.14

The quality of administrative data is likely to be better if the data originate from hospitals than if the data originate from physician offices or other ambulatory settings. Hospitals employ professional coders to assign diagnosis and procedure codes, submit all-payer data in most States to health data agencies, and are subject to auditing and financial penalties for incorrect reporting.15-17 For example, Steinwachs, et al., found that Medicaid administrative data undercounted visits by 25% for low-cost providers and by 41% for patients with low utilization, whereas medical records undercounted billed visits by 10% for patients with high utilization.18 In other words, a significant proportion of ambulatory services may not generate a claim, especially in systems where provider payments are bundled or capped. These limitations of administrative data may be compounded when measuring performance at the individual physician level, as sample sizes are small and patient populations are heterogeneous.19

Medical Records Data

Obtaining data frommedical records (paper or electronic) requires expert staff and greater financial and time resources. Medical reviewers, who are typically either nurses or physicians, must interpret each record and input data findings. Medical records provide detailed clinical data with a richer description of care than can be obtained from administrative data. This is useful for some quality measures, especially those that rely on laboratory values (e.g., hemoglobin A1c [HbA1c] or cholesterol levels) and specific treatments (e.g., discharge instructions). The Core Measures of hospital performance, which were defined by The Joint Commission, endorsed by the National Quality Forum, and adopted by the Hospital Quality Alliance and the Centers for Medicare & Medicaid Services (CMS), rely principally on medical record data. However, ICD-9-CM (International Classification of Diseases, Ninth Revision, Clinical Modification) coded administrative data are often used to help identify the denominator cases of interest. The cost of collecting and submitting these data is borne by hospitals as part of the accreditation process.

Many hospitals and physician organizations voluntarily participate in clinical registries that also capture abstracted data from medical records and report performance data confidentially to participating providers. Although these registry data are not generally used for public reporting or value-based purchasing, their use is under discussion in many regions. In some cases, for example, health plans or employer coalitions may designate centers of excellence for specific services based on participation in the appropriate registry and voluntary sharing of summarized outcome data. Such use of registries is still very limited, as it is generally discouraged by national registry sponsors. Examples of registries include the American College of Surgeons' National Surgical Quality Improvement Program and the Society of Thoracic Surgeons' Adult Cardiac Surgery and Congenital Heart Surgery Databases. The United Network for Organ Sharing collects and manages detailed clinical data pertaining to waiting lists and outcomes for organ transplantation, but only selected center-specific information is available to the general public.

Hybrid Data

Both administrative and medical record data alone have limitations for measuring quality of care. Hybrid data bring together both administrative data and medical record data to build on the strengths of each and to compensate for some of their respective weaknesses.18,20 Varying definitions exist to describe hybrid data, but the term typically refers to aggregation of electronic claims and information obtained from either electronic or paper medical records to increase the number of relevant data elements or to reduce the number of records that must be reviewed, the time required to review each record, or both.14,21 Because organizations differ in their definitions of hybrid data, it is important to clarify expectations and procedures before pursuing this strategy.

At the physician level, applications of hybrid data generally involve using claims to identify patients with a relevant diagnosis or problem and using medical records to identify specific clinical findings or nonpharmacologic treatments. Relying on administrative data alone to estimate Healthcare Effectiveness Data and Information Set (HEDIS) indicators, Pawlson, et al., found significant underestimation and instability in health plan rankings, compared with results from hybrid data. For example, only 3 of 15 measures evaluated (all of which related to well child visits) had comparable performance estimates based on administrative and hybrid data. Evaluating diabetes care in the Veterans' Affairs system, Kerr, et al., compared administrative, medical record, and hybrid data at the Veterans Integrated Service Network (VISN) level and found high agreement between administrative and medical record data, but administrative data consistently underestimated facility performance.22 Hybrid data yielded estimates similar to those from medical record data alone but required 50% fewer chart reviews, resulting in a significant cost reduction.23

At the hospital level, applications of hybrid data generally involve combining ICD-9-CM coded administrative data with key laboratory or other clinical data to enhance the performance of risk-adjustment models and to reduce bias in estimates of hospital performance, relative to administrative data alone.24 The most helpful and cost-effective variables to collect for this purpose include blood cell counts, electrolytes, arterial blood gas values, clotting parameters, and vital signs.25-27 An AHRQ-funded, multicenter pilot project involving Florida, Minnesota, and Virginia will enhance the utility of hybrid data for hospital-level analyses by: (1) standardizing collection of laboratory data using common nomenclature; (2) merging laboratory data with hospital administrative data; (3) assessing the added value of using clinical data to evaluate the quality of patient care within hospitals; and (4) developing recommendations for other sites. AHRQ will release a summary report of the experiences of the three participating organizations and a related toolkit in 2010.

The Future: Electronic Health Records

Looking to the future, electronic health record (EHR) systems may reduce the cost of accessing clinical information from the medical record, thereby making medical record data more useful for quality reporting. However, there are hurdles to be overcome related to the interoperability of these systems and the continuing use of paper notes for point-of-care documentation in many hospitals and offices with EHR systems. EHR capabilities are still only partially implemented in most hospitals; for example, only 8% to 17% of hospitals have fully implemented computerized physician order entry, and fewer than 2% of hospitals have a comprehensive system present in all clinical units.28,29 Other strategies to reduce the data collection burden for hospital measures that require medical record review have been proposed21 but still need systematic evaluation.30 In addition, some EHR data do not include information found in administrative data (e.g., charges and cost) that are important to public reporting, so there will remain a role for administrative data for the foreseeable future.

In ambulatory care, adoption of EHR systems in the United States has lagged behind the hospital sector (with only 4% of physicians reporting a fully functional system) and well behind other countries (e.g., the United Kingdom, Netherlands, Australia, and New Zealand have achieved more than 90% adoption among general practitioners).31,32 However, the “meaningful use” provisions of the American Recovery and Reinvestment Act, which are tied to significant Federal incentive payments, promise to accelerate adoption over the next several years. 

Return to Contents

Question 3. What are the opportunities and challenges in building a multipayer/multi-data source database or data warehouse?

Multisource databases offer the potential to analyze health outcomes and to estimate complex metrics that address the effectiveness and efficiency of care, rather than just specific steps in the process of care. Until now, most collaboratives have built such databases by aggregating data from multiple sources (i.e., employers or health plans) into what has been described as a “data warehouse.” A newer alternative under exploration is the distributed database, which could allow users to create a virtual database by pulling, in real time, only the required data for a particular query from disparate sources. Table 3 describes the differences between the conventional warehouse database and the distributed database.

In the near future, the infrastructure supporting data retrieval for both static, aggregated databases and distributed databases will be affected by the national effort to establish health information exchanges (HIEs).33 The Federal Government is supporting a Nationwide Health Information Network (NHIN) in limited production. The NHIN will facilitate the exchange of health care information between State and regional HIEs, integrated delivery systems, health plans, personally controlled health records, Federal agencies, and others.34 Once fully operational, the NHIN HIE specifications, testing materials, and trust agreements will be placed in the public domain to stimulate adoption ( Some areas of the United States have already seen successful information exchange through Regional Health Information Organizations (RHIOs). RHIOs facilitate information sharing among enrolled members using common, nonproprietary standards for data content and exchange over existing networks and the Internet.30 (Question 6 addresses privacy regulations, which also affect the data aggregation and sharing process.)

Conventional Aggregated Databases

A conventional aggregated database is the most common structure used to warehouse and analyze health care performance data. Obtaining data from multiple sources (e.g., payers, providers, State databases) requires coordination between each source and the community collaborative (and frequently its vendor or consultant) that performs the analysis. This aggregated database approach has been successful in some settings, but it poses the hazard of inadvertently releasing protected health information during data transfers. Also, this approach can be associated with significant delays in obtaining timely data.

Examples of this approach include Indiana's Quality First program, which is supported by its HIE and has been well received by users.35 California's RHIO recently published a white paper describing its success and sustainability model.36 Additionally, many State public health departments that house surveillance databases are adopting this approach and collaborating with State RHIOs.37

Distributed Databases

Issues such as financial responsibility, varied architectures, patient confidentiality, and data management responsibilities pose significant challenges to building and maintaining a multisource, aggregated database.38 A distributed database operates as a “virtual” database where data from various sites (multiple health plans, physician offices, labs, etc.) remain onsite with the data owner. The data user can pull appropriate data from each of those sites, as needed in real time, and perform the necessary analyses from his or her desktop. Shared software, which must be installed by all participants, matches related data from key data sources while keeping protected health identifiers with the data owners, thereby decreasing the risk of disclosing sensitive information.39 

State public health departments have used distributed databases to track infectious diseases, but only recently has this method been recognized as an approach for analyzing the performance of health care providers. The Quality Alliance Steering Committee and America's Health Insurance Plans Foundation (and others) are testing a distributed database as part of their goal to develop a “nationally-consistent data aggregation methodology” that integrates data from multiple sources. One goal of this High-Value Health Care Project is to make performance information available as quickly, consistently, and efficiently as possible by determining the most accurate and timely sources of diagnostic and treatment information (e.g., registries, administrative data).

Table 3: Comparison of multipayer database formats

  Conventional (Aggregated) Method Distributed Data

Hospital quality data reporting programs, such as:

  • California Hospital Assessment and Reporting Task Force (CHART)
  • Pennsylvania Health Care Cost Containment Council
Data Collection Data owner collects and sends data to offsite location Data users extract deidentified data that remain with the data owner. Data queries requiring patient identifiers occur within the data owner's domain; data users extract query results stripped of these identifiers.
Data Location Data are physically transferred to offsite location Data remain with each data owner; data for analysis reside with data user
  • Less expensive alternative in short run
  • Familiar to users
  • May be available in public use file
  • Can accommodate new questions immediately
  • Possible to repeat analyses at any time without concern that underlying data have changed
  • Real-time access to data
  • Reduces potential for HIPAA*/privacy violations

*HIPAA = Health Insurance Portability and Accountability Act

  • Data collection/
    aggregation delays
  • Startup costs may be significant; more complex or customized data use agreements may be needed
  • Software updates to multiple users must be installed simultaneously, requiring coordination
  • New uses or questions are likely to require new algorithms
  • Data owners must agree not to delete or change data files or results will not be replicable
Technology Requirements/
Software or system architecture must meet specification requirements determined by the data aggregator Singular, master software program must be implemented concurrently at all participating data sites

Return to Contents

Question 4. Should a vendor be used for data collection and management? If so, what are the criteria for selecting a vendor?

Many community quality collaboratives contract with vendors to assist with collecting and managing health care quality data. Vendors that have a solid understanding of the functional requirements and standards for capturing quality measurement data are a valuable resource for community quality collaboratives, especially those that plan to undertake their own data analyses and to coordinate regional measurement and public reporting.

Key Considerations for Vendor Selection:11,40

  1. Issue a clear statement of the collaborative's goals and purpose (e.g., pay-for-performance, public reporting, internal reporting, and quality improvement).
  2. Issue a clear statement of needs, expectations, and potential challenges for the collaborative project (“request for proposals”), including:
    • Data collection procedures (e.g., claims data, hybrid data, surveys from multiple sources; frequency; volume; access to low-quality or high-quality claims data);
    • Data management procedures (e.g., data cleaning methods, data protection),
    • Data evaluation and validation procedures (e.g., use of simple or complex measure methodologies); and
    • Data storage (through vendor or in-house; appropriate security).
  3. Issue a clear statement of internal resources available to the collaborative (e.g., level of in-house expertise).
  4. Ensure that the vendor has expertise in collecting and managing quality performance data with established hardware and software systems (e.g., completed licensure or accreditation from quality measurement entities such as the National Committee on Quality Assurance, Quality Improvement Organizations, or the Joint Commission).
  5. Confirm vendor's data validation or auditing experience.
  6. Confirm that no conflicts of interest exist between collaborative members and the vendor, or within the vendor's various lines of business (e.g., vendor owning, owned by, or dependent upon a hospital or health plan that might be evaluated by the chartered value exchange [CVE]).
  7. Compare cost and services offered by competing bidders. For government programs such as Medicaid, State rules may require a particularly complex process for soliciting and reviewing competitive bids.
  8. Request and contact references provided by vendor, asking salient questions about the vendor's responsiveness, expertise, timeliness, quality, and financial management.
  9. In the case of Medicaid, the vendor needs to understand the added complexities and challenges associated with Medicaid databases (e.g., issues of discontinuous eligibility, variable cost-sharing, nonstandard claims).

Return to Contents

Question 5. How should a data auditing program be designed to ensure data quality?

A carefully designed data auditing program will help to ensure the validity of the data reported and will proactively address concerns about data validity from both provider and consumer perspectives.

Most national measurement efforts, such as those sponsored by the National Committee on Quality Assurance (NCQA) and the Joint Commission, involve a systematic data auditing program. Two approaches have been developed. In the decentralized approach, used by NCQA, the data collecting organization requires participating providers to contract with audit vendors certified or licensed by the organization, who follow a standard auditing protocol.41,42 In the centralized approach, used by the Centers for Medicare & Medicaid Services (CMS), the data collecting organization contracts with an independent entity to review a random sample of records across all providers. For Medicare, these audits are performed on a quarterly basis by a Clinical Data Abstraction Center (CDAC) and submitted to a data warehouse; a hospital's data are considered as “validated” if overall agreement with the reabstraction is at least 80% (

An auditing program was considered very important to the Better Quality Information (BQI) project (refer to Question 13), which sought to aggregate Medicare, Medicaid, and commercial claims data to assess physician performance. The BQI pilot sites created training and auditing processes to ensure accuracy, and the final measure results were given as feedback to providers who could challenge apparent errors or inconsistencies.44

Key Features for an Optimal Local/Regional Data Auditing Program41,42,45

  1. All key data components of the measure should be audited, including not only whether the numerator treatment or event occurred, but also whether the patient actually qualified for the denominator. Missing data are particularly important, because patients with missing data elements are typically excluded from quality reporting. Missing data rates should be tracked over time and across providers, so that high missing rates can be identified and corrected.
  2. If the measure is based on an adverse event that is potentially susceptible to underreporting, such as a complication of care, then there should be some effort through medical record review or linkage with other data (e.g., laboratory data) to find potential “false negative” cases, or unreported adverse events. For the sake of efficiency, these efforts often focus on particular subsets of patients, such as those who were at very high risk of the adverse event.
  3. Similarly, if a measure is based on a specific process of care, then there should be an effort to find both “false positive” cases that were reported as having the treatment but did not, and “false negative” cases that were reported as not having the treatment but actually did. For the sake of efficiency, these efforts often focus on particular subsets of patients, such as those who were reported as “exceptions” or as having unspecified contraindications to the standard treatment.
  4. If possible, auditing should occur concurrently with data collection to detect errors in time to correct them before the data are used to support quality of care analyses. Record-specific feedback should be provided to submitting organizations to facilitate their review of reporting errors and their appeal of legitimate disagreements.
  5. The auditing program should verify that measure calculation processes conform to technical specifications. This is most commonly done by creating a simulated data set or by manipulating an actual data set in a predetermined manner where the outcome is already known.
  6. The auditing program should assess system capabilities, such as the ability to process information submitted by different provider organizations in different formats for consistent reporting of clinical measures.

Return to Contents

Question 6. How do HIPAA and other privacy regulations affect data collection and public reporting?

Community quality collaboratives may encounter opposition to the release of protected health information, from advocates of both patient privacy and physician privacy. In this environment, meeting State and Federal security standards for data sharing is critical to ensuring continued access to valuable data. This answer addresses the Federal issues around reporting, but community quality collaboratives are encouraged to learn about the specific privacy laws applicable in their State.

Adhering fully to all of the various privacy laws and regulations affecting data collection and public reporting can be quite complex, especially given the variability in State laws and the intricacies of the Health Insurance Portability and Accountability Act of 1996 (HIPAA). These laws and regulations dictate what and how health care information can be shared. Formal data sharing or business associate agreements must be in place prior to sharing or receiving protected health information. As described further below, those agreements must specify and limit how the data are used.


Under the Administrative Simplification provisions of HIPAA, the Department of Health and Human Services (HHS) established national standards for electronic health care transactions and national identifiers for providers, health plans, and employers. HIPAA also addresses the security and privacy of health data.46 To ensure privacy, HHS developed a set of regulations, commonly referred to as the HIPAA Privacy Rule, to address the use and disclosure of individuals' health information (called “protected health information”) by organizations subject to the Privacy Rule, which are called “covered entities.”47 HIPAA defines “covered entities” as health plans, providers, and clearinghouses.

HIPAA further requires that covered entities have formal agreements in place with their business associates (e.g., a third-party pharmacy benefit management organization), which restrict the business associate to certain uses and disclosures. In addition, the American Recovery and Reinvestment Act of 2009, (sec. 13401) extends several privacy, security, and administrative requirements to business associates. Business associates will soon be required to comply with the same HIPAA requirements that apply to covered entities, and business associate agreements will need to be updated to reflect this requirement by early 2010.

Privacy Act of 1974

The Federal Privacy Act of 1974 (Public Law 93-579) codifies the permissible personal information the Federal Government may collect and how it uses or discloses that information. The Privacy Act differs from HIPAA in several respects: it covers overall personal data collection and data use by the Federal Government only. HIPAA specifically targets the health care industry and restricts the sharing of protected health information with others, including the government.48,49

Resources for Creating Compliant Data Use Agreements

Creating data use or business associate agreements between entities is necessary to meet the strict data privacy standards imposed by HIPAA. Several Chartered Value Exchanges (CVEs), such as Wisconsin, Indiana, Washington-Puget Sound Health Alliance, and Minnesota, indicate that their business associate agreements meet all HIPAA and other privacy law requirements. These agreements may serve as prototypes that any community quality collaborative can follow; however, local legal advice may still be advisable to ensure full compliance with State privacy laws, which may be even more stringent than HIPAA. State Quality Improvement Organizations (QIOs), which contract with CMS to improve the quality and safety of health care for Medicare beneficiaries, are also offering assistance in creating data use agreements and business associate agreements that comply with HIPAA standards.50 The Centers for Medicare & Medicaid Services (CMS) itself recently released guidance, in December 2008, as part of a toolkit “designed to establish privacy and security principles for health care stakeholders engaged in the electronic exchange of health information [that] includes tangible tools to facilitate implementation of these principles”51 ( Finally, statewide health data organizations offer expertise in State health care privacy regulations, which can vary considerably by State.

Legal Considerations for Physician Tiering

In recent years, several legal cases have addressed the ability of private entities to use enrollee-level claims data for tiering health care providers or public reporting on provider quality. In August 2007, Consumers' Checkbook/Center for the Study of Services won a Freedom of Information Act lawsuit in the U.S. District Court for the District of Columbia, which ordered CMS to release certain data from physician claims paid by Medicare, for the purpose of reporting the number of various types of major procedures performed by each physician and reimbursed by Medicare.52 However, this decision was overruled by the U.S. Court of Appeals, which decided that physicians had a substantial privacy interest in not having claims data publicly disclosed because the data could be used, along with a publicly available Medicare fee schedule, to calculate a physician's total income from Medicare.52

This decision appears to establish, pending further court action, that physician-identified Medicare claims data are exempt from disclosure under the Freedom of Information Act. However, to the extent that community quality collaboratives engage in or facilitate efforts to create “tiered networks” of health care providers using quality and efficiency measures, they may face legal allegations regarding:

  • “Secrecy in both the standards used and the weights used to perform rankings;
  • The absence of a transparent rational basis for the methods chosen; and
  • The absence of a process by which physicians can examine the data on which their rankings rest and challenge errors in data or methodology.”52

A recent legal settlement entered into by Regence Blue Shield and the Washington State Medical Association (WSMA), after a legal challenge by WSMA, exemplifies the importance of:

  • Inviting physician input into the data audit and methods used to compare their performance;
  • Offering advance notice that new scores are forthcoming;
  • Posting scores in an electronic format, along with an explanation of the methodology and data;
  • Providing physicians the opportunity to appeal their scores; and
  • Creating an independent external review process to adjudicate these appeals.

This tension between protecting physician privacy, embodied in the Consumers' Checkbook decision, and enhancing fairness through transparency, embodied in the WSMA settlement, may play out differently in different markets.

Community quality collaboratives should note that they may encounter opposition to the release of health information from those concerned with physician and facility identification. Meeting State and Federal security standards for data sharing is critical, but in this environment, sensitivity to provider concerns is equally important for ensuring continued access to valuable data (refer to Question 24 about provider data reviews).

Page last reviewed October 2014
Page originally created May 2010
Internet Citation: Part I. Introduction to Performance Data (continued). Content last reviewed October 2014. Agency for Healthcare Research and Quality, Rockville, MD.