Future Directions for the National Healthcare Quality and Disparities Reports
H. Additional Assessments of Data Presentation in the NHQR and NHDR
While the NHQR and NHDR monitor a large number of measures, there is no sense from the report findings that the nation is improving or worsening its performance in the areas that matter most or in areas that can make the greatest difference. The significance of the findings is not relayed in a manner that evokes action from its readers. This led to the committee's conclusion regarding the importance of telling a story through the NHQR and NHDR.
With so many measures, population groups, and years of information presented in the national healthcare reports, the task of summarizing the most important findings of the reports is challenging. As discussed in Chapter 6, the committee identified three pieces of information that if reported for each measure (whether individual, composite, or summary) would tell a better story and enhance actionability:
- The Nation's current level of performance on a given measure (expressed using means and standard errors).
- How the Nation has achieved the current level of performance (expressed by the annual rate of change and standard error of the estimated change); and
- How far the Nation has to go to close the performance gap between current practice and the recommended standard of care (goal or benchmark)—the number of years to achieving the desired performance level based on the historical annual rate of change and the corresponding interval estimate.
These pieces of data could be presented in several ways. The committee presents the following examples as one succinct mode of concisely conveying such information; these templates could be used to convey progress toward benchmarks or goals if they have been set for individual measures.
Alternately, these data can be transformed into graphic displays that illustrate the nation's current performance on each measure and estimate how long it would take to reach desired goal levels, whether defined as an aspirational goal or one grounded in benchmark levels of attained performance for a given measure. These would likely be line graphs showing projections, but other useful displays could be derived as well. It is anticipated that presenting the data in such a way that quantifies and brings attention to the extent of a quality or disparity gap will focus the attention of the reports on changes the nation can make and aspires to achieve, rather than focusing on past performance. Going forward, these types of messages are what the committee would like to see.
Because benchmarks could potentially move from one reporting year to another (indeed, benchmarks will ideally improve as quality improves rather than remains stagnant even among the highest performers), the question arises as to how to indicate progress when a benchmark is a not a fixed target. One could compare national average performance to the benchmark in the baseline year and then indicate that expectations have risen over time since the best attained performance is now set at a higher level. For example, the average national performance level in the baseline year might be 70 percent, the benchmark of best attained performance 80 percent, and the estimate is that it will take 10 years for the nation to achieve the 80 percent level. In year three, the national performance is now 75 percent, the benchmark of best attained performance has risen to 85 percent, and it may again be estimated that it will take 10 years for the nation to achieve the newly established benchmark. In year three, progress can be reported compared with the baseline year (movement from 70 to 75 percent), the curve of a trend line can be noted to have improved, and higher expectations can be set because some entity has shown that is possible to achieve 85 percent. Alternately, when and if specific goals are set, they are likely to be more fixed targets, and progress could be assessed against those fixed targets and computations of how long to reach the targets could be provided to that threshold.
The Future Directions committee thought it wise to set goal levels of performance for states or other entities that were informed by actual achievement so that they are not dismissed as unrealistic. Benchmarking units could be states, hospitals, health plans, population groups, amongst other units (go to Chapter 6 for further discussion).
Assessments of Current Presentation of Statistics in Select Sections in the 2008 NHQR and NHDR
The Future Directions committee had a statistical expert review portions of the NHQR and NHDR; the commentary follows. Presentation in the reports will have to balance the needs of a variety of users for simpler exposition and statistical clarity and precision; some of the statistical information would add to the clarity of the exposition, and more detailed statistical information might be presented in online appendixes.
National Healthcare Quality Report, 2008, Chapter 2. Effectiveness (Heart Disease)
- Page 52: In this overview of statistics for this condition, it is unclear whether the number of deaths listed is the number of deaths due to heart disease or the number of people who had heart disease and died.
- Figure 2.15 (Adult current smokers with a checkup in the last months who received advice to quit smoking, 2000-2005): Separate estimates are graphed here for each year and connected with solid lines. It appears that there is an attempt to infer longitudinal patterns without actually going to the trouble of the estimation process. A statistical model that links the annual estimates could supply more information than simply connecting the dots. Additionally, it would be useful to know the sample sizes examined—it should be noted.
- Page 53: In the supporting text of Figure 2.15, it is not clear whether the statement regarding the 18-44 age group is a statistically significant finding or if it is based on observation of the point estimates alone. A statistical test or estimate would be helpful. Without knowing the sample sizes, it seems that the trend is non-linear and this may be additional information worth noting.
- Figure 2.16 (Adults with obesity who were told by a doctor they were overweight, 2002-2006): In this figure, data are aggregated over the period 2003-2006. It is not clear why the data are not reported annually—either a rationale for aggregating would be helpful for the reader, or simply presenting the latest data year of information would be sufficient. Additionally, the number of adults contributing to each stratum (overall, and by age group) should be reported.
- Figure 2.17 (Adults with obesity who ever received advice from a health provider to exercise more, 2002- 2005): There are several comparisons made in this graphic: temporal changes and age group differences. There are no measures of uncertainty for the various point estimates and these could easily be incorporated (sufficient statistics are the point estimates and the standard errors; the barplot really only shows the point estimates).
- Figure 2.19 (Hospital patients with heart attack who received recommended hospital care: overall composite and six components, 2002-2004 (Medicare) and 2005-2006 (all payers)): The denominator indicates patients hospitalized with a principal diagnosis of acute myocardial infarction (AMI) but the denominator should change depending on eligibility criteria. For example, the sample sizes for the angiotensin converting enzyme (ACE) inhibitor measure should only include those who are eligible for ACE.
- Figure 2.20 (Deaths per 1000 adult hospital admissions with acute myocardial infarction, , and 2000-2005): As with other figures in this chapter, point estimates should be accompanied by estimates of error. Again, the connecting lines imply a desire to examine trends over time; a statistical model that smooths the estimates across time would be useful. The title should indicate "in-hospital deaths;" length of stay should be reported given it has changed over time, and, because the dependent variable is in-hospital mortality, this change in exposure period may confound any observed differences.
- Figure 2.21 Similar comments to those for Figure 2.19.
- Figure 2.22 (State variation: Hospital patients with heart failure who received recommended hospital care, 2006): In this figure, it is unclear what messages are intended to be conveyed. If the main message is about geographic variation in receipt of recommended hospital care by state, then it is unclear as to what constitutes an "above average" measure of variation by looking at the figure alone. The national average and some range of values for state performance should be noted on the figure itself, not just stated in the supporting text. The supporting text for this figure on page 61 reports observed variation of 74.3% to 94.5% across the states. However, the denominators in the calculations vary across states, and this fact should be addressed in any inferential (comparative) statement. From the figure, because the states have different sizes (areas), the shading may distort the message. Additionally, as discussed in Chapter 6, the color coding is not as intuitive as may be expected (green usually means "go" or "good"; black is often associated with "bad" results; but that is not the meaning here). In terms of estimation, it is not clear how the data were analyzed (e.g., simply aggregated the number of met measures divided by the number of opportunities within a state; or averaged the hospital-specific opportunity scores within a state; or did something different). Finally, the choice of "average" deserves some justification.
National Healthcare Quality Report, 2008, Chapter 6. Efficiency
- Page 135: The term "expenditure" should be defined for the reader.
- Figure 6.1 (Average annualized percentage changes in national health care expenditures and quality for general population and people with selected conditions, 2001-2005): Text indicates quality and expenditures are "two very different measures" (p. 136) yet they are included on the same graph in the figure. This sends a confusing message. If the two aspects are very different, and the reader is subsequently cautioned in the supporting text not to draw conclusions regarding the relationship between the two, then they should not be presented together in the same graphic.
- Page 137: The term "cost" should be defined for the reader.
- Figure 6.2 (National trends in potentially avoidable hospitalization rates, by type of hospitalization, 1997 and 2000-2005): Because data points for years 1998 and 1999 are not available, the graphic should start at 2000. While the graphic includes several time points, the statistical test on page 139 utilizes only two time points (either the difference between 2000 and 2005 or the difference between 1997 and 2005). It is unclear why the report does not use regression modeling to estimate the actual trends rather than testing the difference between two time points given all the data that are available. The number of hospitals used in the calculations should be reported.
- Figure 6.3 (Total national costs associated with potentially avoidable hospitalizations, 1997 and 2000- 2005): Because data points for years 1998 and 1999 are not available, the graphic should start at 2000. The number of hospitals used in the calculations should be reported. The statement in the supporting text on page 139 indicates that costs due to avoidable hospitalization were 35 percent greater than in 1997. If this is a statistically significant finding, it should be noted. And if it is a statistically significant finding, then the type of test used should be noted. Some measure of accuracy in costs per year in the figure should be reported (the number of avoidable hospitalizations changed across the years and this should be reflected in the graphic).
- Table 6.1 (Rehospitalizations for congestive heart failure, per 1,000 initial admissions for CHF, States, 2004 and 2005): The information in this table is somewhat perplexing. For example, the standard errors that are reported are either 0 or 1, which are suspicious for two reasons. First, the errors of two orders of magnitude are smaller than the rates, which may mean they are not reported on the same scale as the rates (the rates are per 100,000 admissions). Second, it seems that some rounding errors must have occurred, as a standard error of 0 is unlikely. If none occurred, then some explanation for this value in the results would be informative. Important information from the table is missing such as the sample sizes (number of initial congestive heart failure [CHF] admissions) per state (and the number of hospitals per state). The text (page 142) indicates an overall rate (210 per 100,000 admissions)—inclusion of the overall rate in the table would be helpful. Additionally, it appears that no covariates were included in these calculations. For clarity, define the outcome more explicitly: is it re-hospitalization for CHF within 3 months of discharge of an initial CHF hospitalization?
- Figure 6.4 (Average estimated relative hospital cost efficiency index for a selected sample of urban general community hospitals, 2001-2005): This figure reports estimated relative hospital cost efficiency indices for 1,368 general community hospitals. The numbers in this figure are challenging to interpret mainly due to the lack of a clear explanation of what each number means. Specifically, what is 100.03 (reported in 2002), and is 110.48 a clinically meaningful increase? Because each number is estimated based on data, the standard errors (or confidence intervals) should be displayed so the reader is not misled by measurement error. Finally, it is not clear at all how the index accounts for quality (page 142) as it appears to be based on costs and not on quality.
- Table 6.2 (Correlates of hospital cost efficiency): This table reports correlates of hospital cost efficiency for the 1,368 general hospitals. However, the sample sizes are not reported in the table for either the number of hospitals or the number of discharges; the interpretation of the estimates is unclear (for example, what is an operating margin?); and the table reports standard deviations presumably among the hospitals falling into each quartile, but not the standard error of the actual estimates.
Case Study: National Healthcare Disparity Report, 2008, Chapter 2. Quality of Health Care (Heart Disease)
- General note on this chapter: Because of the importance of the specific subgroups of interest, data completeness and comparability for race, income, and education variables are important to report. For example, some states that contribute to the HCUP data may have large proportions of missing race/ethnicity data. Moreover, some states ask patients to identify their race and ethnicity, and some determine race and ethnicity from observation.
- Page 54: Similar to the observation raised for the overview of statistics for this condition in the NHQR, it is unclear whether the number of deaths listed here is the number of deaths due to heart disease or the number of people who had heart disease and have died. It is also unclear why the format for this overview is different from that in the NHDR. It would make most sense for the statistical overviews for the same conditions were the same in both reports.
- Figure 2.13 (Adults with obesity age 0 and over who were told by a doctor they were overweight, by race/ethnicity, income, and education, 1999-2000 and 2003-2006): The number of observations in each category should be reported or the point estimates need to be accompanied by confidence intervals. There are many comparisons listed on page 58, and it is not clear if each is statistically different. This lack of clarity arises as the word "significantly" is only stated for Blacks. The committee is not pushing for many statistical tests, rather clarity on the findings as stated.
- Figure 2.14 (Adults with obesity who ever received ad ice from a health provider to exercise more (top left), ethnicity (top right), income (bottom left), and education (bottom right), 2002-2005): As with other figures, the standard errors or the sample sizes should be reported in order for readers to attempt to eliminate sampling variability.
- Page 60 (Last Paragraph): It is stated that the goal is to identify the independent effects of the various factors on quality of health care. It is highly unlikely that "independent" effects were estimated for the specific factors considered (race, income, education) as these factors are highly correlated.
- Figure 2.15 (Adults with obesity who ever received ad ice from a health provider to exercise: Adjusted odds ratios, 2000-2005): Because these odds ratios are estimates, the standard errors of each should be displayed. For example, is the odds ratio for females different from 1.0? There are many comparisons in the single barplot, and some of the main messages get lost.
- Figure 2.16 (Composite measure: Hospital patients with heart failure who received recommended care, Medicare only by race/ethnicity, 2002-2004 (left) and all payer 2005-2006 (right)): As with the other figures displaying point estimates over time, either the sample sizes or standard errors should be included. By connecting the lines, there is an implication that trends over time are important, yet these are not estimated.