Appendix H. Additional Assessments of Data Presentation in the NHQR and NHDR
While the NHQR and NHDR monitor a large number of measures, there is no sense from the report findings
that the nation is improving or worsening its performance in the areas that matter most or in areas that can
make the greatest difference. The significance of the findings is not relayed in a manner that evokes action from
its readers. This led to the committee's conclusion regarding the importance of telling a story through the NHQR
With so many measures, population groups, and years of information presented in the national healthcare
reports, the task of summarizing the most important findings of the reports is challenging. As discussed in Chapter 6, the committee identified three pieces of information that if reported for each measure (whether individual, composite, or summary) would tell a better story and enhance actionability:
- The Nation's current level of performance on a given measure (expressed using means and standard
- How the Nation has achieved the current level of performance (expressed by the annual rate of change and
standard error of the estimated change); and
- How far the Nation has to go to close the performance gap between current practice and the recommended
standard of care (goal or benchmark)—the number of years to achieving the desired performance level
based on the historical annual rate of change and the corresponding interval estimate.
These pieces of data could be presented in several ways. The committee presents the following examples as
one succinct mode of concisely conveying such information; these templates could be used to convey progress
toward benchmarks or goals if they have been set for individual measures.
Alternately, these data can be transformed into graphic displays that illustrate the nation's current performance
on each measure and estimate how long it would take to reach desired goal levels, whether defined as an
aspirational goal or one grounded in benchmark levels of attained performance for a given measure. These would
likely be line graphs showing projections, but other useful displays could be derived as well. It is anticipated that
presenting the data in such a way that quantifies and brings attention to the extent of a quality or disparity gap will
focus the attention of the reports on changes the nation can make and aspires to achieve, rather than focusing on
past performance. Going forward, these types of messages are what the committee would like to see.
Because benchmarks could potentially move from one reporting year to another (indeed, benchmarks will
ideally improve as quality improves rather than remains stagnant even among the highest performers), the question
arises as to how to indicate progress when a benchmark is a not a fixed target. One could compare national average
performance to the benchmark in the baseline year and then indicate that expectations have risen over time since
the best attained performance is now set at a higher level. For example, the average national performance level in
the baseline year might be 70 percent, the benchmark of best attained performance 80 percent, and the estimate
is that it will take 10 years for the nation to achieve the 80 percent level. In year three, the national performance
is now 75 percent, the benchmark of best attained performance has risen to 85 percent, and it may again be estimated
that it will take 10 years for the nation to achieve the newly established benchmark. In year three, progress
can be reported compared with the baseline year (movement from 70 to 75 percent), the curve of a trend line can
be noted to have improved, and higher expectations can be set because some entity has shown that is possible to
achieve 85 percent. Alternately, when and if specific goals are set, they are likely to be more fixed targets, and
progress could be assessed against those fixed targets and computations of how long to reach the targets could be
provided to that threshold.
The Future Directions committee thought it wise to set goal levels of performance for states or other entities
that were informed by actual achievement so that they are not dismissed as unrealistic. Benchmarking units could
be states, hospitals, health plans, population groups, amongst other units (go to Chapter 6 for further discussion).
Assessments of Current Presentation of Statistics in Select Sections in the 2008 NHQR and NHDR
The Future Directions committee had a statistical expert review portions of the NHQR and NHDR; the
commentary follows. Presentation in the reports will have to balance the needs of a variety of users for simpler
exposition and statistical clarity and precision; some of the statistical information would add to the clarity of the
exposition, and more detailed statistical information might be presented in online appendixes.
National Healthcare Quality Report, 2008, Chapter 2. Effectiveness (Heart Disease)
- Page 52: In this overview of statistics for this condition, it is unclear whether the number of deaths listed
is the number of deaths due to heart disease or the number of people who had heart disease and died.
- Figure 2.15 (Adult current smokers with a checkup in the last months who received advice to quit
smoking, 2000-2005): Separate estimates are graphed here for each year and connected with solid lines. It
appears that there is an attempt to infer longitudinal patterns without actually going to the trouble of the
estimation process. A statistical model that links the annual estimates could supply more information than
simply connecting the dots. Additionally, it would be useful to know the sample sizes examined—it should
- Page 53: In the supporting text of Figure 2.15, it is not clear whether the statement regarding the 18-44
age group is a statistically significant finding or if it is based on observation of the point estimates alone.
A statistical test or estimate would be helpful. Without knowing the sample sizes, it seems that the trend
is non-linear and this may be additional information worth noting.
- Figure 2.16 (Adults with obesity who were told by a doctor they were overweight, 2002-2006): In this figure,
data are aggregated over the period 2003-2006. It is not clear why the data are not reported annually—either
a rationale for aggregating would be helpful for the reader, or simply presenting the latest data year of information would be sufficient. Additionally, the number of adults contributing to each stratum (overall, and by age group) should be reported.
- Figure 2.17 (Adults with obesity who ever received advice from a health provider to exercise more, 2002-
2005): There are several comparisons made in this graphic: temporal changes and age group differences.
There are no measures of uncertainty for the various point estimates and these could easily be incorporated
(sufficient statistics are the point estimates and the standard errors; the barplot really only shows the point
- Figure 2.19 (Hospital patients with heart attack who received recommended hospital care: overall composite
and six components, 2002-2004 (Medicare) and 2005-2006 (all payers)): The denominator indicates
patients hospitalized with a principal diagnosis of acute myocardial infarction (AMI) but the denominator
should change depending on eligibility criteria. For example, the sample sizes for the angiotensin converting
enzyme (ACE) inhibitor measure should only include those who are eligible for ACE.
- Figure 2.20 (Deaths per 1000 adult hospital admissions with acute myocardial infarction, , and
2000-2005): As with other figures in this chapter, point estimates should be accompanied by estimates
of error. Again, the connecting lines imply a desire to examine trends over time; a statistical model that
smooths the estimates across time would be useful. The title should indicate "in-hospital deaths;" length of
stay should be reported given it has changed over time, and, because the dependent variable is in-hospital
mortality, this change in exposure period may confound any observed differences.
- Figure 2.21 Similar comments to those for Figure 2.19.
- Figure 2.22 (State variation: Hospital patients with heart failure who received recommended hospital
care, 2006): In this figure, it is unclear what messages are intended to be conveyed. If the main message is
about geographic variation in receipt of recommended hospital care by state, then it is unclear as to what
constitutes an "above average" measure of variation by looking at the figure alone. The national average
and some range of values for state performance should be noted on the figure itself, not just stated in the
supporting text. The supporting text for this figure on page 61 reports observed variation of 74.3% to 94.5%
across the states. However, the denominators in the calculations vary across states, and this fact should be
addressed in any inferential (comparative) statement. From the figure, because the states have different sizes
(areas), the shading may distort the message. Additionally, as discussed in Chapter 6, the color coding is not as intuitive as may be expected (green usually means "go" or "good"; black is often associated with "bad" results; but that is not the meaning here). In terms of estimation, it is not clear how the data were
analyzed (e.g., simply aggregated the number of met measures divided by the number of opportunities
within a state; or averaged the hospital-specific opportunity scores within a state; or did something different).
Finally, the choice of "average" deserves some justification.
National Healthcare Quality Report, 2008, Chapter 6. Efficiency
- Page 135: The term "expenditure" should be defined for the reader.
- Figure 6.1 (Average annualized percentage changes in national health care expenditures and quality for
general population and people with selected conditions, 2001-2005): Text indicates quality and expenditures
are "two very different measures" (p. 136) yet they are included on the same graph in the figure. This
sends a confusing message. If the two aspects are very different, and the reader is subsequently cautioned
in the supporting text not to draw conclusions regarding the relationship between the two, then they should
not be presented together in the same graphic.
- Page 137: The term "cost" should be defined for the reader.
- Figure 6.2 (National trends in potentially avoidable hospitalization rates, by type of hospitalization, 1997
and 2000-2005): Because data points for years 1998 and 1999 are not available, the graphic should start
at 2000. While the graphic includes several time points, the statistical test on page 139 utilizes only two
time points (either the difference between 2000 and 2005 or the difference between 1997 and 2005). It is
unclear why the report does not use regression modeling to estimate the actual trends rather than testing the difference between two time points given all the data that are available. The number of hospitals used in the calculations should be reported.
- Figure 6.3 (Total national costs associated with potentially avoidable hospitalizations, 1997 and 2000-
2005): Because data points for years 1998 and 1999 are not available, the graphic should start at 2000. The
number of hospitals used in the calculations should be reported. The statement in the supporting text on
page 139 indicates that costs due to avoidable hospitalization were 35 percent greater than in 1997. If this
is a statistically significant finding, it should be noted. And if it is a statistically significant finding, then
the type of test used should be noted. Some measure of accuracy in costs per year in the figure should be
reported (the number of avoidable hospitalizations changed across the years and this should be reflected
in the graphic).
- Table 6.1 (Rehospitalizations for congestive heart failure, per 1,000 initial admissions for CHF, States,
2004 and 2005): The information in this table is somewhat perplexing. For example, the standard errors
that are reported are either 0 or 1, which are suspicious for two reasons. First, the errors of two orders of
magnitude are smaller than the rates, which may mean they are not reported on the same scale as the rates
(the rates are per 100,000 admissions). Second, it seems that some rounding errors must have occurred,
as a standard error of 0 is unlikely. If none occurred, then some explanation for this value in the results
would be informative. Important information from the table is missing such as the sample sizes (number
of initial congestive heart failure [CHF] admissions) per state (and the number of hospitals per state). The
text (page 142) indicates an overall rate (210 per 100,000 admissions)—inclusion of the overall rate in the
table would be helpful. Additionally, it appears that no covariates were included in these calculations. For
clarity, define the outcome more explicitly: is it re-hospitalization for CHF within 3 months of discharge
of an initial CHF hospitalization?
- Figure 6.4 (Average estimated relative hospital cost efficiency index for a selected sample of urban general
community hospitals, 2001-2005): This figure reports estimated relative hospital cost efficiency indices
for 1,368 general community hospitals. The numbers in this figure are challenging to interpret mainly due
to the lack of a clear explanation of what each number means. Specifically, what is 100.03 (reported in
2002), and is 110.48 a clinically meaningful increase? Because each number is estimated based on data, the
standard errors (or confidence intervals) should be displayed so the reader is not misled by measurement
error. Finally, it is not clear at all how the index accounts for quality (page 142) as it appears to be based
on costs and not on quality.
- Table 6.2 (Correlates of hospital cost efficiency): This table reports correlates of hospital cost efficiency for
the 1,368 general hospitals. However, the sample sizes are not reported in the table for either the number
of hospitals or the number of discharges; the interpretation of the estimates is unclear (for example, what
is an operating margin?); and the table reports standard deviations presumably among the hospitals falling
into each quartile, but not the standard error of the actual estimates.
Case Study: National Healthcare Disparity Report, 2008, Chapter 2. Quality of Health Care (Heart Disease)
- General note on this chapter: Because of the importance of the specific subgroups of interest, data completeness
and comparability for race, income, and education variables are important to report. For example,
some states that contribute to the HCUP data may have large proportions of missing race/ethnicity data.
Moreover, some states ask patients to identify their race and ethnicity, and some determine race and ethnicity
- Page 54: Similar to the observation raised for the overview of statistics for this condition in the NHQR,
it is unclear whether the number of deaths listed here is the number of deaths due to heart disease or the
number of people who had heart disease and have died. It is also unclear why the format for this overview
is different from that in the NHDR. It would make most sense for the statistical overviews for the same
conditions were the same in both reports.
- Figure 2.13 (Adults with obesity age 0 and over who were told by a doctor they were overweight, by
race/ethnicity, income, and education, 1999-2000 and 2003-2006): The number of observations in each category should be reported or the point estimates need to be accompanied by confidence intervals. There
are many comparisons listed on page 58, and it is not clear if each is statistically different. This lack of
clarity arises as the word "significantly" is only stated for Blacks. The committee is not pushing for many
statistical tests, rather clarity on the findings as stated.
- Figure 2.14 (Adults with obesity who ever received ad ice from a health provider to exercise more (top
left), ethnicity (top right), income (bottom left), and education (bottom right), 2002-2005): As with other
figures, the standard errors or the sample sizes should be reported in order for readers to attempt to eliminate
- Page 60 (Last Paragraph): It is stated that the goal is to identify the independent effects of the various
factors on quality of health care. It is highly unlikely that "independent" effects were estimated for the
specific factors considered (race, income, education) as these factors are highly correlated.
- Figure 2.15 (Adults with obesity who ever received ad ice from a health provider to exercise: Adjusted
odds ratios, 2000-2005): Because these odds ratios are estimates, the standard errors of each should be
displayed. For example, is the odds ratio for females different from 1.0? There are many comparisons in
the single barplot, and some of the main messages get lost.
- Figure 2.16 (Composite measure: Hospital patients with heart failure who received recommended care,
Medicare only by race/ethnicity, 2002-2004 (left) and all payer 2005-2006 (right)): As with the other figures
displaying point estimates over time, either the sample sizes or standard errors should be included. By connecting
the lines, there is an implication that trends over time are important, yet these are not estimated.
Return to Contents
Proceed to Next Section