Page 1 of 1

Appendix A - HCUP

National Healthcare Quality Report, 2008

Appendix A: Statistical Methods

This section explains the statistical methods and gives formulas for the calculations of standard errors and hypothesis tests. These statistics are derived from multiple databases: the NIS, the SID, and Claritas (a vendor that compiles and adds value to Bureau of Census data). For NIS estimates, the standard errors are calculated as described in the HCUP report titled "Calculating Nationwide Inpatient Sample (NIS) Variances" (Houchens, et al., 2005). We will refer to this report simply as the NIS Variance Report throughout this section. This method takes into account the cluster and stratification aspects of the NIS sample design when calculating these statistics using the SAS procedure PROC SURVEYMEANS. For population counts based on Claritas data, there is no sampling error.

Even though the NIS contains discharges from a finite sample of hospitals and most of the SID databases contain nearly all discharges from nearly all hospitals in the State, we treat the samples as though they were drawn from an infinite population. We do not employ finite population correction factors in estimating standard errors. We take this approach because we view the outcomes as a result of myriad processes that go into treatment decisions rather than being the result of specific, fixed processes generating outcomes for a specific population and a specific year. We consider the NIS and SID to be samples from a "super-population" for purposes of variance estimation. Further, we assume the counts (of QI events) to be binomial.

Section 1. Area Population QIs Using Claritas Population Data

  1. Standard error estimates for discharge rates per 100,000 population using the 2005 Claritas population data.

The observed rate was calculated as follows:

R equals 100,000 times sum of w sub i x sub i for i equals 1 through n, divided by capital N, which simplifies to 100,000 times capital S over capital N.

wi and xi, respectively, are the discharge weight and variable of interest for patient i in the NIS or SID. To obtain the estimate of S and its standard error, SES, we followed instructions in the NIS Variance Report (modified for the SID, as explained above).

The population count in the denominator is a constant. Consequently, the standard error of the rate R was calculated as:

S E sub R equals 100,000 times S E sub s over capital N.

  1. Standard error estimates for age/sex adjusted inpatient rates per 100,000 population using the 2005 Claritas data.

We adjusted rates for age and sex using the method of direct standardization (Fleiss, 1973). We estimated the observed rates for each of 36 age/sex categories (described in Appendix C, Age Groupings for Risk Adjustment). We then calculated a weighted average of those 36 rates using weights proportional to the percentage of a standard population in each cell. Therefore, the adjusted rate represents the rate that would be expected for the observed study population if it had the same age and sex distribution as the standard population.

For the standard population, we used the age and sex distribution of the United States as a whole according to the year 2000. In theory, differences among adjusted rates were not attributable to differences in the age and sex distributions among the comparison groups because the rates were all calculated with a common age and sex distribution.

The adjusted rate was calculated as follows (and subsequently multiplied by 100,000):

A equals sum of standard population for cell g for g equals 1 through 36 sum of w sub g, i x sub g, i over observed population for i equals 1 through n (g) divided by sum of standard population for cell g for g equals 1 through 36, which equals sum g equals 1 through 36 sum standard population over observed population w sub g,i x sub g, i for i equals 1 through n(g) divided by standard population equals sum g equals 1 through 36 sum w asterisk sub g, i x sub g, i for i equals 1 through n (g) divided by stan

g = Index for the 36 age/sex cells.

Ng,std = Standard population for cell g (year 2000 total U.S. population in cell g).

Ng,obs = Observed population for cell g (year 2005 subpopulation in cell g; e.g., females, State of California).

n(g) = Number in the sample for cell g.

xg,i = Observed quality indicator for observation i in cell g (e.g., 0 or 1 indicator).

wg,i = NIS or SID discharge weight for observation i in cell g.

The estimates for the numerator, S*, and its standard error, SES*, were calculated in similar fashion to the unadjusted estimates for the numerator S in formula A.1. The only difference was that the weight for patient i in cell g was redefined to account for the weighting for direct standardization and the discharge weight as:

w asterisk sub g, i  equals standard population for cell g over observed population for cell g times w sub g, i.

Following instructions in the NIS Variance Report (modified for the SID, as explained above), we used PROC SURVEYMEANS to obtain the estimate of S* (A.3), the weighted sum in the numerator using the revised weights (A.4), and the estimate SES*, the standard error of the weighted sum S*. The denominator of the rate is a constant. Therefore, the standard error of the adjusted rate, A, was calculated as

S E sub A equals 100,000 S E sub S asterisk over standard population.

Section 2. Provider-Based QIs Using Weighted Discharge Data (SID and NIS)

  1. Standard error estimates for inpatient rates per 1,000 discharges using discharge counts in both the numerator and the denominator.

We calculated the observed rate as follows:

R equals 1,000 times sum w sub i x sub i for i equals through n over sum w sub i for i equals 1 through n equals 1,000 times capital S over capital N.

Following instructions in the HCUP NIS Variance Report (modified for the SID, as explained above), we used PROC SURVEYMEANS to obtain estimates of the discharge weighted mean, S/N, and the standard error of that weighted mean, SES/N. We multiplied this standard error by 1,000.

  1. Standard error estimates for age/sex adjusted inpatient rates per 1,000 discharges using inpatient counts in both the numerator and the denominator.

We used the full NIS estimates for the standard inpatient population age-sex distribution. For each of the 36 age-sex categories, we estimated the number of U.S. inpatient discharges,N hat sub g, s t d , in category g. We calculated the directly adjusted rate:

A equals 1,000 times sum standard inpatient population for cell g for g equals 1 through 36 sum w sub g, i x sub g, i for i equals 1 through n(g) over sum w sub g, i for i equals 1 through n(g) divided by sum standard inpatient population for cell g for g equals 1 through 36 equals 1,000 times sum proportion of standard inpatient population in cell g for g equals 1 through 36 sum w sub g, i x sub g, i for i equals 1 through n(g) over sum w sub g, i for i equals 1 through n(g).

g = Index for the 36 age/sex cells.

N hat sub g, s t d = Standard inpatient population for cell g (NIS estimate of the total U.S. inpatient population for cell g).

n(g) = Number in the sample for cell g.

xg,i = Observed quality indicator for observation i in cell g.

wg,i = NIS or SID discharge weight for observation i in cell g.

Note that proportion of standard inpatient population in cell g (P hat sub g, s t d) equals standard inpatient population in cell g divided by sum standard inpatient population in cell g for g equals 1 through 36 is the proportion of the standard inpatient population in cell g. Consequently, the adjusted rate is a weighted average of the cell-specific rates with cell weights equal to P hat sub g, s t d . These cell weights are merely a convenient, reasonable standard inpatient population distribution for the direct standardization. Therefore, we treat these cell weights as constants in the variance calculations:

S E (A) equals square root of Var (A) equals 1,000 times square root of Var of the expression sum proportion of standard inpatient population in cell g for g equals 1 through 36 times sum w sub g, i x sub g, i for i equals 1 through n(g) over sum w sub g, i for i equals 1 through n(g) equals 1,000 times square root of sum proportion of standard inpatient population in cell g squared for g equals 1 through 36 times Var sum w sub g, i x sub g, i for i equals 1 through n(g) over sum w sub g, i for i equals 1 t

The variance of the ratio enclosed in parentheses was estimated separately for each cell g by squaring the SE calculated using the method of Section 2.a:

S E (A) equals 1,000 times square root of the expression sum proportion of standard inpatient population in cell g squared for g equals 1 through 36 times S E (R sub g) squared. R sub g equals sum w sub g, i x sub g, i for i equals 1 through n(g) over sum w sub g, i for i equals 1 through n(g).

Following instructions in the HCUP NIS Variance Report (modified for the SID, as explained above), we used PROC SURVEYMEANS to obtain estimates of the weighted means, Rg, and their standard errors.

Section 3. Significance Tests

Let R1 and R2 be either observed or adjusted rates calculated for comparison groups 1 and 2, respectively. Let SE1 and SE2 be the corresponding standard errors for the two rates. We calculated the test statistic and (two-sided) p-value:

t equals R sub 1 minus R sub 2 over square root of standard error sub 1 squared plus standard error sub 2 squared. p equals 2 times Prob (Z greater than pipe t pipe).

where Z is a standard normal variate.

Note: the following functions calculate p in SAS and EXCEL:

SAS: p = 2 * (1—PROBNORM(ABS(t)));

EXCEL: = 2*(1- NORMDIST(ABS(t),0,1,TRUE))

Return to Document

Current as of March 2009
Internet Citation: Appendix A - HCUP: National Healthcare Quality Report, 2008. March 2009. Agency for Healthcare Research and Quality, Rockville, MD. http://www.ahrq.gov/research/findings/nhqrdr/nhqr08/methods/hcupapa.html