CareScience Mortality Risk Model

Presentations from a November 2008 meeting to discuss issues related to mortality measures.

By Eugene Kroch, PhD, Michael Duan, MS, Emi Terasawa


Contents

Significance
Challenges
CareScience Mortality Risk Model
   Defining the Mortality Population
   Risk Model Specification
   Independent Variables
   Population Exemptions
   Out of Range Predictions
Data Source and Model Calibration
   MedPAR Data
   All-Payer State Data
   Private Client Data
   Model Selection for Private Client Data
   Model Selection for Public Data
Performance Assessment
Comparison to a Logit Model
   Logit Model Functional Form
   Logit Model Considerations
Calibration Data

Significance

Mortality is arguably the most commonly employed outcome measure in quality of care studies. Easily measured by simply counting deaths from discharges, inpatient mortality presents a seemingly unambiguous yardstick for judging quality. As an outcome measure, its clinical significance and relevance are unequivocal. It is the archetypical "sentinel event," signaling ultimate failure in care. For hospital staff and leadership, it forms the basis of Mortality and Morbidity Reviews, and for the public and media, it is a focus of quality assessment. In addition to its clinical relevance, mortality is easily explained and understood, a valuable attribute in performance improvement discussions and public reporting.

Return to Article Contents

Challenges

Despite the aforementioned advantages, mortality presents challenges as an outcome measure. The approach of counting deaths from discharges can inadvertently mask "true" mortality rates, which may be disguised by discharge policies. More specifically, inpatient mortality rates may be reduced by transferring the most severely afflicted patients to other acute care facilities, skilled nursing homes, or hospice facilities. Mortality rates are also prone to wide variation across diseases, rendering them irrelevant for certain populations for quality analysis. In populations where death is very rare (e.g., kidney and ureter calculus) or largely expected (e.g., admitted with DNR), mortality becomes a less meaningful quality measure.

Return to Article Contents

CareScience Mortality Risk Model

Defining the Mortality Population

Mortality rates can be defined for a range of periods (e.g., inpatient stay, N days post hospital admission, etc), however, the CareScience Mortality Risk Model restricts its purview to inpatient mortality to isolate in-hospital care effects.

Risk Model Specification

The purpose of the CareScience Mortality Risk Model is to generate the expected or "standard" mortality rate ("risk" rate) under typical care, given the patient's health status and relevant characteristics. Patient-level mortality risk is assessed via a stratified multiple regression model with the following functional form:

Y sub i j k = x sub i j k times beta sub k + epsilon sub i j k, for all i j k

where yijk is the mortality risk rate at patient level i, provider j, and principal diagnosis k. xijk is a vector of patient characteristics and socioeconomic factors. βk is the marginal effect of the independent variables on the mortality outcome measure, and εijk is the random error component of the model. The strata (k) are roughly based on 3-digit level ICD-9-CM diagnosis codes. Rare and insignificant diagnoses are rolled up into broad diagnosis groups, which are defined in the ICD-9-CM book. A total of 142 disease strata are analyzed.

Independent Variables

The following patient characteristics and socioeconomic factors comprise the set of regressors (i.e., classes of independent variables) used in the CareScience Mortality Risk Model.

  1. Age(quadratic form)
  2. Birth weight(quadratic form, for neonatal model only)
  3. Sex(female, male, unknown)
  4. Race(white, black, asian-pacific islander, unknown)
  5. Income(median household income within a zip code reported by US Census Bureau)
  6. Distance traveled(the centroid-to-centroid distance between the zip code of the household and the zip code of the hospital or provider, represented as a relative term)
  7. Principal diagnosis(terminal or three digit ICD-9-CM code, where statistically significant)
  8. CACR1 comorbidity scores(count of comorbidities within each of five severity categories on the CACR Likert scale)
  9. Defining diagnosis(three digit ICD9-CM code for neonatal model only)
  10. Cancer status (benign, malignant, carcinoma in situ, history of cancer, derived from secondary diagnoses)
  11. Chronic disease and disease history(terminal digit ICD9-CM diagnosis codes, such as diabetes, renal failure, hypertension, chronic GI, chronic CP, obesity, and history of substance abuse)
  12. Valid procedure(terminal ICD9-CM procedure codes, where clinically relevant and statistically significant)
  13. Admission source(Physician Referral, Clinic Referral, HMO Referral, Transfer from a Hospital, Skilled Nursing Facility or Another Health Care Facility, Emergency Room, Court/Law Enforcement, Newborn - Normal Delivery, Premature Delivery, Sick Baby, or Extramural Birth, Unknown/Other)
  14. Admission type(Emergency, Urgent, Elective, Newborn, Delivery, Unknown/Other)
  15. Payer class (Self-pay, Medicaid, Medicare, BC/BS, Commercial, HMO, Workman's Compensation, CHAMPUS/FEHP/Other Federal Government, Unknown/Other)
  16. Facility type(Acute, long-term, Psych.)

Risk factors used in the CareScience risk assessment model are tailored to specific patient subpopulations and outcomes. Use of the following risk factors may vary depending on the specific subpopulation and outcome evaluated:

  • Diagnosis detail.
  • Significant comorbidities.
  • Defining procedures.
  • Birth weight (used instead of age for neonates.)

CACR Comorbidity Scores

CACR comorbidity scores are derived from principal and secondary diagnosis codes. Secondary diagnoses are first categorized according to a five point Likert scale of increasing severity (A-E) where E is most severe.2 Comorbidities are calculated for each severity level as

N sub i s = sum of 1 minus p sub i j over all elements p sub i j in the set S, with S = A, B, C, D, E

where Nis is the expected number of comorbidities of severity s for a patient with principal diagnosis i, pij is the CACI probability of complication for the jth secondary diagnosis given principal diagnosis i, and S is one of the severity levels, A-E.

Common chronic diseases enter the model as dummy variables separate from comorbidities. Both comorbidities and chronic diseases are constrained to be non-negative coefficients in the model calibration.

Valid Procedures

Strictly speaking, a procedure is not a patient characteristic but rather a provider care choice. For example, two physicians may opt to pursue two different yet equally effective courses of treatment for the same patient. Although procedures represent the discretion of the care provider, they can signal important information about the patient's overall health status. Certain procedures can serve as effective proxies for lab reports and treatment history that are not available in the current database, as well as for other unobservable critical factors. To be included in the model, procedures must be designated as "valid" for the patient's particular disease stratum. Additionally, the timing of certain procedures relative to the patient's hospital admission must be considered. Valid procedures are grouped into one of two categories based on timing criteria.

Each disease stratum has a unique set of valid procedures. If a procedure falls into Category 1, timing of the procedure is not considered, and the analytic program simply searches for the procedure's corresponding coefficient. (Procedures failing to be statistically significant are not included in the model and have no impact on the risk score.3)

If a procedure is mapped to Category 2, inclusion of the procedure in the model depends on the procedure's timing during the inpatient stay. If the procedure occurs within a critical time period from the patient's hospital admission, the procedure is included in the model. If not, the procedure is excluded. The critical time windows for Category 2 procedures are assigned by internal panels of clinicians.

For several disease strata, the risk model does not incorporate valid procedures. These groups include DRGs 103, 480, 481, 495, 512, and 513.

Missing Independent Variables

As with most large databases, some records may lack one or more independent variables. Dismissing these records completely from the analysis may eliminate important patient information and in turn shrink the base sample size. This is particularly true for public data sets where missing data elements are more common. Recognizing that independent variables have varying impacts on risk scores, the risk model is designed to tolerate missing values to some extent.

Zero Tolerance

Principal Diagnosis, Age, and Birthweight (for neonates) are mandatory elements in the risk assessment model. Patient records missing any of these required elements are excluded from the model.

Conditioned Tolerance

For most categorical variables, such as Admission Source, there is an 'Unknown' category designated for unrecognizable or missing values. Among the categories, 'Unknown' statistically has the greatest probability of having the highest counts, since missing data are due to random errors. In risk modeling, the largest and most common category is often used as the reference group. Assigning the 'Unknown' category as the reference group is thus justifiable, however, a high proportion of 'Unknown' values risks diluting the real characteristics of the reference group.

Due to tight quality control, 'Unknown' values are very rare in private client data. In public data, however, the missing portion ranges from a couple of percent to around ten percent. It is therefore necessary to check the distribution of the data before calibration. In general, the 'Unknown' values should not represent more than one third of the entire sample in order to be used as the reference group.

Value Proxy

Income and Relative Distance are derived from zip code information. In the case of Income, the patient's residence zip code is used. For Relative Distance, both the patient's residence zip code and the hospital zip code are employed. If the patient's zip code is missing, the average Distance and Income of all patients in that hospital will be applied. In cases where both patient and hospital zip codes are unavailable, the Relative Distance is set to 1, and the national average income is applied.

Population Exemptions

Due to hospital discharge policies that can mask "true" mortality rates and measurement considerations, select patients are excluded from the CareScience Mortality Risk Model and do not receive mortality risk scores.

Discharged to Acute Care Facility

At the patient level, mortality is captured by the discharge disposition field in the administrative patient record. Patients expiring in hospital can be identified by discharge disposition codes of '20.'

Patients who are transferred to an acute care facility receive discharge disposition codes of '02.' These patients have an indeterminate mortality value and are consequently excluded from mortality analyses. The mortality risk for these patients is accordingly set to 'null.'

Insufficient Mortality for Measurement

Hospital-level mortality rates hover around 2 to 3 percent, however, wide variation exists across the model's 142 disease strata. Some of the strata have very low mortality rates, indicating that mortality may not be an appropriate performance measure for all disease strata. For example, among intervertebral disc disorder patients (ICD-9 722), mortality rates are less than 0.1%.

Death is so rare that mortality is difficult to model for these types of disease strata. As a result, these disease groups are omitted from mortality analyses rather than forced into a poor model.

Out of Range Predictions

The CareScience mortality model is based on linear regression, and consequently the predicted mortality risks may fall out of the range between zero and one at the patient level. Out-of-range risks are acceptable unless they exceed the "reasonable range" of -0.5 ≤ and ≤ 1.5 at which point they are considered invalid. If negative risks occur in aggregate reporting, they are rounded to zero.4

Return to Article Contents

Data Source and Model Calibration

CareScience employs three main data sources: MedPAR, All-Payer State data, and private client data. All three datasets are calibrated separately.

MedPAR Data

MedPAR consists of approximately 12 million inpatient visits that are covered by Medicare each year. These fiscal year data are generally consistent and updated annually with roughly a one-year lag time. (e.g., Fiscal year 2004 data were available at the end of 2005.) MedPAR covers all U.S. states and territories and is publicly available. Unsurprisingly, many research projects and publications are based on MedPAR. MedPAR covers around one-third of all hospital inpatients, almost all of which are 65 and older. Consequently, some specialties such as Pediatrics and Obstetrics are practically absent.

All-Payer State Data

All-Payer State data include all inpatients regardless of payer type or other restrictions, thus providing an advantage over MedPAR. Additionally, All-Payer State data contain a larger volume: roughly 20 million records from around 2700 hospitals. Despite these advantages, the data set has limitations. The most noticeable of these is that the data are less geographically representative. All-Payer State data come from fewer than 20 states located mostly on the coasts. In addition to this handicap, the data set lacks a continuum of data for each of the states, since changing regulatory laws often affect the availability of states' data from year to year. This lack of continuous data can severely limit the feasibility of longitudinal studies. Additionally, because State data is released by individual states with their own data specifications, the data are often inconsistent across states. As a result, All-Payer State data require significant internal resources to validate and improve its quality. The two-year lag time in release prevents All-Payer State data from being chosen as the model's calibration database, because the standards of hospital care are in constant flux (reflected in part by new codes appearing every year to reflect changes in diagnosis, procedure, DRG, etc). Despite the aforementioned limitations, All-Payer State data remains a good choice for hospital ranking because of its volume and completeness of disease segments. It also serves as a reference data set for CareScience's private data.

Private Client Data

In addition to the public data sets, CareScience collects private data from clients. Client data are submitted in compliance with CareScience's Master Data Specifications (MDS), ensuring its consistency and quality. The data are updated frequently with three to six months lag and offer much richer content that allows exploration of new model specifications. Annually, the combined Premier-CareScience data base consists of about 8 million records from over 600 hospitals dispersed across the United States. Because the client base is continually changing, the number of hospitals and records may fluctuate each year. The quality and richness of the client data make it an ideal calibration database despite its smaller size than the two public data sets.

Model Selection for Private Client Data

To avoid overfitting, CareScience's model calibration employs Stepwise Selection for private client data with critical significance set at 0.10. Variables are added to the model one at a time with the computational program selecting the variable whose F statistic is the largest and also meets the specified critical significance. After a variable is added, the stepwise method inspects all variables in the model and deletes any whose F statistic fails to meet the specified significance threshold. Once the check is made and the necessary deletions accomplished, another variable is added to the model. This process effectively reduces the possibility of multicollinearity caused by highly correlated independent variables. The stepwise process ends when the F statistics for every variable outside the model fail to meet the significance threshold while the F statistics for every variable within the model satisfy the significance criterion.

Due to the selection criteria, the number of selected independent variables ranges from several to dozens, depending on the disease. The R-Square of the model may be smaller than that of a full model without restriction but are far more robust than an overfitted full model. For out-of-sample predictions, robust parameter estimates generate more reliable risk scores.

Chronic conditions and comorbidities are restricted to positive-only parameter estimates due to their clinical attributes.

Model Selection for Public Data

Public data sets are always calibrated on themselves. Because their parameter estimates are not used for out-of-sample predictions, a full model is preferred as it provides a higher R-Square.

Return to Article Contents

Performance Assessment

Provider performance can be assessed for virtually any patient grouping (e.g., hospital-level, physician-level, principal diagnosis, DRG, procedure, etc.) through aggregation and comparison of the model's raw and risk complication rates. Positive deviations, as calculated below, indicate worse than expected (average) performance while negative deviations indicate better than expected (average) performance.

Mortality deviation sub i = 1 over n times result of sum of raw rate for group i minus risk rate for group i, for i = 1 through n.

where n is the number of patients in the ith patient group.

Statistical significance tests can be used to determine whether complication deviations indicate reliable areas for opportunity. CareScience performance reports flag deviations significant at 75% and 95% confidence levels.

Figure 5: Computing Mortality Risk Rates and Deviations Example

Principal Diagnosis: Septicemia (038)
Sample Patient Characteristics

PatientDependent VariableIndependent Variables
Raw Mortality
Survived=0
Expired=1
AgeAgeˆ2Gender
Male=0
Female=1
IncomeComorbidities Severity DComorbidities Severity EProcedure 96.72
Cont. Mech. Ventilation >96hrs
104217641$40,000210
205530251$55,000120
306339690$39,000431
416643560$25,000331

Principal Diagnosis: Septicemia (038)

Independent VariableCoefficient (Parameter Estimate)
Age-0.0022
Ageˆ20.000043
Gender0.0123
Income-0.00000046
Comorbidities Severity D0.0694
Comorbidities Severity E0.1896
Cont. Mech. Ventilation >96 Hrs0.0939

Patient-Level Risk:

Mortality Risk = b0 + b1(age) + b2(ageˆ2) + b3(gender) + b4(income) + …
= 0.0186 – 0.0022(age) + 0.000043(ageˆ2) + 0.0123(gender) – 0.00000046(income) + …
= 0.0186 – 0.0022(42) + 0.000043(1764) + 0.0123(1) – 0.00000046(40,000) + … = 0.1882

Patient 1 has an 18.8% chance of expiring during her inpatient stay.

Provider-Level Risk:

PatientRaw Mortality
(0 = Survived, 1 = Expired)
Mortality Risk Rate (%)
1019
2012
3024
4120
5017
6139
SUM2131

Raw Rate = 2/6 = 33%
Risk Rate = 131%/6 = 22%

Mortality Deviation = 33% - 22% = 11% (excess mortality)

Return to Article Contents

Comparison to a Logit Model

Mortality is a binary outcome; the patient either lives or expires. In the CareScience Mortality Model, however, risk scores may fall outside of the 0 to 1 range due to the inherently unbounded nature of linear regression models. One approach to correcting this discrepancy is to use a logit model.

Logit Model Functional Form

Logit models are often the preferred choice for modeling binary outcomes such as mortality, since their output values are restricted to a range between 0 and 1. Mathematically, the model is expressed as

Log [ Pi/ (1- Pi) ] = α + β 1 x i1 + β 2 x i2 + …+ β k x ik

where kis the number of explanatory variables with i=1,…, n individuals and Pi is the probability that Yi=1. The expression on the left-hand side is usually referred to as the logit or log-odds.5 Similar to an ordinary linear regression, the x's may either be continuous or dummy variables. The logit equation can be solved for Pi to obtain

Pi = EXP ( α + β 1 x i1 + β 2 x i2 + …+ β k x ik) / (1+ EXP ( α + β 1 x i1 + β 2 x i2 + …+ β k x ik ))

This equation can be further simplified by dividing both the numerator and denominator by the numerator itself:

Pi = 1/ (1 + EXP (- α - β 1 x i1 - β 2 x i2 - …- β k x ik ))

The resulting equation has the desirable property that regardless what values are substituted for the β's and x's, Pi will always be a number between 0 and 1.

The linear regression model used by CareScience provides a good approximation to the logistic curve in localized regions of the mortality model.

Logit Model Considerations

At the aggregate level, the logit model generates similar results to linear model. At the patient level, however, the logit model offers better face validity. Although the logit model presents certain, considerations exist as well.

Sampling

In-hospital death is rare among many patient populations. At the hospital level, the survival to death split is around 98% to 2%. This split can be more extreme among many disease groups. For a given sample size, the standard errors of the coefficients depend heavily on the split on the dependent variable. As a general rule, the model is better with a 50%-50% split than with a 95%-5% split. The logit model, however, has a unique sampling property that allows disproportionate stratified random sampling on the dependent variable without biasing the coefficient estimates. Under such sampling schemes, the intercept changes, and the data set needs to be specifically tailored to each disease stratum.

Convergence

Convergence failure is a common issue with the logit model. Most independent variables are categorical and enter the model equation as dummy variables. Often some of the dummy variables exhibit the following property: at one level of the dummy variable every case has a 1 on the dependent variable or every case has a 0. This property causes complete separation or quasi-complete separation preventing convergence. Removing problematic dummy variables can achieve convergence. Alternatively, uncommon categories can be collapsed. In each case, the data set must be specifically tailored to each disease stratum, which is a labor-intensive process.


1Comorbidity Adjusted Complication Risk — Brailer DJ, Kroch E, Pauly MV, Huang J. Comorbidity-Adjusted Complication Risk: A New Outcome Quality Measure, Medical Care 1996; 34:490-505.
2Severity ratings are assigned by an internal panel of clinicians.
3See Sections 4.4 and 4.5 on Model Selection.
4Theoretically, it is possible to have mortality risks greater than 1 in aggregate reporting. In reality, however, these events never happen, since mortality is a relatively rare occurrence. (Aggregate mortality risks of ˜0.80 are already considered unusually high.)
5Transforming the dependent variable to an odds ratio, Pi / (1- Pi), removes the equation's upper bound of 1. The lower bound of 0 is removed by taking the logarithm of the odds.


Return to Article Contents
Proceed to Next Section

 

Current as of March 2009
Internet Citation: CareScience Mortality Risk Model. March 2009. Agency for Healthcare Research and Quality, Rockville, MD. http://www.ahrq.gov/professionals/quality-patient-safety/quality-resources/tools/mortality/KrochMort.html