by C. G. Chute, M.D., Dr.P.H.
Throughout the history of medicine, reliance on observation and
inference has been the major
operational principle of good practice. Throughout the training
of those in the late 20th century,
this tradition has been reinforced by extending the notion of
observation beyond the physical
exam and careful history to include diagnostic evaluations. A
consequence of this march of
progress, then, is an explosion of information and knowledge that
can be brought to bear in the
treatment and management of patients. In short, medical practice,
whether we like it or not, has
become an information-intensive enterprise, relying in
unprecedented ways on the synthesis of
detailed and technical observations with complex and interrelated
knowledge that embodies our
best notions of current practice. Were this not sufficient to
capture our attention, the pace of new
knowledge discovery, integration of that knowledge into
guidelines, and expectations of
infallible excellence all conspire to force the management of
patient information and practice
knowledge to become a major priority in medical education and
practice; its role in applied
patient care research also remains preeminent.
The emphasis on characterizing patient information—including
presenting conditions, findings,
symptoms, working diagnoses, interventions, and outcomes—is
manifest in a broad spectrum of
health analyses. Clinical epidemiology, outcomes analysis, health
services research, guideline
development, continuous quality improvement, and health economics
are among the traditions
that rely fundamentally on a consistent representation of
underlying patient data. If this premise
is so, then surely great attention must have been paid to the
basic problems of how to
consistently represent clinical information in a standardized
way. Knowing readers realize that
nothing could be further from the truth.
Classifications and Nomenclatures
Since the London Bills of Mortality were published in 1662
(1),
periodic efforts have been made to
categorize human mortality (2).
International coordination of these
efforts is manifest in the current
International Classification of Diseases (ICD) (3). During the
middle of this century, efforts to
address human morbidity became a focus of attention, galvanized
by the introduction of the
American Clinical Modification of the ICD (ICD-9-CM) (4) in 1977. In
parallel with these
large-scale efforts, the evolution of the multiaxial Standard
Nomenclature of Diseases and
Operations (5), begun in 1928,
evolved through the pathology
classification of SNOP (6) to
become
what we now know as SNOMED (Systematized Nomenclature of Human
and Veterinary
Medicine) International (7).
Similarly, the Read Codes (8)
have become
the basis of patient care
coding in the United Kingdom. With such varied and intensive
activity in data representation, the
problem of robustly representing patient data must surely be
solved? The evidence does not
entirely support this notion.
The vast majority of patient data in the United States are coded
exclusively according to the
ICD-9-CM, primarily for reimbursement purposes. These data form
the basis of national data sets,
compiled by the Health Care Financing Administration and other
major insurers, which are used
to establish health policy and practice standards. Yet, a simple
review of the classification
reveals that it is devoid of any notion regarding disease
severity. Indeed, two patients with
widely differing conditions having profoundly different natural
histories and outcomes will often
be coded to the identical rubric. For example, a man with a
microscopic focus of indolent
prostate cancer found accidentally in the course of a
transurethral resection of the prostate for
benign prostatic hypertrophy will be coded identically to an
unfortunate man with widely
metastatic prostate cancer involving multiple bone sites, liver,
brain, and lungs. Clearly these
two men do not have the same prognosis and should not be
collapsed into the same category, yet
the best available health data in the United States do precisely
this. How well, then, do the major
clinical classifications work?
In a study undertaken by the Computer-Based Patient Records
Institute (CPRI), an attempt was
made to quantitate how well patient data are captured by major
clinical coding systems (9).
Employing narrative texts drawn from four medical centers and a
variety of chart components
(history and physical, procedure notes, nursing notes, etc.), the
authors extracted 3,061 clinical
concepts to create a consistent set of findings to be coded by
different terminology systems.
After encoding, the quality of each assignment was judged on a
0-2 scale and averaged over all
concepts. Table 1 summarizes the salient points of a recent
publication by Chute and
colleagues (9).
Table 1. Clinical content capture by major
terminologies
| Terminology
system |
Diagnoses |
Findings |
Modifiers |
Others |
Treatment and procedures |
Overall |
|
ICD-10 |
1.60 |
1.08 |
0.27 |
0.46 |
0.26 |
0.62 |
| ICD-9-CM |
1.61 |
1.23 |
0.36 |
0.51 |
1.00 |
0.77 |
| CPT |
0.00 |
0.13 |
0.07 |
0.58 |
0.52 |
0.17 |
| SNOMED |
1.90 |
1.82 |
1.69 |
1.52 |
1.78 |
1.74 |
| Read V2 |
1.47 |
1.36 |
0.65 |
1.50 |
1.26 |
1.05 |
Note: Average of 0.2 subjective scores for 3,061 clinical
concepts from
narrative texts.
Adopted from: Chute CG, Cohn SP, Campbell KE, et al. The content
of clinical
classifications. Journal of the American Medical Informatics
Association 1996;3:224-33.
ICD-10 is International Classification of Disease, 10th Revision.
ICD-9-CM is
International Classification of Diseases, 9th Revision, Clinical
Modification. CPT is
Current Procedural Terminology. SNOMED is Systematized
Nomenclature of Human and
Veterinary Medicine. Read V2 is Version 2 of Read Codes.
Among the striking observations is that, overall, ICD-9-CM
captures considerably less than half
(0.77/2) of the information considered important within the
texts. ICD-10 does even less well,
suggesting that it alone will not solve our problems with
classifying and capturing patient data.
Of the systems evaluated, SNOMED performed in a clearly superior
way, albeit not without
some measure of information loss (about 13 percent). The overall
conclusion is that major
amounts of information go unrecognized, inevitably resulting in
significant misclassification
problems for analyses based on data encoded with these
terminologies. Hence, the major sources
of clinical data in the United States and throughout the world
may be misleading.
Several efforts are presently underway to develop consistent and
robust terminologies intended
to capture the clinical detail and substance of patient findings
and events. Among them is the
Convergent Medical Terminology (CMT) project (10). The CMT project,
being undertaken by
Mayo Foundation and Kaiser Permanente with funding from the
National Library of Medicine
(NLM) and the Agency for Health Care Policy and Research (AHCPR),
intends to expand a
clinically relevant subset of the Large-Scale Vocabulary, using a
knowledge representation
environment (IBM's prototype K-Rep) (11) to better capture the
relationships between observations
and to capture pertinent modifiers of these conditions such as
severity.
The seminal contributions of the International Classification of
Primary Care (ICPC) (12) provide
another dimension of functionality, constituting a comprehensive
classification specifically
organized for primary care. Further, only the ICD can claim a
larger international contribution to
editorial content and widespread implementation. Nevertheless,
the ICPC is not focused on the
clinical detail so important to the valid and unbiased analyses
of clinical information relating to
care practice and outcomes.
Return to Contents
Information Exchange
Few would argue that it is sufficient to capture and classify
clinical data; practical care delivery
depends critically on being able to exchange relevant information
when and where it is needed.
The domain of clinical messaging standards, perhaps best
exemplified by HL/7 (Health Layer 7),
attempts to accommodate these requirements. It has been widely
noted that the health care
industry is farther ahead in establishing standards to exchange
messages than it is in
standardizing a consistent content for them (13). Most users of
message interchange technologies
readily acknowledge that considerable latitude exists in how
these standards are implemented,
detracting considerably from the vision of "plug and play"
levels of comparability.
CEN (Comité Européen de Normalisation) Technical
Committee (TC) 251 on Health Informatics
coordinates European efforts in this domain. Having first focused
on a body of fundamental
specifications about health information standards, the
constituent working groups propose to
evolve a spectrum of integrated standards that will support
consistent clinical content. TC 251
working groups understand the difficulties of
language-independent representations and
interchange, and they may produce a level of standards
specification that can contribute to an
international solution for patient data.
While the technical limitations and challenges of clinical
information representation and
exchange are formidable, they pale against the political issue of
whether people wish to have
patient data managed with such facility at all. This is at the
heart of the confidentiality question,
which is central to any consideration of primary care information
policies.
Return to Contents
Confidentiality and Commitment
Although the societal benefit of deriving new knowledge from
systematically collected
repositories of patient experience seems obvious to me, it must
be balanced against the risk and
concern associated with potential misuse of confidential data.
Ample case reports testify to the
genuine risk to future insurability or employment associated with
inappropriate access to patient
records. Public attitudes are justifiably distrustful of any
efforts to facilitate collection,
interchange, or access to patient information, regardless of how
noble the motivation. This has
become manifest in many pending congressional bills, which range
from reasonable restriction
of unjustified data exchange to a complete ban on collecting
health care data for any reason.
Indeed, many in the confidentiality community regard the use of
patient data for research or
knowledge-generation purposes as a dangerous loophole in much of
current draft legislation.
Sentiment to close these loopholes by prohibiting all research
use of patient information has an
interested audience in the Federal legislature and among many
State lawmakers.
The informatics community may perform no more valuable service
within the next decade than
to help ensure the confidentiality of primary patient
information. This includes standards for
repositories of patient data that require encryption of
information to ensure against its misuse yet
enable the linkage of subsequent outcomes or followup events with
earlier episodes of care. In
this way, longitudinal profiles of patient experience can be used
to improve our understanding of
disease natural histories or to empirically evaluate management
options with respect to patient
outcomes. Indeed, without the assurance of research access to
patient information, the great
promise of this information-intensive age of medicine will pass
unmet, and our opportunity to
efficiently and effectively improve the quality of care we
deliver will go unaddressed.
Our needs for standards in primary care range across many
challenges. Perhaps the most
fundamental are those that address the capture and consistent
representation of patient findings,
conditions, or events to enable the generation of new knowledge,
insights, and understanding for
improving the care we deliver. In parallel with these content
standards are standards supporting
the efficient exchange of data and knowledge at the time and
place of need. Finally, the most
strategic requirement concerns our ability to assure patients and
society that the security and
confidentiality of personal histories can be protected while
preserving the legitimate needs for
aggregate analyses to deliver the promise implicit in the
information-intensive age of health care.
References
- Graunt J. Natural and political observations
made upon the Bills of Mortality, 1662. Baltimore: Johns
Hopkins Press; 1939.
- Greenwood M. Medical statistics from Graunt
to Farr. Biometrika 1941;32:101-27; 1942;32:203-25;
1943;33(I):1-24.
- World Health Organization. Manual of the
International
Statistical Classification of Diseases, Injuries, and Causes of
Death (9th Revision). Geneva; 1977.
- International Classification of Diseases,
Ninth Revision,
Clinical Modification (ICD-9-CM), Vols. 1-3. Ann Arbor, MI:
Commission on Professional and Hospital Activities; 1993.
- Standard Nomenclature of Diseases and
Operations. Chicago: American Medical Association; 1933.
- Systematized Nomenclature of Pathology.
Chicago: College of
American Pathologists; 1965.
- Côté RA, Rothwell DJ, Palotay
JL, et al. SNOMED International. Northfield, IL: College of
American Pathologists; 1994.
- NHS Centre for Coding and Classification.
Read Codes File Structure Version 3: Overview and technical
description. Woodgate, Leicestershire, UK; 1993.
- Chute CG, Cohn SP, Campbell KE, et al. The
content coverage of clinical classifications. Journal of the American
Medical Informatics Association 1996;3:224-33.
- Medical speak. Wall Street Journal 1995
March 9; p. 1.
- Mays E, Weida R, Dionne R, et al. Scalable
and
expressive medical terminologies. Journal of the American Medical
Informatics Association; in press.
- Lamberts H, Wood M, Editors. International
Classification of Primary Care. New York: Oxford Press;
1987.
- General Accounting Office. Automated medical
records: leadership needed to expedite standards development.
Washington; 1993.
Return to Contents
Proceed to Next Section