Practical Considerations in Interoperability The Unified Medical Language System
Unified Medical Language System • Long Term Project of NLM (started 1986) • Focus on Interoperability • Increasingly Standards Based
UMLS Purpose • Make it easy for health professionals and researchers to retrieve and integrate relevant information from disparate automated sources, e.g. – – – –
computer-based patient records factual databanks bibliographic databases and full-text expert systems
• Antedated and Anticipated the Web
UMLS Focus Conceptual Connections • Build knowledge sources that can be used by intelligent programs to overcome: – disparities in language used by different users and in different information sources; – difficulties in identifying which of many information sources is relevant
UMLS Knowledge Sources Multi-purpose Tools or “Intellectual Middleware” for System Developers • Metathesaurus • SPECIALIST lexicon and lexical programs • Semantic Network
UMLS Distribution • Annual updates, 1990 - • Free under license agreement with NLM – Need separate license agreements with vocabulary producers for some uses of some vocabularies in the Metathesaurus
• Available to licensed users (~2000) via Internet server and on CDs http://www.nlm.nih.gov/research/umls/
UMLS Metathesaurus Finely Granular Concepts • Concepts, terms, and attributes from many controlled vocabularies • New inter-source relationships, definitional information, use information • Scope determined by combined scope of source vocabularies • Strict definition of synonymy (=Identity)
Organization of Metathesaurus • Semantic Locality – – – –
Synonymy Contexts Co-Occurrences Asserted Relationships
• Analogous to Orthography
Concepts (MRCON) C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834
ENG ENG ENG FRE GER POR RUS SPA
P S S P P P P P
L0014834 L0004656 L0789289 L0166913 L0411596 L0325532 L0893188 L0343199
PF PF PF PF PF PF PF PF
S0038884 S0004045 S0864664 S0231394 S0535878 S0433837 S1097005 S0451504
Escherichia coli Bacterium coli E COLI ESCHERICHIA COLI ESCHERICHIA COLI ESCHERICHIA COLI ESCHERICHIA COLI ESCHERICHIA COLI
Sources (MRSO) C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834
L0004656 L0004656 L0014834 L0014834 L0014834 L0014834 L0014834 L0014834 L0166913 L0325532 L0343199 L0411596 L0789289 L0893188
S0004045 S0004045 S0038884 S0038884 S0038884 S0038884 S0038884 S0038884 S0231394 S0433837 S0451504 S0535878 S0864664 S1097005
SNM2 SNMI97 CSP94 LCH90 MSH98 RCD95 SNM2 SNMI97 INS97 BRMP98 BRMS98 DMD97 LNC10I RUS98
SY SY PT PT MH PT PT PT MH MH MH MH RN MH
E-1571 L-15601 0336-0160 U001655 D004926 X73K5 E-1571 L-15601 D004926 D004926 D004926 D004926 NOCODE D004926
Semantic Type (MRSTY) C0014834
T007
Bacterium
Definition (MRDEF) C0014834 MSH98
A species of gram-negative, facultatively anaerobic, rod-shaped bacteria commonly found in the lower part of the intestine of warm-blooded animals. It is usually nonpathogenic, but some strains are known to produce diarrhea and pyogenic infections.
Relationships And Contexts (MRREL & MRCXT) C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 C0014834 ...
CHD CHD PAR PAR PAR PAR PAR RN RO RO RO RO SIB
C0376624 C0376625 C0014346 C0014833 C0014833 C0014833 C0014833 C0002054 C0148325 C0370139 C0370140 C0370141 C0315244
Escherichia coli O157-H7 Escherichia coli O157 Enterobacteriaceae Escherichia Escherichia Escherichia Escherichia Alkalescens-Dispar Group Verotoxin 2 Verotoxin 1 Verotoxin 1 Antibody Verotoxin 2 Antibody Escherichia hermannii
RCD95 MSH98 SNM2 CSP94 MSH98 RCD95 SNMI97 MTH MTH MTH MTH MTH RCD95
Lessons from UMLS • Context is Key • Face Validity a Problem in Many Systems • “Special” Meanings • Zipf’s Law
• Use Meaningless Identifiers • Let Concepts Persist
Observations • • • •
Multi-Purpose Tools Difficult to Use Semantic Locality Insufficient The Escher Phenomenon Other Terminological Paradoxes
Paradoxes in Terminology • Content seemingly defies logic • The way we use words often defies formal definitions – Wave-particle duality – The Escher phenomenon – Godel’s Theorem
• Logicians do not understand content – Categorical Names of Artifacts – Natural Objects (open lattice description)
Choices in Terminology • • • • • •
Want a Standard Understandable by Humans Reproducible for Machines Appropriate for Intended Use Update Model Low Cost
Standards in Models Doesn’t Vocabulary Alone Solve Problem? What is an Information Model? How Does it Relate to Terminology?
A Situation RECORD 1
RECORD 2
Male, 56, IDDM x 15 years, Graves’ Disease, Renal Failure, Admitted with Septic Shock.
Male, 56, IDDM x 15 years, Graves’ Disease Fam Hx of Renal Failure, Admitted with Septic Shock.
Decision Support If Renal Failure, do not give Aminoglycosides
Which Would be Appropriate Application of Rule?
What is an Information Model?
Example: MeSH in MEDLINE Recognition of Context Formalization of Implicit Design Formal Declaration of Intent
How Does it Relate to Terminology? Pre-Coordination versus Post Coordination • Use Requirements • Standards Efforts – HL7 for Clinical Records – Common Model Efforts • Gene Ontology • International Drug Harmonisation
Recent US Developments • Clinical Vocabularies – LOINC – SNOMED-CT – RxNorm
• Consolidated Health Informatics
SNOMED CT • ‘Most comprehensive international and multilingual clinical reference terminology available in the world’ • Latest release: Jan 31, 2004 – Concepts: 357,135 (298,090 current) – Descriptions: 957,349 (736,946 current) – Relationships: 1,374,955 (1,315,910 involve only current concepts)
SNOMED CT Data Structure • Concepts, Descriptions and Relationships – All Have Unique Identifiers – All Have Individual Attributes – Some Description Logic
• Mappings to ICD9CM • Original Format of Metathesaurus Incapable
Original Release Format (ORF) of Metathesaurus • Metathesaurus-Concept-Centric view • Concept -based Connections and Relationships • Most information at Concept (CUI) level
New Rich Release Format (RRF) • Source-Centric View (Source transparency) • Attributes and relationships at the Atom level – Denoted by AUI (Atom Unique identifier) – A string in a source
• New Fields for source specific identifiers (SCUI, SAUI, SRUI)
Different Theologies in Concept Organization • •
What is a concept? ‘A unit of thought’ In SNOMED CT: – –
•
Rigid hierarchies – mutually exclusive Define concepts by other concepts ( Description Logic)
In UMLS: – –
The working principle is ‘useful level of distinction for biomedical professionals eg. in clinical discourse’ May trade slight ambiguity in meaning for Improving localization
SNOMED CT Concept View
Acetaminophen (product)
Has-active ingredient
Is-a Acetaminophen 500 mg tablet (product)
Acetaminophen (substance)
Acetaminophen 80 mg suppository (product)
‘This patient is on acetaminophen.’ Useful clinical distinction? ‘Tylenol contains acetaminophen.’
SNOMED CT Concept View
Stab wound (disorder)
Has-associatedmorphology
Stab wound (morphologic abnormality)
‘The patient was admitted for a stab wound sustained during a fight.’ Useful clinical distinction? ‘On physical examination, there was a 3 cm stab wound in the right upper quadrant of the abdomen.’
Mapping Issues Few Able to Exploit Locality Direct Navigation Desirable From: Core Clinical Vocabularies To Other Important Terminologies
Directionality Important Complex Logical Expressions
Can Interoperability Be Achieved? • Desirable – What is Experience with Patients Like Mine? – Disease Surveillance
• UMLS a First Step • Semantic Web – No Standard – No Update Model
• Ars Longa, Vita Brevis
Thank you!