pmc.ncbi.nlm.nih.gov

Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program

Abstract

The UMLS Metathesaurus, the largest thesaurus in the biomedical domain, provides a representation of biomedical knowledge consisting of concepts classified by semantic type and both hierarchical and non-hierarchical relationships among the concepts. This knowledge has proved useful for many applications including decision support systems, management of patient records, information retrieval (IR) and data mining. Gaining effective access to the knowledge is critical to the success of these applications. This paper describes MetaMap, a program developed at the National Library of Medicine (NLM) to map biomedical text to the Metathesaurus or, equivalently, to discover Metathesaurus concepts referred to in text. MetaMap uses a knowledge intensive approach based on symbolic, natural language processing (NLP) and computational linguistic techniques. Besides being applied for both IR and data mining applications, MetaMap is one of the foundations of NLM's Indexing Initiative System which is being applied to both semi-automatic and fully automatic indexing of the biomedical literature at the library.

17

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Aronson A. R., Bodenreider O., Chang H. F., Humphrey S. M., Mork J. G., Nelson S. J., Rindflesch T. C., Wilbur W. J. The NLM Indexing Initiative. Proc AMIA Symp. 2000:17–21. [PMC free article] [PubMed] [Google Scholar]
  2. Aronson A. R., Rindflesch T. C. Query expansion using the UMLS Metathesaurus. Proc AMIA Annu Fall Symp. 1997:485–489. [PMC free article] [PubMed] [Google Scholar]
  3. Aronson A. R. The effect of textual variation on concept based information retrieval. Proc AMIA Annu Fall Symp. 1996:373–377. [PMC free article] [PubMed] [Google Scholar]
  4. Hersh W. R., Hickam D. H., Haynes R. B., McKibbon K. A. A performance and failure analysis of SAPHIRE with a MEDLINE test collection. J Am Med Inform Assoc. 1994 Jan-Feb;1(1):51–60. doi: 10.1136/jamia.1994.95236136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Nadkarni P., Chen R., Brandt C. UMLS concept indexing for production databases: a feasibility study. J Am Med Inform Assoc. 2001 Jan-Feb;8(1):80–91. doi: 10.1136/jamia.2001.0080080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Pratt W., Wasserman H. QueryCat: automatic categorization of MEDLINE queries. Proc AMIA Symp. 2000:655–659. [PMC free article] [PubMed] [Google Scholar]
  7. Rindflesch T. C., Bean C. A., Sneiderman C. A. Argument identification for arterial branching predications asserted in cardiac catheterization reports. Proc AMIA Symp. 2000:704–708. [PMC free article] [PubMed] [Google Scholar]
  8. Rindflesch T. C., Hunter L., Aronson A. R. Mining molecular binding terminology from biomedical text. Proc AMIA Symp. 1999:127–131. [PMC free article] [PubMed] [Google Scholar]
  9. Rindflesch T. C., Tanabe L., Weinstein J. N., Hunter L. EDGAR: extraction of drugs, genes and relations from the biomedical literature. Pac Symp Biocomput. 2000:517–528. doi: 10.1142/9789814447331_0049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Sneiderman C. A., Rindflesch T. C., Aronson A. R. Finding the findings: identification of findings in medical literature using restricted natural language processing. Proc AMIA Annu Fall Symp. 1996:239–243. [PMC free article] [PubMed] [Google Scholar]
  11. Sneiderman C. A., Rindflesch T. C., Bean C. A. Identification of anatomical terminology in medical text. Proc AMIA Symp. 1998:428–432. [PMC free article] [PubMed] [Google Scholar]
  12. Srinivasan P. Retrieval feedback in MEDLINE. J Am Med Inform Assoc. 1996 Mar-Apr;3(2):157–167. doi: 10.1136/jamia.1996.96236284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Weeber M., Klein H., Aronson A. R., Mork J. G., de Jong-van den Berg L. T., Vos R. Text-based discovery in biomedicine: the architecture of the DAD-system. Proc AMIA Symp. 2000:903–907. [PMC free article] [PubMed] [Google Scholar]
  14. Wilbur W. J., Hazard G. F., Jr, Divita G., Mork J. G., Aronson A. R., Browne A. C. Analysis of biomedical text for chemical names: a comparison of three methods. Proc AMIA Symp. 1999:176–180. [PMC free article] [PubMed] [Google Scholar]