MedScan, a natural language processing engine for MEDLINE abstracts.

Svetlana Novichkova Sergei Egorov Nikolai Daraselia

Bioinformatics

Ariadne Genomics, Inc, 9100 Great Seneca HWY, Rockville, MD 20850, USA.

Published: September 2003

Motivation: The importance of extracting biomedical information from scientific publications is well recognized. A number of information extraction systems for the biomedical domain have been reported, but none of them have become widely used in practical applications. Most proposals to date make rather simplistic assumptions about the syntactic aspect of natural language. There is an urgent need for a system that has broad coverage and performs well in real-text applications.

Results: We present a general biomedical domain-oriented NLP engine called MedScan that efficiently processes sentences from MEDLINE abstracts and produces a set of regularized logical structures representing the meaning of each sentence. The engine utilizes a specially developed context-free grammar and lexicon. Preliminary evaluation of the system's performance, accuracy, and coverage exhibited encouraging results. Further approaches for increasing the coverage and reducing parsing ambiguity of the engine, as well as its application for information extraction are discussed.

Download full-text PDF	Source
http://dx.doi.org/10.1093/bioinformatics/btg207	DOI Listing

Publication Analysis

Top Keywords

natural language

medline abstracts

medscan natural

language processing

engine

processing engine

engine medline

abstracts motivation

motivation extracting

extracting biomedical

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!