Background And Objective: In order for computers to extract useful information from unstructured text, a concept normalization system is needed to link relevant concepts in a text to sources that contain further information about the concept. Popular concept normalization tools in the biomedical field are dictionary-based. In this study we investigate the usefulness of natural language processing (NLP) as an adjunct to dictionary-based concept normalization.

Methods: We compared the performance of two biomedical concept normalization systems, MetaMap and Peregrine, on the Arizona Disease Corpus, with and without the use of a rule-based NLP module. Performance was assessed for exact and inexact boundary matching of the system annotations with those of the gold standard and for concept identifier matching.

Results: Without the NLP module, MetaMap and Peregrine attained F-scores of 61.0% and 63.9%, respectively, for exact boundary matching, and 55.1% and 56.9% for concept identifier matching. With the aid of the NLP module, the F-scores of MetaMap and Peregrine improved to 73.3% and 78.0% for boundary matching, and to 66.2% and 69.8% for concept identifier matching. For inexact boundary matching, performances further increased to 85.5% and 85.4%, and to 73.6% and 73.3% for concept identifier matching.

Conclusions: We have shown the added value of NLP for the recognition and normalization of diseases with MetaMap and Peregrine. The NLP module is general and can be applied in combination with any concept normalization system. Whether its use for concept types other than disease is equally advantageous remains to be investigated.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3756254PMC
http://dx.doi.org/10.1136/amiajnl-2012-001173DOI Listing

Publication Analysis

Top Keywords

concept normalization
16
metamap peregrine
16
nlp module
16
boundary matching
16
concept identifier
16
concept
11
natural language
8
language processing
8
normalization system
8
inexact boundary
8

Similar Publications

Background: The inheritance of the short allele, encoding the serotonin transporter (SERT) in humans, increases susceptibility to neuropsychiatric and metabolic disorders, with aging and female sex further exacerbating these conditions. Both central and peripheral mechanisms of the compromised serotonin (5-HT) system play crucial roles in this context. Previous studies on SERT-deficient (Sert) mice, which model human SERT deficiency, have demonstrated emotional and metabolic disturbances, exacerbated by exposure to a high-fat Western diet (WD).

View Article and Find Full Text PDF

A Landscape of Cancer Initiation and Cancer Stem Cells.

Cancers (Basel)

January 2025

Laboratory of Cancer Stem Cell Engineering, Faculty of Interdisciplinary Science and Engineering of Health Systems, Okayama University, 3-1-1 Tsushima-Naka, Kita-ku, Okayama 700-8530, Japan.

Exposure to radiation and chemicals, oncogenic viruses, microbiomes, and inflammation are the major events of cancer initiation. DNA damage and chromosomal aberrations are classically considered the main causes of cancer. The recent idea of epigenetics is broadening the concept, including the suggestion that oncogenic virus infection disrupts various intracellular signaling cascades.

View Article and Find Full Text PDF

Colorectal cancer (CRC) is one of the most common oncological disorders. Its fundamental treatments include surgery and chemotherapy, predominantly utilizing 5-fluorouracil (5-FU). Despite medical advances, CRC continues to present a high risk of recurrence, metastasis and low survival rates.

View Article and Find Full Text PDF

LGS-KT: Integrating logical and grammatical skills for effective programming knowledge tracing.

Neural Netw

January 2025

School of Computer Science and Technology, University of Science and Technology of China, Hefei, China; State Key Laboratory of Cognitive Intelligence, Hefei, China. Electronic address:

Knowledge tracing (KT) estimates students' mastery of knowledge concepts or skills by analyzing their historical interactions. Although general KT methods have effectively assessed students' knowledge states, specific measurements of students' programming skills remain insufficient. Existing studies mainly rely on exercise outcomes and do not fully utilize behavioral data during the programming process.

View Article and Find Full Text PDF

Optimizing TMS dosimetry: evaluating the effective electric field as a novel metric.

Phys Med Biol

January 2025

Department of Information Engineering, Electronics and Telecommunications (DIET) , University of Rome La Sapienza, Via Eudossiana 18, Rome, 00184, ITALY.

Objective: This study introduces the effective electric field (Eeff) as a novel observable for transcranial magnetic stimulation (TMS) numerical dosimetry. Eeff represents the electric field component aligned with the local orientation of cortical and white matter neuronal elements. To assess the utility of Eeff as a predictive measure for TMS outcomes, we evaluated its correlation with TMS induced muscle responses and compared it against conventional observables, including the electric (E-)field magnitude, and its components normal and tangential to the cortical surface.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!