Rule-based human gene normalization in biomedical text with confidence estimation.

Comput Syst Bioinformatics Conf

Center for Information Technology, National Institutes of Health, Bethesda, MD 20892-5624, USA.

Published: December 2007

The ability to identify gene mentions in text and normalize them to the proper unique identifiers is crucial for "down-stream" text mining applications in bioinformatics. We have developed a rule-based algorithm that divides the normalization task into two steps. The first step includes pattern matching for gene symbols and an approximate term searching technique for gene names. Next, the algorithm measures several features based on morphological, statistical, and contextual information to estimate the level of confidence that the correct identifier is selected for a potential mention. Uniqueness, inverse distance, and coverage are three novel features we quantified. The algorithm was evaluated against the BioCreAtIvE datasets. The feature weights were tuned by the Nealder-Mead simplex method. An F-score of .7622 and an AUC (area under the recall-precision curve) of .7461 were achieved on the test data using the set of weights optimized to the training data.

Download full-text PDF

Source

Publication Analysis

Top Keywords

rule-based human
4
gene
4
human gene
4
gene normalization
4
normalization biomedical
4
biomedical text
4
text confidence
4
confidence estimation
4
estimation ability
4
ability identify
4

Similar Publications

This empirical study assessed the potential of developing a machine-learning model to identify children and adolescents with poor oral health using only self-reported survey data. Such a model could enable scalable and cost-effective screening and targeted interventions, optimizing limited resources to improve oral health outcomes. To train and test the model, we used data from 2,133 students attending schools in a Portuguese municipality.

View Article and Find Full Text PDF

Objective: To review pediatric artificial intelligence (AI) implementation studies from 2010-2021 and analyze reported performance measures.

Methods: We searched PubMed/Medline, Embase CINHAL, Cochrane Library CENTRAL, IEEE and Web of Science with controlled vocabulary.

Inclusion Criteria: AI intervention in a pediatric clinical setting that learns from data (i.

View Article and Find Full Text PDF

A decision-analytic method to evaluate the cost-effectiveness of remote monitoring technology for chronic depression.

Int J Technol Assess Health Care

January 2025

Department of Industrial and Systems Engineering, University of Washington, Seattle, WA, USA.

Objectives: Advances in mobile apps, remote sensing, and big data have enabled remote monitoring of mental health conditions, but the cost-effectiveness is unknown. This study proposed a systematic framework integrating computational tools and decision-analytic modeling to assess cost-effectiveness and guide emerging monitoring technologies development.

Methods: Using a novel decision-analytic Markov-cohort model, we simulated chronic depression patients' disease progression over 2 years, allowing treatment modifications at follow-up visits.

View Article and Find Full Text PDF

Purpose: The process of searching for and selecting clinical evidence for systematic reviews (SRs) or clinical guidelines is essential for researchers in Traditional Chinese medicine (TCM). However, this process is often time-consuming and resource-intensive. In this study, we introduce a novel precision-preferred comprehensive information extraction and selection procedure to enhance both the efficiency and accuracy of evidence selection for TCM practitioners.

View Article and Find Full Text PDF

Background Large-scale secondary use of clinical databases requires automated tools for retrospective extraction of structured content from free-text radiology reports. Purpose To share data and insights on the application of privacy-preserving open-weights large language models (LLMs) for reporting content extraction with comparison to standard rule-based systems and the closed-weights LLMs from OpenAI. Materials and Methods In this retrospective exploratory study conducted between May 2024 and September 2024, zero-shot prompting of 17 open-weights LLMs was preformed.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!