Full-text chemical identification with improved generalizability and tagging consistency.

Database (Oxford)

Department of Computer Science and Engineering, Korea University, Seoul, South Korea.

Published: September 2022

Chemical identification involves finding chemical entities in text (i.e. named entity recognition) and assigning unique identifiers to the entities (i.e. named entity normalization). While current models are developed and evaluated based on article titles and abstracts, their effectiveness has not been thoroughly verified in full text. In this paper, we identify two limitations of models in tagging full-text articles: (1) low generalizability to unseen mentions and (2) tagging inconsistency. We use simple training and post-processing methods to address the limitations such as transfer learning and mention-wise majority voting. We also present a hybrid model for the normalization task that utilizes the high recall of a neural model while maintaining the high precision of a dictionary model. In the BioCreative VII NLM-Chem track challenge, our best model achieves 86.72 and 78.31 F1 scores in named entity recognition and normalization, significantly outperforming the median (83.73 and 77.49 F1 scores) and taking first place in named entity recognition. In a post-challenge evaluation, we re-implement our model and obtain 84.70 F1 score in the normalization task, outperforming the best score in the challenge by 3.34 F1 score. Database URL: https://github.com/dmis-lab/bc7-chem-id.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9518746PMC
http://dx.doi.org/10.1093/database/baac074DOI Listing

Publication Analysis

Top Keywords

named entity
16
entity recognition
12
chemical identification
8
normalization task
8
model
5
full-text chemical
4
identification improved
4
improved generalizability
4
generalizability tagging
4
tagging consistency
4

Similar Publications

Article Synopsis
  • Neointimal coverage and stent apposition are critical for improving percutaneous coronary interventions (PCI), but current algorithms struggle with automating the analysis of diverse stent types and preselecting necessary segments.
  • This study introduces TriVOCTNet, a multi-task deep learning model designed to automate the classification, lumen segmentation, and stent strut segmentation in IVOCT images, all within one efficient network.
  • TriVOCTNet demonstrated impressive accuracy with high classification rates and precise segmentation outputs, indicating its potential for enhancing clinical practices in PCI procedures.
View Article and Find Full Text PDF

Background: The onset of the coronavirus disease 2019 (COVID-19) outbreak caused major interruptions to the entire healthcare network affecting referral, diagnosis and treatment pathways with the potential to affect cancer treatment outcomes. In Ireland a national lockdown was initiated in March 2020 involving a stay-at-home order with a limitation on travel, social interactions and closure of schools, universities and childcare facilities. We designed a retrospective study comparing treatment outcomes for patients with oropharyngeal cancer treated before and during the COVID pandemic.

View Article and Find Full Text PDF

Tumor burden with AFP improves survival prediction for TACE-treated patients with HCC: An international observational study.

JHEP Rep

January 2025

State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi'an, China.

Background & Aims: Current prognostic models for patients with hepatocellular carcinoma (HCC) undergoing transarterial chemoembolization (TACE) are not extensively validated and widely accepted. We aimed to develop and validate a continuous model incorporating tumor burden and biology for individual survival prediction and risk stratification.

Methods: Overall, 4,377 treatment-naive candidates for whom TACE was recommended, from 39 centers in five countries, were enrolled and divided into training, internal validation, and two external validation datasets.

View Article and Find Full Text PDF

Calcium requirements in growing Japanese quail from 21 to 35 days post-hatch.

Poult Sci

December 2024

Department of Animal Sciences, Faculty of Agriculture, University of Zabol, Sistan, 98661-5538, Iran. Electronic address:

An experiment was conducted to estimate the optimal calcium (Ca) requirement for growth performance and bone health in quail from 21 to 35 days posthatch. Five dietary treatments containing 0.45, 0.

View Article and Find Full Text PDF

Alginate/gelatin blend fibers for functional high-performance air filtration applications.

Int J Biol Macromol

December 2024

Department of Textile Engineering, Istanbul Technical University, Istanbul, Turkey. Electronic address:

Currently, the primary composition of fibrous filter materials predominantly relies on synthetic polymers derived from petroleum. The utilization of these polymers, as well as their production process, has a negative impact on the environment. Consequently, the adoption of air filter media fabricated from natural fibers would yield significant environmental benefits.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!