Recognition and normalization of multilingual symptom entities using in-domain-adapted BERT models and classification layers.

Database (Oxford)

Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Blvr. Louis Pasteur, 35, Puerto de la Torre, Málaga 29071, Spain.

Published: August 2024

Due to the scarcity of available annotations in the biomedical domain, clinical natural language processing poses a substantial challenge, especially when applied to low-resource languages. This paper presents our contributions for the detection and normalization of clinical entities corresponding to symptoms, signs, and findings present in multilingual clinical texts. For this purpose, the three subtasks proposed in the SympTEMIST shared task of the Biocreative VIII conference have been addressed. For Subtask 1-named entity recognition in a Spanish corpus-an approach focused on BERT-based model assemblies pretrained on a proprietary oncology corpus was followed. Subtasks 2 and 3 of SympTEMIST address named entity linking (NEL) in Spanish and multilingual corpora, respectively. Our approach to these subtasks followed a classification strategy that starts from a bi-encoder trained by contrastive learning, for which several SapBERT-like models are explored. To apply this NEL approach to different languages, we have trained these models by leveraging the knowledge base of domain-specific medical concepts in Spanish supplied by the organizers, which we have translated into the other languages of interest by using machine translation tools. The results obtained in the three subtasks establish a new state of the art. Thus, for Subtask 1 we obtain precision results of 0.804, F1-score of 0.748, and recall of 0.699. For Subtask 2, we obtain performance gains of up to 5.5% in top-1 accuracy when the trained bi-encoder is followed by a WNT-softmax classification layer that is initialized with the mean of the embeddings of a subset of SNOMED-CT terms. For Subtask 3, the differences are even more pronounced, and our multilingual bi-encoder outperforms the other models analyzed in all languages except Swedish when combined with a WNT-softmax classification layer. Thus, the improvements in top-1 accuracy over the best bi-encoder model alone are 13% for Portuguese and 13.26% for Swedish. Database URL: https://doi.org/10.1093/database/baae087.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11352596PMC
http://dx.doi.org/10.1093/database/baae087DOI Listing

Publication Analysis

Top Keywords

three subtasks
8
top-1 accuracy
8
wnt-softmax classification
8
classification layer
8
recognition normalization
4
multilingual
4
normalization multilingual
4
multilingual symptom
4
symptom entities
4
entities in-domain-adapted
4

Similar Publications

Domain-general cognitive systems are essential for adaptive human behaviour, supporting various cognitive tasks through flexible neural mechanisms. While fMRI studies link frontoparietal network activation to increasing demands across various tasks, the electrophysiological mechanisms underlying this domain-general response to demand remain unclear. Here, we used MEG/EEG, and separated the aperiodic and oscillatory components of the signals to examine their roles in domain-general cognition across three cognitive tasks using multivariate analysis.

View Article and Find Full Text PDF

Objective: Blood DNA methylation (DNAm) alterations have been widely reported in the onset and progression of mild cognitive impairment (MCI) and Alzheimer's disease (AD); however, DNAm is underutilized as a diagnostic biomarker for these diseases. We aimed to evaluate the diagnostic performance of DNAm for MCI and AD, both individually and in combination with well-established AD biosignatures.

Methods: A total of 1,891 blood samples from Alzheimer's Disease Neuroimaging Initiative (ADNI) studies were used to identify potential candidate DNAm biomarkers.

View Article and Find Full Text PDF

MVGNet: Prediction of PI3K Inhibitors Using Multitask Learning and Multiview Frameworks.

ACS Omega

November 2024

Zhejiang Province Key Laboratory of Smart Management & Application of Modern Agricultural Re-sources, School of Information Engineering, Huzhou University, Huzhou 313000, Zhejiang Province,China.

Article Synopsis
  • PI3K is an important intracellular enzyme made up of regulatory (p85) and catalytic (p110) subunits, existing in four different isoforms important for cancer treatment.
  • The study introduces MVGNet, a deep learning framework that improves the prediction of how well molecules can inhibit these PI3K isoforms by using multitask learning techniques.
  • MVGNet outperforms traditional machine learning and deep learning models, achieving impressive accuracy metrics (AUC-ROC and AUC-PR), and helps further understand the relationship between the structure of PI3K inhibitors and their activity.
View Article and Find Full Text PDF

Three-dimensional (3D) stimuli are always better than two-dimensional (2D) multi-tasking? A high cognitive load in 3D-MATB-II.

Behav Brain Res

February 2025

School of Psychology, Shaanxi Normal University, Xi'an, China; Key Laboratory of Behavior and Cognitive Neuroscience of Shaanxi, Shaanxi Normal University, Xi'an, China. Electronic address:

Objective: The objective of this study is to investigate the whether multi-tasking performance in (three-dimensional) 3D aid or impede cognition compare to (two-dimensional) 2D environments, as reflected by cognitive load. Specifically, we aim to examine the mechanism of multi-tasking under 3D (virtual reality [VR]) and 2D (PC monitor) conditions using the widely used Multi-Attribute Task Battery (MATB) II paradigm.

Methodology: The MATB-II sub-tasks, namely "Tracking" and "System Monitoring," were conducted with varying task demands in both 3D conditions (Tracking Far - System Monitoring Near [TF-SN], Tracking Near - System Monitoring Far [TN-SF]) and a 2D condition with no depth perception (No Depth [ND]).

View Article and Find Full Text PDF

Introduction: This study investigated the differences between males and females in autonomic functions and cognitive performance during cold-air exposure and cold-water partial-immersion compared to a room temperature-air environment. Although several studies have investigated the effects of cold-air or cold-water exposures on autonomic function and cognitive performance, biological sex differences are often under-researched.

Methods: Twenty-two males and nineteen females participated in the current study.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!