Computing semantic similarity between biomedical concepts using new information content approach.

J Biomed Inform

Multimedia InfoRmation system and Advanced Computing Laboratory, Sfax University, 3021, Tunisia. Electronic address:

Published: February 2016

The exploitation of heterogeneous clinical sources and healthcare records is fundamental in clinical and translational research. The determination of semantic similarity between word pairs is an important component of text understanding that enables the processing and structuring of textual resources. Some of these measures have been adapted to the biomedical field by incorporating domain information extracted from clinical data or from medical ontologies such as MeSH. This study focuses on Information Content (IC) based measures that exploit the topological parameters of the taxonomy to express the semantics of a concept. A new intrinsic IC computing method based on the taxonomical parameters of the ancestors' subgraph is then assigned to a biomedical concept into the "is a" hierarchy. Moreover, we present a study of the topological parameters through the MeSH taxonomy. This study treats the semantic interpretation and the different ways of expressing the parameters of depth and the descendants' subgraph. Using MeSH as an input ontology, the accuracy of our proposal is evaluated and compared against other IC-based measures according to several widely-used benchmarks of biomedical terms. The correlation between the results obtained for the evaluated measure using the proposed approach and those from the ratings of human' experts shows that our proposal outperforms the previous measures.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbi.2015.12.007DOI Listing

Publication Analysis

Top Keywords

semantic similarity
8
topological parameters
8
computing semantic
4
biomedical
4
similarity biomedical
4
biomedical concepts
4
concepts content
4
content approach
4
approach exploitation
4
exploitation heterogeneous
4

Similar Publications

People with concealable stigmatized identities may strategically share or hide cues to their identity. They may likewise seek or avoid interpersonal invisibility (i.e.

View Article and Find Full Text PDF

Modern dialogue systems rely on emotion recognition in conversation (ERC) as a core element enabling empathetic and human-like interactions. However, the weak correlation between emotions and semantics poses significant challenges to emotion recognition in dialogue. Semantically similar utterances can express different types of emotions, depending on the context or speaker.

View Article and Find Full Text PDF

In short-term ordered recall tasks, phonological similarity impedes item and order recall, while semantic similarity benefits item recall with a weak or null effect on order recall. Ishiguro and Saito recently suggested that these contradictory findings were due to an inadequate assessment of semantic similarity. They proposed a novel measure of semantic similarity based on the distance between items in a three-dimensional space composed of the semantic dimensions of valence, arousal, and dominance.

View Article and Find Full Text PDF

As psychological research progresses, the issue of concept overlap becomes increasing evident, adding to participant burden and complicating data interpretation. This study introduces an Embedding-based Semantic Analysis Approach (ESAA) for detecting redundancy in psychological concepts, which are operationalized through their respective scales, using natural language processing techniques. The ESAA utilizes OpenAI's text-embedding-3-large model to generate high-dimensional semantic vectors (i.

View Article and Find Full Text PDF

This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct and a Uni-Stage Visual Question Answering (VQA) model using Qwen2-72b-VL. Both models operated in a zero-shot capacity, utilizing Retrieval-Augmented Generation (RAG) with DrugBank references to ensure contextual relevance and accuracy.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!