This paper proposes to model the extraction of acronyms and their meaning from unstructured text as a stochastic process using Hidden Markov Models (HMM). The underlying, or hidden, chain is derived from the acronym where the states in the chain are made by the acronyms characters. The transition between two states happens when the origin state emits a signal. Signals recognizable by the HMM are tokens extracted from text. Observations are sequence of tokens also extracted from text. Given a set of observations, the acronym definition will be the observation with the highest probability to emerge from the HMM. Modelling this extraction probabilistically allows us to deal with two difficult aspects of this process: ambiguity and noise. We characterize ambiguity when there is no unique alignment between a character in the acronym with a token in the expansion while the feature characterizing noise is the absence of such alignment. Our experiments have proven that this approach has high precision (93.50%) and recall (85.50%) rates in an environment where acronym coinage is ambiguous and noisy such as the biomedical domain. Processing and comparing the HMM approach with different ones, showed ours to reach the highest F1 score (89.40%) on the same corpus.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TBME.2010.2051033DOI Listing

Publication Analysis

Top Keywords

hidden markov
8
markov models
8
tokens extracted
8
extracted text
8
acronym
5
language independent
4
independent acronym
4
acronym extraction
4
extraction biomedical
4
biomedical texts
4

Similar Publications

Paratuberculosis (Johne's disease), caused by Mycobacterium avium subsp. paratuberculosis (MAP), is a common, economically-important and potentially zoonotic contagious disease of cattle, with worldwide distribution. Disease management relies on identification of animals which are at high-risk of being infected or infectious.

View Article and Find Full Text PDF

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

Trinity Biomedical Sciences Institute, Trinity College Dublin, University of Dublin, Dublin, Dublin 2, Ireland.

Background: Amyotrophic lateral sclerosis (ALS) shares pathological and genetic underpinnings with frontotemporal dementia (FTD). ALS manifests with diverse symptoms, including progressive neuro-motor degeneration, muscle weakness, but also cognitive-behavioural changes in up to half of the cases. Resting-state EEG measures, particularly spectral power and functional connectivity, have been instrumental for discerning abnormal motor and cognitive network function in ALS [1]-[3].

View Article and Find Full Text PDF

The Biomedical Applications of Artificial Intelligence: An Overview of Decades of Research.

J Drug Target

January 2025

Sunirmal Bhattacharjee, Bharat Pharmaceutical Technology, Amtali, Agartala, Tripura, India.

A significant area of computer science called artificial intelligence (AI) is successfully applied to the analysis of intricate biological data and the extraction of substantial associations from datasets for a variety of biomedical uses. AI has attracted significant interest in biomedical research due to its features: (i) better patient care through early diagnosis and detection; (ii) enhanced workflow; (iii) lowering medical errors; (v) lowering medical costs; (vi) reducing morbidity and mortality; (vii) enhancing performance; (viii) enhancing precision; and (ix) time efficiency. Quantitative metrics are crucial for evaluating AI implementations, providing insights, enabling informed decisions, and measuring the impact of AI-driven initiatives, thereby enhancing transparency, accountability, and overall impact.

View Article and Find Full Text PDF

Particle Markov Chain Monte Carlo Approach to Inference in Transient Surface Kinetics.

J Chem Theory Comput

January 2025

Department of Electrical & Computer Engineering, Stony Brook University, Stony Brook, New York 11794, United States.

In this work, we develop a novel Bayesian approach to study the adsorption and desorption of CO onto a Pd(111) surface, a process of great importance in natural sciences. The motivation for this work comes from the recent availability of time-resolved infrared spectroscopy data and the need for model interpretability and uncertainty quantification in chemical processes. The objective is to learn the relevant parameters that characterize the process: coverage with time, rate constants, activation energies, and pre-exponential factors.

View Article and Find Full Text PDF
Article Synopsis
  • Parkinson's disease (PD) is a movement disorder linked to the degeneration of dopamine-producing neurons, and treatments like Levodopa (L-dopa) and Subthalamic Deep Brain Stimulation (STN-DBS) have distinct effects on brain activity that need further study.
  • In a study involving 21 PD patients on L-dopa and 11 patients with STN-DBS, researchers used Magnetoencephalogram (MEG) data to analyze how these treatments impacted brain state dynamics through a statistical method called the Time-delay embedded Hidden Markov Model (TDE-HMM).
  • Results showed that L-dopa enhanced motor state and beta wave activity in the brain, correlating
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!