Dictionary-based protein name recognition is often a first step in extracting information from biomedical documents because it can provide ID information on recognized terms. However, dictionary-based approaches present two fundamental difficulties: (1) false recognition mainly caused by short names; (2) low recall due to spelling variations. In this paper, we tackle the former problem using machine learning to filter out false positives and present two alternative methods for alleviating the latter problem of spelling variations. The first is achieved by using approximate string searching, and the second by expanding the dictionary with a probabilistic variant generator, which we propose in this paper. Experimental results using the GENIA corpus revealed that filtering using a naive Bayes classifier greatly improved precision with only a slight loss of recall, resulting in 10.8% improvement in F-measure, and dictionary expansion with the variant generator gave further 1.6% improvement and achieved an F-measure of 66.6%.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.jbi.2004.08.003 | DOI Listing |
Behav Res Methods
December 2024
Department of Business Administration and Economics, FernUniversität in Hagen, Hagen, Germany.
This article develops a comprehensive database comprising 5956 German affective norms specifically tailored for the study of organizational atmospheres through computational verbal language analysis. This dictionary adopts both dimensional and categorical approaches. The theoretical foundation of this study is the circumplex model of affective atmospheres.
View Article and Find Full Text PDFBioinformatics
November 2024
Britton Chance Center for Biomedical Photonics, Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan 430074, China.
Motivation: Neuroscientists have long endeavored to map brain connectivity, yet the intricate nature of brain networks often leads them to concentrate on specific regions, hindering efforts to unveil a comprehensive connectivity map. Recent advancements in imaging and text mining techniques have enabled the accumulation of a vast body of literature containing valuable insights into brain connectivity, facilitating the extraction of whole-brain connectivity relations from this corpus. However, the diverse representations of brain region names and connectivity relations pose a challenge for conventional machine learning methods and dictionary-based approaches in identifying all instances accurately.
View Article and Find Full Text PDFPLoS One
November 2024
Institute for Communication Psychology and Media Education, Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau, Landau, Germany.
To understand and measure political information consumption in the high-choice media environment, we need new methods to trace individual interactions with online content and novel techniques to analyse and detect politics-related information. In this paper, we report the results of a comparative analysis of the performance of automated content analysis techniques for detecting political content in the German language across different platforms. Using three validation datasets, we compare the performance of three groups of detection techniques relying on dictionaries, classic supervised machine learning, and deep learning.
View Article and Find Full Text PDFRadiography (Lond)
November 2024
Centro Hospitalar Universitário de Santo António, Unidade Local de Saúde de Santo António, Orthopedic Department, Porto, Portugal; ICBAS, School of Medicine and Biomedical Sciences, University of Porto, Portugal. Electronic address:
J Health Commun
October 2024
School of Journalism, Fudan University, Shanghai, China.
Media coverage of depression on social media with specific framings could shape people's perception and attitude, which is significant in reducing the stigma and promoting support for depression sufferers. Adopting the lens of moral foundation theory (MFT), this study aims to explore the effect of inherent moral framings within depression coverage on social media on the stigma and approval attitudes toward depression in audiences' responses. A large language model and a dictionary-based approach were respectively adopted to score depression-related media coverages ( = 919) and corresponding comments ( = 92,505) collected from the Weibo platform against MFT's five dimensions and (de)stigma attitudes.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!