Summary: Named entity recognition (NER) is an important step in biomedical information extraction pipelines. Tools for NER should be easy to use, cover multiple entity types, be highly accurate and be robust toward variations in text genre and style. We present HunFlair, a NER tagger fulfilling these requirements. HunFlair is integrated into the widely used NLP framework Flair, recognizes five biomedical entity types, reaches or overcomes state-of-the-art performance on a wide set of evaluation corpora, and is trained in a cross-corpus setting to avoid corpus-specific bias. Technically, it uses a character-level language model pretrained on roughly 24 million biomedical abstracts and three million full texts. It outperforms other off-the-shelf biomedical NER tools with an average gain of 7.26 pp over the next best tool in a cross-corpus setting and achieves on-par results with state-of-the-art research prototypes in in-corpus experiments. HunFlair can be installed with a single command and is applied with only four lines of code. Furthermore, it is accompanied by harmonized versions of 23 biomedical NER corpora.

Availability And Implementation: HunFlair ist freely available through the Flair NLP framework (https://github.com/flairNLP/flair) under an MIT license and is compatible with all major operating systems.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8428609PMC
http://dx.doi.org/10.1093/bioinformatics/btab042DOI Listing

Publication Analysis

Top Keywords

named entity
8
entity recognition
8
entity types
8
nlp framework
8
cross-corpus setting
8
biomedical ner
8
biomedical
6
hunflair
5
ner
5
hunflair easy-to-use
4

Similar Publications

Objective: Extracting PICO elements-Participants, Intervention, Comparison, and Outcomes-from clinical trial literature is essential for clinical evidence retrieval, appraisal, and synthesis. Existing approaches do not distinguish the attributes of PICO entities. This study aims to develop a named entity recognition (NER) model to extract PICO entities with fine granularities.

View Article and Find Full Text PDF

Motivation: Forecasting the synergistic effects of drug combinations facilitates drug discovery and development, especially regarding cancer therapeutics. While numerous computational methods have emerged, most of them fall short in fully modeling the relationships among clinical entities including drugs, cell lines, and diseases, which hampers their ability to generalize to drug combinations involving unseen drugs. These relationships are complex and multidimensional, requiring sophisticated modeling to capture nuanced interplay that can significantly influence therapeutic efficacy.

View Article and Find Full Text PDF

Light-driven in-situ synthesis of nano-sulfur and graphene oxide composites for efficient removal of heavy metal ions.

J Hazard Mater

January 2025

State Key Lab of Geohazard prevention & Geoenvironment protection, College of Materials and Chemistry & Chemical Engineering, Chengdu University of Technology, Chengdu 610059, China. Electronic address:

Sulfur nanoparticles (SNPs) and their composites are promising for heavy metal adsorption, yet current SNPs often lack surface S, leading to low affinity toward heavy metal and ease of aggregation. Here, we report a simple light-driven method for facile prepare SNPs with surfaces enriched with S and in-situ load them onto graphene oxide (GO) to fabricate GO-S composites. Under illumination, the O generated by photosensitizer phloxine B was able to oxidize S into elemental SNPs.

View Article and Find Full Text PDF

In this study, the authors presented a dataset for named entity recognition in the Uzbek language. The dataset consists of 2000 sentences and 25,865 words, and the sources were legal documents and hand-crafted sentences annotated using the BIOES scheme. The study is complemented by the fact that the authors demonstrated the applications of the created dataset by training a language model using the CNN + LSTM architecture, which achieves high accuracy in NER tasks, with an F1 score of 90.

View Article and Find Full Text PDF

Hyaluronic acid (HA) is a popular surface modifier in targeted cancer delivery due to its receptor-binding abilities. However, HA alone faces limitations in lipid solubility, biocompatibility, and cell internalization, making it less effective as a standalone delivery system. This comprehensive study aimed to explore a dynamic landscape of complexation in HA-based nanoparticles in cancer therapy, examining diverse aspects from influential modifiers to emerging trends in cancer diagnostics.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!