An extensive review of tools for manual annotation of documents.

Brief Bioinform

German Centre for the Protection of Laboratory Animals (BfR), German Federal Institute for Risk Assessment (BfR), Berlin, Germany.

Published: January 2021

Motivation: Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms. Further, annotation tools are also used to extract new information for a particular use case. However, owing to the high number of existing annotation tools, finding the one that best fits particular needs is a demanding task that requires searching the scientific literature followed by installing and trying various tools.

Methods: We searched for annotation tools and selected a subset of them according to five requirements with which they should comply, such as being Web-based or supporting the definition of a schema. We installed the selected tools (when necessary), carried out hands-on experiments and evaluated them using 26 criteria that covered functional and technical aspects. We defined each criterion on three levels of matches and a score for the final evaluation of the tools.

Results: We evaluated 78 tools and selected the following 15 for a detailed evaluation: BioQRator, brat, Catma, Djangology, ezTag, FLAT, LightTag, MAT, MyMiner, PDFAnno, prodigy, tagtog, TextAE, WAT-SL and WebAnno. Full compliance with our 26 criteria ranged from only 9 up to 20 criteria, which demonstrated that some tools are comprehensive and mature enough to be used on most annotation projects. The highest score of 0.81 was obtained by WebAnno (of a maximum value of 1.0).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7820865PMC
http://dx.doi.org/10.1093/bib/bbz130DOI Listing

Publication Analysis

Top Keywords

annotation tools
16
tools
8
tools selected
8
annotation
6
extensive review
4
review tools
4
tools manual
4
manual annotation
4
annotation documents
4
documents motivation
4

Similar Publications

Coronary artery disease (CAD) remains the leading cause of death globally, highlighting the critical need for accurate diagnostic tools in medical imaging. Traditional segmentation methods for coronary angiograms often struggle with vessel discontinuity and inaccuracies, impeding effective diagnosis and treatment planning. To address these challenges, we developed the Local Adaptive Segmentation Framework (LASF), enhancing the YOLOv8 architecture with dilation and erosion algorithms to improve the continuity and precision of vascular image segmentation.

View Article and Find Full Text PDF

PathwayPilot: A User-Friendly Tool for Visualizing and Navigating Metabolic Pathways.

Mol Cell Proteomics

January 2025

VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium; Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium.

Metaproteomics, the study of collective proteomes in environmental communities, plays a crucial role in understanding microbial functionalities affecting ecosystems and human health. Pathway analysis offers structured insights into the biochemical processes within these communities. However, no existing tool effectively combines pathway analysis with peptide- or protein-level data.

View Article and Find Full Text PDF

The extent of functional sequences within the human genome is a pivotal yet debated topic in biology. Although high-throughput reverse genetic screens have made strides in exploring this, they often limit their scope to known genomic elements and may introduce non-specific effects. This underscores the urgent need for novel functional genomics tools that enable a deeper, unbiased understanding of genome functionality.

View Article and Find Full Text PDF

Background And Aims: Patient-reported outcomes (PROs) are vital in assessing disease activity and treatment outcomes in inflammatory bowel disease (IBD). However, manual extraction of these PROs from the free-text of clinical notes is burdensome. We aimed to improve data curation from free-text information in the electronic health record, making it more available for research and quality improvement.

View Article and Find Full Text PDF

Clinical entity-aware domain adaptation in low resource setting for inflammatory bowel disease.

Front Artif Intell

January 2025

Language Intelligence and Information Retrieval (LIIR) Lab, Department of Computer Science, KU Leuven, Leuven, Belgium.

The digitization of healthcare records has revolutionized medical research and patient care, with electronic health records (EHRs) containing a wealth of structured and unstructured data. Extracting valuable information from unstructured clinical text presents a significant challenge, necessitating automated tools for efficient data mining. Natural language processing (NLP) methods have been pivotal in this endeavor, aiming to extract crucial clinical concepts embedded within free-form text.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!