Data preparation and interannotator agreement: BioCreAtIvE task 1B.

BMC Bioinformatics

The MITRE Corporation, 202 Burlington Road, Bedford, MA 01730, USA.

Published: February 2006

Background: We prepared and evaluated training and test materials for an assessment of text mining methods in molecular biology. The goal of the assessment was to evaluate the ability of automated systems to generate a list of unique gene identifiers from PubMed abstracts for the three model organisms Fly, Mouse, and Yeast. This paper describes the preparation and evaluation of answer keys for training and testing. These consisted of lists of normalized gene names found in the abstracts, generated by adapting the gene list for the full journal articles found in the model organism databases. For the training dataset, the gene list was pruned automatically to remove gene names not found in the abstract; for the testing dataset, it was further refined by manual annotation by annotators provided with guidelines. A critical step in interpreting the results of an assessment is to evaluate the quality of the data preparation. We did this by careful assessment of interannotator agreement and the use of answer pooling of participant results to improve the quality of the final testing dataset.

Results: Interannotator analysis on a small dataset showed that our gene lists for Fly and Yeast were good (87% and 91% three-way agreement) but the Mouse gene list had many conflicts (mostly omissions), which resulted in errors (69% interannotator agreement). By comparing and pooling answers from the participant systems, we were able to add an additional check on the test data; this allowed us to find additional errors, especially in Mouse. This led to 1% change in the Yeast and Fly "gold standard" answer keys, but to an 8% change in the mouse answer key.

Conclusion: We found that clear annotation guidelines are important, along with careful interannotator experiments, to validate the generated gene lists. Also, abstracts alone are a poor resource for identifying genes in paper, containing only a fraction of genes mentioned in the full text (25% for Fly, 36% for Mouse). We found that there are intrinsic differences between the model organism databases related to the number of synonymous terms and also to curation criteria. Finally, we found that answer pooling was much faster and allowed us to identify more conflicting genes than interannotator analysis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869005PMC
http://dx.doi.org/10.1186/1471-2105-6-S1-S12DOI Listing

Publication Analysis

Top Keywords

interannotator agreement
12
gene list
12
data preparation
8
assessment evaluate
8
gene
8
answer keys
8
gene names
8
model organism
8
organism databases
8
dataset gene
8

Similar Publications

Predicting court rulings has gained attention over the past years. The court rulings are among the most important documents in all legal systems, profoundly impacting the lives of the children in case of divorce or separation. It is evident from literature that Natural language processing (NLP) and machine learning (ML) are widely used in the prediction of court rulings.

View Article and Find Full Text PDF

As online video and streaming platforms continue to grow, affective computing research has undergone a shift towards more complex studies involving multiple modalities. However, there is still a lack of readily available datasets with high-quality audiovisual stimuli. In this paper, we present GameVibe, a novel affect corpus which consists of multimodal audiovisual stimuli, including in-game behavioural observations and third-person affect traces for viewer engagement.

View Article and Find Full Text PDF

Advance care planning, involving goals-of-care and surrogate-designation conversations, is crucial for patient-centered care. However, determining the optimal timing and participants for these conversations remains challenging. This study explored the frequency, timing, and predictors of documenting two advance care planning elements, goals-of-care and surrogate-designation conversations, in clinical notes for patients with advanced illness.

View Article and Find Full Text PDF

Measuring seminal root angle is an important aspect of root phenotyping, yet automated methods are lacking. We introduce SeminalRootAngle, a novel open-source automated method that measures seminal root angles from images. To ensure our method is flexible and user-friendly we build on an established corrective annotation training method for image segmentation.

View Article and Find Full Text PDF
Article Synopsis
  • - This study addresses the complexity of diagnosing pituitary adenomas by creating a detailed annotated dataset (TCPA) from clinical notes, which includes 2000 documents and over half a million words filled with diagnosis and treatment information.
  • - The corpus construction employed a semi-automatic approach, achieving high-quality annotations with a strong inter-annotator agreement, highlighting the dataset's reliability for research purposes.
  • - Experiments with large language models (LLMs) demonstrated that TCPA can automatically extract clinical information from unstructured text and effectively reduce the amount of training data needed, saving labor costs in medical research.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!