Opinion: Strategy of Semi-Automatically Annotating Full Text Corpus of Genomics & Informatics.

Genomics Inform

Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, Korea.

Published: December 2018

There is a community need for an annotated corpus consisting of the full texts of biomedical journal articles. In response to community needs, a prototype version of full text corpus of Genomics & Informatics, called GNI version 1.0 has been recently published, with 499 annotated full text articles available as a corpus resource. However, GNI needs to be updated, as the texts were shallow-parsed, and annotated with several existing parsers. I list issues associated with upgrading annotations, and give opinion on methodology to develop next version of GNI corpus based on a semi-automatic strategy for more linguistically rich corpus annotation.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6440653PMC
http://dx.doi.org/10.5808/GI.2018.16.4.e40DOI Listing

Publication Analysis

Top Keywords

full text
12
text corpus
8
corpus genomics
8
genomics informatics
8
corpus
6
opinion strategy
4
strategy semi-automatically
4
semi-automatically annotating
4
full
4
annotating full
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!