Inter-species normalization of gene mentions with GNAT.

Bioinformatics

Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA.

Published: August 2008

Motivation: Text mining in the biomedical domain aims at helping researchers to access information contained in scientific publications in a faster, easier and more complete way. One step towards this aim is the recognition of named entities and their subsequent normalization to database identifiers. Normalization helps to link objects of potential interest, such as genes, to detailed information not contained in a publication; it is also key for integrating different knowledge sources. From an information retrieval perspective, normalization facilitates indexing and querying. Gene mention normalization (GN) is particularly challenging given the high ambiguity of gene names: they refer to orthologous or entirely different genes, are named after phenotypes and other biomedical terms, or they resemble common English words.

Results: We present the first publicly available system, GNAT, reported to handle inter-species GN. Our method uses extensive background knowledge on genes to resolve ambiguous names to EntrezGene identifiers. It performs comparably to single-species approaches proposed by us and others. On a benchmark set derived from BioCreative 1 and 2 data that contains genes from 13 species, GNAT achieves an F-measure of 81.4% (90.8% precision at 73.8% recall). For the single-species task, we report an F-measure of 85.4% on human genes.

Availability: A web-frontend is available at http://cbioc.eas.asu.edu/gnat/. GNAT will also be available within the BioCreativeMetaService project, see http://bcms.bioinfo.cnio.es.

Supplementary Information: The test data set, lexica, and links toexternal data are available at http://cbioc.eas.asu.edu/gnat/

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btn299DOI Listing

Publication Analysis

Top Keywords

inter-species normalization
4
normalization gene
4
gene mentions
4
gnat
4
mentions gnat
4
gnat motivation
4
motivation text
4
text mining
4
mining biomedical
4
biomedical domain
4

Similar Publications

Alfalfa species L. (MS) and L. (MF), globally prominent perennial leguminous forages, hold substantial economic value.

View Article and Find Full Text PDF

Small ubiquitin-like modifier protease gene TaDSU enhances salt tolerance of wheat.

New Phytol

October 2024

The Key Laboratory of Plant Development and Environmental Adaptation Biology, Ministry of Education, Shandong Key Laboratory of Precision Molecular Crop Design and Breeding, School of Life Sciences, Shandong University, Qingdao, 266237, China.

To identify efficient salt-tolerant genes is beneficial for coping with the penalty of salt stress on crop yield. Reversible conjugation (sumoylation and desumoylation) of Small Ubiquitin-Like Modifier (SUMO) is a crucial kind of protein modifications, but its roles in the response to salt and other abiotic stress are not well addressed. Here, we identify salt-tolerant SUMO protease gene TaDSU for desumoylation from wheat, and analyze its mechanism in salt tolerance and evaluate its role in yield in saline-alkaline fields.

View Article and Find Full Text PDF

Performing joint analysis of gene expression datasets from different experiments can present challenges brought on by multiple factors-differences in equipment, protocols, climate etc. "Cross-study normalization" is a general term for transformations aimed at eliminating such effects, thus making datasets more comparable. However, joint analysis of datasets from different species is rarely done, and there are no dedicated normalization methods for such inter-species analysis.

View Article and Find Full Text PDF

Soil microorganisms play a crucial role in the plant invasion process, acting as both drivers of and responders to plant invasion. However, the effects of plant invasion on the complexity and stability of co-occurrence networks of soil microbial communities remain unclear. Here, we investigated how the invasion of affected the diversity, composition, and co-occurrence networks of soil bacterial and fungal communities in the Yellow River Delta, China.

View Article and Find Full Text PDF

Defense against Paramecium predation via long filament morphology favors the survival of Raphidiopsis raciborskii populations.

Harmful Algae

August 2024

National and Local Joint Engineering Research Center of Ecological Treatment Technology for Urban Water Pollution, Zhejiang Provincial Key Lab for Water Environment and Marine Biological Resources Protection, Institute for Eco-Environmental Research of Sanyang Wetland, Wenzhou University, Wenzhou 325035, China. Electronic address:

Raphidiopsis blooms are notorious for cyanotoxin formation and strong invasiveness, threatening the stability of aquatic ecosystems and human health. The protozoa Paramecium can potentially serve as an organism for controlling Raphidiopsis blooms owing to its grazing effect. However, the grazing ability of Paramecium is largely determined by the size of the prey, and the population of Raphidiopsis consists of filaments of varying lengths and sizes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!