The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Ross Overbeek Robert Olson Gordon D Pusch Gary J Olsen James J Davis Terry Disz Robert A Edwards Svetlana Gerdes Bruce Parrello Maulik Shukla Veronika Vonstein Alice R Wattam Fangfang Xia Rick Stevens

Nucleic Acids Res

Fellowship for Interpretation of Genomes, Burr Ridge, IL 60527, USA, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA, Computation Institute, University of Chicago, Chicago, IL 60637, USA, Department of Microbiology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA, Department of Computer Science, San Diego State University, San Diego, CA 92182, USA, Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, VA 24060, USA, Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA and Department of Computer Science, University of Chicago, Chicago, IL 60637, USA.

Published: January 2014

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965101	PMC
http://dx.doi.org/10.1093/nar/gkt1226	DOI Listing

Publication Analysis

Top Keywords

protein families

rast annotation

figfam collection

seed

rast

seed rapid

annotation

rapid annotation

annotation microbial

genomes

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!