Expansion of the BioCyc collection of pathway/genome databases to 160 genomes.

Nucleic Acids Res

Bioinformatics Research Group, SRI International EK207, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA.

Published: October 2005

The BioCyc database collection is a set of 160 pathway/genome databases (PGDBs) for most eukaryotic and prokaryotic species whose genomes have been completely sequenced to date. Each PGDB in the BioCyc collection describes the genome and predicted metabolic network of a single organism, inferred from the MetaCyc database, which is a reference source on metabolic pathways from multiple organisms. In addition, each bacterial PGDB includes predicted operons for the corresponding species. The BioCyc collection provides a unique resource for computational systems biology, namely global and comparative analyses of genomes and metabolic networks, and a supplement to the BioCyc resource of curated PGDBs. The Omics viewer available through the BioCyc website allows scientists to visualize combinations of gene expression, proteomics and metabolomics data on the metabolic maps of these organisms. This paper discusses the computational methodology by which the BioCyc collection has been expanded, and presents an aggregate analysis of the collection that includes the range of number of pathways present in these organisms, and the most frequently observed pathways. We seek scientists to adopt and curate individual PGDBs within the BioCyc collection. Only by harnessing the expertise of many scientists we can hope to produce biological databases, which accurately reflect the depth and breadth of knowledge that the biomedical research community is producing.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1266070PMC
http://dx.doi.org/10.1093/nar/gki892DOI Listing

Publication Analysis

Top Keywords

biocyc collection
20
pathway/genome databases
8
collection
7
biocyc
7
expansion biocyc
4
collection pathway/genome
4
databases 160
4
160 genomes
4
genomes biocyc
4
biocyc database
4

Similar Publications

Introduction: Changes in the human gut microbiome have been linked to various chronic diseases, including chronic obstructive pulmonary disease (COPD). While substantial knowledge is available on the genomic features of fecal communities, little is known about the microbiome's transcriptional activity. Here, we analyzed the metatranscriptomic (MTR) abundance of MetaCyc pathways, SuperPathways, and protein domain families (PFAM) represented by the gut microbiome in a cohort of non-small cell lung cancer (NSCLC) patients with- or without COPD comorbidity.

View Article and Find Full Text PDF

Leveraging Curation Among Pathway/Genome Databases Using Ortholog-Based Annotation Propagation.

Front Microbiol

March 2021

Bioinformatics Research Group, SRI International, Menlo Park, CA, United States.

Updating genome databases to reflect newly published molecular findings for an organism was hard enough when only a single strain of a given organism had been sequenced. With multiple sequenced strains now available for many organisms, the challenge has grown significantly because of the still-limited resources available for the manual curation that corrects errors and captures new knowledge. We have developed a method to automatically propagate multiple types of curated knowledge from genes and proteins in one genome database to their orthologs in uncurated databases for related strains, imposing several quality-control filters to reduce the chances of introducing errors.

View Article and Find Full Text PDF

Pathway size matters: the influence of pathway granularity on over-representation (enrichment analysis) statistics.

BMC Genomics

March 2021

Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, 140 Gortner Lab, 1479 Gortner Ave, Saint Paul, 55198, MN, USA.

Background: Enrichment or over-representation analysis is a common method used in bioinformatics studies of transcriptomics, metabolomics, and microbiome datasets. The key idea behind enrichment analysis is: given a set of significantly expressed genes (or metabolites), use that set to infer a smaller set of perturbed biological pathways or processes, in which those genes (or metabolites) play a role. Enrichment computations rely on collections of defined biological pathways and/or processes, which are usually drawn from pathway databases.

View Article and Find Full Text PDF

The MetaCyc database of metabolic pathways and enzymes - a 2019 update.

Nucleic Acids Res

January 2020

SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA.

MetaCyc (MetaCyc.org) is a comprehensive reference database of metabolic pathways and enzymes from all domains of life. It contains 2749 pathways derived from more than 60 000 publications, making it the largest curated collection of metabolic pathways.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!