Publications by Peter Midford | LitMetric

Publications by authors named "Peter Midford"

Page 1 of 2

Improved BioCyc Operon Prediction: Revisiting the Operon Prediction Problem.

Peter E Midford John Cadigan Peter D Karp

bioRxiv

June 2024

Introduction: Operon prediction is a valuable component of microbial-genome annotation because operon organization can yield inferences about gene function, and because knowledge of operon structure can aid the interpretation of gene expression data.

Methods: We present a number of improvements to the existing Pathway Tools operon predictor based mostly on 7 new features that we hypothesized would increase its performance. The new features include shared Gene Ontology biological process terms, similarity of codon usage and GC content, correlated gene expression, and shared protein complex.

View Article and Find Full Text PDF

The EcoCyc Database (2023).

Peter D Karp Suzanne Paley Ron Caspi Anamika Kothari Markus Krummenacker Peter E Midford

EcoSal Plus

December 2023

EcoCyc is a bioinformatics database available online at EcoCyc.org that describes the genome and the biochemical machinery of K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of .

View Article and Find Full Text PDF

The EcoCyc Database in 2021.

Ingrid M Keseler Socorro Gama-Castro Amanda Mackie Richard Billington César Bonavides-Martínez Peter E Midford

Front Microbiol

July 2021

The EcoCyc model-organism database collects and summarizes experimental data for K-12. EcoCyc is regularly updated by the manual curation of individual database entries, such as genes, proteins, and metabolic pathways, and by the programmatic addition of results from select high-throughput analyses. Updates to the Pathway Tools software that supports EcoCyc and to the web interface that enables user access have continuously improved its usability and expanded its functionality.

View Article and Find Full Text PDF

Pathway size matters: the influence of pathway granularity on over-representation (enrichment analysis) statistics.

Peter D Karp Peter E Midford Ron Caspi Arkady Khodursky

BMC Genomics

March 2021

Background: Enrichment or over-representation analysis is a common method used in bioinformatics studies of transcriptomics, metabolomics, and microbiome datasets. The key idea behind enrichment analysis is: given a set of significantly expressed genes (or metabolites), use that set to infer a smaller set of perturbed biological pathways or processes, in which those genes (or metabolites) play a role. Enrichment computations rely on collections of defined biological pathways and/or processes, which are usually drawn from pathway databases.

View Article and Find Full Text PDF

Pathway Tools version 23.0 update: software for pathway/genome informatics and systems biology.

Peter D Karp Peter E Midford Richard Billington Anamika Kothari Markus Krummenacker

Brief Bioinform

January 2021

Motivation: Biological systems function through dynamic interactions among genes and their products, regulatory circuits and metabolic networks. Our development of the Pathway Tools software was motivated by the need to construct biological knowledge resources that combine these many types of data, and that enable users to find and comprehend data of interest as quickly as possible through query and visualization tools. Further, we sought to support the development of metabolic flux models from pathway databases, and to use pathway information to leverage the interpretation of high-throughput data sets.

View Article and Find Full Text PDF

Taxonomic weighting improves the accuracy of a gap-filling algorithm for metabolic models.

Wai Kit Ong Peter E Midford Peter D Karp

Bioinformatics

March 2020

Motivation: The increasing availability of annotated genome sequences enables construction of genome-scale metabolic networks, which are useful tools for studying organisms of interest. However, due to incomplete genome annotations, draft metabolic models contain gaps that must be filled in a time-consuming process before they are usable. Optimization-based algorithms that fill these gaps have been developed, however, gap-filling algorithms show significant error rates and often introduce incorrect reactions.

View Article and Find Full Text PDF

The MetaCyc database of metabolic pathways and enzymes - a 2019 update.

Ron Caspi Richard Billington Ingrid M Keseler Anamika Kothari Markus Krummenacker Peter E Midford

Nucleic Acids Res

January 2020

MetaCyc (MetaCyc.org) is a comprehensive reference database of metabolic pathways and enzymes from all domains of life. It contains 2749 pathways derived from more than 60 000 publications, making it the largest curated collection of metabolic pathways.

View Article and Find Full Text PDF

Using Pathway Covering to Explore Connections among Metabolites.

Peter E Midford Mario Latendresse Paul E O'Maille Peter D Karp

Metabolites

May 2019

Interpreting changes in metabolite abundance in response to experimental treatments or disease states remains a major challenge in metabolomics. Pathway Covering is a new algorithm that takes a list of metabolites (compounds) and determines a minimum-cost set of metabolic pathways in an organism that includes (covers) all the metabolites in the list. We used five functions for assigning costs to pathways, including assigning a constant for all pathways, which yields a solution with the smallest pathway count; two methods that penalize large pathways; one that prefers pathways based on the pathway's assigned function, and one that loosely corresponds to metabolic flux.

View Article and Find Full Text PDF

A Comparison of Microbial Genome Web Portals.

Peter D Karp Natalia Ivanova Markus Krummenacker Nikos Kyrpides Mario Latendresse Peter Midford

Front Microbiol

February 2019

Microbial genome web portals have a broad range of capabilities that address a number of information-finding and analysis needs for scientists. This article compares the capabilities of the major microbial genome web portals to aid researchers in determining which portal(s) are best suited to their needs. We assessed both the bioinformatics tools and the data content of BioCyc, KEGG, Ensembl Bacteria, KBase, IMG, and PATRIC.

View Article and Find Full Text PDF

The EcoCyc Database.

Peter D Karp Wai Kit Ong Suzanne Paley Richard Billington Ron Caspi Peter E Midford

EcoSal Plus

November 2018

EcoCyc is a bioinformatics database available at EcoCyc.org that describes the genome and the biochemical machinery of K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of .

View Article and Find Full Text PDF

The BioCyc collection of microbial genomes and metabolic pathways.

Peter D Karp Richard Billington Ron Caspi Carol A Fulcher Mario Latendresse Peter E Midford

Brief Bioinform

July 2019

BioCyc.org is a microbial genome Web portal that combines thousands of genomes with additional information inferred by computer programs, imported from other databases and curated from the biomedical literature by biologist curators. BioCyc also provides an extensive range of query tools, visualization services and analysis software.

View Article and Find Full Text PDF

The MetaCyc database of metabolic pathways and enzymes.

Ron Caspi Richard Billington Carol A Fulcher Ingrid M Keseler Anamika Kothari Peter E Midford

Nucleic Acids Res

January 2018

MetaCyc (https://MetaCyc.org) is a comprehensive reference database of metabolic pathways and enzymes from all domains of life. It contains more than 2570 pathways derived from >54 000 publications, making it the largest curated collection of metabolic pathways.

View Article and Find Full Text PDF

Emerging semantics to link phenotype and environment.

Anne E Thessen Daniel E Bunker Pier Luigi Buttigieg Laurel D Cooper Wasila M Dahdul Peter E Midford

PeerJ

December 2015

Understanding the interplay between environmental conditions and phenotypes is a fundamental goal of biology. Unfortunately, data that include observations on phenotype and environment are highly heterogeneous and thus difficult to find and integrate. One approach that is likely to improve the status quo involves the use of ontologies to standardize and link data about phenotypes and environments.

View Article and Find Full Text PDF

Synthesis of phylogeny and taxonomy into a comprehensive tree of life.

Cody E Hinchliff Stephen A Smith James F Allman J Gordon Burleigh Ruchi Chaudhary Peter E Midford

Proc Natl Acad Sci U S A

October 2015

Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny.

View Article and Find Full Text PDF

Finding our way through phenotypes.

Andrew R Deans Suzanna E Lewis Eva Huala Salvatore S Anzaldo Michael Ashburner Peter E Midford

PLoS Biol

January 2015

Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes.

View Article and Find Full Text PDF

Patterns in root traits of woody species hosting arbuscular and ectomycorrhizas: implications for the evolution of belowground strategies.

Louise H Comas Hilary S Callahan Peter E Midford

Ecol Evol

August 2014

Root traits vary enormously among plant species but we have little understanding of how this variation affects their functioning. Of central interest is how root traits are related to plant resource acquisition strategies from soil. We examined root traits of 33 woody species from northeastern US forests that form two of the most common types of mutualisms with fungi, arbuscular mycorrhizas (AM) and ectomycorrhizas (EM).

View Article and Find Full Text PDF

Semantics in support of biodiversity knowledge discovery: an introduction to the biological collections ontology and related ontologies.

Ramona L Walls John Deck Robert Guralnick Steve Baskauf Reed Beaman Peter Midford

PLoS One

January 2015

The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions.

View Article and Find Full Text PDF

The vertebrate taxonomy ontology: a framework for reasoning across model organism and species phenotypes.

Peter E Midford Thomas Alex Dececchi James P Balhoff Wasila M Dahdul Nizar Ibrahim

J Biomed Semantics

November 2013

Background: A hierarchical taxonomy of organisms is a prerequisite for semantic integration of biodiversity data. Ideally, there would be a single, expansive, authoritative taxonomy that includes extinct and extant taxa, information on synonyms and common names, and monophyletic supraspecific taxa that reflect our current understanding of phylogenetic relationships.

Description: As a step towards development of such a resource, and to enable large-scale integration of phenotypic data across vertebrates, we created the Vertebrate Taxonomy Ontology (VTO), a semantically defined taxonomic resource derived from the integration of existing taxonomic compilations, and freely distributed under a Creative Commons Zero (CC0) public domain waiver.

View Article and Find Full Text PDF

Phylotastic! Making tree-of-life knowledge accessible, reusable and convenient.

Arlin Stoltzfus Hilmar Lapp Naim Matasci Helena Deus Brian Sidlauskas Peter E Midford

BMC Bioinformatics

May 2013

Background: Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs.

View Article and Find Full Text PDF

Exploring power and parameter estimation of the BiSSE method for analyzing species diversification.

Matthew P Davis Peter E Midford Wayne Maddison

BMC Evol Biol

February 2013

Background: There has been a considerable increase in studies investigating rates of diversification and character evolution, with one of the promising techniques being the BiSSE method (binary state speciation and extinction). This study uses simulations under a variety of different sample sizes (number of tips) and asymmetries of rate (speciation, extinction, character change) to determine BiSSE's ability to test hypotheses, and investigate whether the method is susceptible to confounding effects.

Results: We found that the power of the BiSSE method is severely affected by both sample size and high tip ratio bias (one character state dominates among observed tips).

View Article and Find Full Text PDF

500,000 fish phenotypes: The new informatics landscape for evolutionary and developmental biology of the vertebrate skeleton.

By Paula Mabee James P Balhoff Wasila M Dahdul Hilmar Lapp Peter E Midford

J Appl Ichthyol

June 2012

The rich phenotypic diversity that characterizes the vertebrate skeleton results from evolutionary changes in regulation of genes that drive development. Although relatively little is known about the genes that underlie the skeletal variation among fish species, significant knowledge of genetics and development is available for zebrafish. Because developmental processes are highly conserved, this knowledge can be leveraged for understanding the evolution of skeletal diversity.

View Article and Find Full Text PDF

NeXML: rich, extensible, and verifiable representation of comparative data and metadata.

Rutger A Vos James P Balhoff Jason A Caravas Mark T Holder Hilmar Lapp Peter E Midford

Syst Biol

July 2012

In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines.

View Article and Find Full Text PDF

The teleost anatomy ontology: anatomical representation for the genomics age.

Wasila M Dahdul John G Lundberg Peter E Midford James P Balhoff Hilmar Lapp

Syst Biol

July 2010

The rich knowledge of morphological variation among organisms reported in the systematic literature has remained in free-text format, impractical for use in large-scale synthetic phylogenetic work. This noncomputable format has also precluded linkage to the large knowledgebase of genomic, genetic, developmental, and phenotype data in model organism databases. We have undertaken an effort to prototype a curated, ontology-based evolutionary morphology database that maps to these genetic databases (http://kb.

View Article and Find Full Text PDF

Evolutionary characters, phenotypes and ontologies: curating data from the systematic biology literature.

Wasila M Dahdul James P Balhoff Jeffrey Engeman Terry Grande Eric J Hilton Peter E Midford

PLoS One

May 2010

Background: The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies.

View Article and Find Full Text PDF

Phenex: ontological annotation of phenotypic diversity.

James P Balhoff Wasila M Dahdul Cartik R Kothari Hilmar Lapp John G Lundberg Peter E Midford

PLoS One

May 2010

Background: Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge.

View Article and Find Full Text PDF