Genomes and transcriptomes are now typically sequenced by individual laboratories but analyzing them often remains challenging. One essential step in many analyses lies in identifying orthologs-corresponding genes across multiple species-but this is far from trivial. The Orthologous MAtrix (OMA) database is a leading resource for identifying orthologs among publicly available, complete genomes. Here, we describe the OMA pipeline available as a standalone program for Linux and Mac. When run on a cluster, it has native support for the LSF, SGE, PBS Pro, and Slurm job schedulers and can scale up to thousands of parallel processes. Another key feature of OMA standalone is that users can combine their own data with existing public data by exporting genomes and precomputed alignments from the OMA database, which currently contains over 2100 complete genomes. We compare OMA standalone to other methods in the context of phylogenetic tree inference, by inferring a phylogeny of Lophotrochozoa, a challenging clade within the protostomes. We also discuss other potential applications of OMA standalone, including identifying gene families having undergone duplications/losses in specific clades, and identifying potential drug targets in nonmodel organisms. OMA standalone is available under the permissive open source Mozilla Public License Version 2.0.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6633268 | PMC |
http://dx.doi.org/10.1101/gr.243212.118 | DOI Listing |
F1000Res
June 2020
Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland.
Knowledge of species phylogeny is critical to many fields of biology. In an era of genome data availability, the most common way to make a phylogenetic species tree is by using multiple protein-coding genes, conserved in multiple species. This methodology is composed of several steps: orthology inference, multiple sequence alignment and inference of the phylogeny with dedicated tools.
View Article and Find Full Text PDFGenome Res
July 2019
Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
Bioinformatics
July 2017
Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.
Motivation: Accurate orthology inference is a fundamental step in many phylogenetics and comparative analysis. Many methods have been proposed, including OMA (Orthologous MAtrix). Yet substantial challenges remain, in particular in coping with fragmented genes or genes evolving at different rates after duplication, and in scaling to large datasets.
View Article and Find Full Text PDFOrthology inference and other sequence analyses across multiple genomes typically start by performing exhaustive pairwise sequence comparisons, a process referred to as "all-against-all". As this process scales quadratically in terms of the number of sequences analysed, this step can become a bottleneck, thus limiting the number of genomes that can be simultaneously analysed. Here, we explored ways of speeding-up the all-against-all step while maintaining its sensitivity.
View Article and Find Full Text PDFPLoS One
September 2013
Eldgenössische Technische Hochschule Zurich, Department of Computer Science, Zürich, Switzerland.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!