Orthology is a widely used concept in comparative and evolutionary genomics. In addition to prokaryotic orthology, delineating eukaryotic orthology has provided insight into the evolution of higher organisms. Indeed, many eukaryotic ortholog databases have been established for this purpose. However, unlike prokaryotes, alternative splicing (AS) has hampered eukaryotic orthology assignments. Therefore, existing databases likely contain ambiguous eukaryotic ortholog relationships and possibly misclassify alternatively spliced protein isoforms as in-paralogs, which are duplicated genes that arise following speciation. Here, we propose a new approach for designating eukaryotic orthology using processed transcription units, and we present an orthology database prototype using the human and mouse genomes. Currently existing programs cover less than 69% of the human reference sequences when assigning human/mouse orthologs. In contrast, our method encompasses up to 80% of the human reference sequences. Moreover, the ortholog database presented herein is more than 92% consistent with the existing databases. In addition to managing AS, this approach is capable of identifying orthologs of embedded genes and fusion genes using syntenic evidence. In summary, this new approach is sensitive, specific and can generate a more comprehensive and accurate compilation of eukaryotic orthologs.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2425467 | PMC |
http://dx.doi.org/10.1093/nar/gkn227 | DOI Listing |
BMC Bioinformatics
January 2025
Mathematics and Computer Science Department, University of the Balearic Islands, Ctra Valldemossa, Km 7.5, Palma, 07122, Balearic Islands, Spain.
Background: MetaDAG is a web-based tool developed to address challenges posed by big data from omics technologies, particularly in metabolic network reconstruction and analysis. The tool is capable of constructing metabolic networks for specific organisms, sets of organisms, reactions, enzymes, or KEGG Orthology (KO) identifiers. By retrieving data from the KEGG database, MetaDAG helps users visualize and analyze complex metabolic interactions efficiently.
View Article and Find Full Text PDFGenome Biol
January 2025
Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1GA, UK.
Background: East African cichlid fishes have diversified in an explosive fashion, but the (epi)genetic basis of the phenotypic diversity of these fishes remains largely unknown. Although transposable elements (TEs) have been associated with phenotypic variation in cichlids, little is known about their transcriptional activity and epigenetic silencing. We set out to bridge this gap and to understand the interactions between TEs and their cichlid hosts.
View Article and Find Full Text PDFBMC Bioinformatics
January 2025
The Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske Universitet, Kongens Lyngby, Denmark.
Background: The increasing amount of genomic data calls for tools that can create genome-scale phylogenies quickly and efficiently. Existing tools rely on large reference databases or require lengthy de novo calculations to identify orthologues, meaning that they have long run times and are limited in their taxonomic scope. To address this, we created getphylo, a python tool for the rapid generation of phylogenetic trees de novo from annotated sequences.
View Article and Find Full Text PDFInt J Mol Sci
December 2024
Department of Pharmacology, JSS College of Pharmacy, JSS Academy of Higher Education & Research, Mysuru 570015, Karnataka, India.
Sleep is a universally conserved behavior whose origin and evolutionary purpose are uncertain. Using phylogenomics, this article investigates the evolutionary foundations of sleep from a never before used perspective. More specifically, it identifies orthologs of human sleep-related genes in the Lokiarchaeota of the Asgard superphylum and examines their functional role.
View Article and Find Full Text PDFNat Methods
January 2025
Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
The surge in genome data, with ongoing efforts aiming to sequence 1.5 M eukaryotes in a decade, could revolutionize genomics, revealing the origins, evolution and genetic innovations of biological processes. Yet, traditional genomics methods scale poorly with such large datasets.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!