Designating eukaryotic orthology via processed transcription units.

Nucleic Acids Res

Institute of Biomedical Informatics, National Yang-Ming University, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.

Published: June 2008

Orthology is a widely used concept in comparative and evolutionary genomics. In addition to prokaryotic orthology, delineating eukaryotic orthology has provided insight into the evolution of higher organisms. Indeed, many eukaryotic ortholog databases have been established for this purpose. However, unlike prokaryotes, alternative splicing (AS) has hampered eukaryotic orthology assignments. Therefore, existing databases likely contain ambiguous eukaryotic ortholog relationships and possibly misclassify alternatively spliced protein isoforms as in-paralogs, which are duplicated genes that arise following speciation. Here, we propose a new approach for designating eukaryotic orthology using processed transcription units, and we present an orthology database prototype using the human and mouse genomes. Currently existing programs cover less than 69% of the human reference sequences when assigning human/mouse orthologs. In contrast, our method encompasses up to 80% of the human reference sequences. Moreover, the ortholog database presented herein is more than 92% consistent with the existing databases. In addition to managing AS, this approach is capable of identifying orthologs of embedded genes and fusion genes using syntenic evidence. In summary, this new approach is sensitive, specific and can generate a more comprehensive and accurate compilation of eukaryotic orthologs.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2425467PMC
http://dx.doi.org/10.1093/nar/gkn227DOI Listing

Publication Analysis

Top Keywords

eukaryotic orthology
16
designating eukaryotic
8
orthology processed
8
processed transcription
8
transcription units
8
units orthology
8
eukaryotic ortholog
8
existing databases
8
human reference
8
reference sequences
8

Similar Publications

Metadag: a web tool to generate and analyse metabolic networks.

BMC Bioinformatics

January 2025

Mathematics and Computer Science Department, University of the Balearic Islands, Ctra Valldemossa, Km 7.5, Palma, 07122, Balearic Islands, Spain.

Background: MetaDAG is a web-based tool developed to address challenges posed by big data from omics technologies, particularly in metabolic network reconstruction and analysis. The tool is capable of constructing metabolic networks for specific organisms, sets of organisms, reactions, enzymes, or KEGG Orthology (KO) identifiers. By retrieving data from the KEGG database, MetaDAG helps users visualize and analyze complex metabolic interactions efficiently.

View Article and Find Full Text PDF

Background: East African cichlid fishes have diversified in an explosive fashion, but the (epi)genetic basis of the phenotypic diversity of these fishes remains largely unknown. Although transposable elements (TEs) have been associated with phenotypic variation in cichlids, little is known about their transcriptional activity and epigenetic silencing. We set out to bridge this gap and to understand the interactions between TEs and their cichlid hosts.

View Article and Find Full Text PDF

getphylo: rapid and automatic generation of multi-locus phylogenetic trees.

BMC Bioinformatics

January 2025

The Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske Universitet, Kongens Lyngby, Denmark.

Background: The increasing amount of genomic data calls for tools that can create genome-scale phylogenies quickly and efficiently. Existing tools rely on large reference databases or require lengthy de novo calculations to identify orthologues, meaning that they have long run times and are limited in their taxonomic scope. To address this, we created getphylo, a python tool for the rapid generation of phylogenetic trees de novo from annotated sequences.

View Article and Find Full Text PDF

Sleep is a universally conserved behavior whose origin and evolutionary purpose are uncertain. Using phylogenomics, this article investigates the evolutionary foundations of sleep from a never before used perspective. More specifically, it identifies orthologs of human sleep-related genes in the Lokiarchaeota of the Asgard superphylum and examines their functional role.

View Article and Find Full Text PDF

The surge in genome data, with ongoing efforts aiming to sequence 1.5 M eukaryotes in a decade, could revolutionize genomics, revealing the origins, evolution and genetic innovations of biological processes. Yet, traditional genomics methods scale poorly with such large datasets.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!