Background: Orthology is a central tenet of comparative genomics and ortholog identification is instrumental to protein function prediction. Major advances have been made to determine orthology relations among a set of homologous proteins. However, they depend on the comparison of individual sequences and do not take into account divergent orthologs.

Results: We have developed an iterative orthology prediction method, Ortho-Profile, that uses reciprocal best hits at the level of sequence profiles to infer orthology. It increases ortholog detection by 20% compared to sequence-to-sequence comparisons. Ortho-Profile predicts 598 human orthologs of mitochondrial proteins from Saccharomyces cerevisiae and Schizosaccharomyces pombe with 94% accuracy. Of these, 181 were not known to localize to mitochondria in mammals. Among the predictions of the Ortho-Profile method are 11 human cytochrome c oxidase (COX) assembly proteins that are implicated in mitochondrial function and disease. Their co-expression patterns, experimentally verified subcellular localization, and co-purification with human COX-associated proteins support these predictions. For the human gene C12orf62, the ortholog of S. cerevisiae COX14, we specifically confirm its role in negative regulation of the translation of cytochrome c oxidase.

Conclusions: Divergent homologs can often only be detected by comparing sequence profiles and profile-based hidden Markov models. The Ortho-Profile method takes advantage of these techniques in the quest for orthologs.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3334569PMC
http://dx.doi.org/10.1186/gb-2012-13-2-r12DOI Listing

Publication Analysis

Top Keywords

iterative orthology
8
orthology prediction
8
mitochondrial proteins
8
cytochrome oxidase
8
sequence profiles
8
ortho-profile method
8
proteins
5
human
5
prediction uncovers
4
uncovers mitochondrial
4

Similar Publications

Clinical variants of are associated with frontotemporal dementia (FTD), amyotrophic lateral sclerosis (ALS) and other degenerative diseases. The predicted ortholog of is encoded by , but functional orthology has not been demonstrated We undertook CRISPR/Cas9-based genome editing of the locus to create a complete loss of function allele; all exons and introns were deleted, creating , which resulted in neurodegeneration after oxidative stress. Next, we undertook CRISPR-based genome editing to replace exons with human TARDBP coding sequences, creating humanized ( ) expressing TDP-43 Based on the efficiency of this genome editing, we suggest that iterative genome editing of the target locus using linked coCRISPR markers, like , would be a more efficient strategy for sequential assembly of the large engineered transgenes.

View Article and Find Full Text PDF

The KEGG Orthology (KO) database is a widely used molecular function reference database which can be used to conduct functional annotation of most microorganisms. At present, there are many KEGG tools based on the KO entries for annotating functional orthologs. However, determining how to efficiently extract and sort the annotation results of KEGG still hinders the subsequent genome analysis.

View Article and Find Full Text PDF

GDSL-type esterase/lipase (GELP) enzymes have key functions in plants, such as developmental processes, anther and pollen development, and responses to biotic and abiotic stresses. Genes that encode GELP belong to a complex and large gene family, ranging from tens to more than hundreds of members per plant species. To facilitate functional transfer between them, we conducted a genome-wide classification of GELP in 46 plant species.

View Article and Find Full Text PDF

Malaria remains a major healthcare risk to growing economies like India, and a chromosome-level reference genome of is critical for successful vector management and understanding of vector evolution using comparative genomics. We report chromosome-level assemblies of an Indian strain, STE2, and a Pakistani strain SDA-500 by combining draft genomes of the two strains using a homology-based iterative approach. The resulting assembly IndV3/PakV3 with L50 of 9/12 and N50 6.

View Article and Find Full Text PDF

Genome-scale metabolic models are a powerful tool to study the inner workings of biological systems and to guide applications. The advent of cheap sequencing has brought the opportunity to create metabolic maps of biotechnologically interesting organisms. While this drives the development of new methods and automatic tools, network reconstruction remains a time-consuming process where extensive manual curation is required.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!