Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods.

BMC Bioinformatics

Laboratoire de Biologie et Génomique Intégrative, Département de Biologie et Génomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/UDS, Illkirch, France.

Published: November 2009

Background: Phylogenetic profiling encompasses an important set of methodologies for in silico high throughput inference of functional relationships between genes. The simplest profiles represent the distribution of gene presence-absence in a set of species as a sequence of 0's and 1's, and it is assumed that functionally related genes will have more similar profiles. The methodology has been successfully used in numerous studies of prokaryotic genomes, although its application in eukaryotes appears problematic, with reported low accuracy due to the complex genomic organization within this domain of life. Recently some groups have proposed an alternative approach based on the correlation of homologous gene group sizes, taking into account all potentially informative genetic events leading to a change in group size, regardless of whether they result in a de novo group gain or total gene group loss.

Results: We have compared the performance of classical presence-absence and group size based approaches using a large, diverse set of eukaryotic species. In contrast to most previous comparisons in Eukarya, we take into account the species phylogeny. We also compare the approaches using two different group categories, based on orthology and on domain-sharing. Our results confirm a limited overall performance of phylogenetic profiling in eukaryotes. Although group size based approaches initially showed an increase in performance for the domain-sharing based groups, this seems to be an overestimation due to a simplistic negative control dataset and the choice of null hypothesis rejection criteria.

Conclusion: Presence-absence profiling represents a more accurate classifier of related versus non-related profile pairs, when the profiles under consideration have enough information content. Group size based approaches provide a complementary means of detecting domain or family level co-evolution between groups that may be elusive to presence-absence profiling. Moreover positive correlation between co-evolution scores and functional links imply that these methods could be used to estimate functional distances between gene groups and to cluster them based on their functional relatedness. This study should have important implications for the future development and application of phylogenetic profiling methods, not only in eukaryotic, but also in prokaryotic datasets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2787529PMC
http://dx.doi.org/10.1186/1471-2105-10-383DOI Listing

Publication Analysis

Top Keywords

phylogenetic profiling
16
group size
16
size based
12
based approaches
12
group
8
gene group
8
presence-absence profiling
8
based
7
profiling
6
approaches
5

Similar Publications

The Low Density Lipoprotein receptors (LDLRs) gene family includes 15 receptors: very low-density lipoprotein receptor (VLDLR), LDLR, Sorting-related receptor with A-type repeats (SORLA), and 12 LDL receptor-related proteins (LRPs): LRP1, LRP1B, LRP2, LRP3, LRP4, LRP5, LRP6, LRP8, LRP10, LRP11, LRP12, LRP13. Most of these are involved in the transduction of key signals during embryonic development and in the regulation of cholesterol homeostasis. In oviparous animals, the VLDL receptor is also known as VTGR since it facilitates the uptake of vitellogenin in ovary.

View Article and Find Full Text PDF

sp. nov., isolated from tree bark ( Chev.) and its antioxidant activity.

Int J Syst Evol Microbiol

January 2025

Department of Biochemistry and Microbiology, Faculty of Pharmaceutical Sciences, Chulalongkorn University, Bangkok 10330, Thailand.

A Gram-stain-positive, facultatively anaerobic, rod-shaped strain, designated SPB1-3, was isolated from tree bark. This strain exhibited heterofermentative production of dl-lactic acid from glucose. Optimal growth was observed at 25-40 °C, pH 4.

View Article and Find Full Text PDF

Background: Nosocomial outbreaks with multidrug-resistant bacteria with a probable reservoir in hospital toilets and drainage systems have been increasingly reported.

Aim: To investigate an increase in bacteraemia with extended-spectrum β-lactamase (ESBL)-producing at our hospital in 2021; the epidemiology of the outbreak suggested an environmental source.

Methods: Available clinical isolates from patient with infection or rectal carriage from 2019 to 2022 were collected.

View Article and Find Full Text PDF

The are a family of non-segmented positive-sense enveloped RNA viruses containing significant pathogens including hepatitis C virus and yellow fever virus. Recent large-scale metagenomic surveys have identified many diverse RNA viruses related to classical orthoflaviviruses and pestiviruses but quite different genome lengths and configurations, and with a hugely expanded host range that spans multiple animal phyla, including molluscs, cnidarians and stramenopiles,, and plants. Grouping of RNA-directed RNA polymerase (RdRP) hallmark gene sequences of flavivirus and 'flavi-like' viruses into four divergent clades and multiple lineages within them was congruent with helicase gene phylogeny, PPHMM profile comparisons, and comparison of RdRP protein structure predicted by AlphFold2.

View Article and Find Full Text PDF

Molecular Epidemiology of Type F Among Diarrheal Patients and Virulence-Resistance Dynamics - 11 Provinces, China, 2024.

China CDC Wkly

January 2025

Department of Clinical Laboratory, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou City, Zhejiang Province, China.

Introduction: Type F () represents a significant pathogen in human gastrointestinal diseases, primarily through its gene encoding enterotoxin (CPE). This investigation examined the prevalence, antimicrobial resistance patterns, and genetic characteristics of Type F within the Chinese population.

Methods: The study analyzed 2,068 stool samples collected from 11 provincial hospitals in 2024.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!