A new fast method for inferring multiple consensus trees using k-medoids.

BMC Evol Biol

Département d'Informatique, Université du Québec à Montréal, Case postale 8888, Succursale Centre-ville, Montréal, H3C 3P8, Canada.

Published: April 2018

Background: Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant.

Results: We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms.

Conclusions: The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while providing similar or better clustering results in most cases. This makes it particularly well suited for the analysis of large genomic and phylogenetic datasets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5887197PMC
http://dx.doi.org/10.1186/s12862-018-1163-8DOI Listing

Publication Analysis

Top Keywords

gene trees
24
multiple consensus
16
consensus trees
16
gene families
16
trees
14
inferring multiple
12
gene
12
consensus tree
12
evolutionary histories
12
trees set
12

Similar Publications

Common ash (Fraxinus excelsior) is under intensive attack from the invasive alien pathogenic fungus Hymenoscyphus fraxineus, causing ash dieback at epidemic levels throughout Europe. Previous studies have found significant genetic variation among genotypes in ash dieback susceptibility and that host phenology, such as autumn yellowing, is correlated with susceptibility of ash trees to H. fraxineus; however, the genomic basis of ash dieback tolerance in F.

View Article and Find Full Text PDF

The identification and typing of bacteria are very expensive and time-consuming due to their growth times, and the expertise needed. MALDI-TOF MS represents a fast technique, reproducible with molecular approaches. This technique is still poorly applied in Legionella surveillance with estimation occurring only at the genus level.

View Article and Find Full Text PDF

Genome-wide identification and expression analysis of the BBX gene family in Lagerstroemia indica grown under light stress.

Int J Biol Macromol

January 2025

Hunan Key Laboratory for Breeding of Clonally Propagated Forest Trees, Hunan Academy of Forestry, Changsha, Hunan 410004, China. Electronic address:

B-box proteins (BBX) play pivotal roles in the regulation of numerous growth and developmental processes in plants, particularly the light-mediated biosynthesis of pigments. To elucidate the role of BBX transcription factors in the anthocyanin biosynthetic pathway of Lagerstroemia indica leaves, this study identified 41 BBX genes in the L. indica genome.

View Article and Find Full Text PDF

Genome-wide analysis of GRAS gene family and functional identification of a putative development and maintenance of axillary meristematic tissue gene PlGRAS22 in Paeonia ludlowii.

Int J Biol Macromol

January 2025

School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China; Beijing Key Laboratory of Ornamental Plants Germplasm Innovation & Molecular Breeding, National Engineering Research Center for Floriculture, Engineering Research Center of Landscape Environment of Ministry of Education, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, China. Electronic address:

The GRAS gene family, is instrumental in a myriad of biological processes, including plant growth and development. Our findings revealed that Paeonia ludlowii (Stern & G.Taylor) D.

View Article and Find Full Text PDF

Background: The confused taxonomic classification of Crucigenia is mainly inferred through morphological evidence and few nuclear genes and chloroplast genomic fragments. The phylogenetic status of C. quadrata, as the type species of Crucigenia, remains considerably controversial.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!