Estimation of relative effectiveness of phylogenetic programs by machine learning.

J Bioinform Comput Biol

Belozersky Insitute of Moscow State University, Moscow 119991, Russia , Gamaleya Institute of Epidemiology and Microbiology, Moscow 123098, Russia.

Published: April 2014

Reconstruction of phylogeny of a protein family from a sequence alignment can produce results of different quality. Our goal is to predict the quality of phylogeny reconstruction basing on features that can be extracted from the input alignment. We used Fitch-Margoliash (FM) method of phylogeny reconstruction and random forest as a predictor. For training and testing the predictor, alignments of orthologous series (OS) were used, for which the result of phylogeny reconstruction can be evaluated by comparison with trees of corresponding organisms. Our results show that the quality of phylogeny reconstruction can be predicted with more than 80% precision. Also, we tried to predict which phylogeny reconstruction method, FM or UPGMA, is better for a particular alignment. With the used set of features, among alignments for which the obtained predictor predicts a better performance of UPGMA, 56% really give a better result with UPGMA. Taking into account that in our testing set only for 34% alignments UPGMA performs better, this result shows a principal possibility to predict the better phylogeny reconstruction method basing on features of a sequence alignment.

Download full-text PDF

Source
http://dx.doi.org/10.1142/S0219720014410042DOI Listing

Publication Analysis

Top Keywords

phylogeny reconstruction
24
sequence alignment
8
quality phylogeny
8
basing features
8
reconstruction method
8
better result
8
reconstruction
7
phylogeny
7
better
5
estimation relative
4

Similar Publications

Evolutionary events leading to organismal preference for a specific growth temperature, as well as genes whose products are needed for a proper function at that temperature, are poorly understood. Using 64 bacteria from phylum Thermotogota as a model system, we examined how optimal growth temperature changed throughout Thermotogota history. We inferred that Thermotogota's last common ancestor was a thermophile and that some Thermotogota evolved the mesophilic and hyperthermophilic lifestyles secondarily.

View Article and Find Full Text PDF

During the study of algal diversity in pyroclastic deposits of the Kamchatka Peninsula, Chlorella-like green algae strains VCA-72 and VCA-93 were isolated from samples collected from along the Baydarnaya river bed on the Shiveluch volcano in 2018 and at the outlet of thermal vapors along the edge of the caldera on the southern slope of the Gorely volcano in 2020. Identification of the strains was carried out within the framework of an integrative approach using microscopic and molecular genetic methods, including preliminary taxon identification, obtaining nucleotide sequences of the small subunit and the internal transcribed spacer rRNA, reconstruction of phylogenetic trees and secondary structures of the ITS1 and ITS2 rRNA regions. On the phylogenetic tree, strain VCA-93 was clustered in the Micractinium thermotolerans species clade.

View Article and Find Full Text PDF

The Malvaceae family, comprising 9 subfamilies and 4,225 species, includes economically significant taxa, such as Ceiba pentandra, Gossypium ekmanianum, Gossypium stephensii, Kokia drynarioides, Talipariti hamabo, and Durio zibethinus. Chloroplast (cp) genome research is crucial for elucidating the evolutionary divergence and species identification within this family. In this study, we assembled and annotated cp genomes of six Malvaceae species, conducting comprehensive comparative genomic and phylogenomic analyses.

View Article and Find Full Text PDF

On the parameterized complexity of the median and closest problems under some permutation metrics.

Algorithms Mol Biol

December 2024

Instituto de Computação, Universidade Federal Fluminense, Niterói, Brazil.

Genome rearrangements are events where large blocks of DNA exchange places during evolution. The analysis of these events is a promising tool for understanding evolutionary genomics, providing data for phylogenetic reconstruction based on genome rearrangement measures. Many pairwise rearrangement distances have been proposed, based on finding the minimum number of rearrangement events to transform one genome into the other, using some predefined operation.

View Article and Find Full Text PDF

Phylogeny, evolution and a re-classification of the .

Stud Mycol

December 2024

Herbarium Hamburgense, Institute for Plant Science and Microbiology, University of Hamburg, Ohnhorststr. 18, D-22609, Hamburg, Germany.

The is an independent lichenized lineage within the comprising . 390 species and 50 genera. Very few studies have dealt with family and genus classification using molecular data and many groups are in need of thorough revision.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!