Robust expansion of phylogeny for fast-growing genome sequence data.

PLoS Comput Biol

State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China.

Published: February 2024

Massive sequencing of SARS-CoV-2 genomes has urged novel methods that employ existing phylogenies to add new samples efficiently instead of de novo inference. 'TIPars' was developed for such challenge integrating parsimony analysis with pre-computed ancestral sequences. It took about 21 seconds to insert 100 SARS-CoV-2 genomes into a 100k-taxa reference tree using 1.4 gigabytes. Benchmarking on four datasets, TIPars achieved the highest accuracy for phylogenies of moderately similar sequences. For highly similar and divergent scenarios, fully parsimony-based and likelihood-based phylogenetic placement methods performed the best respectively while TIPars was the second best. TIPars accomplished efficient and accurate expansion of phylogenies of both similar and divergent sequences, which would have broad biological applications beyond SARS-CoV-2. TIPars is accessible from https://tipars.hku.hk/ and source codes are available at https://github.com/id-bioinfo/TIPars.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10898724PMC
http://dx.doi.org/10.1371/journal.pcbi.1011871DOI Listing

Publication Analysis

Top Keywords

sars-cov-2 genomes
8
best tipars
8
robust expansion
4
expansion phylogeny
4
phylogeny fast-growing
4
fast-growing genome
4
genome sequence
4
sequence data
4
data massive
4
massive sequencing
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!