Background: Identifying polymorphism clades on phylogenetic trees could help detect punctual mutations that are associated with viral functions. With visualization tools coloring the tree, it is easy to visually find clades where most sequences have the same polymorphism state. However, with the fast accumulation of viral sequences, a computational tool to automate this process is urgently needed.

Results: Here, by implementing a branch-and-bound-like search method, we developed an R package named sitePath to identify polymorphism clades automatically. Based on the identified polymorphism clades, fixed and parallel mutations could be inferred. Furthermore, sitePath also integrated visualization tools to generate figures of the calculated results. In an example with the influenza A virus H3N2 dataset, the detected fixed mutations coincide with antigenic shift mutations. The highly specificity and sensitivity of sitePath in finding fixed mutations were achieved for a range of parameters and different phylogenetic tree inference software.

Conclusions: The result suggests that sitePath can identify polymorphism clades per site. The clustering of sequences on a phylogenetic tree can be used to infer fixed and parallel mutations. High-quality figures of the calculated results could also be generated by sitePath.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9701067PMC
http://dx.doi.org/10.1186/s12859-022-05064-4DOI Listing

Publication Analysis

Top Keywords

polymorphism clades
20
identify polymorphism
12
fixed parallel
12
parallel mutations
12
visualization tools
8
sitepath identify
8
figures calculated
8
fixed mutations
8
phylogenetic tree
8
mutations
7

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!