Genomic Fishing and Data Processing for Molecular Evolution Research.

Methods Protoc

Department of Biodiversity, Ecology and Evolution, Complutense University of Madrid, 28040 Madrid, Spain.

Published: March 2022

Molecular evolution analyses, such as detection of adaptive/purifying selection or ancestral protein reconstruction, typically require three inputs for a target gene (or gene family) in a particular group of organisms: sequence alignment, model of evolution, and phylogenetic tree. While modern advances in high-throughput sequencing techniques have led to rapid accumulation of genomic-scale data in public repositories and databases, mining such vast amount of information often remains a challenging enterprise. Here, we describe a comprehensive, versatile workflow aimed at the preparation of genome-extracted datasets readily available for molecular evolution research. The workflow involves: (1) fishing (searching and capturing) specific gene sequences of interest from taxonomically diverse genomic data available in databases at variable levels of annotation, (2) processing and depuration of retrieved sequences, (3) production of a multiple sequence alignment, (4) selection of best-fit model of evolution, and (5) solid reconstruction of a phylogenetic tree.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8938851PMC
http://dx.doi.org/10.3390/mps5020026DOI Listing

Publication Analysis

Top Keywords

molecular evolution
12
sequence alignment
8
model evolution
8
phylogenetic tree
8
evolution
5
genomic fishing
4
fishing data
4
data processing
4
processing molecular
4
evolution molecular
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!