PolyX2: Fast Detection of Homorepeats in Large Protein Datasets.

Genes (Basel)

Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, 55128 Mainz, Germany.

Published: April 2022

Homorepeat sequences, consecutive runs of identical amino acids, are prevalent in eukaryotic proteins. It has become necessary to annotate and evaluate this feature in entire proteomes. The definition of what constitutes a homorepeat is not fixed, and different research approaches may require different definitions; therefore, flexible approaches to analyze homorepeats in complete proteomes are needed. Here, we present polyX2, a fast, simple but tunable script to scan protein datasets for all possible homorepeats. The user can modify the length of the window to scan, the minimum number of identical residues that must be found in the window, and the types of homorepeats to be found.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9141109	PMC
http://dx.doi.org/10.3390/genes13050758	DOI Listing

Publication Analysis

Top Keywords

polyx2 fast

protein datasets

fast detection

homorepeats

detection homorepeats

homorepeats large

large protein

datasets homorepeat

homorepeat sequences

sequences consecutive

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!