Sequence alignment has become one of the essential bioinformatics tools in biomedical research. Existing sequence alignment methods can produce reliable alignments for homologous proteins sharing a high percentage of sequence identity. The performance of these methods deteriorates sharply for the sequence pairs sharing less than 25% sequence identity. We report here a new method, NdPASA, for pairwise sequence alignment. This method employs neighbor-dependent propensities of amino acids as a unique parameter for alignment. The values of neighbor-dependent propensity measure the preference of an amino acid pair adopting a particular secondary structure conformation. NdPASA optimizes alignment by evaluating the likelihood of a residue pair in the query sequence matching against a corresponding residue pair adopting a particular secondary structure in the template sequence. Using superpositions of homologous proteins derived from the PSI-BLAST analysis and the Structural Classification of Proteins (SCOP) classification of a nonredundant Protein Data Bank (PDB) database as a gold standard, we show that NdPASA has improved pairwise alignment. Statistical analyses of the performance of NdPASA indicate that the introduction of sequence patterns of secondary structure derived from neighbor-dependent sequence analysis clearly improves alignment performance for sequence pairs sharing less than 20% sequence identity. For sequence pairs sharing 13-21% sequence identity, NdPASA improves the accuracy of alignment over the conventional global alignment (GA) algorithm using the BLOSUM62 by an average of 8.6%. NdPASA is most effective for aligning query sequences with template sequences whose structure is known. NdPASA can be accessed online at http://astro.temple.edu/feng/Servers/BioinformaticServers.htm.

Download full-text PDF

Source
http://dx.doi.org/10.1002/prot.20359DOI Listing

Publication Analysis

Top Keywords

sequence alignment
16
sequence identity
16
sequence
15
sequence pairs
12
pairs sharing
12
secondary structure
12
alignment
10
ndpasa
8
alignment algorithm
8
amino acid
8

Similar Publications

Detection of circular permutations by Protein Language Models.

Comput Struct Biotechnol J

December 2024

School of Bioengineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong 250300, China.

Protein circular permutations are crucial for understanding protein evolution and functionality. Traditional detection methods face challenges: sequence-based approaches struggle with detecting distant homologs, while structure-based approaches are limited by the need for structure generation and often treat proteins as rigid bodies. Protein Language Model-based alignment tools have shown advantages in utilizing sequence information to overcome the challenges of detecting distant homologs without requiring structural input.

View Article and Find Full Text PDF

The novel HLA-C*06:44:02 allele differs from HLA-C*06:44:01 by one synonymous nucleotide substitution in exon 2.

View Article and Find Full Text PDF

HLA-B*15:245:02Q differs from HLA-B*15:01:01:01 by two nonsynonymous nucleotides exchanges in exon 3.

View Article and Find Full Text PDF

The new allele HLA-B*44:384 differs from HLA-B*44:02:01:01 by one non-synonymous nucleotide substitution in exon 2.

View Article and Find Full Text PDF

The novel allele HLA-DQA1*02:39 differs from HLA-DQA1*02:01:01:01 by one non-synonymous nucleotide substitution in exon 2.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!