Protein sequences are treated as stochastic processes on the basis of a reduced amino acid alphabet of 10 types of amino acids. The realization of a stochastic process is described by associated transition probability matrix that corresponds to the process uniquely. Then new distances between transition probability matrices are defined for sequences similarity analysis. Two separate datasets are prepared and tested to identify the validity of the method. The results demonstrate the new method is powerful and efficient.

Download full-text PDF

Source
http://dx.doi.org/10.1080/07391102.2009.10507288DOI Listing

Publication Analysis

Top Keywords

protein sequences
8
reduced amino
8
amino acid
8
transition probability
8
alignment-free comparison
4
comparison protein
4
sequences based
4
based reduced
4
acid alphabets
4
alphabets protein
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!