Protein sequences are treated as stochastic processes on the basis of a reduced amino acid alphabet of 10 types of amino acids. The realization of a stochastic process is described by associated transition probability matrix that corresponds to the process uniquely. Then new distances between transition probability matrices are defined for sequences similarity analysis. Two separate datasets are prepared and tested to identify the validity of the method. The results demonstrate the new method is powerful and efficient.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1080/07391102.2009.10507288 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!