DNA sequence analysis linguistic tools: contrast vocabularies, compositional spectra and linguistic complexity.

Appl Bioinformatics

Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa, Israel.

Published: June 2004

This is a review of the methods based on counting oligomers in nucleotide and amino acid sequences. Such methods are analogous to the formal linguistic analysis of human texts. This review includes methods based on the calculation of observed occurrences (frequencies) of oligomers and their distribution, as well as those based on deviations between the observed and the expected occurrences (contrast words, genome signatures) in biological sequences. Both types of methods have a wide range of sensitivity and can identify homologous as well as functionally and taxonomically related sequences.

Download full-text PDF	Source

Publication Analysis

Top Keywords

methods based

dna sequence

sequence analysis

analysis linguistic

linguistic tools

tools contrast

contrast vocabularies

vocabularies compositional

compositional spectra

spectra linguistic

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!