A mapping of an ensemble of mitochondrial sequences for various organisms into 3D space based on the word composition.

Mol Phylogenet Evol

Graduate School of Science and Engineering, Saitama University, 255 Shimo-okubo, Saitama 338-8570, Japan.

Published: November 2012

AI Article Synopsis

  • A novel method was developed to represent mitochondrial genome sequences from various species in a 3D vector space, enabling visualization of biological diversity.
  • The sequences were mapped using a word-composition vector based on K-tuple words, maintaining the original sequence lengths and angle correlations.
  • The mapping revealed distinct arc distributions for Animalia, Fungi, and Green plants, providing a cross structure that enhances researchers' ability to intuitively interpret complex biological data.

Article Abstract

To visualize a bird's-eye view of an ensemble of mitochondrial genome sequences for various species, we recently developed a novel method of mapping a biological sequence ensemble into Three-Dimensional (3D) vector space. First, we represented a biological sequence of a species s by a word-composition vector x(s), where its length [absolute value]x(s)[absolute value] represents the sequence length, and its unit vector x(s)/[absolute value]x(s)[absolute value] represents the relative composition of the K-tuple words through the sequence and the size of the dimension, N=4(K), is the number of all possible words with the length of K. Second, we mapped the vector x(s) to the 3D position vector y(s), based on the two following simple principles: (1) [absolute value]y(s)[absolute value]=[absolute value]x(s)[absolute value] and (2) the angle between y(s) and y(t) maximally correlates with the angle between x(s) and x(t). The mitochondrial genome sequences for 311 species, including 177 Animalia, 85 Fungi and 49 Green plants, were mapped into 3D space by using K=7. The mapping was successful because the angles between vectors before and after the mapping highly correlated with each other (correlation coefficients were 0.92-0.97). Interestingly, the Animalia kingdom is distributed along a single arc belt (just like the Milky Way on a Celestial Globe), and the Fungi and Green plant kingdoms are distributed in a similar arc belt. These two arc belts intersect at their respective middle regions and form a cross structure just like a jet aircraft fuselage and its wings. This new mapping method will allow researchers to intuitively interpret the visual information presented in the maps in a highly effective manner.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ympev.2012.06.023DOI Listing

Publication Analysis

Top Keywords

value]xs[absolute value]
12
ensemble mitochondrial
8
mitochondrial genome
8
genome sequences
8
biological sequence
8
value] represents
8
fungi green
8
arc belt
8
mapping
5
vector
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!