Spectral-dynamic representation of DNA sequences.

J Biomed Inform

Department of Nuclear Medicine, Medical University of Gdańsk, Tuwima 15, 80-210 Gdańsk, Poland. Electronic address:

Published: August 2017

A graphical representation of DNA sequences in which the distribution of a particular base B=A,C,G,T is represented by a set of discrete lines has been formulated. The methodology of this approach has been borrowed from two areas of physics: spectroscopy and dynamics. Consequently, the set of discrete lines is referred to as the B-spectrum. Next, the B-spectrum is transformed to a rigid body composed of material points. In this way a dynamic representation of the DNA sequence has been obtained. The centers of mass of these rigid bodies, divided by their moments of inertia, have been taken as the descriptors of the spectra and, thus, of the DNA sequences. The performance of this method on a standard set of data commonly applied by authors introducing new approaches to bioinformatics (the first exons of β-globin genes of different species) proved to be very good.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbi.2017.06.001DOI Listing

Publication Analysis

Top Keywords

representation dna
12
dna sequences
12
set discrete
8
discrete lines
8
spectral-dynamic representation
4
dna
4
sequences graphical
4
graphical representation
4
sequences distribution
4
distribution base
4

Similar Publications

Background: African buffalo (Syncerus caffer) is a significant reservoir host for many zoonotic and parasitic infections in Africa. These include a range of viruses and pathogenic bacteria, such as tick-borne rickettsial organisms. Despite the considerations of mammalian blood as a sterile environment, blood microbiome sequencing could become crucial for agnostic biosurveillance.

View Article and Find Full Text PDF

The role of chromatin state in intron retention: A case study in leveraging large scale deep learning models.

PLoS Comput Biol

January 2025

Department of Computer Science, Colorado State University, Fort Collins, Colorado, United States of America.

Complex deep learning models trained on very large datasets have become key enabling tools for current research in natural language processing and computer vision. By providing pre-trained models that can be fine-tuned for specific applications, they enable researchers to create accurate models with minimal effort and computational resources. Large scale genomics deep learning models come in two flavors: the first are large language models of DNA sequences trained in a self-supervised fashion, similar to the corresponding natural language models; the second are supervised learning models that leverage large scale genomics datasets from ENCODE and other sources.

View Article and Find Full Text PDF

Repetitive DNA contributes significantly to plant genome size, adaptation, and evolution. However, little is understood about the transcription of repeats. This is addressed here in the plant green foxtail millet (Setaria viridis).

View Article and Find Full Text PDF

With the advent of commercial DNA databases, investigative genetic genealogy (IGG) has emerged as a powerful forensic tool, rivalling the impact of STR analyses, introduced four decades ago. IGG has been frequently applied in the US and tested in other countries, but never in Norway. Here, we apply IGG to three cold criminal cases and successfully identify the donor of the DNA in two of these cases.

View Article and Find Full Text PDF

Purpose: To evaluate evidence on germline and somatic genomic testing for patients with metastatic prostate cancer and provide recommendations.

Methods: A systematic review by a multidisciplinary panel with patient representation was conducted. The PubMed database was searched from January 2018 to May 2024.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!