The bendability of genomic DNA, which measures the DNA looping rate, is crucial for numerous biological processes of DNA. Recently, an advanced high-throughput technique known as 'loop-seq' has made it possible to measure the inherent cyclizability of DNA fragments. However, quantifying the bendability of large-scale DNA is costly, laborious, and time-consuming. To close the gap between rapidly evolving large language models and expanding genomic sequence information, and to elucidate the DNA bendability's impact on critical regulatory sequence motifs such as super-enhancers in the human genome, we introduce an innovative computational model, named MIXBend, to forecast the DNA bendability utilizing both nucleotide sequences and physicochemical properties. In MIXBend, a pre-trained language model DNABERT and convolutional neural network with attention mechanism are utilized to construct both sequence- and physicochemical-based extractors for the sophisticated refinement of DNA sequence representations. These bimodal DNA representations are then fed to a k-mer sequence-physicochemistry matching module to minimize the semantic gap between each modality. Lastly, a self-attention fusion layer is employed for the prediction of DNA bendability. In conclusion, the experimental results validate MIXBend's superior performance relative to other state-of-the-art methods. Additionally, MIXBend reveals both novel and known motifs from the yeast. Moreover, MIXBend discovers significant bendability fluctuations within super-enhancer regions and transcription factors binding sites in the human genome.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11014357PMC
http://dx.doi.org/10.1093/nar/gkae099DOI Listing

Publication Analysis

Top Keywords

dna bendability
12
dna
11
human genome
8
bendability
6
employing bimodal
4
bimodal representations
4
representations predict
4
predict dna
4
bendability self-supervised
4
self-supervised pre-trained
4

Similar Publications

DNA has shown great biocompatibility, programmable mechanical properties, and precise structural addressability at the nanometer scale, rendering it a material for constructing versatile nanorobots for biomedical applications. Here, we present the design principle, synthesis, and characterization of a DNA nanorobotic hand, called DNA NanoGripper, that contains a palm and four bendable fingers as inspired by naturally evolved human hands, bird claws, and bacteriophages. Each NanoGripper finger consists of three phalanges connected by three rotatable joints that are bendable in response to the binding of other entities.

View Article and Find Full Text PDF

Ribosomal RNA (18S, 5.8S, 28S) gene clusters in genomes form regions that consist of multiple tandem repeats. They are located on a single or several pairs of chromosomes and play an important role in the formation of the nucleolus responsible for the assembly of ribosome subunits.

View Article and Find Full Text PDF

Dinucleotide biases have been widely investigated in the genomes of eukaryotes and viruses, but not in bacteria. We assembled a dataset of bacterial genomes (>15 000), which are representative of the genetic diversity in the kingdom Eubacteria, and we analyzed dinucleotide biases in relation to different traits. We found that TpA dinucleotides are the most depleted and that CpG dinucleotides show the widest dispersion.

View Article and Find Full Text PDF

The organization of nucleosomes into chromatin and their accessibility are shaped by local DNA mechanics. Conversely, nucleosome positions shape genetic variations, which may originate from mismatches during replication and chemical modification of DNA. To investigate how DNA mismatches affect the mechanical stability and the exposure of nucleosomal DNA, we used an optical trap combined with single-molecule FRET and a single-molecule FRET cyclization assay.

View Article and Find Full Text PDF

The bendability of genomic DNA, which measures the DNA looping rate, is crucial for numerous biological processes of DNA. Recently, an advanced high-throughput technique known as 'loop-seq' has made it possible to measure the inherent cyclizability of DNA fragments. However, quantifying the bendability of large-scale DNA is costly, laborious, and time-consuming.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!