RiboDiffusion: tertiary structure-based RNA inverse folding with generative diffusion models.

Bioinformatics

Department of Computer Science and Engineering, CUHK, Hong Kong SAR, 999077, China.

Published: June 2024

Motivation: RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem. Computational approaches have emerged to address this problem based on secondary structures. However, designing RNA sequences directly from 3D structures is still challenging, due to the scarcity of data, the nonunique structure-sequence mapping, and the flexibility of RNA conformation.

Results: In this study, we propose RiboDiffusion, a generative diffusion model for RNA inverse folding that can learn the conditional distribution of RNA sequences given 3D backbone structures. Our model consists of a graph neural network-based structure module and a Transformer-based sequence module, which iteratively transforms random sequences into desired sequences. By tuning the sampling weight, our model allows for a trade-off between sequence recovery and diversity to explore more candidates. We split test sets based on RNA clustering with different cut-offs for sequence or structure similarity. Our model outperforms baselines in sequence recovery, with an average relative improvement of 11% for sequence similarity splits and 16% for structure similarity splits. Moreover, RiboDiffusion performs consistently well across various RNA length categories and RNA types. We also apply in silico folding to validate whether the generated sequences can fold into the given 3D RNA backbones. Our method could be a powerful tool for RNA design that explores the vast sequence space and finds novel solutions to 3D structural constraints.

Availability And Implementation: The source code is available at https://github.com/ml4bio/RiboDiffusion.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11211841PMC
http://dx.doi.org/10.1093/bioinformatics/btae259DOI Listing

Publication Analysis

Top Keywords

rna
13
inverse folding
12
rna sequences
12
rna inverse
8
generative diffusion
8
rna design
8
sequence recovery
8
structure similarity
8
similarity splits
8
sequences
6

Similar Publications

BACKGROUND Limb-girdle muscular dystrophy recessive 1 (LGMDR1) is an autosomal recessive degenerative muscle disorder characterized by progressive muscular weakness caused by pathogenic variants in the CAPN3 gene. Desmoplastic small round cell tumors (DSRCT) are ultra-rare and aggressive soft tissue sarcomas usually in the abdominal cavity, molecularly characterized by the presence of a EWSR1::WT1 fusion transcript. Mouse models of muscular dystrophy, including LGMDR1, present an increased risk of soft tissue sarcomas.

View Article and Find Full Text PDF

Cisplatin, a platinum-based chemotherapeutic agent, can be used to treat cervical cancer (CC), but cisplatin resistance is increased during the cisplatin treatment. Long non-coding RNA PGM5-AS1 reportedly participates in CC tumorigenesis; however, its role in CC patients with cisplatin resistance has not been revealed. The present aimed to examine the role of PGM5-AS1 in modulating cisplatin resistance in CC.

View Article and Find Full Text PDF

Germline inactivating mutations of the SLC25A1 gene contribute to various human disorders, including Velocardiofacial (VCFS), DiGeorge (DGS) syndromes and combined D/L-2-hydroxyglutaric aciduria (D/L-2HGA), a severe systemic disease characterized by the accumulation of 2-hydroxyglutaric acid (2HG). The mechanisms by which SLC25A1 loss leads to these syndromes remain largely unclear. Here, we describe a mouse model of SLC25A1 deficiency that mimics human VCFS/DGS and D/L-2HGA.

View Article and Find Full Text PDF

Tilapia lake virus (TiLV) disease is highly contagious and causes substantial mortality in tilapia. Currently, no effective treatments or commercial vaccines are available to prevent TiLV infection. In this study, TiLV segment 4 (S4) was cloned into the pET28a(+)vector and transformed into Escherichia coli BL21(DE3).

View Article and Find Full Text PDF

Identification and validation of up-regulated TNFAIP6 in osteoarthritis with type 2 diabetes mellitus.

Sci Rep

December 2024

Division of Joint Surgery and Sports Medicine, Department of Orthopedic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China.

Lines of evidence have indicated that type 2 diabetes mellitus (T2DM) is an independent risk factor for osteoarthritis (OA) progression. However, the study focused on the relationship between T2DM and OA at the transcriptional level remains empty. We downloaded OA- and T2DM-related bulk RNA-sequencing and single-cell RNA sequencing data from the Gene Expression Omnibus (GEO) dataset.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!