Robust RNA secondary structure prediction with a mixture of deep learning and physics-based experts.

Biol Methods Protoc

Department of Physics, George Washington University, Washington, DC 20052, United States.

Published: January 2025

A mixture-of-experts (MoE) approach has been developed to mitigate the poor out-of-distribution (OOD) generalization of deep learning (DL) models for single-sequence-based prediction of RNA secondary structure. The main idea behind this approach is to use DL models for in-distribution (ID) test sequences to leverage their superior ID performances, while relying on physics-based models for OOD sequences to ensure robust predictions. One key ingredient of the pipeline, named MoEFold2D, is automated ID/OOD detection via consensus analysis of an ensemble of DL model predictions without requiring access to training data during inference. Specifically, motivated by the clustered distribution of known RNA structures, a collection of distinct DL models is trained by iteratively leaving one cluster out. Each DL model hence serves as an expert on all but one cluster in the training data. Consequently, for an ID sequence, all but one DL model makes accurate predictions consistent with one another, while an OOD sequence yields highly inconsistent predictions among all DL models. Through consensus analysis of DL predictions, test sequences are categorized as ID or OOD. ID sequences are subsequently predicted by averaging the DL models in consensus, and OOD sequences are predicted using physics-based models. Instead of remediating generalization gaps with alternative approaches such as transfer learning and sequence alignment, MoEFold2D circumvents unpredictable ID-OOD gaps and combines the strengths of DL and physics-based models to achieve accurate ID and robust OOD predictions.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11729747	PMC
http://dx.doi.org/10.1093/biomethods/bpae097	DOI Listing

Publication Analysis

Top Keywords

physics-based models

ood sequences

rna secondary

secondary structure

deep learning

models

test sequences

consensus analysis

training data

models consensus

Similar Publications

Robust RNA secondary structure prediction with a mixture of deep learning and physics-based experts.

Biol Methods Protoc

January 2025

Department of Physics, George Washington University, Washington, DC 20052, United States.

Xiangyun Qiu

View Article and Find Full Text PDF

Similar Publications

Physics-Based Synthetic Data Model for Automated Segmentation in Catalysis Microscopy.

Microsc Microanal

January 2025

Fritz-Haber-Institut der Max-Planck-Gesellschaft, Berlin 14195, Germany.

Maurits Vuijk Gianmarco Ducci Luis Sandoval Markus Pietsch Karsten Reuter

In catalysis research, the amount of microscopy data acquired when imaging dynamic processes is often too much for nonautomated quantitative analysis. Developing machine learned segmentation models is challenged by the requirement of high-quality annotated training data. We thus substitute expert-annotated data with a physics-based sequential synthetic data model.

View Article and Find Full Text PDF

Similar Publications

Hybrid control of hydraulic directional valves: Integrating physics-based and data-driven models for enhanced accuracy and efficiency.

ISA Trans

December 2024

Technische Universität Wien, Automation and Control Institute, Gusshausstrasse 27-29, Vienna, 1040, Austria. Electronic address:

Tobias Glück Amadeus Lobe Adrian Trachte Matthias Bitzer Wolfgang Kemmetmüller

In this paper, we tackle the challenge of accurately controlling the position of the valve spool in hydraulic 4/3 two-stage directional control valves utilized in mobile applications. The pilot valve's overlapping design often leads to a significant dead zone, negatively impacting positioning accuracy and necessitating a sophisticated controller design. To overcome these challenges, we introduce a control strategy founded on a control-oriented model.

View Article and Find Full Text PDF

Similar Publications

Systematic benchmarking of deep-learning methods for tertiary RNA structure prediction.

PLoS Comput Biol

December 2024

School of Biological Sciences (SBS), Nanyang Technological University, Singapore, Singapore.

Akash Bahai Chee Keong Kwoh Yuguang Mu Yinghui Li

The 3D structure of RNA critically influences its functionality, and understanding this structure is vital for deciphering RNA biology. Experimental methods for determining RNA structures are labour-intensive, expensive, and time-consuming. Computational approaches have emerged as valuable tools, leveraging physics-based-principles and machine learning to predict RNA structures rapidly.

View Article and Find Full Text PDF

Similar Publications

Seismic anisotropy prediction using ML methods: A case study on an offshore carbonate oilfield.

PLoS One

January 2025

Geosciences Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, KSA.

Guibin Zhao Fateh Bouchaala Mohamed S Jouini Umair Bin Waheed

Estimating seismic anisotropy parameters, such as Thomson's parameters, is crucial for investigating fractured and finely layered geological media. However, many inversion methods rely on complex physical models with initial assumptions, leading to non-reproducible estimates and subjective fracture interpretation. To address these limitations, this study utilizes machine learning methods: support vector regression, extreme gradient boost, multi-layer perceptron, and a convolutional neural network.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!