Residue distance prediction from the sequence is critical for many biological applications such as protein structure reconstruction, protein-protein interaction prediction, and protein design. However, prediction of fine-grained distances between residues with long sequence separations still remains challenging. In this study, we propose DuetDis, a method based on duet feature sets and deep residual network with squeeze-and-excitation (SE), for protein inter-residue distance prediction. DuetDis embraces the ability to learn and fuse features directly or indirectly extracted from the whole-genome/metagenomic databases and, therefore, minimize the information loss through ensembling models trained on different feature sets. We evaluate DuetDis and 11 widely used peer methods on a large-scale test set (610 proteins chains). The experimental results suggest that 1) prediction results from different feature sets show obvious differences; 2) ensembling different feature sets can improve the prediction performance; 3) high-quality multiple sequence alignment (MSA) used for both training and testing can greatly improve the prediction performance; and 4) DuetDis is more accurate than peer methods for the overall prediction, more reliable in terms of model prediction score, and more robust against shallow multiple sequence alignment (MSA).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9148999PMC
http://dx.doi.org/10.3389/fgene.2022.887491DOI Listing

Publication Analysis

Top Keywords

feature sets
16
distance prediction
12
prediction
10
inter-residue distance
8
peer methods
8
improve prediction
8
prediction performance
8
multiple sequence
8
sequence alignment
8
alignment msa
8

Similar Publications

Diffraction imaging of cells allows rapid phenotyping by the response of intracellular molecules to coherent illumination. However, its ability to distinguish numerous types of human leukocytes remains to be investigated. Here, we show that accurate classification of three lymphocyte subtypes can be achieved with features extracted from cross-polarized diffraction image (p-DI) pairs.

View Article and Find Full Text PDF

Analysis of the internal motions of thermoresponsive polymers and single chain nanoparticles.

Soft Matter

January 2025

Department of Macromolecular Science and Engineering, Case Western Reserve University, 10900 Euclid Ave., Cleveland, OH 44122, USA.

Data-driven techniques, such as proper orthogonal decomposition (POD) and uniform manifold approximation & projection (UMAP), are powerful methods for understanding polymer behavior in complex systems that extend beyond ideal conditions. They are based on the principle that low-dimensional behaviors are often embedded within the structure and dynamics of complex systems. Here, the internal motions of a thermoresponsive, LCST polymer are investigated for two cases: (1) the coil-to-globule transition that occurs as the system is heated above its critical temperature and (2) intramolecularly crosslinked, single chain nanoparticles (SCNPs) both above and below the critical temperature ().

View Article and Find Full Text PDF

Multivariate patterns among multimodal neuroimaging and clinical, cognitive, and daily functioning characteristics in bipolar disorder.

Neuropsychopharmacology

January 2025

Neurocognition and Emotion in Affective Disorders (NEAD) Centre, Psychiatric Centre Copenhagen, Mental Health Services, Capital Region of Denmark, Frederiksberg, Denmark.

Individuals with bipolar disorder (BD) show heterogeneity in clinical, cognitive, and daily functioning characteristics, which challenges accurate diagnostics and optimal treatment. A key goal is to identify brain-based biomarkers that inform patient stratification and serve as treatment targets. The objective of the present study was to apply a data-driven, multivariate approach to quantify the relationship between multimodal imaging features and behavioral phenotypes in BD.

View Article and Find Full Text PDF

Interpretable CT Radiomics-based Machine Learning Model for Preoperative Prediction of Ki-67 Expression in Clear Cell Renal Cell Carcinoma.

Acad Radiol

January 2025

Department of Urology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China (Y.X., B.X., Z.W., C.P., M.X.). Electronic address:

Rationale And Objectives: To develop and externally validate interpretable CT radiomics-based machine learning (ML) models for preoperative Ki-67 expression prediction in clear cell renal cell carcinoma (ccRCC).

Methods: 506 patients were retrospectively enrolled from three independent institutes and divided into the training (n=357) and external test (n=149) sets. Ki67 expression was determined by immunohistochemistry (IHC) and categorized into low (<15%) and high (≥15%) expression groups.

View Article and Find Full Text PDF

The accurate identification of protein-nucleotide binding residues is crucial for protein function annotation and drug discovery. Numerous computational methods have been proposed to predict these binding residues, achieving remarkable performance. However, due to the limited availability and high variability of nucleotides, predicting binding residues for diverse nucleotides remains a significant challenge.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!