Motivation: Protein structure modeling can be improved by the use of distance constraints between amino acid residues, provided such data reflects-at least partially-the native tertiary structure of the target system. In fact, only a small subset of the native contact map is necessary to successfully drive the model conformational search, so one important goal is to obtain the set of constraints with the highest true-positive rate, lowest redundancy and greatest amount of information. In this work, we introduce a constraint evaluation and selection method based on the point-biserial correlation coefficient, which utilizes structural information from an ensemble of models to indirectly measure the power of each constraint in biasing the conformational search toward consensus structures.

Results: Residue contact maps obtained by direct coupling analysis are systematically improved by means of discriminant analysis, reaching in some cases accuracies often seen only in modern deep-learning-based approaches. When combined with an iterative modeling workflow, the proposed constraint classification optimizes the selection of the constraint set and maximizes the probability of obtaining successful models. The use of discriminant analysis for the valorization of the information of constraint datasets is a general concept with possible applications to other constraint types and modeling problems.

Availability And Implementation: MSA for the targets in this work is available on https://github.com/m3g/2021_Bottino_Biserial. Modeling data supporting the findings of this study was generated at the Center for Computing in Engineering and Sciences, and is available from the corresponding author LM on request.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab425DOI Listing

Publication Analysis

Top Keywords

conformational search
8
discriminant analysis
8
constraint
7
modeling
5
structural discrimination
4
analysis
4
discrimination analysis
4
analysis constraint
4
constraint selection
4
selection protein
4

Similar Publications

Retroviruses are among the most extensively studied viral families, both historically and in contemporary research. They are primarily investigated in the fields of viral oncogenesis, reverse transcription mechanisms, and other infection-specific aspects. These include the integration of endogenous retroviruses (ERVs) into host genomes, a process widely utilized in genetic engineering, and the ongoing search for HIV/AIDS treatment.

View Article and Find Full Text PDF

Molecular structure prediction and homology detection offer promising paths to discovering protein function and evolutionary relationships. However, current approaches lack statistical reliability assurances, limiting their practical utility for selecting proteins for further experimental and in-silico characterization. To address this challenge, we introduce a statistically principled approach to protein search leveraging principles from conformal prediction, offering a framework that ensures statistical guarantees with user-specified risk and provides calibrated probabilities (rather than raw ML scores) for any protein search model.

View Article and Find Full Text PDF

NAC4ED: A high-throughput computational platform for the rational design of enzyme activity and substrate selectivity.

mLife

December 2024

Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development School of Chemistry and Molecular Engineering, East China Normal University Shanghai China.

In silico computational methods have been widely utilized to study enzyme catalytic mechanisms and design enzyme performance, including molecular docking, molecular dynamics, quantum mechanics, and multiscale QM/MM approaches. However, the manual operation associated with these methods poses challenges for simulating enzymes and enzyme variants in a high-throughput manner. We developed the NAC4ED, a high-throughput enzyme mutagenesis computational platform based on the "near-attack conformation" design strategy for enzyme catalysis substrates.

View Article and Find Full Text PDF

The present study reveals an unexpected anomaly observed in the acid-catalyzed hydrolysis of the 5,6-O-isopropylidene group in 3-O-protected D-gluco- and D-allofuranose derivatives. Although the removal of the 5,6-O-isopropylidene protecting group is typically rapid and quantitative under acidic conditions, an unexpected inhibition of this reaction is observed for the two C3-epimers, 3-O-imidazole sulfonyl moiety. X-ray data show a two-faced imidazole ring orientation in the crystal, while solution state NOE data reveal a critical interaction type between the isopropylidene and the imidazole rings.

View Article and Find Full Text PDF

Methods for scoring matrix adjustment decrease the significance of biased residues to better detect homology between protein sequences. This is because non-homologous proteins often contain fragments with non-standard compositions that are strikingly similar to each other. However, these fragments are also functionally important in proteins and are receiving an increasing attention from the scientific community.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!