We report the results of residue-residue contact prediction of a new pipeline built purely on the learning of coevolutionary features in the CASP13 experiment. For a query sequence, the pipeline starts with the collection of multiple sequence alignments (MSAs) from multiple genome and metagenome sequence databases using two complementary Hidden Markov Model (HMM)-based searching tools. Three profile matrices, built on covariance, precision, and pseudolikelihood maximization respectively, are then created from the MSAs, which are used as the input features of a deep residual convolutional neural network architecture for contact-map training and prediction. Two ensembling strategies have been proposed to integrate the matrix features through end-to-end training and stacking, resulting in two complementary programs called TripletRes and ResTriplet, respectively. For the 31 free-modeling domains that do not have homologous templates in the PDB, TripletRes and ResTriplet generated comparable results with an average accuracy of 0.640 and 0.646, respectively, for the top L/5 long-range predictions, where 71% and 74% of the cases have an accuracy above 0.5. Detailed data analyses showed that the strength of the pipeline is due to the sensitive MSA construction and the advanced strategies for coevolutionary feature ensembling. Domain splitting was also found to help enhance the contact prediction performance. Nevertheless, contact models for tail regions, which often involve a high number of alignment gaps, and for targets with few homologous sequences are still suboptimal. Development of new approaches where the model is specifically trained on these regions and targets might help address these problems.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6851483PMC
http://dx.doi.org/10.1002/prot.25798DOI Listing

Publication Analysis

Top Keywords

coevolutionary features
8
features deep
8
deep residual
8
contact prediction
8
tripletres restriplet
8
ensembling multiple
4
multiple raw
4
raw coevolutionary
4
features
4
residual neural
4

Similar Publications

The enormous diversity of bacteriophages and their bacterial hosts presents a significant challenge to predict which phages infect a focal set of bacteria. Infection is largely determined by complementary-and largely uncharacterized-genetics of adsorption, injection, cell take-over, and lysis. Here we present a machine learning approach to predict phage-bacteria interactions trained on genome sequences of and phenotypic interactions among 51 strains and 45 phage λ strains that coevolved in laboratory conditions for 37 days.

View Article and Find Full Text PDF

Visual defect detection methods based on representation learning play an important role in industrial scenarios. Defect detection technology based on representation learning has made significant progress. However, existing defect detection methods still face three challenges: first, the extreme scarcity of industrial defect samples makes training difficult.

View Article and Find Full Text PDF

Cytochrome P450 monooxygenases of the CYP79 family catalyze conversion of specific amino acids into oximes feeding into a variety of metabolic plant pathways. Here we present an extensive phylogenetic tree of the CYP79 family built on carefully curated sequences collected across the entire plant kingdom. Based on a monophyletic origin of the P450s, a set of evolutionarily distinct branches was identified.

View Article and Find Full Text PDF

Eukaryotic innate immune systems use pattern recognition receptors to sense infection by detecting pathogen-associated molecular patterns, which then triggers an immune response. Bacteria have similarly evolved immunity proteins that sense certain components of their viral predators, known as bacteriophages. Although different immunity proteins can recognize different phage-encoded triggers, individual bacterial immunity proteins have been found to sense only a single trigger during infection, suggesting a one-to-one relationship between bacterial pattern recognition receptors and their ligands.

View Article and Find Full Text PDF
Article Synopsis
  • * The researchers derived analytical models that demonstrate that evolutionary outcomes might not always align with the life-dinner principle due to factors like the rare-enemy principle, where predators are outnumbered by prey.
  • * Overall, while the life-dinner principle provides a useful metaphor for understanding predator-prey dynamics, it can be easily influenced by other evolutionary factors that can change the expected outcomes.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!