The recent CASP15 competition highlighted the critical role of multiple sequence alignments (MSAs) in protein structure prediction, as demonstrated by the success of the top AlphaFold2-based prediction methods. To push the boundaries of MSA utilization, we conducted a petabase-scale search of the Sequence Read Archive (SRA), resulting in gigabytes of aligned homologs for CASP15 targets. These were merged with default MSAs produced by ColabFold-search and provided to ColabFold-predict. By using SRA data, we achieved highly accurate predictions (GDT_TS > 70) for 66% of the non-easy targets, whereas using ColabFold-search default MSAs scored highly in only 52%. Next, we tested the effect of deep homology search and ColabFold's advanced features, such as more recycles, on prediction accuracy. While SRA homologs were most significant for improving ColabFold's CASP15 ranking from 11th to 3rd place, other strategies contributed too. We analyze these in the context of existing strategies to improve prediction.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10369885 | PMC |
http://dx.doi.org/10.1101/2023.07.10.548308 | DOI Listing |
Nucleic Acids Res
January 2025
School of Biological Sciences, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea.
Tailor-made enzymes empower a wide range of versatile applications, although searching for the desirable enzymes often requires high throughput screening and thus poses significant challenges. In this study, we employed homology searches and protein language models to discover and prioritize enzymes by their kinetic parameters. We aimed to discover kynureninases as a potentially versatile therapeutic enzyme, which hydrolyses L-kynurenine, a potent immunosuppressive metabolite, to overcome the immunosuppressive tumor microenvironment in anticancer therapy.
View Article and Find Full Text PDFPathogens
December 2024
School of Artificial Intelligence, Hangzhou Dianzi University, Hangzhou 310018, China.
is a parasite transmitted by mosquitoes and can cause a neglected tropical disease called Lymphatic filariasis. However, the genome of was not well studied, making novel drug development difficult. This study aims to identify microRNA, annotate protein function, and explore the pathogenic mechanism of by genome-wide analysis.
View Article and Find Full Text PDFGenes (Basel)
December 2024
Project Group Biochemistry, Leibniz Institute on Aging-Fritz Lipmann Institute, D-07745 Jena, Germany.
DNA replication represents a series of precisely regulated events performed by a complex protein machinery that guarantees accurate duplication of the genetic information. Since DNA replication is permanently faced by a variety of exogenous and endogenous stressors, DNA damage response, repair and replication must be closely coordinated to maintain genomic integrity. HROB has been identified recently as a binding partner and activator of the Mcm8/9 helicase involved in DNA interstrand crosslink (ICL) repair.
View Article and Find Full Text PDFCAZymes ( C arbohydrate A ctive En Zymes ) degrade, synthesize, and modify all complex carbohydrates on Earth. CAZymes are extremely important to research in human health, nutrition, gut microbiome, bioenergy, plant disease, and global carbon recycling. Current CAZyme annotation tools are all based on sequence similarity.
View Article and Find Full Text PDFCell Rep
January 2025
Laboratory of Biochemistry, Wageningen University, 6708 WE Wageningen, the Netherlands. Electronic address:
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!