Motivation: In silico identification of linear B-cell epitopes represents an important step in the development of diagnostic tests and vaccine candidates, by providing potential high-probability targets for experimental investigation. Current predictive tools were developed under a generalist approach, training models with heterogeneous datasets to develop predictors that can be deployed for a wide variety of pathogens. However, continuous advances in processing power and the increasing amount of epitope data for a broad range of pathogens indicate that training organism or taxon-specific models may become a feasible alternative, with unexplored potential gains in predictive performance.
Results: This article shows how organism-specific training of epitope prediction models can yield substantial performance gains across several quality metrics when compared to models trained with heterogeneous and hybrid data, and with a variety of widely used predictors from the literature. These results suggest a promising alternative for the development of custom-tailored predictive models with high predictive power, which can be easily implemented and deployed for the investigation of specific pathogens.
Availability And Implementation: The data underlying this article, as well as the full reproducibility scripts, are available at https://github.com/fcampelo/OrgSpec-paper. The R package that implements the organism-specific pipeline functions is available at https://github.com/fcampelo/epitopes.
Supplementary Information: Supplementary materials are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8665745 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btab536 | DOI Listing |
J Bioinform Comput Biol
April 2024
Bioinformatics Centre, Savitribai Phule Pune University, Pune 411007, India.
Enzymes catalyze diverse biochemical reactions and are building blocks of cellular and metabolic pathways. Data and metadata of enzymes are distributed across databases and are archived in various formats. The enzyme databases provide utilities for efficient searches and downloading enzyme records in batch mode but do not support organism-specific extraction of subsets of data.
View Article and Find Full Text PDFMicrobiome
April 2024
Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-Imaging, Center for Artificial Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China.
Background: Ruminants are important livestock animals that have a unique digestive system comprising multiple stomach compartments. Despite significant progress in the study of microbiome in the gastrointestinal tract (GIT) sites of ruminants, we still lack an understanding of the viral community of ruminants. Here, we surveyed its viral ecology using 2333 samples from 10 sites along the GIT of 8 ruminant species.
View Article and Find Full Text PDFbioRxiv
November 2023
Department of Bioengineering, University of California San Diego, San Diego, CA, USA.
Background: Invasive mold infections (IMIs) such as aspergillosis, mucormycosis, fusariosis, and lomentosporiosis are associated with high morbidity and mortality, particularly in immunocompromised patients, with mortality rates as high as 40% to 80%. Outcomes could be substantially improved with early initiation of appropriate antifungal therapy, yet early diagnosis remains difficult to establish and often requires multidisciplinary teams evaluating clinical and radiological findings plus supportive mycological findings. Universal digital high resolution melting analysis (U-dHRM) may enable rapid and robust diagnosis of IMI.
View Article and Find Full Text PDFJ Sci Food Agric
February 2024
School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China.
Background: Streptococcus thermophilus is an important strain widely used in dairy fermentation, with distinct urea metabolism characteristics compared to other lactic acid bacteria. The conversion of urea by S. thermophilus has been shown to affect the flavor and acidification characteristics of milk.
View Article and Find Full Text PDFbioRxiv
September 2023
Department of Biomedical Engineering, University of Delaware.
Recent advancements in Protein Language Models (pLMs) have enabled high-throughput analysis of proteins through primary sequence alone. At the same time, newfound evidence illustrates that codon usage bias is remarkably predictive and can even change the final structure of a protein. Here, we explore these findings by extending the traditional vocabulary of pLMs from amino acids to codons to encapsulate more information inside CoDing Sequences (CDS).
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!