Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at http://thegleelab.org/iBCE-EL. iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6072840PMC
http://dx.doi.org/10.3389/fimmu.2018.01695DOI Listing

Publication Analysis

Top Keywords

data set
12
ibce-el
8
ibce-el ensemble
8
ensemble learning
8
learning framework
8
framework improved
8
improved linear
8
set ibce-el
8
bces
6
linear b-cell
4

Similar Publications

Addressing the global challenge of ensuring access to safe drinking water, especially in developing countries, demands cost-effective, eco-friendly, and readily available technologies. The persistence, toxicity, and bioaccumulation potential of organic pollutants arising from various human activities pose substantial hurdles. While high-performance liquid chromatography coupled with high-resolution mass spectrometry (HPLC-HRMS) is a widely utilized technique for identifying pollutants in water, the multitude of structures for a single elemental composition complicates structural identification.

View Article and Find Full Text PDF

An Automated Workflow to Discover the Structure-Stability Relations for Radiation Hard Molecular Semiconductors.

J Am Chem Soc

January 2025

Institute of Materials for Electronics and Energy Technology (i-MEET), Department of Materials Science and Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Martensstraße 7, 91058 Erlangen, Germany.

Emerging photovoltaics for outer space applications are one of the many examples where radiation hard molecular semiconductors are essential. However, due to a lack of general design principles, their resilience against extra-terrestrial high-energy radiation can currently not be predicted. In this work, the discovery of radiation hard materials is accelerated by combining the strengths of high-throughput, lab automation and machine learning.

View Article and Find Full Text PDF

D3-ImgNet: A Framework for Molecular Properties Prediction Based on Data-Driven Electron Density Images.

J Phys Chem A

January 2025

Liaoning Key Laboratory of Manufacturing System and Logistics Optimization, Shenyang 110819, China.

Artificial intelligence technology has introduced a new research paradigm into the fields of quantum chemistry and materials science, leading to numerous studies that utilize machine learning methods to predict molecular properties. We contend that an exemplary deep learning model should not only achieve high-precision predictions of molecular properties but also incorporate guidance from physical mechanisms. Here, we propose a framework for predicting molecular properties based on data-driven electron density images, referred to as D3-ImgNet.

View Article and Find Full Text PDF

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

College of Public Health, University of Kentucky, Lexington, KY, USA.

Background: Brain arteriolosclerosis (B-ASC) is a pathologic hallmark characterized by dysmorphic brain arteriolar wall thickening. B-ASC is a common finding at autopsy in aged persons - some degree of B-ASC is seen in >80% of brains beyond age 80 years - and is associated with cognitive impairment. Hypertension and diabetes are widely recognized as risk factors for B-ASC.

View Article and Find Full Text PDF

Background: Single-nucleus RNA sequencing (snRNAseq) allows for the dissection of the cell type-specific transcriptional profiles of tissue specimens. In this study, we compared gene expression in multiple brain cell types in brain tissue from Alzheimer disease (AD) cases with no or other co-existing pathologies including Lewy body disease (LBD) and vascular disease (VaD).

Method: We evaluated differential gene expression measured from single nucleus RNA sequencing (snRNAseq) data generated from the hippocampus region tissue donated by 11 BU ADRC participants with neuropathologically confirmed AD with or without a co-existing pathology (AD-only = 3, AD+VaD = 6, AD+LBD = 2).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!