The identification of optimal drug candidates is very important in drug discovery. Researchers in biology and computational sciences have sought to use machine learning (ML) to efficiently predict drug-target interactions (DTIs). In recent years, according to the emerging usefulness of pretrained models in natural language process (NLPs), pretrained models are being developed for chemical compounds and target proteins. This study sought to improve DTI predictive models using a Bidirectional Encoder Representations from the Transformers (BERT)-pretrained model, ChemBERTa, for chemical compounds. Pretraining features the use of a simplified molecular-input line-entry system (SMILES). We also employ the pretrained ProBERT for target proteins (pretraining employed the amino acid sequences). The BIOSNAP, DAVIS, and BindingDB databases (DBs) were used (alone or together) for learning. The final model, taught by both ChemBERTa and ProtBert and the integrated DBs, afforded the best DTI predictive performance to date based on the receiver operating characteristic area under the curve (AUC) and precision-recall-AUC values compared with previous models. The performance of the final model was verified using a specific case study on 13 pairs of subtrates and the metabolic enzyme cytochrome P450 (CYP). The final model afforded excellent DTI prediction. As the real-world interactions between drugs and target proteins are expected to exhibit specific patterns, pretraining with ChemBERTa and ProtBert could teach such patterns. Learning the patterns of such interactions would enhance DTI accuracy if learning employs large, well-balanced datasets that cover all relationships between drugs and target proteins.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9414546PMC
http://dx.doi.org/10.3390/pharmaceutics14081710DOI Listing

Publication Analysis

Top Keywords

target proteins
16
final model
12
predict drug-target
8
drug-target interactions
8
pretrained models
8
chemical compounds
8
dti predictive
8
chemberta protbert
8
drugs target
8
model
5

Similar Publications

Biophysical constraints limit the specificity with which transcription factors (TFs) can target regulatory DNA. While individual nontarget binding events may be low affinity, the sheer number of such interactions could present a challenge for gene regulation by degrading its precision or possibly leading to an erroneous induction state. Chromatin can prevent nontarget binding by rendering DNA physically inaccessible to TFs, at the cost of energy-consuming remodeling orchestrated by pioneer factors (PFs).

View Article and Find Full Text PDF

Norepinephrine in vertebrates and its invertebrate analog, octopamine, regulate the activity of neural circuits. We find that, when hungry, larvae switch activity in type II octopaminergic motor neurons (MNs) to high-frequency bursts, which coincide with locomotion-driving bursts in type I glutamatergic MNs that converge on the same muscles. Optical quantal analysis across hundreds of synapses simultaneously reveals that octopamine potentiates glutamate release by tonic type Ib MNs, but not phasic type Is MNs, and occurs via the G-coupled octopamine receptor (OAMB).

View Article and Find Full Text PDF

Malignant gliomas are heterogeneous tumors, mostly incurable, arising in the central nervous system (CNS) driven by genetic, epigenetic, and metabolic aberrations. Mutations in isocitrate dehydrogenase (IDH1/2) enzymes are predominantly found in low-grade gliomas and secondary high-grade gliomas, with IDH1 mutations being more prevalent. Mutant-IDH1/2 confers a gain-of-function activity that favors the conversion of a-ketoglutarate (α-KG) to the oncometabolite 2-hydroxyglutarate (2-HG), resulting in an aberrant hypermethylation phenotype.

View Article and Find Full Text PDF

Posttranslational modifications (PTMs) of proteins play critical roles in regulating many cellular events. Antibodies targeting site-specific PTMs are essential tools for detecting and enriching PTMs at sites of interest. However, fundamental difficulties in molecular recognition of both PTM and surrounding peptide sequence have hindered the efficient generation of highly sequence-specific anti-PTM antibodies.

View Article and Find Full Text PDF

The widespread application of genome editing to treat and cure disease requires the delivery of genome editors into the nucleus of target cells. Enveloped delivery vehicles (EDVs) are engineered virally derived particles capable of packaging and delivering CRISPR-Cas9 ribonucleoproteins (RNPs). However, the presence of lentiviral genome encapsulation and replication proteins in EDVs has obscured the underlying delivery mechanism and precluded particle optimization.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!