Publications by authors named "Junru Jin"

N-7methylguanosine (m7G) modification plays a crucial role in various biological processes and is closely associated with the development and progression of many cancers. Accurate identification of m7G modification sites is essential for understanding their regulatory mechanisms and advancing cancer therapy. Previous studies often suffered from insufficient research data, underutilization of motif information, and lack of interpretability.

View Article and Find Full Text PDF

Motivation: Diabetes is a chronic metabolic disorder that has been a major cause of blindness, kidney failure, heart attacks, stroke, and lower limb amputation across the world. To alleviate the impact of diabetes, researchers have developed the next generation of anti-diabetic drugs, known as dipeptidyl peptidase IV inhibitory peptides (DPP-IV-IPs). However, the discovery of these promising drugs has been restricted due to the lack of effective peptide-mining tools.

View Article and Find Full Text PDF

Molecular representation learning (MRL) is a fundamental task for drug discovery. However, previous deep-learning (DL) methods focus excessively on learning robust inner-molecular representations by mask-dominated pretraining frameworks, neglecting abundant chemical reactivity molecular relationships that have been demonstrated as the determining factor for various molecular property prediction tasks. Here, we present MolCAP to promote MRL, a graph-pretraining Transformer based on chemical reactivity (IMR) knowledge with prompted finetuning.

View Article and Find Full Text PDF

NcRNA-encoded small peptides (ncPEPs) have recently emerged as promising targets and biomarkers for cancer immunotherapy. Therefore, identifying cancer-associated ncPEPs is crucial for cancer research. In this work, we propose CoraL, a novel supervised contrastive meta-learning framework for predicting cancer-associated ncPEPs.

View Article and Find Full Text PDF
Article Synopsis
  • Automating retrosynthesis with AI, specifically through a model called RetroExplainer, streamlines organic chemistry research, providing insights that previous "black box" deep-learning methods lacked.
  • RetroExplainer incorporates a molecular assembly process supported by advanced techniques like Graph Transformers and contrastive learning to enhance its performance across various tasks.
  • Tested on large datasets, RetroExplainer not only outperforms existing methods in single-step retrosynthesis but also effectively identifies known pathways for multi-step planning, promising to improve efficiency and reliability in drug development.
View Article and Find Full Text PDF

The promoter region, positioned proximal to the transcription start sites, exerts control over the initiation of gene transcription by modulating the interaction with RNA polymerase. Consequently, the accurate recognition of promoter regions represents a critical focus within the bioinformatics domain. Although some methods leveraging pre-trained language models (PLMs) for promoter prediction have been proposed, the full potential of such PLMs remains largely untapped.

View Article and Find Full Text PDF

Recent research has highlighted the pivotal role of RNA post-transcriptional modifications in the regulation of RNA expression and function. Accurate identification of RNA modification sites is important for understanding RNA function. In this study, we propose a novel RNA modification prediction method, namely Rm-LR, which leverages a long-range-based deep learning approach to accurately predict multiple types of RNA modifications using RNA sequences only.

View Article and Find Full Text PDF

Anticancer peptides (ACPs) recently have been receiving increasing attention in cancer therapy due to their low consumption, few adverse side effects, and easy accessibility. However, it remains a great challenge to identify anticancer peptides via experimental approaches, requiring expensive and time-consuming experimental studies. In addition, traditional machine-learning-based methods are proposed for ACP prediction mainly depending on hand-crafted feature engineering, which normally achieves low prediction performance.

View Article and Find Full Text PDF
Article Synopsis
  • Drug-target interaction (DTI) prediction is vital in drug discovery, but existing methods often lack effective feature representation which hampers accuracy.
  • The authors propose a new neural network model called DrugormerDTI that combines Graph Transformer for molecular graph analysis and Residual2vec for understanding protein residue relationships.
  • Experimental results indicate DrugormerDTI outperforms previous methods on four benchmarks, highlighting the effectiveness of the Graph Transformer and residue design in improving drug-target prediction.
View Article and Find Full Text PDF

Motivation: Plant Small Secreted Peptides (SSPs) play an important role in plant growth, development, and plant-microbe interactions. Therefore, the identification of SSPs is essential for revealing the functional mechanisms. Over the last few decades, machine learning-based methods have been developed, accelerating the discovery of SSPs to some extent.

View Article and Find Full Text PDF

Here, we present DeepBIO, the first-of-its-kind automated and interpretable deep-learning platform for high-throughput biological sequence functional analysis. DeepBIO is a one-stop-shop web service that enables researchers to develop new deep-learning architectures to answer any biological question. Specifically, given any biological sequence data, DeepBIO supports a total of 42 state-of-the-art deep-learning algorithms for model training, comparison, optimization and evaluation in a fully automated pipeline.

View Article and Find Full Text PDF

Accurately predicting peptide secondary structures remains a challenging task due to the lack of discriminative information in short peptides. In this study, PHAT is proposed, a deep hypergraph learning framework for the prediction of peptide secondary structures and the exploration of downstream tasks. The framework includes a novel interpretable deep hypergraph multi-head attention network that uses residue-based reasoning for structure prediction.

View Article and Find Full Text PDF

Background: Cell-penetrating peptides (CPPs) have received considerable attention as a means of transporting pharmacologically active molecules into living cells without damaging the cell membrane, and thus hold great promise as future therapeutics. Recently, several machine learning-based algorithms have been proposed for predicting CPPs. However, most existing predictive methods do not consider the agreement (disagreement) between similar (dissimilar) CPPs and depend heavily on expert knowledge-based handcrafted features.

View Article and Find Full Text PDF
Article Synopsis
  • The study introduces iDNA-ABF, a deep learning model designed to predict DNA methylations using only genomic sequences, which enhances interpretability in predictions.
  • iDNA-ABF outperforms existing methods in various methylation prediction tasks, showcasing its advanced capabilities.
  • The model not only captures important sequential and functional information from genomes but also includes a mechanism for interpreting its findings, linking crucial DNA sequences to their biological functions.
View Article and Find Full Text PDF

Summary: Identifying the protein-peptide binding residues is fundamentally important to understand the mechanisms of protein functions and explore drug discovery. Although several computational methods have been developed, most of them highly rely on third-party tools or complex data preprocessing for feature design, easily resulting in low computational efficacy and suffering from low predictive performance. To address the limitations, we propose PepBCL, a novel BERT (Bidirectional Encoder Representation from Transformers) -based contrastive learning framework to predict the protein-peptide binding residues based on protein sequences only.

View Article and Find Full Text PDF

DNA N4-methylcytosine (4mC) is an important DNA modification and plays a crucial role in a variety of biological processes. Accurate 4mC site identification is fundamental to improving the understanding of 4mC biological functions and mechanisms. However, lots of identification approaches are limited to traditional machine learning, which leads to weak learning ability and a complex feature extraction process.

View Article and Find Full Text PDF

Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides.

View Article and Find Full Text PDF

Motivation: DNA methylation plays an important role in epigenetic modification, the occurrence, and the development of diseases. Therefore, identification of DNA methylation sites is critical for better understanding and revealing their functional mechanisms. To date, several machine learning and deep learning methods have been developed for the prediction of different DNA methylation types.

View Article and Find Full Text PDF