Publications by authors named "Ruofan Jin"

Protein structure is key to understanding protein function and is essential for progress in bioengineering, drug discovery, and molecular biology. Recently, with the incorporation of generative AI, the power and accuracy of computational protein structure prediction/design have been improved significantly. However, ethical concerns such as copyright protection and harmful content generation (biosecurity) pose challenges to the wide implementation of protein generative models.

View Article and Find Full Text PDF

Protein loop modeling is a challenging yet highly nontrivial task in protein structure prediction. Despite recent progress, existing methods including knowledge-based, ab initio, hybrid, and deep learning (DL) methods fall substantially short of either atomic accuracy or computational efficiency. To overcome these limitations, we present KarmaLoop, a novel paradigm that distinguishes itself as the first DL method centered on full-atom (encompassing both backbone and side-chain heavy atoms) protein loop modeling.

View Article and Find Full Text PDF

The optimization of therapeutic antibodies through traditional techniques, such as candidate screening via hybridoma or phage display, is resource-intensive and time-consuming. In recent years, computational and artificial intelligence-based methods have been actively developed to accelerate and improve the development of therapeutic antibodies. In this study, we developed an end-to-end sequence-based deep learning model, termed AttABseq, for the predictions of the antigen-antibody binding affinity changes connected with antibody mutations.

View Article and Find Full Text PDF
Article Synopsis
  • Protein loops are vital for protein dynamics and various biological processes, yet there's a lack of comprehensive evaluation on loop modeling methods.
  • Researchers created two datasets to assess the accuracy and efficiency of 13 loop modeling approaches based on factors like loop length and protein types.
  • The knowledge-based method FREAD generally performed best, while Rosetta NGK excelled with short loops; AlphaFold2 and RoseTTAFold showed promise for longer loops but require more resources.
View Article and Find Full Text PDF

In the past few years, a number of machine learning (ML)-based molecular generative models have been proposed for generating molecules with desirable properties, but they all require a large amount of label data of pharmacological and physicochemical properties. However, experimental determination of these labels, especially bioactivity labels, is very expensive. In this study, we analyze the dependence of various multi-property molecule generation models on biological activity label data and propose Frag-G/M, a fragment-based multi-constraint molecular generation framework based on conditional transformer, recurrent neural networks (RNNs), and reinforcement learning (RL).

View Article and Find Full Text PDF

Clarifying the process of formation of diversity hotspots and the biogeographic connection between regions is critical in understanding the impact of environmental changes on organismal evolution. Polygonatum (Asparagaceae) is distributed across the Northern Hemisphere. It displays an uneven distribution, with more than 50% of its species occurring in the Himalaya-Hengduan Mountains (HHM).

View Article and Find Full Text PDF