Publications by authors named "Yundian Zeng"

3D structure-based molecular generation is a successful application of generative AI in drug discovery. Most earlier models follow an atom-wise paradigm, generating molecules with good docking scores but poor molecular properties (like synthesizability and drugability). In contrast, fragment-wise generation offers a promising alternative by assembling chemically viable fragments.

View Article and Find Full Text PDF

Ribonucleic acid (RNA)-ligand interactions play a pivotal role in a wide spectrum of biological processes, ranging from protein biosynthesis to cellular reproduction. This recognition has prompted the broader acceptance of RNA as a viable candidate for drug targets. Delving into the atomic-scale understanding of RNA-ligand interactions holds paramount importance in unraveling intricate molecular mechanisms and further contributing to RNA-based drug discovery.

View Article and Find Full Text PDF

Nucleic acid (NA)-ligand interactions are of paramount importance in a variety of biological processes, including cellular reproduction and protein biosynthesis, and therefore, NAs have been broadly recognized as potential drug targets. Understanding NA-ligand interactions at the atomic scale is essential for investigating the molecular mechanism and further assisting in NA-targeted drug discovery. Molecular docking is one of the predominant computational approaches for predicting the interactions between NAs and small molecules.

View Article and Find Full Text PDF

In the past few years, a number of machine learning (ML)-based molecular generative models have been proposed for generating molecules with desirable properties, but they all require a large amount of label data of pharmacological and physicochemical properties. However, experimental determination of these labels, especially bioactivity labels, is very expensive. In this study, we analyze the dependence of various multi-property molecule generation models on biological activity label data and propose Frag-G/M, a fragment-based multi-constraint molecular generation framework based on conditional transformer, recurrent neural networks (RNNs), and reinforcement learning (RL).

View Article and Find Full Text PDF

Metalloproteins play indispensable roles in various biological processes ranging from reaction catalysis to free radical scavenging, and they are also pertinent to numerous pathologies including cancer, HIV infection, neurodegeneration, and inflammation. Discovery of high-affinity ligands for metalloproteins powers the treatment of these pathologies. Extensive efforts have been made to develop approaches, such as molecular docking and machine learning (ML)-based models, for fast identification of ligands binding to heterogeneous proteins, but few of them have exclusively concentrated on metalloproteins.

View Article and Find Full Text PDF

Many deep learning (DL)-based molecular generative models have been proposed to design novel molecules. These models may perform well on benchmarks, but they usually do not take real-world constraints into account, such as available training data set, synthetic accessibility, and scaffold diversity in drug discovery. In this study, a new algorithm, ChemistGA, was proposed by combining the traditional heuristic algorithm with DL, in which the crossover of the traditional genetic algorithm (GA) was redefined by DL in conjunction with GA, and an innovative backcrossing operation was implemented to generate desired molecules.

View Article and Find Full Text PDF

Covalent ligands have attracted increasing attention due to their unique advantages, such as long residence time, high selectivity, and strong binding affinity. They also show promise for targets where previous efforts to identify noncovalent small molecule inhibitors have failed. However, our limited knowledge of covalent binding sites has hindered the discovery of novel ligands.

View Article and Find Full Text PDF

Cross-contamination of cell lines is a highly relevant and pervasive problem. The analysis of short tandem repeats (STR) is a simple and commercially available technique to authenticate cell lines for more than two decades. At present, STR multiple amplification kits have been developed up to 21 loci while the current STR databases only provide 9-loci STR profiles.

View Article and Find Full Text PDF