Publications by authors named "Yingfan Ma"

Proteins can be represented in different data forms, including sequence, structure, and surface, each of which has unique advantages and certain limitations. It is promising to fuse the complementary information among them. In this work, we propose a framework called ProteinF3S for enzyme function prediction that fuses the complementary information across protein sequence, structure, and surface.

View Article and Find Full Text PDF

As more and more protein structures are discovered, blind protein-ligand docking will play an important role in drug discovery because it can predict protein-ligand complex conformation without pocket information on the target proteins. Recently, deep learning-based methods have made significant advancements in blind protein-ligand docking, but their protein features are suboptimal because they do not fully consider the difference between potential pocket regions and non-pocket regions in protein feature extraction. In this work, we propose a pocket-guided strategy for guiding the ligand to dock to potential docking regions on a protein.

View Article and Find Full Text PDF

Self-supervised learning plays an important role in molecular representation learning because labeled molecular data are usually limited in many tasks, such as chemical property prediction and virtual screening. However, most existing molecular pre-training methods focus on one modality of molecular data, and the complementary information of two important modalities, SMILES and graph, is not fully explored. In this study, we propose an effective multi-modality self-supervised learning framework for molecular SMILES and graph.

View Article and Find Full Text PDF

Summary: The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments.

View Article and Find Full Text PDF