Improving quantitative prediction of protein subcellular locations in fluorescence images through deep generative models.

Comput Biol Med

School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China. Electronic address:

Published: September 2024

AI Article Synopsis

  • Machine learning helps identify where proteins are located in cells, which is crucial for understanding their functions, especially for those that exist in multiple places.
  • However, most current research focuses on just classifying these locations without considering how much of a protein is in each place.
  • To tackle this issue, a new model called PLocGAN was developed, which generates cell images with detailed quantitative information, enabling better predictions about protein localization.

Article Abstract

Machine learning has been employed in recognizing protein localization at the subcellular level, which highly facilitates the protein function studies, especially for those multi-label proteins that localize in more than one organelle. However, existing works mostly study the qualitative classification of protein subcellular locations, ignoring fraction of one multi-label protein in different locations. Actually, about 50 % proteins are multi-label proteins, and the ignorance of quantitative information highly restricts the understanding of their spatial distribution and functional mechanism. One reason of the lack of quantitative study is the insufficiency of quantitative annotations. To address the data shortage problem, here we proposed a generative model, PLocGAN, which could generate cell images with conditional quantitative annotation of the fluorescence distribution. The model was a conditional generative adversarial network, in which the condition learning utilized partial label learning to overcome the lack of training labels and allowed training with only qualitative labels. Meanwhile, it used contrastive learning to enhance diversity of the generated images. We assessed the PLocGAN on four pixel-fused synthetic datasets and one real dataset, and demonstrated that the model could generate images with good fidelity and diversity, outperforming existing state-of-the-art generative methods. To verify the utility of PLocGAN in the quantitative prediction of protein subcellular locations, we replaced the training images with generated quantitative images and built prediction models, and found that they had a boosting effect on the quantitative estimation. This work demonstrates the effectiveness of deep generative models in bioimage analysis, and provides a new solution for quantitative subcellular proteomics.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiomed.2024.108913DOI Listing

Publication Analysis

Top Keywords

protein subcellular
12
subcellular locations
12
quantitative prediction
8
prediction protein
8
deep generative
8
generative models
8
multi-label proteins
8
quantitative
8
protein
6
images
6

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!