Publications by authors named "Le Ouyang"

The self-expressive coefficient plays a crucial role in the self-expressiveness-based subspace clustering method. To enhance the precision of the self-expressive coefficient, we propose a novel deep subspace clustering method, named grouping belief-based deep contrastive subspace clustering (GRESS), which integrates the clustering information and higher-order relationship into the coefficient matrix. Specifically, we develop a deep contrastive subspace clustering module to enhance the learning of both self-expressive coefficients and cluster representations simultaneously.

View Article and Find Full Text PDF

Inferring the interactions between genes is essential for understanding the mechanisms underlying biological processes. Gene networks will change along with the change of environment and state. The accumulation of gene expression data from multiple states makes it possible to estimate the gene networks in various states based on computational methods.

View Article and Find Full Text PDF

Binding affinity prediction of three-dimensional (3D) protein-ligand complexes is critical for drug repositioning and virtual drug screening. Existing approaches usually transform a 3D protein-ligand complex to a two-dimensional (2D) graph, and then use graph neural networks (GNNs) to predict its binding affinity. However, the node and edge features of the 2D graph are extracted based on invariant local coordinate systems of the 3D complex.

View Article and Find Full Text PDF

Motivation: Single-cell clustering plays a crucial role in distinguishing between cell types, facilitating the analysis of cell heterogeneity mechanisms. While many existing clustering methods rely solely on gene expression data obtained from single-cell RNA sequencing techniques to identify cell clusters, the information contained in mono-omic data is often limited, leading to suboptimal clustering performance. The emergence of single-cell multi-omics sequencing technologies enables the integration of multiple omics data for identifying cell clusters, but how to integrate different omics data effectively remains challenging.

View Article and Find Full Text PDF

Motivation: Retrosynthesis is a critical task in drug discovery, aimed at finding a viable pathway for synthesizing a given target molecule. Many existing approaches frame this task as a graph-generating problem. Specifically, these methods first identify the reaction center, and break a targeted molecule accordingly to generate the synthons.

View Article and Find Full Text PDF

Integrating single-cell datasets produced by multiple omics technologies is essential for defining cellular heterogeneity. Mosaic integration, in which different datasets share only some of the measured modalities, poses major challenges, particularly regarding modality alignment and batch effect removal. Here, we present a deep probabilistic framework for the mosaic integration and knowledge transfer (MIDAS) of single-cell multimodal data.

View Article and Find Full Text PDF

Cell type identification is a crucial step towards the study of cellular heterogeneity and biological processes. Advances in single-cell sequencing technology have enabled the development of a variety of clustering methods for cell type identification. However, most of existing methods are designed for clustering single omic data such as single-cell RNA-sequencing (scRNA-seq) data.

View Article and Find Full Text PDF

Detecting protein complexes is critical for studying cellular organizations and functions. The accumulation of protein-protein interaction (PPI) data enables the identification of protein complexes computationally. Although a great number of computational methods have been proposed to identify protein complexes from PPI networks, most of them ignore the signs of PPIs that reflect the ways proteins interact (activation or inhibition).

View Article and Find Full Text PDF

Cancer is a complex disease caused primarily by genetic variants. Reconstructing gene networks within tumors is essential for understanding the functional regulatory mechanisms of carcinogenesis. Advances in high-throughput sequencing technologies have provided tremendous opportunities for inferring gene networks via computational approaches.

View Article and Find Full Text PDF
Article Synopsis
  • The importance of inferring gene regulatory networks (GRNs) lies in understanding complex cellular regulatory mechanisms, with single-cell RNA-sequencing (scRNA-seq) enabling detailed measurement of gene expression at the individual cell level.
  • Existing methods for network inference typically rely on data from a single source, overlooking the potential insights from multiple related datasets.
  • The proposed DeepMCL model utilizes multi-view contrastive learning and an attention mechanism to effectively integrate various data sources and improve GRN inference through a deep Siamese convolutional neural network, showing promising results in experiments.
View Article and Find Full Text PDF

Advances in single-cell RNA sequencing (scRNA-seq) technologies has provided an unprecedent opportunity for cell-type identification. As clustering is an effective strategy towards cell-type identification, various computational approaches have been proposed for clustering scRNA-seq data. Recently, with the emergence of cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), the cell surface expression of specific proteins and the RNA expression on the same cell can be captured, which provides more comprehensive information for cell analysis.

View Article and Find Full Text PDF

Identification of cancer-related genes is helpful for understanding the pathogenesis of cancer, developing targeted drugs and creating new diagnostic and therapeutic methods. Considering the complexity of the biological laboratory methods, many network-based methods have been proposed to identify cancer-related genes at the global perspective with the increasing availability of high-throughput data. Some studies have focused on the tissue-specific cancer networks.

View Article and Find Full Text PDF
Article Synopsis
  • The scRNA-seq technique allows researchers to examine gene expression at the single-cell level, unveiling the complexities within tissues but faces challenges due to dropout events that can complicate data analysis.
  • A new method called scTSSR2 is introduced, which effectively combines matrix decomposition with a two-side sparse self-representation approach, resulting in efficient imputation of missing data in scRNA-seq studies.
  • scTSSR2 demonstrates superior computational speed and memory efficiency compared to other imputation methods and is packaged into a user-friendly R tool to enhance the quality of scRNA-seq data.
View Article and Find Full Text PDF

Advances in high-throughput experimental technologies promote the accumulation of vast number of biomedical data. Biomedical link prediction and single-cell RNA-sequencing (scRNA-seq) data imputation are two essential tasks in biomedical data analyses, which can facilitate various downstream studies and gain insights into the mechanisms of complex diseases. Both tasks can be transformed into matrix completion problems.

View Article and Find Full Text PDF
Article Synopsis
  • The identification of differentially expressed genes in single-cell RNA-sequencing (scRNA-seq) is vital, but existing methods yield inconsistent results due to varying assumptions and data structures.
  • The authors introduce scDEA, an ensemble learning approach that combines P-values from 12 distinct analysis methods, aiming for more reliable results in gene expression detection.
  • Experimental findings indicate that scDEA significantly outperforms existing methods, making it a useful tool for biologists, bioinformaticians, and data scientists working with scRNA-seq data.
View Article and Find Full Text PDF
Article Synopsis
  • Single-cell RNA sequencing (scRNA-seq) allows for detailed analysis of cellular states and complex diseases, but it suffers from inaccuracies due to dropout events that create false zero counts.
  • A new method called EnTSSR uses weighted ensemble learning to improve imputation of these dropout events by considering similarities among genes and cells, leveraging multiple imputation methods.
  • Tests including down-sampling, clustering, differential expression, and cell trajectory analysis show that EnTSSR effectively restores the true expression patterns in scRNA-seq data.
View Article and Find Full Text PDF
Article Synopsis
  • Inferring gene co-expression networks is crucial in bioinformatics, particularly since many networks have modular structures that reflect biological functions.
  • Despite advances in Gaussian graphical models to estimate these networks, existing methods fail to incorporate prior networks, like protein interaction networks, into their analysis.
  • The newly proposed method, pGNI, combines gene expression data and prior protein interaction data to better capture modular structures, showing its effectiveness in simulations and real datasets through biologically meaningful results.
View Article and Find Full Text PDF

Motivation: Differential network analysis is an important tool to investigate the rewiring of gene interactions under different conditions. Several computational methods have been developed to estimate differential networks from gene expression data, but most of them do not consider that gene network rewiring may be driven by the differential expression of individual genes. New differential network analysis methods that simultaneously take account of the changes in gene interactions and changes in expression levels are needed.

View Article and Find Full Text PDF
Article Synopsis
  • The study focuses on understanding biological processes by analyzing gene dependencies during events like disease development and cell differentiation using gene expression data.
  • New methods, specifically the proposed Weighted Differential Network Estimation (WDNE) model, are introduced to effectively handle missing data from techniques such as single-cell RNA sequencing.
  • Simulation results show that WDNE outperforms existing methods, and its application to ovarian tumors and breast cancer reveals crucial hub genes that provide insights into drug resistance and tumor behavior, along with a Matlab toolbox for practical analysis.
View Article and Find Full Text PDF

This study aims to investigate healthcare workers' (HCWs) willingness to receive SARS-CoV-2 vaccine in Zhejiang and to discover the related influential factors. The survey was conducted in six regions of Zhejiang Province, China, and 13 hospitals and 12 Centers for Disease Control and Prevention (CDC) were incorporated into the survey research. Participants were healthcare workers and a total of 3726 questionnaires were collected online, of which 3634 (97.

View Article and Find Full Text PDF

The identification of cancer subtypes is of great importance for understanding the heterogeneity of tumors and providing patients with more accurate diagnoses and treatments. However, it is still a challenge to effectively integrate multiple omics data to establish cancer subtypes. In this paper, we propose an unsupervised integration method, named weighted multi-view low rank representation (WMLRR), to identify cancer subtypes from multiple types of omics data.

View Article and Find Full Text PDF
Article Synopsis
  • Advances in technology have generated a wealth of single cell RNA sequencing (scRNA-seq) data, leading to the development of various clustering approaches to identify cellular phenotypes.
  • Clustering methods are generally categorized into individual (normal) methods, which focus on single data aspects, and integrated (ensemble) methods, which combine multiple individual methods for improved accuracy but can be sensitive to their base results.
  • The proposed EC-PGMGR algorithm aims to address these challenges by automatically determining cluster numbers and incorporating regularization to enhance the effectiveness of active clustering while mitigating the influence of weaker results.
View Article and Find Full Text PDF

Disease-gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives.

View Article and Find Full Text PDF

Inferring gene networks from gene expression data is important for understanding functional organizations within cells. With the accumulation of single-cell RNA sequencing (scRNA-seq) data, it is possible to infer gene networks at single cell level. However, due to the characteristics of scRNA-seq data, such as cellular heterogeneity and high sparsity caused by dropout events, traditional network inference methods may not be suitable for scRNA-seq data.

View Article and Find Full Text PDF