Publications by authors named "Le OuYang"

The self-expressive coefficient plays a crucial role in the self-expressiveness-based subspace clustering method. To enhance the precision of the self-expressive coefficient, we propose a novel deep subspace clustering method, named grouping belief-based deep contrastive subspace clustering (GRESS), which integrates the clustering information and higher-order relationship into the coefficient matrix. Specifically, we develop a deep contrastive subspace clustering module to enhance the learning of both self-expressive coefficients and cluster representations simultaneously.

View Article and Find Full Text PDF

Inferring the interactions between genes is essential for understanding the mechanisms underlying biological processes. Gene networks will change along with the change of environment and state. The accumulation of gene expression data from multiple states makes it possible to estimate the gene networks in various states based on computational methods.

View Article and Find Full Text PDF

Binding affinity prediction of three-dimensional (3D) protein-ligand complexes is critical for drug repositioning and virtual drug screening. Existing approaches usually transform a 3D protein-ligand complex to a two-dimensional (2D) graph, and then use graph neural networks (GNNs) to predict its binding affinity. However, the node and edge features of the 2D graph are extracted based on invariant local coordinate systems of the 3D complex.

View Article and Find Full Text PDF

Motivation: Single-cell clustering plays a crucial role in distinguishing between cell types, facilitating the analysis of cell heterogeneity mechanisms. While many existing clustering methods rely solely on gene expression data obtained from single-cell RNA sequencing techniques to identify cell clusters, the information contained in mono-omic data is often limited, leading to suboptimal clustering performance. The emergence of single-cell multi-omics sequencing technologies enables the integration of multiple omics data for identifying cell clusters, but how to integrate different omics data effectively remains challenging.

View Article and Find Full Text PDF

Motivation: Retrosynthesis is a critical task in drug discovery, aimed at finding a viable pathway for synthesizing a given target molecule. Many existing approaches frame this task as a graph-generating problem. Specifically, these methods first identify the reaction center, and break a targeted molecule accordingly to generate the synthons.

View Article and Find Full Text PDF

Integrating single-cell datasets produced by multiple omics technologies is essential for defining cellular heterogeneity. Mosaic integration, in which different datasets share only some of the measured modalities, poses major challenges, particularly regarding modality alignment and batch effect removal. Here, we present a deep probabilistic framework for the mosaic integration and knowledge transfer (MIDAS) of single-cell multimodal data.

View Article and Find Full Text PDF

Cell type identification is a crucial step towards the study of cellular heterogeneity and biological processes. Advances in single-cell sequencing technology have enabled the development of a variety of clustering methods for cell type identification. However, most of existing methods are designed for clustering single omic data such as single-cell RNA-sequencing (scRNA-seq) data.

View Article and Find Full Text PDF

Detecting protein complexes is critical for studying cellular organizations and functions. The accumulation of protein-protein interaction (PPI) data enables the identification of protein complexes computationally. Although a great number of computational methods have been proposed to identify protein complexes from PPI networks, most of them ignore the signs of PPIs that reflect the ways proteins interact (activation or inhibition).

View Article and Find Full Text PDF

Cancer is a complex disease caused primarily by genetic variants. Reconstructing gene networks within tumors is essential for understanding the functional regulatory mechanisms of carcinogenesis. Advances in high-throughput sequencing technologies have provided tremendous opportunities for inferring gene networks via computational approaches.

View Article and Find Full Text PDF

The inference of gene regulatory networks (GRNs) is of great importance for understanding the complex regulatory mechanisms within cells. The emergence of single-cell RNA-sequencing (scRNA-seq) technologies enables the measure of gene expression levels for individual cells, which promotes the reconstruction of GRNs at single-cell resolution. However, existing network inference methods are mainly designed for data collected from a single data source, which ignores the information provided by multiple related data sources.

View Article and Find Full Text PDF

Advances in single-cell RNA sequencing (scRNA-seq) technologies has provided an unprecedent opportunity for cell-type identification. As clustering is an effective strategy towards cell-type identification, various computational approaches have been proposed for clustering scRNA-seq data. Recently, with the emergence of cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), the cell surface expression of specific proteins and the RNA expression on the same cell can be captured, which provides more comprehensive information for cell analysis.

View Article and Find Full Text PDF

Identification of cancer-related genes is helpful for understanding the pathogenesis of cancer, developing targeted drugs and creating new diagnostic and therapeutic methods. Considering the complexity of the biological laboratory methods, many network-based methods have been proposed to identify cancer-related genes at the global perspective with the increasing availability of high-throughput data. Some studies have focused on the tissue-specific cancer networks.

View Article and Find Full Text PDF

The single-cell RNA sequencing (scRNA-seq) technique begins a new era by revealing gene expression patterns at single-cell resolution, enabling studies of heterogeneity and transcriptome dynamics of complex tissues at single-cell resolution. However, existing large proportion of dropout events may hinder downstream analyses. Thus imputation of dropout events is an important step in analyzing scRNA-seq data.

View Article and Find Full Text PDF

Advances in high-throughput experimental technologies promote the accumulation of vast number of biomedical data. Biomedical link prediction and single-cell RNA-sequencing (scRNA-seq) data imputation are two essential tasks in biomedical data analyses, which can facilitate various downstream studies and gain insights into the mechanisms of complex diseases. Both tasks can be transformed into matrix completion problems.

View Article and Find Full Text PDF

The identification of differentially expressed genes between different cell groups is a crucial step in analyzing single-cell RNA-sequencing (scRNA-seq) data. Even though various differential expression analysis methods for scRNA-seq data have been proposed based on different model assumptions and strategies recently, the differentially expressed genes identified by them are quite different from each other, and the performances of them depend on the underlying data structures. In this paper, we propose a new ensemble learning-based differential expression analysis method, scDEA, to produce a more stable and accurate result.

View Article and Find Full Text PDF

The advancements of single-cell RNA sequencing (scRNA-seq) technologies have provided us unprecedented opportunities to characterize cellular states and investigate the mechanisms of complex diseases. Due to technical issues such as dropout events, scRNA-seq data contains excess of false zero counts, which has a substantial impact on the downstream analyses. Although several computational approaches have been proposed to impute dropout events in scRNA-seq data, there is no strong consensus on which is the best approach.

View Article and Find Full Text PDF

Inferring gene co-expression networks from high-throughput gene expression data is an important task in bioinformatics. Many gene networks often exhibit modular structures. Although several Gaussian graphical model-based methods have been developed to estimate gene co-expression networks by incorporating the modular structural prior, none of them takes into account the modular structures captured by the prior networks (e.

View Article and Find Full Text PDF

Motivation: Differential network analysis is an important tool to investigate the rewiring of gene interactions under different conditions. Several computational methods have been developed to estimate differential networks from gene expression data, but most of them do not consider that gene network rewiring may be driven by the differential expression of individual genes. New differential network analysis methods that simultaneously take account of the changes in gene interactions and changes in expression levels are needed.

View Article and Find Full Text PDF

The mechanisms controlling biological process, such as the development of disease or cell differentiation, can be investigated by examining changes in the networks of gene dependencies between states in the process. High-throughput experimental methods, like microarray and RNA sequencing, have been widely used to gather gene expression data, which paves the way to infer gene dependencies based on computational methods. However, most differential network analysis methods are designed to deal with fully observed data, but missing values, such as the dropout events in single-cell RNA-sequencing data, are frequent.

View Article and Find Full Text PDF

This study aims to investigate healthcare workers' (HCWs) willingness to receive SARS-CoV-2 vaccine in Zhejiang and to discover the related influential factors. The survey was conducted in six regions of Zhejiang Province, China, and 13 hospitals and 12 Centers for Disease Control and Prevention (CDC) were incorporated into the survey research. Participants were healthcare workers and a total of 3726 questionnaires were collected online, of which 3634 (97.

View Article and Find Full Text PDF

The identification of cancer subtypes is of great importance for understanding the heterogeneity of tumors and providing patients with more accurate diagnoses and treatments. However, it is still a challenge to effectively integrate multiple omics data to establish cancer subtypes. In this paper, we propose an unsupervised integration method, named weighted multi-view low rank representation (WMLRR), to identify cancer subtypes from multiple types of omics data.

View Article and Find Full Text PDF

Advances in technology have made it convenient to obtain a large amount of single cell RNA sequencing (scRNA-seq) data. Since that clustering is a very important step in identifying or defining cellular phenotypes, many clustering approaches have been developed recently for these applications. The general methods can be roughly divided into normal clustering methods and integrated (ensemble) clustering methods which combine more than two normal clustering methods aiming to get much more informative performance.

View Article and Find Full Text PDF

Disease-gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives.

View Article and Find Full Text PDF

Inferring gene networks from gene expression data is important for understanding functional organizations within cells. With the accumulation of single-cell RNA sequencing (scRNA-seq) data, it is possible to infer gene networks at single cell level. However, due to the characteristics of scRNA-seq data, such as cellular heterogeneity and high sparsity caused by dropout events, traditional network inference methods may not be suitable for scRNA-seq data.

View Article and Find Full Text PDF