Publications by Guohui Chuai

Publications by authors named "Guohui Chuai"

Page 1 of 1

Author Correction: Discovering CRISPR-Cas system with self-processing pre-crRNA capability by foundation models.

Wenhui Li Xianyue Jiang Wuke Wang Liya Hou Runze Cai Guohui Chuai

Nat Commun

January 2025

View Article and Find Full Text PDF

Discovering CRISPR-Cas system with self-processing pre-crRNA capability by foundation models.

Wenhui Li Xianyue Jiang Wuke Wang Liya Hou Runze Cai Guohui Chuai

Nat Commun

November 2024

Article Synopsis

The emergence of CRISPR-Cas systems has led to the development of efficient gene editing tools, but traditional methods for discovering these systems often overlook important variants due to their reliance on sequence similarity.* -
A new AI framework called CHOOSER has been introduced to discover CRISPR-Cas systems without needing extensive training data, significantly enhancing the discovery process by utilizing protein large language models.* -
Using CHOOSER, researchers identified 11 new Casλ homologs, doubling the known catalog, and experimentally validated one homolog, EphcCasλ, for its capability in self-processing pre-crRNA and potential in CRISPR-based pathogen detection.*

View Article and Find Full Text PDF

Toward subtask-decomposition-based learning and benchmarking for predicting genetic perturbation outcomes and beyond.

Yicheng Gao Zhiting Wei Kejing Dong Ke Chen Jingya Yang Guohui Chuai

Nat Comput Sci

October 2024

Article Synopsis

Understanding how cells respond to changes in genes is crucial for many medical fields, but there are challenges in predicting outcomes from single or multiple genetic changes across different cell types.
The research introduces an AI method called STAMP, which breaks down the prediction of genetic outcomes into three manageable tasks: finding genes that change after perturbation, determining how they change, and measuring the extent of those changes.
STAMP shows significant improvements over previous methods, including the ability to identify important regulatory genes and pathways, even with small sample sizes, and to uncover detailed interactions among genes.

View Article and Find Full Text PDF

Foundation models in molecular biology.

Yunda Si Jiawei Zou Yicheng Gao Guohui Chuai Qi Liu

Biophys Rep

June 2024

Determining correlations between molecules at various levels is an important topic in molecular biology. Large language models have demonstrated a remarkable ability to capture correlations from large amounts of data in the field of natural language processing as well as image generation, and correlations captured from data using large language models can also be applicable to solving a wide range of specific tasks, hence large language models are also referred to as foundation models. The massive amount of data that exists in the field of molecular biology provides an excellent basis for the development of foundation models, and the recent emergence of foundation models in the field of molecular biology has really pushed the entire field forward.

View Article and Find Full Text PDF

Personalized tumor combination therapy optimization using the single-cell transcriptome.

Chen Tang Shaliu Fu Xuan Jin Wannian Li Feiyang Xing Guohui Chuai

Genome Med

December 2023

Background: The precise characterization of individual tumors and immune microenvironments using transcriptome sequencing has provided a great opportunity for successful personalized cancer treatment. However, the cancer treatment response is often characterized by in vitro assays or bulk transcriptomes that neglect the heterogeneity of malignant tumors in vivo and the immune microenvironment, motivating the need to use single-cell transcriptomes for personalized cancer treatment.

Methods: Here, we present comboSC, a computational proof-of-concept study to explore the feasibility of personalized cancer combination therapy optimization using single-cell transcriptomes.

View Article and Find Full Text PDF

Genome-wide CRISPR off-target prediction and optimization using RNA-DNA interaction fingerprints.

Qinchang Chen Guohui Chuai Haihang Zhang Jin Tang Liwen Duan

Nat Commun

November 2023

The powerful CRISPR genome editing system is hindered by its off-target effects, and existing computational tools achieved limited performance in genome-wide off-target prediction due to the lack of deep understanding of the CRISPR molecular mechanism. In this study, we propose to incorporate molecular dynamics (MD) simulations in the computational analysis of CRISPR system, and present CRISOT, an integrated tool suite containing four related modules, i.e.

View Article and Find Full Text PDF

Toward a molecular mechanism-based prediction of CRISPR-Cas9 targeting effects.

Qinchang Chen Guohui Chuai Chao Zhang Qing Zhang Qi Liu

Sci Bull (Beijing)

June 2022

View Article and Find Full Text PDF

X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis.

Dongyu Xue Han Zhang Xiaohan Chen Dongling Xiao Yukang Gong Guohui Chuai

Sci Bull (Beijing)

May 2022

View Article and Find Full Text PDF

Systematic Exploration of Optimized Base Editing gRNA Design and Pleiotropic Effects with BExplorer.

Gongchen Zhang Chenyu Zhu Xiaohan Chen Jifang Yan Dongyu Xue Guohui Chuai

Genomics Proteomics Bioinformatics

December 2023

Base editing technology is being increasingly applied in genome engineering, but the current strategy for designing guide RNAs (gRNAs) relies substantially on empirical experience rather than a dependable and efficient in silico design. Furthermore, the pleiotropic effect of base editing on disease treatment remains unexplored, which prevents its further clinical usage. Here, we presented BExplorer, an integrated and comprehensive computational pipeline to optimize the design of gRNAs for 26 existing types of base editors in silico.

View Article and Find Full Text PDF

A deep generative model for multi-view profiling of single-cell RNA-seq and ATAC-seq data.

Gaoyang Li Shaliu Fu Shuguang Wang Chenyu Zhu Bin Duan Guohui Chuai

Genome Biol

January 2022

Article Synopsis

The single-cell Multi-View Profiler (scMVP) is a deep generative model designed to analyze sequencing data that captures both gene expression and chromatin accessibility in individual cells.
It generates unified latent representations for tasks like dimensionality reduction, cell clustering, and tracing developmental pathways while also providing separate imputations for differential analysis and cis-regulatory element identification.
scMVP addresses data sparsity challenges and improves the identification of cell groups in diverse joint profiling methods, showcasing its effectiveness on various realistic datasets.

View Article and Find Full Text PDF

DeepReac+: deep active learning for quantitative modeling of organic chemical reactions.

Yukang Gong Dongyu Xue Guohui Chuai Jing Yu Qi Liu

Chem Sci

November 2021

Various computational methods have been developed for quantitative modeling of organic chemical reactions; however, the lack of universality as well as the requirement of large amounts of experimental data limit their broad applications. Here, we present DeepReac+, an efficient and universal computational framework for prediction of chemical reaction outcomes and identification of optimal reaction conditions based on deep active learning. Under this framework, DeepReac is designed as a graph-neural-network-based model, which directly takes 2D molecular structures as inputs and automatically adapts to different prediction tasks.

View Article and Find Full Text PDF

FL-QSAR: a federated learning-based QSAR prototype for collaborative drug discovery.

Shaoqi Chen Dongyu Xue Guohui Chuai Qiang Yang Qi Liu

Bioinformatics

April 2021

Motivation: Quantitative structure-activity relationship (QSAR) analysis is commonly used in drug discovery. Collaborations among pharmaceutical institutions can lead to a better performance in QSAR prediction, however, intellectual property and related financial interests remain substantially hindering inter-institutional collaborations in QSAR modeling for drug discovery.

Results: For the first time, we verified the feasibility of applying the horizontal federated learning (HFL), which is a recently developed collaborative and privacy-preserving learning framework to perform QSAR analysis.

View Article and Find Full Text PDF

Benchmarking and integrating genome-wide CRISPR off-target detection and prediction.

Jifang Yan Dongyu Xue Guohui Chuai Yuli Gao Gongchen Zhang

Nucleic Acids Res

November 2020

Systematic evaluation of genome-wide Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) off-target profiles is a fundamental step for the successful application of the CRISPR system to clinical therapies. Many experimental techniques and in silico tools have been proposed for detecting and predicting genome-wide CRISPR off-target profiles. These techniques and tools, however, have not been systematically benchmarked.

View Article and Find Full Text PDF

Learning for single-cell assignment.

Bin Duan Chenyu Zhu Guohui Chuai Chen Tang Xiaohan Chen

Sci Adv

October 2020

Efficient single-cell assignment without prior marker gene annotations is essential for single-cell sequencing data analysis. Current methods, however, have limited effectiveness for distinct single-cell assignment. They failed to achieve a well-generalized performance in different tasks because of the inherent heterogeneity of different single-cell sequencing datasets and different single-cell types.

View Article and Find Full Text PDF

pTuneos: prioritizing tumor neoantigens from next-generation sequencing data.

Chi Zhou Zhiting Wei Zhanbing Zhang Biyu Zhang Chenyu Zhu Guohui Chuai

Genome Med

October 2019

Background: Cancer neoantigens are expressed only in cancer cells and presented on the tumor cell surface in complex with major histocompatibility complex (MHC) class I proteins for recognition by cytotoxic T cells. Accurate and rapid identification of neoantigens play a pivotal role in cancer immunotherapy. Although several in silico tools for neoantigen prediction have been presented, limitations of these tools exist.

View Article and Find Full Text PDF

Data imbalance in CRISPR off-target prediction.

Yuli Gao Guohui Chuai Weichuan Yu Shen Qu Qi Liu

Brief Bioinform

July 2020

For genome-wide CRISPR off-target cleavage sites (OTS) prediction, an important issue is data imbalance-the number of true OTS recognized by whole-genome off-target detection techniques is much smaller than that of all possible nucleotide mismatch loci, making the training of machine learning model very challenging. Therefore, computational models proposed for OTS prediction and scoring should be carefully designed and properly evaluated in order to avoid bias. In our study, two tools are taken as examples to further emphasize the data imbalance issue in CRISPR off-target prediction to achieve better sensitivity and specificity for optimized CRISPR gene editing.

View Article and Find Full Text PDF

DeepCRISPR: optimized CRISPR guide RNA design by deep learning.

Guohui Chuai Hanhui Ma Jifang Yan Ming Chen Nanfang Hong

Genome Biol

June 2018

A major challenge for effective application of CRISPR systems is to accurately predict the single guide RNA (sgRNA) on-target knockout efficacy and off-target profile, which would facilitate the optimized design of sgRNAs with high sensitivity and specificity. Here we present DeepCRISPR, a comprehensive computational platform to unify sgRNA on-target and off-target site prediction into one framework with deep learning, surpassing available state-of-the-art in silico tools. In addition, DeepCRISPR fully automates the identification of sequence and epigenetic features that may affect sgRNA knockout efficacy in a data-driven manner.

View Article and Find Full Text PDF

[In silico CRISPR-based sgRNA design].

Yuanli Wang Guohui Chuai Jifang Yan Lei Shi Qi Liu

Sheng Wu Gong Cheng Xue Bao

October 2017

CRISPR-based genome editing has been widely implemented in various cell types. In-silico single guide RNA (sgRNA) design is a key step for successful gene editing using CRISPR system. Continuing efforts are made to refine in-silico sgRNA design with high on-target efficacy and reduced off-target effects.

View Article and Find Full Text PDF

Benchmarking CRISPR on-target sgRNA design.

Jifang Yan Guohui Chuai Chi Zhou Chenyu Zhu Jing Yang

Brief Bioinform

July 2018

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-based gene editing has been widely implemented in various cell types and organisms. A major challenge in the effective application of the CRISPR system is the need to design highly efficient single-guide RNA (sgRNA) with minimal off-target cleavage. Several tools are available for sgRNA design, while limited tools were compared.

View Article and Find Full Text PDF

MetaTopics: an integration tool to analyze microbial community profile by topic model.

Jifang Yan Guohui Chuai Tao Qi Fangyang Shao Chi Zhou

BMC Genomics

January 2017

Background: Deciphering taxonomical structures based on high dimensional sequencing data is still challenging in metagenomics study. Moreover, the common workflow processed in this field fails to identify microbial communities and their effect on a specific disease status. Even the relationships and interactions between different bacteria in a microbial community keep unknown.

View Article and Find Full Text PDF

In Silico Meets In Vivo: Towards Computational CRISPR-Based sgRNA Design.

Guo-Hui Chuai Qi-Long Wang Qi Liu

Trends Biotechnol

January 2017

CRISPR-based genome editing has been widely implemented in various cell types. In silico single guide RNA (sgRNA) design is a key step for successful gene editing using the CRISPR system, and continuing efforts are aimed at refining in silico sgRNA design with high on-target efficacy and reduced off-target effects. Many sgRNA design tools are available, but careful assessments of their application scenarios and performance benchmarks across different types of genome-editing data are needed.

View Article and Find Full Text PDF

Deciphering relationship between microhomology and in-frame mutation occurrence in human CRISPR-based gene knockout.

Guohui Chuai Fayu Yang Jifang Yan Yanan Chen Qin Ma

Mol Ther Nucleic Acids

June 2016

View Article and Find Full Text PDF