Publications by authors named "Xiaochen Bo"

The human body contains trillions of cells, classified into specific cell types, with diverse morphologies and functions. In addition, cells of the same type can assume different states within an individual's body during their lifetime. Understanding the complexities of the proteome in the context of a human organism and its many potential states is a necessary requirement to understanding human biology, but these complexities can neither be predicted from the genome, nor have they been systematically measurable with available technologies.

View Article and Find Full Text PDF

Background: Continuing medical education in liver transplantation is pivotal in enhancing the proficiency of liver surgeons. The goal of this study is to obtain information on all aspects of the training, enable us to pinpoint the training's strengths, and address any shortcomings or challenges.

Method: We conducted an online questionnaire survey, which was comprised of 33 questions, offering response options in the form of "yes/no", single choice, or multiple choice.

View Article and Find Full Text PDF

In mammals, during the post-fertilization pre-implantation phase, the expression of cell type-specific genes is crucial for normal embryonic development, which is regulated by cis-regulatory elements (CREs). TFs control gene expression by interacting with CREs. Research shows that transcription factor binding sites (TFBSs) reflect the general characteristics of the regulatory genome.

View Article and Find Full Text PDF
Article Synopsis
  • The study focuses on improving the prediction of synthetic lethality (SL) interactions for targeted cancer therapies by developing a new model called SLGRN, which considers genetic context.
  • SLGRN uses a Graph Recurrent Network to create low-dimensional feature representations for nodes, enhancing its ability to identify novel SL interactions more accurately than existing models.
  • The model's effectiveness is validated through experiments and clinical analyses, highlighting its potential impact on personalized cancer treatment strategies.
View Article and Find Full Text PDF

Predicting protein-ligand binding sites is an integral part of structural biology and drug design. A comprehensive understanding of these binding sites is essential for advancing drug innovation, elucidating mechanisms of biological function, and exploring the nature of disease. However, accurately identifying protein-ligand binding sites remains a challenging task.

View Article and Find Full Text PDF

Nanopore selective sequencing allows the targeted sequencing of DNA of interest using computational approaches rather than experimental methods such as targeted multiplex polymerase chain reaction or hybridization capture. Compared to sequence-alignment strategies, deep learning (DL) models for classifying target and nontarget DNA provide large speed advantages. However, the relatively low accuracy of these DL-based tools hinders their application in nanopore selective sequencing.

View Article and Find Full Text PDF

The assessment of mutagenicity is essential in drug discovery, as it may lead to cancer and germ cells damage. Although in silico methods have been proposed for mutagenicity prediction, their performance is hindered by the scarcity of labeled molecules. However, experimental mutagenicity testing can be time-consuming and costly.

View Article and Find Full Text PDF

Cancer is rarely the straightforward consequence of an abnormality in a single gene, but rather reflects a complex interplay of many genes, represented as gene modules. Here, we leverage the recent advances of model-agnostic interpretation approach and develop CGMega, an explainable and graph attention-based deep learning framework to perform cancer gene module dissection. CGMega outperforms current approaches in cancer gene prediction, and it provides a promising approach to integrate multi-omics information.

View Article and Find Full Text PDF

Advances in chromatin mapping have exposed the complex chromatin hierarchical organization in mammals, including topologically associating domains (TADs) and their substructures, yet the functional implications of this hierarchy in gene regulation and disease progression are not fully elucidated. Our study delves into the phenomenon of shared TAD boundaries, which are pivotal in maintaining the hierarchical chromatin structure and regulating gene activity. By integrating high-resolution Hi-C data, chromatin accessibility, and DNA double-strand breaks (DSBs) data from various cell lines, we systematically explore the complex regulatory landscape at high-level TAD boundaries.

View Article and Find Full Text PDF

Gene expression is temporally and spatially regulated by the interaction of transcription factors (TFs) and cis-regulatory elements (CREs). The uneven distribution of TF binding sites across the genome poses challenges in understanding how this distribution evolves to regulate spatio-temporal gene expression and consequent heritable phenotypic variation. In this study, chromatin accessibility profiles and gene expression profiles were collected from several species including mammals (human, mouse, bovine), fish (zebrafish and medaka), and chicken.

View Article and Find Full Text PDF

Topologically associating domains (TADs), megabase-scale features of chromatin spatial architecture, are organized in a domain-within-domain TAD hierarchy. Within TADs, the inner and smaller subTADs not only manifest cell-to-cell variability, but also precisely regulate transcription and differentiation. Although over 20 TAD callers are able to detect TAD, their usability in biomedicine is confined by a disagreement of outputs and a limit in understanding TAD hierarchy.

View Article and Find Full Text PDF

The prediction of binding affinity changes caused by missense mutations can elucidate antigen-antibody interactions. A few accessible structure-based online computational tools have been proposed. However, selecting suitable software for particular research is challenging, especially research on the SARS-CoV-2 spike protein with antibodies.

View Article and Find Full Text PDF

Motivation: Clustering analysis for single-cell RNA sequencing (scRNA-seq) data is an important step in revealing cellular heterogeneity. Many clustering methods have been proposed to discover heterogenous cell types from scRNA-seq data. However, adaptive clustering with accurate cluster number reflecting intrinsic biology nature from large-scale scRNA-seq data remains quite challenging.

View Article and Find Full Text PDF

Detecting changes in the dynamics of secreted proteins in serum has been a challenge for proteomics. Enter secreted protein database (SEPDB), an integrated secretory proteomics database offering human, mouse and rat secretory proteomics datasets collected from serum, exosomes and cell culture media. SEPDB compiles secreted protein information from secreted protein database, UniProt and Human Protein Atlas databases to annotate secreted proteomics data based on protein subcellular localization and disease markers.

View Article and Find Full Text PDF

Integrating single-cell datasets produced by multiple omics technologies is essential for defining cellular heterogeneity. Mosaic integration, in which different datasets share only some of the measured modalities, poses major challenges, particularly regarding modality alignment and batch effect removal. Here, we present a deep probabilistic framework for the mosaic integration and knowledge transfer (MIDAS) of single-cell multimodal data.

View Article and Find Full Text PDF

Chromatin accessibility plays important roles in revealing the regulatory networks of gene expression, while its application in bladder cancer is yet to be fully elucidated. Chloride intracellular channel 3 (CLIC3) protein has been reported to be associated with the progression of some tumors, whereas the specific mechanism of CLIC3 in tumor remains unclear. Here, we screened for key genes in bladder cancer through the identification of transcription factor binding site clustered region (TFCR) on the basis of chromatin accessibility and TF motif.

View Article and Find Full Text PDF

Recurrence and extraocular metastasis in advanced intraocular retinoblastoma (RB) are still major obstacles for successful treatment of Chinese children. Tuberous sclerosis complex (TSC) is a very rare, multisystemic genetic disorder characterized by hamartomatous growth. In this study, we aimed to compare genomic and epigenomic profiles with human RB or TSC using recently developed nanopore sequencing, and to identify disease-associated variations or genes.

View Article and Find Full Text PDF
Article Synopsis
  • * The research involved sequencing samples from 320 Tibetan and Han individuals, highlighting the significant role of SVs in regulating biological functions for high-altitude survival.
  • * Findings include a unique Tibetan deletion that disrupts a super-enhancer affecting the EPAS1 gene, showcasing how SVs contribute to the adaptation and regulation in humans.
View Article and Find Full Text PDF

Introduction: Synthetic lethality (SL) provides an opportunity to leverage different genetic interactions when designing synergistic combination therapies. To further explore SL-based combination therapies for cancer treatment, it is important to identify and mechanistically characterize more SL interactions. Artificial intelligence (AI) methods have recently been proposed for SL prediction, but the results of these models are often not interpretable such that deriving the underlying mechanism can be challenging.

View Article and Find Full Text PDF

Summary: A variety of computational methods have been developed to identify functionally related gene modules from genome-wide gene expression profiles. Integrating the results of these methods to identify consensus modules is a promising approach to produce more accurate and robust results. In this application note, we introduce COMMO, the first web server to identify and analyze consensus gene functionally related gene modules from different module detection methods.

View Article and Find Full Text PDF

Cellular transitions hold great promise in translational medicine research. However, therapeutic applications are limited by the low efficiency and safety concerns of using transcription factors. Small molecules provide a temporal and highly tunable approach to overcome these issues.

View Article and Find Full Text PDF

Early embryonic development is a dynamic process that relies on proper cell-cell communication to form a correctly patterned embryo. Early embryo development-related ligand-receptor pairs (eLRs) have been shown to guide cell fate decisions and morphogenesis. However, the scope of eLRs and their influence on early embryo development remain elusive.

View Article and Find Full Text PDF

Introduction: There are countless possibilities for drug combinations, which makes it expensive and time-consuming to rely solely on clinical trials to determine the effects of each possible drug combination. In order to screen out the most effective drug combinations more quickly, scholars began to apply machine learning to drug combination prediction. However, most of them are of low interpretability.

View Article and Find Full Text PDF

Cancer is an extremely complex disease and each type of cancer usually has several different subtypes. Multi-omics data can provide more comprehensive biological information for identifying and discovering cancer subtypes. However, existing unsupervised cancer subtyping methods cannot effectively learn comprehensive shared and specific information of multi-omics data.

View Article and Find Full Text PDF