Clustering cells and depicting the lineage relationship among cell subpopulations are fundamental tasks in single-cell omics studies. However, existing analytical methods face challenges in stratifying cells, tracking cellular trajectories, and identifying critical points of cell transitions. To overcome these, we proposed a novel Markov hierarchical clustering algorithm (MarkovHC), a topological clustering method that leverages the metastability of exponentially perturbed Markov chains for systematically reconstructing the cellular landscape.
View Article and Find Full Text PDFDynamic models of gene expression are urgently required. In this paper, we describe the time evolution of gene expression by learning a jump diffusion process to model the biological process directly. Our algorithm needs aggregate gene expression data as input and outputs the parameters of the jump diffusion process.
View Article and Find Full Text PDFCopy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and consistency of CNV calling from the two platforms remain in dispute.
View Article and Find Full Text PDFIn human cells, DNA is hierarchically organized and assembled with histones and DNA-binding proteins in three dimensions. Chromatin interactions play important roles in genome architecture and gene regulation, including robustness in the developmental stages and flexibility during the cell cycle. Here we propose in situ Hi-C method named Bridge Linker-Hi-C (BL-Hi-C) for capturing structural and regulatory chromatin interactions by restriction enzyme targeting and two-step proximity ligation.
View Article and Find Full Text PDFAdaptation is a crucial biological function possessed by many sensory systems. Early work has shown that some influential equilibrium models can achieve accurate adaptation. However, recent studies indicate that there are close relationships between adaptation and nonequilibrium.
View Article and Find Full Text PDFBackground: RNA-Seq is a powerful new technology to comprehensively analyze the transcriptome of any given cells. An important task in RNA-Seq data analysis is quantifying the expression levels of all transcripts. Although many methods have been introduced and much progress has been made, a satisfactory solution remains be elusive.
View Article and Find Full Text PDFUnlabelled: The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates.
View Article and Find Full Text PDFThe inositol trisphosphate receptor (IPR) is a crucial ion channel that regulates the Ca(2+) influx from the endoplasmic reticulum (ER) to the cytoplasm. A thorough study of the IPR channel contributes to a better understanding of calcium oscillations and waves. It has long been observed that the IPR channel is a typical biological system which performs adaptation.
View Article and Find Full Text PDFA number of biological systems can be modelled by Markov chains. Recently, there has been an increasing concern about when biological systems modelled by Markov chains will perform a dynamic phenomenon called overshoot. In this study, the authors found that the steady-state behaviour of the system will have a great effect on the occurrence of overshoot.
View Article and Find Full Text PDFDiscovering the regulation of cancer-related gene is of great importance in cancer biology. Transcription factors and microRNAs are two kinds of crucial regulators in gene expression, and they compose a combinatorial regulatory network with their target genes. Revealing the structure of this network could improve the authors' understanding of gene regulation, and further explore the molecular pathway in cancer.
View Article and Find Full Text PDFDNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation.
View Article and Find Full Text PDFCancer stem cell (CSC) theory suggests a cell-lineage structure in tumor cells in which CSCs are capable of giving rise to the other non-stem cancer cells (NSCCs) but not vice versa. However, an alternative scenario of bidirectional interconversions between CSCs and NSCCs was proposed very recently. Here we present a general population model of cancer cells by integrating conventional cell divisions with direct conversions between different cell states, namely, not only can CSCs differentiate into NSCCs by asymmetric cell division, NSCCs can also dedifferentiate into CSCs by cell state conversion.
View Article and Find Full Text PDFGene regulation is a key factor in gaining a full understanding of molecular biology. microRNA (miRNA), a novel class of non-coding RNA, has recently been found to be one crucial class of post-transactional regulators, and play important roles in cancer. One essential step to understand the regulatory effect of miRNAs is the reliable prediction of their target mRNAs.
View Article and Find Full Text PDFBackground: Microarray technology is widely utilized for monitoring the expression changes of thousands of genes simultaneously. However, the requirement of relatively large amount of RNA for labeling and hybridization makes it difficult to perform microarray experiments with limited biological materials, thus leads to the development of many methods for preparing and amplifying mRNA. It is addressed that amplification methods usually bring bias, which may strongly hamper the following interpretation of the results.
View Article and Find Full Text PDFBackground: Somatic cells can be reprogrammed to induced-pluripotent stem cells (iPSCs) by introducing few reprogramming factors, which challenges the long held view that cell differentiation is irreversible. However, the mechanism of induced pluripotency is still unknown.
Methods: Inspired by the phenomenological reprogramming model of Artyomov et al (2010), we proposed a novel Markov model, stepwise reprogramming Markov (SRM) model, with simpler gene regulation rules and explored various properties of the model with Monte Carlo simulation.
Unlabelled: Etiologic diagnoses of lower respiratory tract infections (LRTI) have been relying primarily on bacterial cultures that often fail to return useful results in time. Although DNA-based assays are more sensitive than bacterial cultures in detecting pathogens, the molecular results are often inconsistent and challenged by doubts on false positives, such as those due to system- and environment-derived contaminations. Here we report a nationwide cohort study on 2986 suspected LRTI patients across P.
View Article and Find Full Text PDFBackground: Copy number variation (CNV) is essential to understand the pathology of many complex diseases at the DNA level. Affymetrix SNP arrays, which are widely used for CNV studies, significantly depend on accurate copy number (CN) estimation. Nevertheless, CN estimation may be biased by several factors, including cross-hybridization and training sample batch, as well as genomic waves of intensities induced by sequence-dependent hybridization rate and amplification efficiency.
View Article and Find Full Text PDFPhosphorylation and transcriptional regulation events are critical for cells to transmit and respond to signals. In spite of its importance, systems-level strategies that couple these two networks have yet to be presented. Here we introduce a novel approach that integrates the physical and functional aspects of phosphorylation network together with the transcription network in S.
View Article and Find Full Text PDFFront Biosci (Elite Ed)
January 2012
The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms.
View Article and Find Full Text PDFIn this paper, we perform a complete analysis of the kinetic behavior of the general modifier mechanism of Botts and Morales in both equilibrium steady states and non-equilibrium steady states (NESS). Enlightened by the non-equilibrium theory of Markov chains, we introduce the net flux into discussion and acquire an expression of the rate of product formation in NESS, which has clear biophysical significance. Up till now, it is a general belief that being an activator or an inhibitor is an intrinsic property of the modifier.
View Article and Find Full Text PDFBackground: Cellular functions depend on genetic, physical and other types of interactions. As such, derived interaction networks can be utilized to discover novel genes involved in specific biological processes. Epistatic Miniarray Profile, or E-MAP, which is an experimental platform that measures genetic interactions on a genome-wide scale, has successfully recovered known pathways and revealed novel protein complexes in Saccharomyces cerevisiae (budding yeast).
View Article and Find Full Text PDFMotivation: Epistatic Miniarray Profiles (EMAP) has enabled the mapping of large-scale genetic interaction networks; however, the quantitative information gained from EMAP cannot be fully exploited since the data are usually interpreted as a discrete network based on an arbitrary hard threshold. To address such limitations, we adopted a mixture modeling procedure to construct a probabilistic genetic interaction network and then implemented a Bayesian approach to identify densely interacting modules in the probabilistic network.
Results: Mixture modeling has been demonstrated as an effective soft-threshold technique of EMAP measures.
Background: Gene regulation is a key factor in gaining a full understanding of molecular biology. Cis-regulatory modules (CRMs), consisting of multiple transcription factor binding sites, have been confirmed as the main regulators in gene expression. In recent years, a novel regulator known as microRNA (miRNA) has been found to play an important role in gene regulation.
View Article and Find Full Text PDFNucleic Acids Res
September 2009
Affymetrix SNP arrays have been widely used for single-nucleotide polymorphism (SNP) genotype calling and DNA copy number variation inference. Although numerous methods have achieved high accuracy in these fields, most studies have paid little attention to the modeling of hybridization of probes to off-target allele sequences, which can affect the accuracy greatly. In this study, we address this issue and demonstrate that hybridization with mismatch nucleotides (HWMMN) occurs in all SNP probe-sets and has a critical effect on the estimation of allelic concentrations (ACs).
View Article and Find Full Text PDF