Chromosomal rearrangements can initiate and drive cancer progression, yet it has been challenging to evaluate their impact, especially in genetically heterogeneous solid cancers. To address this problem we developed HiDENSEC, a new computational framework for analyzing chromatin conformation capture in heterogeneous samples that can infer somatic copy number alterations, characterize large-scale chromosomal rearrangements, and estimate cancer cell fractions. After validating HiDENSEC with in silico and in vitro controls, we used it to characterize chromosome-scale evolution during melanoma progression in formalin-fixed tumor samples from three patients.
View Article and Find Full Text PDFThere is significant interest in developing machine learning methods to model protein-ligand interactions but a scarcity of experimentally resolved protein-ligand structures to learn from. Protein self-contacts are a much larger source of structural data that could be leveraged, but currently it is not well understood how this data source differs from the target domain. Here, we characterize the 3D geometric patterns of protein self-contacts as probability distributions.
View Article and Find Full Text PDFDirect comparison of bulk gene expression profiles is complicated by distinct cell type mixtures in each sample that obscure whether observed differences are actually caused by changes in the expression levels themselves or are simply a result of differing cell type compositions. Single-cell technology has made it possible to measure gene expression in individual cells, achieving higher resolution at the expense of increased noise. If carefully incorporated, such single-cell data can be used to deconvolve bulk samples to yield accurate estimates of the true cell type proportions, thus enabling one to disentangle the effects of differential expression and cell type mixtures.
View Article and Find Full Text PDFThe totally asymmetric simple exclusion process (TASEP), which describes the stochastic dynamics of interacting particles on a lattice, has been actively studied over the past several decades and applied to model important biological transport processes. Here, we present a software package, called EGGTART (Extensive GUI gives TASEP-realization in Real Time), which quantifies and visualizes the dynamics associated with a generalized version of the TASEP with an extended particle size and heterogeneous jump rates. This computational tool is based on analytic formulas obtained from deriving and solving the hydrodynamic limit of the process.
View Article and Find Full Text PDFTranslation of mRNA into protein is a fundamental yet complex biological process with multiple factors that can potentially affect its efficiency. Here, we study a stochastic model describing the traffic flow of ribosomes along the mRNA and identify the key parameters that govern the overall rate of protein synthesis, sensitivity to initiation rate changes, and efficiency of ribosome usage. By analyzing a continuum limit of the model, we obtain closed-form expressions for stationary currents and ribosomal densities, which agree well with Monte Carlo simulations.
View Article and Find Full Text PDF