Background: The encoding of cell intrinsic drug resistance states in breast cancer reflects the contributions of genomic and non-genomic variations and requires accurate estimation of clonal fitness from co-measurement of transcriptomic and genomic data. Somatic copy number (CN) variation is the dominant mutational mechanism leading to transcriptional variation and notably contributes to platinum chemotherapy resistance cell states. Here, we deploy time series measurements of triple negative breast cancer (TNBC) single-cell transcriptomes, along with co-measured single-cell CN fitness, identifying genomic and transcriptomic mechanisms in drug-associated transcriptional cell states.
View Article and Find Full Text PDFHow cell-to-cell copy number alterations that underpin genomic instability in human cancers drive genomic and phenotypic variation, and consequently the evolution of cancer, remains understudied. Here, by applying scaled single-cell whole-genome sequencing to wild-type, TP53-deficient and TP53-deficient;BRCA1-deficient or TP53-deficient;BRCA2-deficient mammary epithelial cells (13,818 genomes), and to primary triple-negative breast cancer (TNBC) and high-grade serous ovarian cancer (HGSC) cells (22,057 genomes), we identify three distinct 'foreground' mutational patterns that are defined by cell-to-cell structural variation. Cell- and clone-specific high-level amplifications, parallel haplotype-specific copy number alterations and copy number segment length variation (serrate structural variations) had measurable phenotypic and evolutionary consequences.
View Article and Find Full Text PDFAssessing tumour gene fitness in physiologically-relevant model systems is challenging due to biological features of in vivo tumour regeneration, including extreme variations in single cell lineage progeny. Here we develop a reproducible, quantitative approach to pooled genetic perturbation in patient-derived xenografts (PDXs), by encoding single cell output from transplanted CRISPR-transduced cells in combination with a Bayesian hierarchical model. We apply this to 181 PDX transplants from 21 breast cancer patients.
View Article and Find Full Text PDFProgress in defining genomic fitness landscapes in cancer, especially those defined by copy number alterations (CNAs), has been impeded by lack of time-series single-cell sampling of polyclonal populations and temporal statistical models. Here we generated 42,000 genomes from multi-year time-series single-cell whole-genome sequencing of breast epithelium and primary triple-negative breast cancer (TNBC) patient-derived xenografts (PDXs), revealing the nature of CNA-defined clonal fitness dynamics induced by TP53 mutation and cisplatin chemotherapy. Using a new Wright-Fisher population genetics model to infer clonal fitness, we found that TP53 mutation alters the fitness landscape, reproducibly distributing fitness over a larger number of clones associated with distinct CNAs.
View Article and Find Full Text PDFCX-5461 is a G-quadruplex (G4) ligand currently in trials with initial indications of clinical activity in cancers with defects in homologous recombination repair. To identify more genetic defects that could sensitize tumors to CX-5461, we tested synthetic lethality for 480 DNA repair and genome maintenance genes to CX-5461, pyridostatin (PDS), a structurally unrelated G4-specific stabilizer, and BMH-21, which binds GC-rich DNA but not G4 structures. We identified multiple members of HRD, Fanconi Anemia pathways, and POLQ, a polymerase with a helicase domain important for G4 structure resolution.
View Article and Find Full Text PDFWe present Epiclomal, a probabilistic clustering method arising from a hierarchical mixture model to simultaneously cluster sparse single-cell DNA methylation data and impute missing values. Using synthetic and published single-cell CpG datasets, we show that Epiclomal outperforms non-probabilistic methods and can handle the inherent missing data characteristic that dominates single-cell CpG genome sequences. Using newly generated single-cell 5mCpG sequencing data, we show that Epiclomal discovers sub-clonal methylation patterns in aneuploid tumour genomes, thus defining epiclones that can match or transcend copy number-determined clonal lineages and opening up an important form of clonal analysis in cancer.
View Article and Find Full Text PDFAccurate measurement of clonal genotypes, mutational processes, and replication states from individual tumor-cell genomes will facilitate improved understanding of tumor evolution. We have developed DLP+, a scalable single-cell whole-genome sequencing platform implemented using commodity instruments, image-based object recognition, and open source computational methods. Using DLP+, we have generated a resource of 51,926 single-cell genomes and matched cell images from diverse cell types including cell lines, xenografts, and diagnostic samples with limited material.
View Article and Find Full Text PDFBackground: Single-cell RNA sequencing (scRNA-seq) is a powerful tool for studying complex biological systems, such as tumor heterogeneity and tissue microenvironments. However, the sources of technical and biological variation in primary solid tumor tissues and patient-derived mouse xenografts for scRNA-seq are not well understood.
Results: We use low temperature (6 °C) protease and collagenase (37 °C) to identify the transcriptional signatures associated with tissue dissociation across a diverse scRNA-seq dataset comprising 155,165 cells from patient cancer tissues, patient-derived breast cancer xenografts, and cancer cell lines.
Measuring gene expression of tumor clones at single-cell resolution links functional consequences to somatic alterations. Without scalable methods to simultaneously assay DNA and RNA from the same single cell, parallel single-cell DNA and RNA measurements from independent cell populations must be mapped for genome-transcriptome association. We present clonealign, which assigns gene expression states to cancer clones using single-cell RNA and DNA sequencing independently sampled from a heterogeneous population.
View Article and Find Full Text PDFThe genomes of large numbers of single cells must be sequenced to further understanding of the biological significance of genomic heterogeneity in complex systems. Whole genome amplification (WGA) of single cells is generally the first step in such studies, but is prone to nonuniformity that can compromise genomic measurement accuracy. Despite recent advances, robust performance in high-throughput single-cell WGA remains elusive.
View Article and Find Full Text PDFSingle-cell DNA sequencing has great potential to reveal the clonal genotypes and population structure of human cancers. However, single-cell data suffer from missing values and biased allelic counts as well as false genotype measurements owing to the sequencing of multiple cells. We describe the Single Cell Genotyper (https://bitbucket.
View Article and Find Full Text PDFWe performed phylogenetic analysis of high-grade serous ovarian cancers (68 samples from seven patients), identifying constituent clones and quantifying their relative abundances at multiple intraperitoneal sites. Through whole-genome and single-nucleus sequencing, we identified evolutionary features including mutation loss, convergence of the structural genome and temporal activation of mutational processes that patterned clonal progression. We then determined the precise clonal mixtures comprising each tumor sample.
View Article and Find Full Text PDFHuman cancers, including breast cancers, comprise clones differing in mutation content. Clones evolve dynamically in space and time following principles of Darwinian evolution, underpinning important emergent features such as drug resistance and metastasis. Human breast cancer xenoengraftment is used as a means of capturing and studying tumour biology, and breast tumour xenografts are generally assumed to be reasonable models of the originating tumours.
View Article and Find Full Text PDFThe evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event.
View Article and Find Full Text PDFWe introduce PyClone, a statistical model for inference of clonal population structures in cancers. PyClone is a Bayesian clustering method for grouping sets of deeply sequenced somatic mutations into putative clonal clusters while estimating their cellular prevalences and accounting for allelic imbalances introduced by segmental copy-number changes and normal-cell contamination. Single-cell sequencing validation demonstrates PyClone's accuracy.
View Article and Find Full Text PDF