A well-defined risk factor and precursor for cutaneous melanoma is the dysplastic nevus. These benign tumors represent clonal hyperproliferation of melanocytes that are in a senescent-like state, but with occasional malignant transformation events. To portray the mutational repertoire of dysplastic nevi in patients with the dysplastic nevus syndrome and to determine the discriminatory profiles of melanocytic nevi (including dysplastic nevi) from melanoma, we sequenced exomes of melanocytic nevi including dysplastic nevi (n = 19), followed by a targeted gene panel (785 genes) characterization of melanocytic nevi (n = 46) and primary melanomas (n = 42).
View Article and Find Full Text PDFMeiotic recombination is a fundamental evolutionary process driving diversity in eukaryotes. In mammals, recombination is known to occur preferentially at specific genomic regions. Using topological data analysis (TDA), a branch of applied topology that extracts global features from large data sets, we developed an efficient method for mapping recombination at fine scales.
View Article and Find Full Text PDFDespite large-scale cancer genomics studies, key somatic mutations driving cancer, and their functional roles, remain elusive. Here, we propose that analysis of comorbidities of Mendelian diseases with cancers provides a novel, systematic way to discover new cancer genes. If germline genetic variation in Mendelian loci predisposes bearers to common cancers, the same loci may harbour cancer-associated somatic variation.
View Article and Find Full Text PDFNanopore sequencing promises long read-lengths and single-molecule resolution, but the stochastic motion of the DNA molecule inside the pore is, as of this writing, a barrier to high accuracy reads. We develop a method of statistical inference that explicitly accounts for this error, and demonstrate that high accuracy (>99%) sequence inference is feasible even under highly diffusive motion by using a hidden Markov model to jointly analyze multiple stochastic reads. Using this model, we place bounds on achievable inference accuracy under a range of experimental parameters.
View Article and Find Full Text PDFBackground: Viral outbreaks, such as the 2014 ebolavirus, can spread rapidly and have complex evolutionary dynamics, including coinfection and bulk transmission of multiple viral populations. Genomic surveillance can be hindered when the spread of the outbreak exceeds the evolutionary rate, in which case consensus approaches will have limited resolution. Deep sequencing of infected patients can identify genomic variants present in intrahost populations at subclonal frequencies (i.
View Article and Find Full Text PDF