Gene promoter and enhancer sequences are bound by transcription factors and are depleted of methylated CpG sites (cytosines preceding guanines in DNA). The absence of methylated CpGs in these sequences typically correlates with increased gene expression, indicating a regulatory role for methylation. We used nanopore sequencing to determine haplotype-specific methylation rates of 15.
View Article and Find Full Text PDFThe characterization of mutational processes that generate sequence diversity in the human genome is of paramount importance both to medical genetics and to evolutionary studies. To understand how the age and sex of transmitting parents affect de novo mutations, here we sequence 1,548 Icelanders, their parents, and, for a subset of 225, at least one child, to 35× genome-wide coverage. We find 108,778 de novo mutations, both single nucleotide polymorphisms and indels, and determine the parent of origin of 42,961.
View Article and Find Full Text PDFUnderstanding of sequence diversity is the cornerstone of analysis of genetic disorders, population genetics, and evolutionary biology. Here, we present an update of our sequencing set to 15,220 Icelanders who we sequenced to an average genome-wide coverage of 34X. We identified 39,020,168 autosomal variants passing GATK filters: 31,079,378 SNPs and 7,940,790 indels.
View Article and Find Full Text PDFMycobacterium tuberculosis infections cause 9 million new tuberculosis cases and 1.5 million deaths annually. To identify variants conferring risk of tuberculosis, we tested 28.
View Article and Find Full Text PDFIn this paper the effect of SNPs on expression levels in Nimblegen RNA expression microarrays is investigated. A vast number of replicates of probe pairs representing both alleles of SNPs on 14 loci allows accurate estimation of the difference in signal intensities both within and between probe pairs. The majority of probe-pairs with sufficiently high expression have significant differences in expression levels within the pair and the difference shows concordance with the genotype of the samples.
View Article and Find Full Text PDFLong non-coding ribonucleic acids (lncRNAs) have been proposed as biomarkers in prostate cancer. This paper proposes a selection method which uses data from tiled microarrays to identify relatively long regions of moderate expression independent of the microarray platform and probe design. The method is used to search for candidate long non-coding ribonucleic acids (lncRNAs) at locus 8q24 and is run on three independent experiments which all use samples from prostate cancer patients.
View Article and Find Full Text PDF