Publications by authors named "Andreas Sundquist"

Background: Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results.

View Article and Find Full Text PDF

Motivation: Accurate inference of genealogical relationships between pairs of individuals is paramount in association studies, forensics and evolutionary analyses of wildlife populations. Current methods for relationship inference consider only a small set of close relationships and have limited to no power to distinguish between relationships with the same number of meioses separating the individuals under consideration (e.g.

View Article and Find Full Text PDF

Molecular interactions between protein complexes and DNA mediate essential gene-regulatory functions. Uncovering such interactions by chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-Seq) has recently become the focus of intense interest. We here introduce quantitative enrichment of sequence tags (QuEST), a powerful statistical framework based on the kernel density estimation approach, which uses ChIP-Seq data to determine positions where protein complexes contact DNA.

View Article and Find Full Text PDF

The genome of an admixed individual with ancestors from isolated populations is a mosaic of chromosomal blocks, each following the statistical properties of variation seen in those populations. By analyzing polymorphisms in the admixed individual against those seen in representatives from the populations, we can infer the ancestral source of the individual's haploblocks. In this paper we describe a novel approach for ancestry inference, HAPAA (HMM-based analysis of polymorphisms in admixed ancestries), that models the allelic and haplotypic variation in the populations and captures the signal of correlation due to linkage disequilibrium, resulting in greatly improved accuracy.

View Article and Find Full Text PDF

Background: The metagenomic analysis of microbial communities holds the potential to improve our understanding of the role of microbes in clinical conditions. Recent, dramatic improvements in DNA sequencing throughput and cost will enable such analyses on individuals. However, such advances in throughput generally come at the cost of shorter read-lengths, limiting the discriminatory power of each read.

View Article and Find Full Text PDF

While recently developed short-read sequencing technologies may dramatically reduce the sequencing cost and eventually achieve the $1000 goal for re-sequencing, their limitations prevent the de novo sequencing of eukaryotic genomes with the standard shotgun sequencing protocol. We present SHRAP (SHort Read Assembly Protocol), a sequencing protocol and assembly methodology that utilizes high-throughput short-read technologies. We describe a variation on hierarchical sequencing with two crucial differences: (1) we select a clone library from the genome randomly rather than as a tiling path and (2) we sample clones from the genome at high coverage and reads from the clones at low coverage.

View Article and Find Full Text PDF