There is growing recognition that mammalian cells produce many thousands of large intergenic transcripts. However, the functional significance of these transcripts has been particularly controversial. Although there are some well-characterized examples, most (>95%) show little evidence of evolutionary conservation and have been suggested to represent transcriptional noise.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
December 2008
More complete knowledge of the molecular mechanisms underlying cancer will improve prevention, diagnosis and treatment. Efforts such as The Cancer Genome Atlas are systematically characterizing the structural basis of cancer, by identifying the genomic mutations associated with each cancer type. A powerful complementary approach is to systematically characterize the functional basis of cancer, by identifying the genes essential for growth and related phenotypes in different cancer cells.
View Article and Find Full Text PDFOur variant ascertainment algorithm, VAAL, uses massively parallel DNA sequence data to identify differences between bacterial genomes with high sensitivity and specificity. VAAL detected approximately 98% of differences (including large insertion-deletions) between pairs of strains from three species while calling no false positives. VAAL also pinpointed a single mutation between Vibrio cholerae genomes, identifying an antibiotic's site of action by identifying sequence differences between drug-sensitive strains and drug-resistant derivatives.
View Article and Find Full Text PDFBackground: The malaria parasite Plasmodium falciparum exhibits abundant genetic diversity, and this diversity is key to its success as a pathogen. Previous efforts to study genetic diversity in P. falciparum have begun to elucidate the demographic history of the species, as well as patterns of population structure and patterns of linkage disequilibrium within its genome.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
December 2008
The genetic architecture of complex traits underlying physiology and disease in most organisms remains elusive. We still know little about the number of genes that underlie these traits, the magnitude of their effects, or the extent to which they interact. Chromosome substitution strains (CSSs) enable statistically powerful studies based on testing engineered inbred strains that have single, unique, and nonoverlapping genetic differences, thereby providing measures of phenotypic effects that are attributable to individual chromosomes.
View Article and Find Full Text PDFCancer results from somatic alterations in key genes, including point mutations, copy-number alterations and structural rearrangements. A powerful way to discover cancer-causing genes is to identify genomic regions that show recurrent copy-number alterations (gains and losses) in tumor genomes. Recent advances in sequencing technologies suggest that massively parallel sequencing may provide a feasible alternative to DNA microarrays for detecting copy-number alterations.
View Article and Find Full Text PDFIn 1994, two independent groups extracted DNA from several Pleistocene epoch mammoths and noted differences among individual specimens. Subsequently, DNA sequences have been published for a number of extinct species. However, such ancient DNA is often fragmented and damaged, and studies to date have typically focused on short mitochondrial sequences, never yielding more than a fraction of a per cent of any nuclear genome.
View Article and Find Full Text PDFGenetic mapping provides a powerful approach to identify genes and biological processes underlying any trait influenced by inheritance, including human diseases. We discuss the intellectual foundations of genetic mapping of Mendelian and complex traits in humans, examine lessons emerging from linkage analysis of Mendelian diseases and genome-wide association studies of common diseases, and discuss questions and challenges that lie ahead.
View Article and Find Full Text PDFIn embryonic stem (ES) cells, bivalent chromatin domains with overlapping repressive (H3 lysine 27 tri-methylation) and activating (H3 lysine 4 tri-methylation) histone modifications mark the promoters of more than 2,000 genes. To gain insight into the structure and function of bivalent domains, we mapped key histone modifications and subunits of Polycomb-repressive complexes 1 and 2 (PRC1 and PRC2) genomewide in human and mouse ES cells by chromatin immunoprecipitation, followed by ultra high-throughput sequencing. We find that bivalent domains can be segregated into two classes -- the first occupied by both PRC2 and PRC1 (PRC1-positive) and the second specifically bound by PRC2 (PRC2-only).
View Article and Find Full Text PDFDetermining the genetic basis of cancer requires comprehensive analyses of large collections of histopathologically well-classified primary tumours. Here we report the results of a collaborative study to discover somatic mutations in 188 human lung adenocarcinomas. DNA sequencing of 623 genes with known or potential relationships to cancer revealed more than 1,000 somatic mutations across the samples.
View Article and Find Full Text PDFDNA methylation is essential for normal development and has been implicated in many pathologies including cancer. Our knowledge about the genome-wide distribution of DNA methylation, how it changes during cellular differentiation and how it relates to histone methylation and other chromatin modifications in mammals remains limited. Here we report the generation and analysis of genome-scale DNA methylation profiles at nucleotide resolution in mammalian cells.
View Article and Find Full Text PDFWe describe a collection of expressed sequence tags (ESTs) for Saccoglossus kowalevskii, a direct-developing hemichordate valuable for evolutionary comparisons with chordates. The 202,175 ESTs represent 163,633 arrayed clones carrying cDNAs prepared from embryonic libraries, and they assemble into 13,677 continuous sequences (contigs), leaving 10,896 singletons (excluding mitochondrial sequences). Of the contigs, 53% had significant matches when BLAST was used to query the NCBI databases (< or = 10(-10)), as did 51% of the singletons.
View Article and Find Full Text PDFSomatic cells can be reprogrammed to a pluripotent state through the ectopic expression of defined transcription factors. Understanding the mechanism and kinetics of this transformation may shed light on the nature of developmental potency and suggest strategies with improved efficiency or safety. Here we report an integrative genomic analysis of reprogramming of mouse fibroblasts and B lymphocytes.
View Article and Find Full Text PDFLoss of the epithelial adhesion molecule E-cadherin is thought to enable metastasis by disrupting intercellular contacts-an early step in metastatic dissemination. To further investigate the molecular basis of this notion, we use two methods to inhibit E-cadherin function that distinguish between E-cadherin's cell-cell adhesion and intracellular signaling functions. Whereas the disruption of cell-cell contacts alone does not enable metastasis, the loss of E-cadherin protein does, through induction of an epithelial-to-mesenchymal transition, invasiveness, and anoikis resistance.
View Article and Find Full Text PDFNew DNA sequencing technologies deliver data at dramatically lower costs but demand new analytical methods to take full advantage of the very short reads that they produce. We provide an initial, theoretical solution to the challenge of de novo assembly from whole-genome shotgun "microreads." For 11 genomes of sizes up to 39 Mb, we generated high-quality assemblies from 80x coverage by paired 30-base simulated reads modeled after real Illumina-Solexa reads.
View Article and Find Full Text PDFPromising new sequencing technologies, based on sequencing-by-synthesis (SBS), are starting to deliver large amounts of DNA sequence at very low cost. Polymorphism detection is a key application. We describe general methods for improved quality scores and accurate automated polymorphism detection, and apply them to data from the Roche (454) Genome Sequencer 20.
View Article and Find Full Text PDFComprehensive knowledge of the genomic alterations that underlie cancer is a critical foundation for diagnostics, prognostics, and targeted therapeutics. Systematic efforts to analyze cancer genomes are underway, but the analysis is hampered by the lack of a statistical framework to distinguish meaningful events from random background aberrations. Here we describe a systematic method, called Genomic Identification of Significant Targets in Cancer (GISTIC), designed for analyzing chromosomal aberrations in cancer.
View Article and Find Full Text PDFInfection with the malaria parasite Plasmodium falciparum leads to widely different clinical conditions in children, ranging from mild flu-like symptoms to coma and death. Despite the immense medical implications, the genetic and molecular basis of this diversity remains largely unknown. Studies of in vitro gene expression have found few transcriptional differences between different parasite strains.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
December 2007
Although the Human Genome Project was completed 4 years ago, the catalog of human protein-coding genes remains a matter of controversy. Current catalogs list a total of approximately 24,500 putative protein-coding genes. It is broadly suspected that a large fraction of these entries are functionally meaningless ORFs present by chance in RNA transcripts, because they show no evidence of evolutionary conservation with mouse or dog.
View Article and Find Full Text PDFThis article describes a set of alignments of 28 vertebrate genome sequences that is provided by the UCSC Genome Browser. The alignments can be viewed on the Human Genome Browser (March 2006 assembly) at http://genome.ucsc.
View Article and Find Full Text PDFSomatic alterations in cellular DNA underlie almost all human cancers. The prospect of targeted therapies and the development of high-resolution, genome-wide approaches are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas.
View Article and Find Full Text PDFWith several hundred genetic diseases and an advantageous genome structure, dogs are ideal for mapping genes that cause disease. Here we report the development of a genotyping array with approximately 27,000 SNPs and show that genome-wide association mapping of mendelian traits in dog breeds can be achieved with only approximately 20 dogs. Specifically, we map two traits with mendelian inheritance: the major white spotting (S) locus and the hair ridge in Rhodesian ridgebacks.
View Article and Find Full Text PDFSjöblom et al. (Research Article, 13 October 2006, p. 268) reported nearly 200 novel cancer genes said to have a 90% probability of being involved in colon or breast cancer.
View Article and Find Full Text PDFBackground: Multiple sclerosis has a clinically significant heritable component. We conducted a genomewide association study to identify alleles associated with the risk of multiple sclerosis.
Methods: We used DNA microarray technology to identify common DNA sequence variants in 931 family trios (consisting of an affected child and both parents) and tested them for association.