Reducing premature mortality associated with age-related chronic diseases, such as cancer and cardiovascular disease, is an urgent priority. We report early results using genomics in combination with advanced imaging and other clinical testing to proactively screen for age-related chronic disease risk among adults. We enrolled active, symptom-free adults in a study of screening for age-related chronic diseases associated with premature mortality.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
September 2017
Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
July 2017
The HLA gene complex on human chromosome 6 is one of the most polymorphic regions in the human genome and contributes in large part to the diversity of the immune system. Accurate typing of HLA genes with short-read sequencing data has historically been difficult due to the sequence similarity between the polymorphic alleles. Here, we introduce an algorithm, xHLA, that iteratively refines the mapping results at the amino acid level to achieve 99-100% four-digit typing accuracy for both class I and II HLA genes, taking only [Formula: see text]3 min to process a 30× whole-genome BAM file on a desktop computer.
View Article and Find Full Text PDFThe characterization of the blood virome is important for the safety of blood-derived transfusion products, and for the identification of emerging pathogens. We explored non-human sequence data from whole-genome sequencing of blood from 8,240 individuals, none of whom were ascertained for any infectious disease. Viral sequences were extracted from the pool of sequence reads that did not map to the human reference genome.
View Article and Find Full Text PDFGenetic factors modifying the blood metabolome have been investigated through genome-wide association studies (GWAS) of common genetic variants and through exome sequencing. We conducted a whole-genome sequencing study of common, low-frequency and rare variants to associate genetic variations with blood metabolite levels using comprehensive metabolite profiling in 1,960 adults. We focused the analysis on 644 metabolites with consistent levels across three longitudinal data collections.
View Article and Find Full Text PDFWe report on the sequencing of 10,545 human genomes at 30×-40× coverage with an emphasis on quality metrics and novel variant and sequence discovery. We find that 84% of an individual human genome can be sequenced confidently. This high-confidence region includes 91.
View Article and Find Full Text PDFThe increasing volume of ChIP-chip and ChIP-seq data being generated creates a challenge for standard, integrative and reproducible bioinformatics data analysis platforms. We developed a web-based application called Cistrome, based on the Galaxy open source framework. In addition to the standard Galaxy functions, Cistrome has 29 ChIP-chip- and ChIP-seq-specific tools in three major categories, from preliminary peak calling and correlation analyses to downstream genome feature association, gene expression analyses, and motif discovery.
View Article and Find Full Text PDFGene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods.
View Article and Find Full Text PDFIn chronic-phase chronic myeloid leukemia (CML) patients, the lack of a major cytogenetic response (< 36% Ph(+) metaphases) to imatinib within 12 months indicates failure and mandates a change of therapy. To identify biomarkers predictive of imatinib failure, we performed gene expression array profiling of CD34(+) cells from 2 independent cohorts of imatinib-naive chronic-phase CML patients. The learning set consisted of retrospectively selected patients with a complete cytogenetic response or more than 65% Ph(+) metaphases within 12 months of imatinib therapy.
View Article and Find Full Text PDFBackground: Genomic hybridization platforms, including BAC-CGH and genotyping arrays, have been used to estimate chromosome copy number (CN) in tumor samples by detecting the relative strength of genomic signal. The methods rely on the assumption that the predominant chromosomal background of the samples is diploid, an assumption that is frequently incorrect for tumor samples. In addition to generally greater resolution, an advantage of genotyping arrays over CGH arrays is the ability to detect signals from individual alleles, allowing estimation of loss-of-heterozygosity (LOH) and allelic ratios to enhance the interpretation of copy number alterations.
View Article and Find Full Text PDFBackground: Reproducibility is a fundamental requirement in scientific experiments. Some recent publications have claimed that microarrays are unreliable because lists of differentially expressed genes (DEGs) are not reproducible in similar experiments. Meanwhile, new statistical methods for identifying DEGs continue to appear in the scientific literature.
View Article and Find Full Text PDFWarfarin is an effective, commonly prescribed anticoagulant used to treat and prevent thrombotic events. Because of historically high rates of drug-associated adverse events, warfarin remains underprescribed. Further, interindividual variability in therapeutic dose mandates frequent monitoring until target anticoagulation is achieved.
View Article and Find Full Text PDFIdentification of microbial pathogens in clinical specimens is still performed by phenotypic methods that are often slow and cumbersome, despite the availability of more comprehensive genotyping technologies. We present an approach based on whole-genome amplification and resequencing microarrays for unbiased pathogen detection. This 10 h process identifies a broad spectrum of bacterial and viral species and predicts antibiotic resistance and pathogenicity and virulence profiles.
View Article and Find Full Text PDFWe report a method, Expression-Microarray Copy Number Analysis (ECNA) for the detection of copy number changes using Affymetrix Human Genome U133 Plus 2.0 arrays, starting with as little as 5 ng input genomic DNA. An analytical approach was developed using DNA isolated from cell lines containing various X-chromosome numbers, and validated with DNA from cell lines with defined deletions and amplifications in other chromosomal locations.
View Article and Find Full Text PDFWe have undertaken an extensive high-resolution analysis of loss of heterozygosity (LOH) in 30 high grade gliomas using the Affymetrix 100K SNP mapping array. Only 70% of LOH events were accompanied by a copy number loss (CNA(loss)), and of the other 30%, the distal region of 17p preferentially showed copy number neutral (CNN)-associated LOH. Combined analysis of CNA(loss) and LOH using MergeLevels analysis software predicts whether the observed losses occurred on a diploid or tetraploid background.
View Article and Find Full Text PDFBackground: The interpretability of microarray data can be affected by sample quality. To systematically explore how RNA quality affects microarray assay performance, a set of rat liver RNA samples with a progressive change in RNA integrity was generated by thawing frozen tissue or by ex vivo incubation of fresh tissue over a time course.
Results: Incubation of tissue at 37 degrees C for several hours had little effect on RNA integrity, but did induce changes in the transcript levels of stress response genes and immune cell markers.
Exposure to arsenic causes cancer by inducing a variety of responses that affect the expression of genes associated with numerous biological pathways leading to altered cell growth and proliferation, signaling, apoptosis and oxidative stress response. Affymetrix GeneChip arrays were used to detect gene expression changes following dimethylarsinic acid (DMA) exposure to human bladder cells (UROtsa) or rat bladder cells (MYP3) and rat bladder epithelium in vivo at comparable doses. Using different experimental models coupled with transcriptional profiling allowed investigation of the correlation of mechanisms of DMA-induced toxicity between in vitro and in vivo treatment and across species.
View Article and Find Full Text PDFCopy number abnormalities (CNAs) in tumor cells are presumed to affect expression levels of genes located in region of abnormality. To investigate this relationship we have surveyed the losses, gains and amplifications in 30 glioblastomas using array comparative genome hybridization and compared these data with gene expression changes in the same tumors using the Affymetrix U133Plus2.0 oligonucleotide arrays.
View Article and Find Full Text PDFThe Overlay Tool has been developed to combine high throughput data derived from various microarray platforms. This tool analyzes high-resolution correlations between gene expression changes and either copy number abnormalities (CNAs) or loss of heterozygosity events detected using array comparative genomic hybridization (aCGH). Using an overlay analysis which is designed to be performed using data from multiple microarray platforms on a single biological sample, the Overlay Tool identifies potentially important genes whose expression profiles are changed as a result of losses, gains and amplifications in the cancer genome.
View Article and Find Full Text PDFBackground: Alternative splicing is a mechanism for increasing protein diversity by excluding or including exons during post-transcriptional processing. Alternatively spliced proteins are particularly relevant in oncology since they may contribute to the etiology of cancer, provide selective drug targets, or serve as a marker set for cancer diagnosis. While conventional identification of splice variants generally targets individual genes, we present here a new exon-centric array (GeneChip Human Exon 1.
View Article and Find Full Text PDFOver the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues.
View Article and Find Full Text PDFWe have assessed the utility of RNA titration samples for evaluating microarray platform performance and the impact of different normalization methods on the results obtained. As part of the MicroArray Quality Control project, we investigated the performance of five commercial microarray platforms using two independent RNA samples and two titration mixtures of these samples. Focusing on 12,091 genes common across all platforms, we determined the ability of each platform to detect the correct titration response across the samples.
View Article and Find Full Text PDFCavities on a proteins surface as well as specific amino acid positioning within it create the physicochemical properties needed for a protein to perform its function. CASTp (http://cast.engr.
View Article and Find Full Text PDFThe comparability and reliability of data generated using microarray technology would be enhanced by use of a common set of standards that allow accuracy, reproducibility and dynamic range assessments on multiple formats. We designed and tested a complex biological reagent for performance measurements on three commercial oligonucleotide array formats that differ in probe design and signal measurement methodology. The reagent is a set of two mixtures with different proportions of RNA for each of four rat tissues (brain, liver, kidney and testes).
View Article and Find Full Text PDFReplication and transcription activator (RTA), an immediate-early gene product of gamma-2 herpesviruses including Kaposi's sarcoma-associated herpesvirus (KSHV) and murine gamma herpesvirus 68 (MHV-68), plays a critical role in controlling the viral life cycle. RTA acts as a strong transcription activator for several downstream genes of KSHV and MHV-68 through direct DNA binding, as well as via indirect mechanisms. HMGB1 (also called HMG-1) protein is a highly conserved nonhistone chromatin protein with the ability to bind and bend DNA.
View Article and Find Full Text PDF