Fast and accurate imputation of summary statistics enhances evidence of functional enrichment.

Bioinformatics

Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, 90024, Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, 90024, Department of Medicine, Lung Biology Center, University of California San Francisco, San Francisco, 94143, Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, 02115, Departments of Epidemiology and Biostatistics, Harvard School of Public Health, Boston, MA, 02115, Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, 02142, Department of Genetics Harvard Medical School, Boston, MA, 02115 and Division of Population Health Sciences and Education, St George's, University of London, UK Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, 90024, Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, 90024, Department of Medicine, Lung Biology Center, University of California San Francisco, San Francisco, 94143, Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, 02115, Departments of Epidemiology and Biostatistics, Harvard School of Public Health, Boston, MA, 02115, Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, 02142, Department of Genetics Harvard Medical School, Boston, MA, 02115 and Division of Population Health Sciences and Education, St George's, University of London, UK Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, 90024, Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, 90024, Department of Medicine, Lung Biology Center, University of California San Francisco, San Francisco, 94143, Program in Genetic Epidemiology and Statistical Genetics, Har

Published: October 2014

Motivation: Imputation using external reference panels (e.g. 1000 Genomes) is a widely used approach for increasing power in genome-wide association studies and meta-analysis. Existing hidden Markov models (HMM)-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available.

Results: In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants [increasing to 87% (60%) when summary linkage disequilibrium information is available from target samples] versus the gold standard of 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and it is computationally very fast. As an empirical demonstration, we apply our method to seven case-control phenotypes from the Wellcome Trust Case Control Consortium (WTCCC) data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of [Formula: see text] association statistics) compared with HMM-based imputation from individual-level genotypes at the 227 (176) published single nucleotide polymorphisms (SNPs) in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of four lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic versus non-genic loci for these traits, as compared with an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses.

Availability And Implementation: Publicly available software package available at http://bogdan.bioinformatics.ucla.edu/software/.

Contact: bpasaniuc@mednet.ucla.edu or aprice@hsph.harvard.edu

Supplementary Information: Supplementary materials are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4184260PMC
http://dx.doi.org/10.1093/bioinformatics/btu416DOI Listing

Publication Analysis

Top Keywords

summary statistics
24
imputation summary
16
hmm-based imputation
12
sample size
12
imputation
10
summary
8
statistics
8
functional enrichment
8
1000 genomes
8
individual-level genotypes
8

Similar Publications

Background: Immunosuppression might increase the risk of skin cancer in organ transplant recipients (OTRs), with azathioprine (AZA), exerting a fundamental role in the carcinogenesis of those tumors. This systematic review and meta-analysis aims to address the risk of developing malignant skin neoplasms in OTRs undergoing immunosuppression with AZA.

Methods: PubMed, Cochrane and Embase were searched for studies with OTRs who have a treatment regimen involving Azathioprine therapy after transplantation and that analyzed the emergence of skin neoplasia.

View Article and Find Full Text PDF

Biomarkers play a pivotal role in the selection and enrollment of trial participants. Particularly, predictive biomarkers help tailor medical care to individual patients; however, also prognostic biomarkers require consideration at the design stage. At the time of initiating a clinical trial, there may be uncertainty about whether a biomarker is predictive or prognostic, and the trial design may need to account for this.

View Article and Find Full Text PDF

Objective: To determine referral patterns for psychiatric consultations among COVID-19 patients encompassing both the in-patient and Emergency Department of a multidisciplinary hospital in Karachi, Pakistan.

Study Design: A retrospective chart review. Place and Duration of the Study: The Aga Khan University Hospital, Karachi, Pakistan, from March 2020 to December 2021.

View Article and Find Full Text PDF

Objective: To investigate the causal influence of gut microbiota on small cell lung cancer (SCLC) progression using Mendelian randomisation (MR), providing insights into the gut-lung axis in lung cancer pathology.

Study Design: Analytical study. Place and Duration of the Study: Department of Radiotherapy, Binhai County People's Hospital, Yancheng, Jiangsu, China, and Department of Paediatrics, General Hospital of Ningxia Medical University, Yinchuan, Ningxia, China, from January to May 2024.

View Article and Find Full Text PDF

The interplay of sex and genotype in disease associations: a comprehensive network analysis in the UK Biobank.

Hum Genomics

January 2025

Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Richards Building B304, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA.

Background: Disease comorbidities and longer-term complications, arising from biologically related associations across phenotypes, can lead to increased risk of severe health outcomes. Given that many diseases exhibit sex-specific differences in their genetics, our objective was to determine whether genotype-by-sex (GxS) interactions similarly influence cross-phenotype associations. Through comparison of sex-stratified disease-disease networks (DDNs)-where nodes represent diseases and edges represent their relationships-we investigate sex differences in patterns of polygenicity and pleiotropy between diseases.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!