Publications by authors named "Gleb Kichaev"

Despite strong transethnic genetic correlations reported in the literature for many complex traits, the non-transferability of polygenic risk scores across populations suggests the presence of population-specific components of genetic architecture. We propose an approach that models GWAS summary data for one trait in two populations to estimate genome-wide proportions of population-specific/shared causal SNPs. In simulations across various genetic architectures, we show that our approach yields approximately unbiased estimates with in-sample LD and slight upward-bias with out-of-sample LD.

View Article and Find Full Text PDF

Regulatory variation plays a major role in complex disease and that cell type-specific binding of transcription factors (TF) is critical to gene regulation. However, assessing the contribution of genetic variation in TF-binding sites to disease heritability is challenging, as binding is often cell type-specific and annotations from directly measured TF binding are not currently available for most cell type-TF pairs. We investigate approaches to annotate TF binding, including directly measured chromatin data and sequence-based predictions.

View Article and Find Full Text PDF

Although genome-wide association studies (GWAS) have identified hundreds of risk loci for breast and prostate cancer, only a few studies have characterized the GWAS association signals across functional genomic annotations with a particular focus on single nucleotide polymorphisms (SNPs) located in DNA regulatory elements. In this study, we investigated the enrichment pattern of GWAS signals for breast and prostate cancer in genomic functional regions located in normal tissue and cancer cell lines. We quantified the overall enrichment of SNPs with breast and prostate cancer association p values < 1 × 10 across regulatory categories.

View Article and Find Full Text PDF

Transcriptome-wide association studies using predicted expression have identified thousands of genes whose locally regulated expression is associated with complex traits and diseases. In this work, we show that linkage disequilibrium induces significant gene-trait associations at non-causal genes as a function of the expression quantitative trait loci weights used in expression prediction. We introduce a probabilistic framework that models correlation among transcriptome-wide association study signals to assign a probability for every gene in the risk region to explain the observed association signal.

View Article and Find Full Text PDF

Functional genomics data has the potential to increase GWAS power by identifying SNPs that have a higher prior probability of association. Here, we introduce a method that leverages polygenic functional enrichment to incorporate coding, conserved, regulatory, and LD-related genomic annotations into association analyses. We show via simulations with real genotypes that the method, functionally informed novel discovery of risk loci (FINDOR), correctly controls the false-positive rate at null loci and attains a 9%-38% increase in the number of independent associations detected at causal loci, depending on trait polygenicity and sample size.

View Article and Find Full Text PDF

Although recent studies provide evidence for a common genetic basis between complex traits and Mendelian disorders, a thorough quantification of their overlap in a phenotype-specific manner remains elusive. Here, we have quantified the overlap of genes identified through large-scale genome-wide association studies (GWASs) for 62 complex traits and diseases with genes containing mutations known to cause 20 broad categories of Mendelian disorders. We identified a significant enrichment of genes linked to phenotypically matched Mendelian disorders in GWAS gene sets; of the total 1,240 comparisons, a higher proportion of phenotypically matched or related pairs (n = 50 of 92 [54%]) than phenotypically unmatched pairs (n = 27 of 1,148 [2%]) demonstrated significant overlap, confirming a phenotype-specific enrichment pattern.

View Article and Find Full Text PDF

Nearly 100 loci have been identified for pulmonary function, almost exclusively in studies of European ancestry populations. We extend previous research by meta-analyzing genome-wide association studies of 1000 Genomes imputed variants in relation to pulmonary function in a multiethnic population of 90,715 individuals of European (N = 60,552), African (N = 8429), Asian (N = 9959), and Hispanic/Latino (N = 11,775) ethnicities. We identify over 50 additional loci at genome-wide significance in ancestry-specific or multiethnic meta-analyses.

View Article and Find Full Text PDF

Biobank-based genome-wide association studies are enabling exciting insights in complex trait genetics, but much uncertainty remains over best practices for optimizing statistical power and computational efficiency in GWAS while controlling confounders. Here, we introduce a much faster version of our BOLT-LMM Bayesian mixed model association method—capable of running analyses of the full UK Biobank cohort in a few days on a single compute node—and show that it produces highly powered, robust test statistics when run on all 459K European samples (retaining related individuals). When used to conduct a GWAS for height in UK Biobank, BOLT-LMM achieved power equivalent to linear regression on 650K samples—a 93% increase in effective sample size versus the common practice of analyzing unrelated British samples using linear regression (UK Biobank documentation; Bycroft et al.

View Article and Find Full Text PDF

Recent studies have identified thousands of regions in the genome associated with chromatin modifications, which may in turn be affecting gene expression. Existing works have used heuristic methods to investigate the relationships between genome, epigenome, and gene expression, but, to our knowledge, none have explicitly modeled the chain of causality whereby genetic variants impact chromatin, which impacts gene expression. In this work we introduce a new hierarchical fine-mapping framework that integrates information across all three levels of data to better identify the causal variant and chromatin mark that are concordantly influencing gene expression.

View Article and Find Full Text PDF

Recent successes in genome-wide association studies (GWASs) make it possible to address important questions about the genetic architecture of complex traits, such as allele frequency and effect size. One lesser-known aspect of complex traits is the extent of allelic heterogeneity (AH) arising from multiple causal variants at a locus. We developed a computational method to infer the probability of AH and applied it to three GWASs and four expression quantitative trait loci (eQTL) datasets.

View Article and Find Full Text PDF

Genome-wide association studies have identified over 150 loci associated with lipid traits, however, no large-scale studies exist for Hispanics and other minority populations. Additionally, the genetic architecture of lipid-influencing loci remains largely unknown. We performed one of the most racially/ethnically diverse fine-mapping genetic studies of HDL-C, LDL-C, and triglycerides to-date using SNPs on the MetaboChip array on 54,119 individuals: 21,304 African Americans, 19,829 Hispanic Americans, 12,456 Asians, and 530 American Indians.

View Article and Find Full Text PDF

Motivation: Expression quantitative trait loci (eQTLs), genetic variants associated with gene expression levels, are identified in eQTL mapping studies. Such studies typically test for an association between single nucleotide polymorphisms (SNPs) and expression under an additive model, which ignores interaction and haplotypic effects. Mismatches between the model tested and the underlying genetic architecture can lead to a loss of association power.

View Article and Find Full Text PDF

Although genome-wide association studies (GWASs) have identified thousands of risk loci for many complex traits and diseases, the causal variants and genes at these loci remain largely unknown. Here, we introduce a method for estimating the local genetic correlation between gene expression and a complex trait and utilize it to estimate the genetic correlation due to predicted expression between pairs of traits. We integrated gene expression measurements from 45 expression panels with summary GWAS data to perform 30 multi-tissue transcriptome-wide association studies (TWASs).

View Article and Find Full Text PDF

Motivation: Genome-wide association studies (GWAS) have identified thousands of regions in the genome that contain genetic variants that increase risk for complex traits and diseases. However, the variants uncovered in GWAS are typically not biologically causal, but rather, correlated to the true causal variant through linkage disequilibrium (LD). To discern the true causal variant(s), a variety of statistical fine-mapping methods have been proposed to prioritize variants for functional validation.

View Article and Find Full Text PDF

Variance-component methods that estimate the aggregate contribution of large sets of variants to the heritability of complex traits have yielded important insights into the genetic architecture of common diseases. Here, we introduce methods that estimate the total trait variance explained by the typed variants at a single locus in the genome (local SNP heritability) from genome-wide association study (GWAS) summary data while accounting for linkage disequilibrium among variants. We applied our estimator to ultra-large-scale GWAS summary data of 30 common traits and diseases to gain insights into their local genetic architecture.

View Article and Find Full Text PDF

Although genome-wide association studies have identified over 100 risk loci that explain ∼33% of familial risk for prostate cancer (PrCa), their functional effects on risk remain largely unknown. Here we use genotype data from 59,089 men of European and African American ancestries combined with cell-type-specific epigenetic data to build a genomic atlas of single-nucleotide polymorphism (SNP) heritability in PrCa. We find significant differences in heritability between variants in prostate-relevant epigenetic marks defined in normal versus tumour tissue as well as between tissue and cell lines.

View Article and Find Full Text PDF

Localization of causal variants underlying known risk loci is one of the main research challenges following genome-wide association studies. Risk loci are typically dissected through fine-mapping experiments in trans-ethnic cohorts for leveraging the variability in the local genetic structure across populations. More recent works have shown that genomic functional annotations (i.

View Article and Find Full Text PDF

Motivation: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD.

View Article and Find Full Text PDF

Recent breakthroughs in exome-sequencing technology have made possible the identification of many causal variants of monogenic disorders. Although extremely powerful when closely related individuals (eg, child and parents) are simultaneously sequenced, sequencing of a single case is often unsuccessful due to the large number of variants that need to be followed up for functional validation. Many approaches filter out common variants above a given frequency threshold (eg, 1%), and then prioritize the remaining variants according to their functional, structural and conservation properties.

View Article and Find Full Text PDF

Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus.

View Article and Find Full Text PDF

In vivo electroporation (EP) has been shown to be a highly efficient non-viral method for enhancing DNA vaccine delivery and immunogenicity, when the site of immunization is the skin or muscle of animals and humans. However, the route of entry for many microbial pathogens is via the mucosal surfaces of the human body. We have previously reported on minimally invasive, surface and contactless EP devices for enhanced DNA delivery to dermal tissue.

View Article and Find Full Text PDF

Electroporation (EP) of mammalian tissue is a technique that has been used successfully in the clinic for the delivery of genetic-based vaccines in the form of DNA plasmids. There is great interest in platforms which efficiently deliver RNA molecules such as messenger RNA and small interfering RNA (siRNA) to mammalian tissue. However, the in vivo delivery of RNA enhanced by EP has not been extensively characterized.

View Article and Find Full Text PDF

The skin is an attractive tissue for vaccination in a clinical setting due to the accessibility of the target, the ease of monitoring and most importantly the immune competent nature of the dermal tissue. While skin electroporation offers an exciting and novel future methodology for the delivery of DNA vaccines in the clinic, little is known about the actual mechanism of the approach and the elucidation of the resulting immune responses. To further understand the mechanism of this platform, the expression kinetics and localization of a reporter plasmid delivered via a surface dermal electroporation (SEP) device as well as the effect that this treatment would have on the resident immune cells in that tissue was investigated.

View Article and Find Full Text PDF

Nucleic acid-based vaccines such as viral vectors, plasmid DNA (pDNA), and mRNA are being developed as a means to address limitations of both live-attenuated and subunit vaccines. DNA vaccines have been shown to be potent in a wide variety of animal species and several products are now licensed for commercial veterinary but not human use. Electroporation delivery technologies have been shown to improve the generation of T and B cell responses from synthetic DNA vaccines in many animal species and now in humans.

View Article and Find Full Text PDF

In vivo electroporation (EP) is an efficient nonviral method for enhancing DNA vaccine delivery and immunogenicity in animals and humans. Intradermal delivery of DNA vaccines is an attractive strategy because of the immunocompetence of skin tissue. We have previously reported a minimally invasive surface intradermal EP (SEP) device for delivery of prophylactic DNA vaccines.

View Article and Find Full Text PDF