Identifying factors that affect treatment response is a central objective of clinical research, yet the role of common genetic variation remains largely unknown. Here, we develop a framework to study the genetic architecture of response to commonly prescribed drugs in large biobanks. We quantify treatment response heritability for statins, metformin, warfarin, and methotrexate in the UK Biobank.
View Article and Find Full Text PDFHuman subcutaneous adipose tissue (SAT) contains a diverse array of cell-types; however, the epigenomic landscape among the SAT cell-types has remained elusive. Our integrative analysis of single-cell resolution DNA methylation and chromatin conformation profiles (snm3C-seq), coupled with matching RNA expression (snRNA-seq), systematically cataloged the epigenomic, 3D topology, and transcriptomic dynamics across the SAT cell-types. We discovered that the SAT CG methylation (mCG) landscape is characterized by pronounced hyper-methylation in myeloid cells and hypo-methylation in adipocytes and adipose stem and progenitor cells (ASPCs), driving nearly half of the 705,063 detected differentially methylated regions (DMRs).
View Article and Find Full Text PDFIdentifying and refining clinically significant patient stratification is a critical step toward realizing the promise of precision medicine in asthma. Several peripheral blood hallmarks, including total peripheral blood eosinophil count (BEC) and immunoglobulin E (IgE) levels, are routinely used in asthma clinical practice for endotype classification and predicting response to state-of-the-art targeted biologic drugs. However, these biomarkers appear ineffective in predicting treatment outcomes in some patients, and they differ in distribution between racially and ethnically diverse populations, potentially compromising medical care and hindering health equity due to biases in drug eligibility.
View Article and Find Full Text PDFStructural variation (SV) refers to insertions, deletions, inversions, and duplications in human genomes. SVs are present in approximately 1.5% of the human genome.
View Article and Find Full Text PDFOur knowledge of the contribution of genetic interactions () to variation in human complex traits remains limited, partly due to the lack of efficient, powerful, and interpretable algorithms to detect interactions. Recently proposed approaches for set-based association tests show promise in improving the power to detect epistasis by examining the aggregated effects of multiple variants. Nevertheless, these methods either do not scale to large Biobank data sets or lack interpretability.
View Article and Find Full Text PDFSNP heritability, the proportion of phenotypic variation explained by genotyped SNPs, is an important parameter in understanding the genetic architecture underlying various diseases and traits. Methods that aim to estimate SNP heritability from individual genotype and phenotype data are limited by their ability to scale to Biobank-scale data sets and by the restrictions in access to individual-level data. These limitations have motivated the development of methods that only require summary statistics.
View Article and Find Full Text PDFUnderstanding the contribution of gene-environment interactions (GxE) to complex trait variation can provide insights into disease mechanisms, explain sources of heritability, and improve genetic risk prediction. While large biobanks with genetic and deep phenotypic data hold promise for obtaining novel insights into GxE, our understanding of GxE architecture in complex traits remains limited. We introduce a method to estimate the proportion of trait variance explained by GxE (GxE heritability) and additive genetic effects (additive heritability) across the genome and within specific genomic annotations.
View Article and Find Full Text PDFMultivariate analysis is becoming central in studies investigating high-throughput molecular data, yet, some important features of these data are seldom explored. Here, we present MANOCCA (Multivariate Analysis of Conditional CovAriance), a powerful method to test for the effect of a predictor on the covariance matrix of a multivariate outcome. The proposed test is by construction orthogonal to tests based on the mean and variance and is able to capture effects that are missed by both approaches.
View Article and Find Full Text PDFUnderstanding the contribution of gene-environment interactions (GxE) to complex trait variation can provide insights into mechanisms underlying disease risk, explain sources of heritability, and improve the accuracy of genetic risk prediction. While biobanks that collect genetic and deep phenotypic data over large numbers of individuals offer the promise of obtaining novel insights into GxE, our understanding of the architecture of GxE in complex traits remains limited. We introduce a method that can estimate the proportion of trait variance explained by GxE (GxE heritability) and additive genetic effects (additive heritability) across the genome and within specific genomic annotations.
View Article and Find Full Text PDFWe present SLIViT, a deep-learning framework that accurately measures disease-related risk factors in volumetric biomedical imaging, such as magnetic resonance imaging (MRI) scans, optical coherence tomography (OCT) scans, and ultrasound videos. To evaluate SLIViT, we applied it to five different datasets of these three different data modalities tackling seven learning tasks (including both classification and regression) and found that it consistently and significantly outperforms domain-specific state-of-the-art models, typically improving performance (ROC AUC or correlation) by 0.1-0.
View Article and Find Full Text PDFBiobanks that collect deep phenotypic and genomic data across many individuals have emerged as a key resource in human genetics. However, phenotypes in biobanks are often missing across many individuals, limiting their utility. We propose AutoComplete, a deep learning-based imputation method to impute or 'fill-in' missing phenotypes in population-scale biobank datasets.
View Article and Find Full Text PDFBiobanks often contain several phenotypes relevant to diseases such as major depressive disorder (MDD), with partly distinct genetic architectures. Researchers face complex tradeoffs between shallow (large sample size, low specificity/sensitivity) and deep (small sample size, high specificity/sensitivity) phenotypes, and the optimal choices are often unclear. Here we propose to integrate these phenotypes to combine the benefits of each.
View Article and Find Full Text PDFOver three percent of people carry a dominant pathogenic variant, yet only a fraction of carriers develop disease. Disease phenotypes from carriers of variants in the same gene range from mild to severe. Here, we investigate underlying mechanisms for this heterogeneity: variable variant effect sizes, carrier polygenic backgrounds, and modulation of carrier effect by genetic background (marginal epistasis).
View Article and Find Full Text PDFThe contribution of epistasis (interactions among genes or genetic variants) to human complex trait variation remains poorly understood. Methods that aim to explicitly identify pairs of genetic variants, usually single nucleotide polymorphisms (SNPs), associated with a trait suffer from low power due to the large number of hypotheses tested while also having to deal with the computational problem of searching over a potentially large number of candidate pairs. An alternate approach involves testing whether a single SNP modulates variation in a trait against a polygenic background.
View Article and Find Full Text PDFOur knowledge of non-linear genetic effects on complex traits remains limited, in part, due to the modest power to detect such effects. While kernel-based tests offer a versatile approach to test for non-linear relationships between sets of genetic variants and traits, current approaches cannot be applied to Biobank-scale datasets containing hundreds of thousands of individuals. We propose, FastKAST, a kernel-based approach that can test for non-linear effects of a set of variants on a quantitative trait.
View Article and Find Full Text PDFPopulation genetic studies often rely on artificial genomes (AGs) simulated by generative models of genetic data. In recent years, unsupervised learning models, based on hidden Markov models, deep generative adversarial networks, restricted Boltzmann machines, and variational autoencoders, have gained popularity due to their ability to generate AGs closely resembling empirical data. These models, however, present a tradeoff between expressivity and tractability.
View Article and Find Full Text PDFMendelian randomization (MR) has emerged as a powerful approach to leverage genetic instruments to infer causality between pairs of traits in observational studies. However, the results of such studies are susceptible to biases owing to weak instruments, as well as the confounding effects of population stratification and horizontal pleiotropy. Here, we show that family data can be leveraged to design MR tests that are provably robust to confounding from population stratification, assortative mating, and dynastic effects.
View Article and Find Full Text PDFIndividuals of admixed ancestries (for example, African Americans) inherit a mosaic of ancestry segments (local ancestry) originating from multiple continental ancestral populations. This offers the unique opportunity of investigating the similarity of genetic effects on traits across ancestries within the same population. Here we introduce an approach to estimate correlation of causal genetic effects (r) across local ancestries and analyze 38 complex traits in African-European admixed individuals (N = 53,001) to observe very high correlations (meta-analysis r = 0.
View Article and Find Full Text PDFThe genetic variants introduced into the ancestors of modern humans from interbreeding with Neanderthals have been suggested to contribute an unexpected extent to complex human traits. However, testing this hypothesis has been challenging due to the idiosyncratic population genetic properties of introgressed variants. We developed rigorous methods to assess the contribution of introgressed Neanderthal variants to heritable trait variation and applied these methods to analyze 235,592 introgressed Neanderthal variants and 96 distinct phenotypes measured in about 300,000 unrelated white British individuals in the UK Biobank.
View Article and Find Full Text PDFMendelian Randomization (MR) has emerged as a powerful approach to leverage genetic instruments to infer causality between pairs of traits in observational studies. However, the results of such studies are susceptible to biases due to weak instruments as well as the confounding effects of population stratification and horizontal pleiotropy. Here, we show that family data can be leveraged to design MR tests that are provably robust to confounding from population stratification, assortative mating, and dynastic effects.
View Article and Find Full Text PDFDefining and accounting for subphenotypic structure has the potential to increase statistical power and provide a deeper understanding of the heterogeneity in the molecular basis of complex disease. Existing phenotype subtyping methods primarily rely on clinically observed heterogeneity or metadata clustering. However, they generally tend to capture the dominant sources of variation in the data, which often originate from variation that is not descriptive of the mechanistic heterogeneity of the phenotype of interest; in fact, such dominant sources of variation, such as population structure or technical variation, are, in general, expected to be independent of subphenotypic structure.
View Article and Find Full Text PDFAdaptive introgression (AI) facilitates local adaptation in a wide range of species. Many state-of-the-art methods detect AI with ad-hoc approaches that identify summary statistic outliers or intersect scans for positive selection with scans for introgressed genomic regions. Although widely used, approaches intersecting outliers are vulnerable to a high false-negative rate as the power of different methods varies, especially for complex introgression events.
View Article and Find Full Text PDFThe observation of genetic correlations between disparate human traits has been interpreted as evidence of widespread pleiotropy. Here, we introduce cross-trait assortative mating (xAM) as an alternative explanation. We observe that xAM affects many phenotypes and that phenotypic cross-mate correlation estimates are strongly associated with genetic correlation estimates (=74%).
View Article and Find Full Text PDF