AI Article Synopsis

  • Genetic variation linked to complex traits is highly pleiotropic, meaning it affects multiple traits, which can be better understood through multi-phenotype analyses to identify shared and specific genetic factors.
  • Traditional matrix factorization (MF) methods struggle with issues like sample-sharing confounding and often yield factors too broad to map onto biological pathways, prompting a need for improvement.
  • The newly introduced method GLEANR effectively addresses these challenges by detecting sparse genetic factors from GWAS summary statistics, improves the replication of genetic factors across different studies, and offers clearer interpretations aligned with diseases and biological processes, as demonstrated through its evaluation of the UK Biobank.

Article Abstract

Complex trait-associated genetic variation is highly pleiotropic. This extensive pleiotropy implies that multi-phenotype analyses are informative for characterizing genetic associations, as they facilitate the discovery of trait-shared and trait-specific variants and pathways ("genetic factors"). Previous efforts have estimated genetic factors using matrix factorization (MF) applied to numerous GWAS. However, existing methods are susceptible to spurious factors arising from residual confounding due to sample-sharing in biobank GWAS. Furthermore, MF approaches have historically estimated dense factors, loaded on most traits and variants, that are challenging to map onto interpretable biological pathways. To address these shortcomings, we introduce "GWAS latent embeddings accounting for noise and regularization" (GLEANR), a MF method for detection of sparse genetic factors from summary statistics. GLEANR accounts for sample sharing between studies and uses regularization to estimate a data-driven number of interpretable factors. GLEANR is robust to confounding induced by shared samples and improves the replication of genetic factors derived from distinct biobanks. We used GLEANR to evaluate 137 diverse GWAS from the UK Biobank, identifying 58 factors that decompose the genetic architecture of input traits and have distinct signatures of negative selection and degrees of polygenicity. These sparse factors can be interpreted with respect to disease, cell-type, and pathway enrichment. We highlight three such factors capturing platelet measure phenotypes and enriched for disease-relevant markers corresponding to distinct stages of platelet differentiation. Overall, GLEANR is a powerful tool for discovering both trait-specific and trait-shared pathways underlying complex traits from GWAS summary statistics.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623536PMC
http://dx.doi.org/10.1101/2024.11.12.623313DOI Listing

Publication Analysis

Top Keywords

genetic factors
12
factors
9
matrix factorization
8
sample sharing
8
summary statistics
8
genetic
7
gwas
5
gleanr
5
sparse matrix
4
factorization robust
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!