Inflation of polygenic risk scores caused by sample overlap and relatedness: Examples of a major risk of bias.

Am J Hum Genet

Population Health and Immunity Division, the Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; Department of Medical Biology, University of Melbourne, Melbourne, VIC 3052, Australia. Electronic address:

Published: September 2024

Polygenic risk scores (PRSs) are an important tool for understanding the role of common genetic variants in human disease. Standard best practices recommend that PRSs be analyzed in cohorts that are independent of the genome-wide association study (GWAS) used to derive the scores without sample overlap or relatedness between the two cohorts. However, identifying sample overlap and relatedness can be challenging in an era of GWASs performed by large biobanks and international research consortia. Although most genomics researchers are aware of best practices and theoretical concerns about sample overlap and relatedness between GWAS and PRS cohorts, the prevailing assumption is that the risk of bias is small for very large GWASs. Here, we present two real-world examples demonstrating that sample overlap and relatedness is not a minor or theoretical concern but an important potential source of bias in PRS studies. Using a recently developed statistical adjustment tool, we found that excluding overlapping and related samples was equal to or more powerful than adjusting for overlap bias. Our goal is to make genomics researchers aware of the magnitude of risk of bias from sample overlap and relatedness and to highlight the need for mitigation tools, including independent validation cohorts in PRS studies, continued development of statistical adjustment methods, and tools for researchers to test their cohorts for overlap and relatedness with GWAS cohorts without sharing individual-level data.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393675PMC
http://dx.doi.org/10.1016/j.ajhg.2024.07.014DOI Listing

Publication Analysis

Top Keywords

overlap relatedness
28
sample overlap
24
risk bias
12
polygenic risk
8
risk scores
8
overlap
8
best practices
8
genomics researchers
8
researchers aware
8
relatedness gwas
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!