Motivation: Polygenic scoring is an approach for estimating an individual's likelihood of a given outcome. Polygenic scores are typically calculated from genome-wide association study (GWAS) summary statistics and individual-level genotype data for the target sample. Going from genotype to interpretable polygenic scores involves many steps and there are many methods available, limiting the accessibility of polygenic scores for research and clinical application. Additional challenges exist for studies in ancestrally diverse populations. We have implemented the leading polygenic scoring methodologies within an easy-to-use pipeline called GenoPred.
Results: Here, we present the GenoPred pipeline, an easy-to-use, high-performance, reference-standardized, and reproducible workflow for polygenic scoring. It requires minimal inputs and offers various configuration options to cater to a range of use cases. GenoPred implements a comprehensive set of analyses, including genotype and GWAS quality control, target sample ancestry inference, polygenic score file generation using a range of leading methods, and target sample scoring. GenoPred standardizes the polygenic scoring process using reference genetic data, providing interpretable polygenic scores. The pipeline is applicable to GWAS and targets data from any population within the reference, facilitating studies of diverse ancestry. GenoPred is a Snakemake pipeline with associated Conda software environments, ensuring reproducibility. We apply the pipeline to UK Biobank data demonstrating the pipeline's simplicity, efficiency, and performance. The GenoPred pipeline provides a novel resource for polygenic scoring, integrating a range of complex processes within an easy-to-use framework. GenoPred widens access to the leading polygenic scoring methodology and their application to studies of diverse ancestry.
Availability And Implementation: Freely available on the web at https://github.com/opain/GenoPred.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462442 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btae551 | DOI Listing |
Breast Cancer Res
December 2024
Biostatistics Unit, The Cyprus Institute of Neurology and Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus.
Background: The 313-variant polygenic risk score (PRS) provides a promising tool for clinical breast cancer risk prediction. However, evaluation of the PRS across different European populations which could influence risk estimation has not been performed.
Methods: We explored the distribution of PRS across European populations using genotype data from 94,072 females without breast cancer diagnosis, of European-ancestry from 21 countries participating in the Breast Cancer Association Consortium (BCAC) and 223,316 females without breast cancer diagnosis from the UK Biobank.
Brain Behav Immun
December 2024
Beijing Hui-Long-Guan Hospital, Peking University, Beijing 100096, China. Electronic address:
Essential hypertension (EH) with secondary insomnia is associated with increased risks of neuroinflammation, neuronal damage, and Alzheimer's disease (AD). However, its relationship with specific cerebrospinal fluid (CSF) biomarkers of neuronal damage and neuroinflammation remains unclear. This case-control study compared CSF biomarker levels across three groups: healthy controls (HC, n = 64), hypertension-controlled (HTN-C, n = 54), and hypertension-uncontrolled (HTN-U, n = 107) groups, all EH participants experiencing secondary insomnia.
View Article and Find Full Text PDFClin Transl Gastroenterol
December 2024
Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, USA.
Introduction: United States Multi-Society Task Force colonoscopy surveillance intervals are based solely on adenoma characteristics, without accounting for other risk factors. We investigated whether a risk model including demographic, environmental, and genetic risk factors could individualize surveillance intervals under an "equal management of equal risks" framework.
Methods: Using 14,069 individuals from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial who had a diagnostic colonoscopy following an abnormal flexible sigmoidoscopy, we modeled the risk of colorectal cancer, considering the diagnostic colonoscopy finding, baseline risk factors (e.
Genet Med
December 2024
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN; Center for Digital Genomic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN; Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN. Electronic address:
Purpose: The value of genetic information for improving the performance of clinical risk prediction models has yielded variable conclusions. Many methodological decisions have the potential to contribute to differential results. We performed multiple modeling experiments integrating clinical and demographic data from electronic health records (EHR) with genetic data to understand which decisions may affect performance.
View Article and Find Full Text PDFBiochem Genet
December 2024
College of Medical Laboratory, Dalian Medical University, Dalian, 116044, People's Republic of China.
This study aims to establish a genetic risk assessment model based on a score of short tandem repeats (STRs) of polygenic inheritance. A total of 396 children and their biological parents were collected for STR genotyping. The numbers of tandem repeats of two alleles in one STR locus were assumed to be a quantitative genetic strength for disease incidence.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!