Background: Genetic risk scores (GRS) summarize genetic features such as single nucleotide polymorphisms (SNPs) in a single statistic with respect to a given trait. So far, GRS are typically built using generalized linear models or regularized extensions. However, these linear methods are usually not able to incorporate gene-gene interactions or non-linear SNP-response relationships. Tree-based statistical learning methods such as random forests and logic regression may be an alternative to such regularized-regression-based methods and are investigated in this article. Moreover, we consider modifications of random forests and logic regression for the construction of GRS.

Results: In an extensive simulation study and an application to a real data set from a German cohort study, we show that both tree-based approaches can outperform elastic net when constructing GRS for binary traits. Especially a modification of logic regression called logic bagging could induce comparatively high predictive power as measured by the area under the curve and the statistical power. Even when considering no epistatic interaction effects but only marginal genetic effects, the regularized regression method lead in most cases to inferior results.

Conclusions: When constructing GRS, we recommend taking random forests and logic bagging into account, in particular, if it can be assumed that possibly unknown epistasis between SNPs is present. To develop the best possible prediction models, extensive joint hyperparameter optimizations should be conducted.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8935722PMC
http://dx.doi.org/10.1186/s12859-022-04634-wDOI Listing

Publication Analysis

Top Keywords

random forests
12
forests logic
12
logic regression
12
tree-based statistical
8
statistical learning
8
learning methods
8
genetic risk
8
risk scores
8
constructing grs
8
logic bagging
8

Similar Publications

Objective: This study aimed to develop a simple machine-learning model incorporating lymph node metastasis status with F-18 Fluorodeoxyglucose positron emission tomography/computed tomography (FDG PET/CT) and clinical information for predicting regional lymph node metastasis in patients with colon cancer.

Methods: This retrospective study included 193 patients diagnosed with colon cancer between January 2014 and December 2017. All patients underwent F-18 FDG PET/CT and blood test before surgery.

View Article and Find Full Text PDF

Background: This study aimed to develop a dynamic survival prediction model utilizing conditional survival (CS) analysis and machine learning techniques for gastric neuroendocrine carcinomas (GNECs).

Patients And Methods: Data from the Surveillance, Epidemiology, and End Results (SEER) database (2004-2015) were analyzed and split into training and validation groups (7:3 ratio). CS profiles for patients with GNEC were examined in the full cohort.

View Article and Find Full Text PDF

Objectives: Rheumatoid arthritis (RA) is a chronic, destructive autoimmune disorder predominantly targeting the joints, with gut microbiota dysbiosis being intricately associated with its progression. The aim of the present study was to develop of effective early diagnostic methods for early RA based on gut microbiota.

Methods: A cohort comprising 262 RA patients and 475 healthy controls (HCs) was recruited.

View Article and Find Full Text PDF

An intelligent decision-making system for embryo transfer in reproductive technology: a machine learning-based approach.

Syst Biol Reprod Med

December 2025

Department of Mathematics and Computer Science, Laboratory of Analysis, Modeling and Simulation, Faculty of Sciences Ben M'sik, Hassan II University of Casablanca, Casablanca, Morocco.

Infertility has emerged as a significant public health concern, with assisted reproductive technology (ART) is a last-resort treatment option. However, ART's efficacy is limited by significant financial cost and physical discomfort. The aim of this study is to build Machine learning (ML) decision-support models to predict the optimal range of embryo numbers to transfer, using data from infertile couples identified through literature reviews.

View Article and Find Full Text PDF

Objective: We aimed to explore the application value of unsupervised machine learning in identifying acute gastrointestinal injury (AGI) after extracorporeal circulation for acute type A aortic dissection (ATAAD).

Methods: Patients who underwent extracorporeal circulation for ATAAD at the First Hospital of Lanzhou University from January 2016 to January 2021 were included. Unsupervised machine learning algorithm was used to stratify patients into different phenogroups according to the similarity of their clinical features and laboratory test results.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!