Penalized regression with multiple sources of prior effects.

Bioinformatics

Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4362 Esch-sur-Alzette, Luxembourg.

Published: December 2023

Motivation: In many high-dimensional prediction or classification tasks, complementary data on the features are available, e.g. prior biological knowledge on (epi)genetic markers. Here we consider tasks with numerical prior information that provide an insight into the importance (weight) and the direction (sign) of the feature effects, e.g. regression coefficients from previous studies.

Results: We propose an approach for integrating multiple sources of such prior information into penalized regression. If suitable co-data are available, this improves the predictive performance, as shown by simulation and application.

Availability And Implementation: The proposed method is implemented in the R package transreg (https://github.com/lcsb-bds/transreg, https://cran.r-project.org/package=transreg).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10699841PMC
http://dx.doi.org/10.1093/bioinformatics/btad680DOI Listing

Publication Analysis

Top Keywords

penalized regression
8
multiple sources
8
sources prior
8
regression multiple
4
prior
4
prior effects
4
effects motivation
4
motivation high-dimensional
4
high-dimensional prediction
4
prediction classification
4

Similar Publications

HighDimMixedModels.jl: Robust high-dimensional mixed-effects models across omics data.

PLoS Comput Biol

January 2025

Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.

High-dimensional mixed-effects models are an increasingly important form of regression in which the number of covariates rivals or exceeds the number of samples, which are collected in groups or clusters. The penalized likelihood approach to fitting these models relies on a coordinate descent algorithm that lacks guarantees of convergence to a global optimum. Here, we empirically study the behavior of this algorithm on simulated and real examples of three types of data that are common in modern biology: transcriptome, genome-wide association, and microbiome data.

View Article and Find Full Text PDF

Unraveling the potential mechanism and prognostic value of pentose phosphate pathway in hepatocellular carcinoma: a comprehensive analysis integrating bulk transcriptomics and single-cell sequencing data.

Funct Integr Genomics

January 2025

Institute of Infectious Diseases, Guangdong Province, Guangzhou Eighth People's Hospital, Guangzhou Medical University, 8 Huaying Road, Baiyun District, Guangzhou, 510440, China.

Hepatocellular carcinoma (HCC) remains a malignant and life-threatening tumor with an extremely poor prognosis, posing a significant global health challenge. Despite the continuous emergence of novel therapeutic agents, patients exhibit substantial heterogeneity in their responses to anti-tumor drugs and overall prognosis. The pentose phosphate pathway (PPP) is highly activated in various tumor cells and plays a pivotal role in tumor metabolic reprogramming.

View Article and Find Full Text PDF

Transfer learning aims to integrate useful information from multi-source datasets to improve the learning performance of target data. This can be effectively applied in genomics when we learn the gene associations in a target tissue, and data from other tissues can be integrated. However, heavy-tail distribution and outliers are common in genomics data, which poses challenges to the effectiveness of current transfer learning approaches.

View Article and Find Full Text PDF

Formulas to estimate dietary sodium intake from spot urine lead to misleading associations with cardiovascular disease risk and mortality.

J Hypertens

January 2025

Centre for Public Health & Policy, Wolfson Institute of Population Health, Barts and The London School of Medicine & Dentistry, Queen Mary University of London, London, United Kingdom.

Objectives: To test the hypothesis that the association of formula-estimated sodium intake from spot urine with cardiovascular disease is independent of spot urinary sodium concentration.

Methods: We included 435 336 participants in the UK Biobank whose sodium intake was estimated from spot urine using INTERSALT, Kawasaki, and Tanaka formulas. Hazard ratios for cardiovascular disease (CVD) events and deaths were estimated using Cox proportional-hazard model adjusted for multiple covariates.

View Article and Find Full Text PDF

The predictive value of anti-IFI16 antibodies for the development or persistence of digital ulcers in systemic sclerosis.

Clin Rheumatol

January 2025

Department of Rheumatology, Huashan Hospital, Fudan University, No.12 Wulumuqi Zhong Road, Shanghai, 200040, China.

To evaluate the association of anti-IFI16 antibodies with peripheral vasculopathy and the predictive value of anti-IFI16 antibodies for the development or persistence of digital ulcers (DPDU) in SSc. A total of 42 SSc patients and 42 age- and sex-matched healthy controls were enrolled. Anti-IFI16 antibodies were examined by ELISA.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!