Purpose: Supplementing investigator-specified variables with large numbers of empirically identified features that collectively serve as 'proxies' for unspecified or unmeasured factors can often improve confounding control in studies utilizing administrative healthcare databases. Consequently, there has been a recent focus on the development of data-driven methods for high-dimensional proxy confounder adjustment in pharmacoepidemiologic research. In this paper, we survey current approaches and recent advancements for high-dimensional proxy confounder adjustment in healthcare database studies.

Methods: We discuss considerations underpinning three areas for high-dimensional proxy confounder adjustment: (1) feature generation-transforming raw data into covariates (or features) to be used for proxy adjustment; (2) covariate prioritization, selection, and adjustment; and (3) diagnostic assessment. We discuss challenges and avenues of future development within each area.

Results: There is a large literature on methods for high-dimensional confounder prioritization/selection, but relatively little has been written on best practices for feature generation and diagnostic assessment. Consequently, these areas have particular limitations and challenges.

Conclusions: There is a growing body of evidence showing that machine-learning algorithms for high-dimensional proxy-confounder adjustment can supplement investigator-specified variables to improve confounding control compared to adjustment based on investigator-specified variables alone. However, more research is needed on best practices for feature generation and diagnostic assessment when applying methods for high-dimensional proxy confounder adjustment in pharmacoepidemiologic studies.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9541861PMC
http://dx.doi.org/10.1002/pds.5500DOI Listing

Publication Analysis

Top Keywords

high-dimensional proxy
20
proxy confounder
20
confounder adjustment
20
investigator-specified variables
12
methods high-dimensional
12
diagnostic assessment
12
adjustment
9
adjustment healthcare
8
healthcare database
8
improve confounding
8

Similar Publications

Article Synopsis
  • * Aquatic embryos serve as ideal models for phenomics because they are diverse, ecologically relevant, and undergo significant changes, making them suitable for in-depth study.
  • * The LabEmbryoCam is a versatile, automated phenotyping platform designed for timelapse imaging of aquatic embryos, incorporating advanced technologies like 3D printing and motion systems for detailed analysis such as heart rate and motion tracking.
View Article and Find Full Text PDF

Objective: The progression of long-term diabetes complications has led to a decreased quality of life. Our objective was to evaluate the adverse outcomes associated with diabetes based on a patient's clinical profile by utilizing a multistate modeling approach.

Methods: This was a retrospective study of diabetes patients seen in primary care practices from 2013 to 2017.

View Article and Find Full Text PDF

Importance: Direct-acting oral anticoagulants (DOACs) are commonly prescribed with antiseizure medications (ASMs) due to concurrency of and the association between atrial fibrillation (AF) and epilepsy. However, enzyme-inducing (EI) ASMs may reduce absorption and accelerate metabolism of DOACs, potentially lowering DOAC levels and elevating thromboembolism risk.

Objective: To assess the rates of thromboembolic and major bleeding events in adults with AF and epilepsy dispensed DOACs and EI ASMs vs DOACs with non-EI ASMs.

View Article and Find Full Text PDF

An expansive area of research focuses on discerning patterns of alterations in functional brain networks from the early stages of Alzheimer's disease, even at the subjective cognitive decline (SCD) stage. Here, we developed a novel hyperbolic MEG brain network embedding framework for transforming high-dimensional complex MEG brain networks into lower-dimensional hyperbolic representations. Using this model, we computed hyperbolic embeddings of the MEG brain networks of two distinct participant groups: individuals with SCD and healthy controls.

View Article and Find Full Text PDF

Interpretable metric learning in comparative metagenomics: The adaptive Haar-like distance.

PLoS Comput Biol

May 2024

Department of Applied Mathematics, University of Colorado, Boulder, Colorado, United States of America.

Random forests have emerged as a promising tool in comparative metagenomics because they can predict environmental characteristics based on microbial composition in datasets where β-diversity metrics fall short of revealing meaningful relationships between samples. Nevertheless, despite this efficacy, they lack biological insight in tandem with their predictions, potentially hindering scientific advancement. To overcome this limitation, we leverage a geometric characterization of random forests to introduce a data-driven phylogenetic β-diversity metric, the adaptive Haar-like distance.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!