Purpose: Supplementing investigator-specified variables with large numbers of empirically identified features that collectively serve as 'proxies' for unspecified or unmeasured factors can often improve confounding control in studies utilizing administrative healthcare databases. Consequently, there has been a recent focus on the development of data-driven methods for high-dimensional proxy confounder adjustment in pharmacoepidemiologic research. In this paper, we survey current approaches and recent advancements for high-dimensional proxy confounder adjustment in healthcare database studies.
Methods: We discuss considerations underpinning three areas for high-dimensional proxy confounder adjustment: (1) feature generation-transforming raw data into covariates (or features) to be used for proxy adjustment; (2) covariate prioritization, selection, and adjustment; and (3) diagnostic assessment. We discuss challenges and avenues of future development within each area.
Results: There is a large literature on methods for high-dimensional confounder prioritization/selection, but relatively little has been written on best practices for feature generation and diagnostic assessment. Consequently, these areas have particular limitations and challenges.
Conclusions: There is a growing body of evidence showing that machine-learning algorithms for high-dimensional proxy-confounder adjustment can supplement investigator-specified variables to improve confounding control compared to adjustment based on investigator-specified variables alone. However, more research is needed on best practices for feature generation and diagnostic assessment when applying methods for high-dimensional proxy confounder adjustment in pharmacoepidemiologic studies.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9541861 | PMC |
http://dx.doi.org/10.1002/pds.5500 | DOI Listing |
HardwareX
December 2024
University of Plymouth, School of Biological and Marine Sciences, Drake Circus, Plymouth, PL4 8AA, England.
J Clin Transl Sci
September 2024
iREACH, ChristianaCare Health Services, Inc., Newark, DE, USA.
Objective: The progression of long-term diabetes complications has led to a decreased quality of life. Our objective was to evaluate the adverse outcomes associated with diabetes based on a patient's clinical profile by utilizing a multistate modeling approach.
Methods: This was a retrospective study of diabetes patients seen in primary care practices from 2013 to 2017.
JAMA Neurol
August 2024
Department of Neurology, University of Pennsylvania, Philadelphia.
Importance: Direct-acting oral anticoagulants (DOACs) are commonly prescribed with antiseizure medications (ASMs) due to concurrency of and the association between atrial fibrillation (AF) and epilepsy. However, enzyme-inducing (EI) ASMs may reduce absorption and accelerate metabolism of DOACs, potentially lowering DOAC levels and elevating thromboembolism risk.
Objective: To assess the rates of thromboembolic and major bleeding events in adults with AF and epilepsy dispensed DOACs and EI ASMs vs DOACs with non-EI ASMs.
An expansive area of research focuses on discerning patterns of alterations in functional brain networks from the early stages of Alzheimer's disease, even at the subjective cognitive decline (SCD) stage. Here, we developed a novel hyperbolic MEG brain network embedding framework for transforming high-dimensional complex MEG brain networks into lower-dimensional hyperbolic representations. Using this model, we computed hyperbolic embeddings of the MEG brain networks of two distinct participant groups: individuals with SCD and healthy controls.
View Article and Find Full Text PDFPLoS Comput Biol
May 2024
Department of Applied Mathematics, University of Colorado, Boulder, Colorado, United States of America.
Random forests have emerged as a promising tool in comparative metagenomics because they can predict environmental characteristics based on microbial composition in datasets where β-diversity metrics fall short of revealing meaningful relationships between samples. Nevertheless, despite this efficacy, they lack biological insight in tandem with their predictions, potentially hindering scientific advancement. To overcome this limitation, we leverage a geometric characterization of random forests to introduce a data-driven phylogenetic β-diversity metric, the adaptive Haar-like distance.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!