Propensity score matching enables batch-effect-corrected imputation in single-cell RNA-seq analysis.

Brief Bioinform

School of Statistics and Data Science, Key Laboratory for Medical Data Analysis and Statistical Research of Tianjin, Nankai University, Tianjin 300071,  China.

Published: July 2022

AI Article Synopsis

  • Recent advancements in single-cell RNA sequencing (scRNA-seq) technologies have allowed for significant biological discoveries but come with challenges like batch effects and dropouts, which complicate data analysis.
  • Current methods often only fix one issue at a time, showing limitations in managing both batch effect and dropout simultaneously.
  • A new method called scPSM utilizes propensity score matching to effectively correct batch effects, impute dropout data, and denoise scRNA-seq datasets, demonstrating superior performance in maintaining cell type separation and preserving biological structures in various test cases.

Article Abstract

Developments of single-cell RNA sequencing (scRNA-seq) technologies have enabled biological discoveries at the single-cell resolution with high throughput. However, large scRNA-seq datasets always suffer from massive technical noises, including batch effects and dropouts, and the dropout is often shown to be batch-dependent. Most existing methods only address one of the problems, and we show that the popularly used methods failed in trading off batch effect correction and dropout imputation. Here, inspired by the idea of causal inference, we propose a novel propensity score matching method for scRNA-seq data (scPSM) by borrowing information and taking the weighted average from similar cells in the deep sequenced batch, which simultaneously removes the batch effect, imputes dropout and denoises data in the entire gene expression space. The proposed method is testified on two simulation datasets and a variety of real scRNA-seq datasets, and the results show that scPSM is superior to other state-of-the-art methods. First, scPSM improves clustering accuracy and mixes cells of the same type, suggesting its ability to keep cell type separation while correcting for batch. Besides, using the scPSM-integrated data as input yields results free of batch effects or dropouts in the differential expression analysis. Moreover, scPSM not only achieves ideal denoising but also preserves real biological structure for downstream gene-based analyses. Furthermore, scPSM is robust to hyperparameters and small datasets with a few cells but enormous genes. Comprehensive evaluations demonstrate that scPSM jointly provides desirable batch effect correction, imputation and denoising for recovering the biologically meaningful expression in scRNA-seq data.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbac275DOI Listing

Publication Analysis

Top Keywords

propensity score
8
score matching
8
scrna-seq datasets
8
batch effects
8
effects dropouts
8
batch correction
8
scrna-seq data
8
batch
7
scpsm
6
scrna-seq
5

Similar Publications

Background: Real-world COVID-19 vaccine effectiveness (VE) studies are investigating exposures of increasing complexity accounting for time since vaccination. These studies require methods that adjust for the confounding that arises when morbidities and demographics are associated with vaccination and the risk of outcome events. Methods based on propensity scores (PS) are well-suited to this when the exposure is dichotomous, but present challenges when the exposure is multinomial.

View Article and Find Full Text PDF

Background: Sex-specific differences in stroke risk factors, clinical presentation, and outcomes are well documented. However, little is known about real-world differences in transient ischemic attack (TIA) hospitalizations and outcomes between men and women.

Methods: This was a retrospective cohort study of the 2016 to 2021 Nationwide Readmissions Database in the United States.

View Article and Find Full Text PDF

Background: Hospitals face mounting pressure to reduce unplanned utilization amid rising healthcare demands from an aging population. The Case management for At-Risk patients in the Emergency Department (CARED) program is among the first ED transitional care strategies to focus on both frail older adults and emergency department (ED) re-attenders to reduce acute hospital utilization. This study aims to evaluate the effectiveness of the CARED program in reducing hospital (re)admissions and ED re-attendances within 30- and 60 days post-discharge.

View Article and Find Full Text PDF

Background: The impact of aortic arch (AA) morphology on the management of the procedural details and the clinical outcomes of the transfemoral artery (TF)-transcatheter aortic valve replacement (TAVR) has not been evaluated. The goal of this study was to evaluate the AA morphology of patients who had TF-TAVR using an artificial intelligence algorithm and then to evaluate its predictive value for clinical outcomes.

Materials And Methods: A total of 1480 consecutive patients undergoing TF-TAVR using a new-generation transcatheter heart valve at 12 institutes were included in this retrospective study.

View Article and Find Full Text PDF

Importance: Medication nonadherence imposes high morbidity, mortality, and costs but is challenging to address given its multiple causes. Subscription models are increasingly used in health care to encourage healthy behaviors; in January 2023, Amazon Pharmacy launched RxPass, a subscription program offering Amazon Prime members (hereafter, company members) in 45 states access to 60 common generic medications for a flat $5 monthly fee.

Objective: To evaluate the associations of program enrollment with medication refills, days' supply, and out-of-pocket costs.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!