A key challenge in analyzing single-cell RNA sequencing data is the large number of false zeros, known as "dropout zeros", which are caused by technical limitations such as shallow sequencing depth or inefficient mRNA capture. To address this challenge, we propose a novel imputation model called CPARI, which combines cell partitioning with our designed absolute and relative imputation methods. Initially, CPARI employs a new approach to select highly variable genes and constructs an average consensus matrix using C-mean fuzzy clustering-based blockchain technology to obtain results at different resolutions. Hierarchical clustering is then applied to further refine these blocks, resulting in well-defined cellular partitions. Subsequently, CPARI identifies dropout events and determines the imputation positions of these identified zeros. An autoencoder is trained within each cellular block to learn gene features and reconstruct data. Our uniquely defined absolute imputation technique is first applied to the identified positions, followed by our relative imputation technique to address remaining dropout zeros, ensuring that both global consistency and local variation are maintained. Through comprehensive analyses conducted on simulated and real scRNA-seq datasets, including quantitative assessment, differential expression analysis, cell clustering, cell trajectory inference, robustness evaluation, and large-scale data imputation, CPARI demonstrates superior performance compared to 12 other art-of-state imputation models. Additionally, ablation experiments further confirm the significance and necessity of both the cell partitioning and relative imputation components of CPARI. Notably, CPARI as a new denoising approach could distinguish between real biological zeros and dropout zeros and minimize false positives, and maximize the accuracy of imputation.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11666288 | PMC |
http://dx.doi.org/10.1093/bib/bbae668 | DOI Listing |
J Clin Nurs
January 2025
Department of Nursing, Faculty of Health Sciences, University of Jaén, Jaén, Spain.
Background And Objectives: Although a substantial amount of research has focused on negative aspects of caregiving, less research has been conducted investigating positive aspects of providing informal care. The aim of this study was to investigate the longitudinal association between caregiving satisfaction and psychological distress in informal carers of dependent older people, and whether this relationship is mediated by caregiver burden.
Research Design And Methods: Prospective longitudinal study with a probabilistic sample of 332 caregivers of older relatives, with data collected at baseline and at 1-year follow-up.
Plant Genome
March 2025
Plant Breeding Graduate Program, Horticultural Sciences Department, University of Florida, IFAS Gulf Coast Research and Education Center, Wimauma, Florida, USA.
Genomic selection is a widely used quantitative method of determining the genetic value of an individual from genomic information and phenotypic data. In this study, we used a large, multi-year training population of 3248 individuals from the University of Florida strawberry (Fragaria × ananassa Duchesne) breeding program. We coupled this training population with a test population of 1460 individuals derived from 20 biparental families.
View Article and Find Full Text PDFInt J Mol Sci
December 2024
Biological and Chemical Research Centre, Faculty of Chemistry, University of Warsaw, Zwirki i Wigury 101, 02-089 Warsaw, Poland.
Mass-spectrometry-based proteomics frequently utilizes label-free quantification strategies due to their cost-effectiveness, methodological simplicity, and capability to identify large numbers of proteins within a single analytical run. Despite these advantages, the prevalence of missing values (MV), which can impact up to 50% of the data matrix, poses a significant challenge by reducing the accuracy, reproducibility, and interpretability of the results. Consequently, effective handling of missing values is crucial for reliable quantitative analysis in proteomic studies.
View Article and Find Full Text PDFBehav Sci (Basel)
December 2024
College of Public Health, University of Georgia, Athens, GA 30602, USA.
Poor Self-Rated Health (SRHp) is part of a four-item scale for self-assessment. SRH from the 2019 Behavioral Risk Factor Surveillance Survey (BRFSS) is used to test hypotheses linking population-level well-being influenced by bereavement due to the death of a close friend or relative. By linking the prevalence rates of population-level well-being with exposure to bereavement, we extend our knowledge of this exposure beyond single-person studies.
View Article and Find Full Text PDFMach Learn
October 2024
Division of Biostatistics and Health Data Science, School of Public Health, University of Minnesota, Minneapolis, 55455, MN, USA.
Data for several applications in diverse fields can be represented as multiple matrices that are linked across rows or columns. This is particularly common in molecular biomedical research, in which multiple molecular "omics" technologies may capture different feature sets (e.g.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!