The proliferation of single-cell RNA-seq data has greatly enhanced our ability to comprehend the intricate nature of diverse tissues. However, accurately annotating cell types in such data, especially when handling multiple reference datasets and identifying novel cell types, remains a significant challenge. To address these issues, we introduce Single Cell annotation based on Distance metric learning and Optimal Transport (scDOT), an innovative cell-type annotation method adept at integrating multiple reference datasets and uncovering previously unseen cell types. scDOT introduces two key innovations. First, by incorporating distance metric learning and optimal transport, it presents a novel optimization framework. This framework effectively learns the predictive power of each reference dataset for new query data and simultaneously establishes a probabilistic mapping between cells in the query data and reference-defined cell types. Secondly, scDOT develops an interpretable scoring system based on the acquired probabilistic mapping, enabling the precise identification of previously unseen cell types within the data. To rigorously assess scDOT's capabilities, we systematically evaluate its performance using two diverse collections of benchmark datasets encompassing various tissues, sequencing technologies and diverse cell types. Our experimental results consistently affirm the superior performance of scDOT in cell-type annotation and the identification of previously unseen cell types. These advancements provide researchers with a potent tool for precise cell-type annotation, ultimately enriching our understanding of complex biological tissues.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10939303PMC
http://dx.doi.org/10.1093/bib/bbae072DOI Listing

Publication Analysis

Top Keywords

cell types
32
cell-type annotation
12
unseen cell
12
cell
9
single-cell rna-seq
8
rna-seq data
8
novel cell
8
types
8
types data
8
multiple reference
8

Similar Publications

sp. nov. isolated from flowers of winter savoury L.

Int J Syst Evol Microbiol

January 2025

Laboratorio de Bacterias Lcticas y Probiticos, Instituto de Agroqumica y Tecnologa de Alimentos (IATA-CSIC), Av. Agustn Escardino 7, 46980 Paterna, Spain.

A novel strain of the genus , named He02, was isolated from flowers of L. in a survey for lactic acid bacteria associated with wild and cultivated plants in the metropolitan area of Valencia, Spain. Partial 16S rRNA gene sequencing revealed a similarity of 99% to DSM 23037=Ryu1-2.

View Article and Find Full Text PDF

Background: Recent research has revealed the potential value of machine learning (ML) models in improving prognostic prediction for patients with trauma. ML can enhance predictions and identify which factors contribute the most to posttraumatic mortality. However, no studies have explored the risk factors, complications, and risk prediction of preoperative and postoperative traumatic coagulopathy (PPTIC) in patients with trauma.

View Article and Find Full Text PDF

Previous studies have reported that chronic lymphocytic leukemia (CLL) shows a de novo chromatin activation pattern as compared to normal B cells. Here, we explored whether the level of chromatin activation is related to the clinical behavior of CLL. We identified that in some regulatory regions, increased de novo chromatin activation is linked to clinical progression whereas, in other regions, it is associated with an indolent course.

View Article and Find Full Text PDF

Stemness-associated cell states are linked to chemotherapy resistance in AML. We uncovered a direct mechanistic link between expression of the stem cell transcription factor GATA2 and drug resistance. The GATA-binding protein 2 (GATA2) plays a central role in blood stem cell generation and maintenance.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!