Publications by authors named "N Leroy"

The amount of genomic region data continues to increase. Integrating across diverse genomic region sets requires consensus regions, which enable comparing regions across experiments, but also by necessity lose precision in region definitions. We require methods to assess this loss of precision and build optimal consensus region sets.

View Article and Find Full Text PDF
Article Synopsis
  • Representation learning models are essential in genomics, creating vector representations (or embeddings) of biological entities like cells and genes.
  • Unsupervised methods can uncover relationships among genomic regions and derive meaningful insights without relying on curated metadata.
  • To assess the quality of these region embeddings, four evaluation metrics are proposed: CTS, RCS, GDSS, and NPS, which measure clustering ability and how well genomic relationships are captured in the embeddings.
View Article and Find Full Text PDF

Background: As biological data increase, we need additional infrastructure to share them and promote interoperability. While major effort has been put into sharing data, relatively less emphasis is placed on sharing metadata. Yet, sharing metadata is also important and in some ways has a wider scope than sharing data themselves.

View Article and Find Full Text PDF

Data from the single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) are now widely available. One major computational challenge is dealing with high dimensionality and inherent sparsity, which is typically addressed by producing lower dimensional representations of single cells for downstream clustering tasks. Current approaches produce such individual cell embeddings directly through a one-step learning process.

View Article and Find Full Text PDF

Motivation: Gene set enrichment (GSE) analysis allows for an interpretation of gene expression through pre-defined gene set databases and is a critical step in understanding different phenotypes. With the rapid development of single-cell RNA sequencing (scRNA-seq) technology, GSE analysis can be performed on fine-grained gene expression data to gain a nuanced understanding of phenotypes of interest. However, with the cellular heterogeneity in single-cell gene profiles, current statistical GSE analysis methods sometimes fail to identify enriched gene sets.

View Article and Find Full Text PDF