Publications by authors named "Ofir Lindenbaum"

The brain is an intricate system that controls a variety of functions. It consists of a vast number of cells that exhibit diverse characteristics. To understand brain function in health and disease, it is crucial to classify neurons accurately.

View Article and Find Full Text PDF

People with HIV (PWH) on antiretroviral therapy (ART) experience elevated rates of neurological impairment, despite controlling for demographic factors and comorbidities, suggesting viral or neuroimmune etiologies for these deficits. Here, we apply multimodal and cross-compartmental single-cell analyses of paired cerebrospinal fluid (CSF) and peripheral blood in PWH and uninfected controls. We demonstrate that a subset of central memory CD4+ T cells in the CSF produced HIV-1 RNA, despite apparent systemic viral suppression, and that HIV-1-infected cells were more frequently found in the CSF than in the blood.

View Article and Find Full Text PDF

Modern datasets often contain large subsets of correlated features and nuisance features, which are not or loosely related to the main underlying structures of the data. Nuisance features can be identified using the Laplacian score criterion, which evaluates the importance of a given feature via its consistency with the Graph Laplacians' leading eigenvectors. We demonstrate that in the presence of large numbers of nuisance features, the Laplacian must be computed on the subset of selected features rather than on the complete feature set.

View Article and Find Full Text PDF

Word2vec introduced by Mikolov et al. is a word embedding method that is widely used in natural language processing. Despite its success and frequent use, a strong theoretical justification is still lacking.

View Article and Find Full Text PDF

A low-dimensional dynamical system is observed in an experiment as a high-dimensional signal, for example, a video of a chaotic pendulums system. Assuming that we know the dynamical model up to some unknown parameters, can we estimate the underlying system's parameters by measuring its time-evolution only once? The key information for performing this estimation lies in the temporal inter-dependencies between the signal and the model. We propose a kernel-based score to compare these dependencies.

View Article and Find Full Text PDF

Context.—: Large cell transformation (LCT) of indolent B-cell lymphomas, such as follicular lymphoma (FL) and chronic lymphocytic leukemia (CLL), signals a worse prognosis, at which point aggressive chemotherapy is initiated. Although LCT is relatively straightforward to diagnose in lymph nodes, a marrow biopsy is often obtained first given its ease of procedure, low cost, and low morbidity.

View Article and Find Full Text PDF

Comprehensive and accurate comparisons of transcriptomic distributions of cells from samples taken from two different biological states, such as healthy versus diseased individuals, are an emerging challenge in single-cell RNA sequencing (scRNA-seq) analysis. Current methods for detecting differentially abundant (DA) subpopulations between samples rely heavily on initial clustering of all cells in both samples. Often, this clustering step is inadequate since the DA subpopulations may not align with a clear cluster structure, and important differences between the two biological states can be missed.

View Article and Find Full Text PDF

Following antigenic challenge, activated B cells rapidly expand and undergo somatic hypermutation, yielding groups of clonally related B cells with diversified immunoglobulin receptors. Inference of clonal relationships based on the receptor sequence is an essential step in many adaptive immune receptor repertoire sequencing studies. These relationships are typically identified by a multi-step process that involves: (i) grouping sequences based on shared V and J gene assignments, and junction lengths and (ii) clustering these sequences using a junction-based distance.

View Article and Find Full Text PDF

We propose a local conformal autoencoder (LOCA) for standardized data coordinates. LOCA is a deep learning-based method for obtaining standardized data coordinates from scientific measurements. Data observations are modeled as samples from an unknown, nonlinear deformation of an underlying Riemannian manifold, which is parametrized by a few normalized, latent variables.

View Article and Find Full Text PDF

Kernel methods play a critical role in many machine learning algorithms. They are useful in manifold learning, classification, clustering and other data analysis tasks. Setting the kernel's scale parameter, also referred to as the kernel's bandwidth, highly affects the performance of the task in hand.

View Article and Find Full Text PDF