Spatially mapping the transcriptome and proteome in the same tissue section can significantly advance our understanding of heterogeneous cellular processes and connect cell type to function. Here, we present Deterministic Barcoding in Tissue sequencing plus (DBiTplus), an integrative multi-modality spatial omics approach that combines sequencing-based spatial transcriptomics and image-based spatial protein profiling on the same tissue section to enable both single-cell resolution cell typing and genome-scale interrogation of biological pathways. DBiTplus begins with reverse transcription for cDNA synthesis, microfluidic delivery of DNA oligos for spatial barcoding, retrieval of barcoded cDNA using RNaseH, an enzyme that selectively degrades RNA in an RNA-DNA hybrid, preserving the intact tissue section for high-plex protein imaging with CODEX.
View Article and Find Full Text PDFSpatially mapping the transcriptome and proteome in the same tissue section can significantly advance our understanding of heterogeneous cellular processes and connect cell type to function. Here, we present Deterministic Barcoding in Tissue sequencing plus (DBiTplus), an integrative multi-modality spatial omics approach that combines sequencing-based spatial transcriptomics and image-based spatial protein profiling on the same tissue section to enable both single-cell resolution cell typing and genome-scale interrogation of biological pathways. DBiTplus begins with reverse transcription for cDNA synthesis, microfluidic delivery of DNA oligos for spatial barcoding, retrieval of barcoded cDNA using RNaseH, an enzyme that selectively degrades RNA in an RNA-DNA hybrid, preserving the intact tissue section for high-plex protein imaging with CODEX.
View Article and Find Full Text PDFData integration to align cells across batches has become a cornerstone of single-cell data analysis, critically affecting downstream results. Currently, there are no guidelines for when the biological differences between samples are separable from batch effects. Here we show that current paradigms for single-cell data integration remove biologically meaningful variation and introduce distortion.
View Article and Find Full Text PDFCell population delineation and identification is an essential step in single-cell and spatial-omics studies. Spatial-omics technologies can simultaneously measure information from three complementary domains related to this task: expression levels of a panel of molecular biomarkers at single-cell resolution, relative positions of cells, and images of tissue sections, but existing computational methods for performing this task on single-cell spatial-omics datasets often relinquish information from one or more domains. The additional reliance on the availability of "atlas" training or reference datasets limits cell type discovery to well-defined but limited cell population labels, thus posing major challenges for using these methods in practice.
View Article and Find Full Text PDFAlthough single-cell and spatial sequencing methods enable simultaneous measurement of more than one biological modality, no technology can capture all modalities within the same cell. For current data integration methods, the feasibility of cross-modal integration relies on the existence of highly correlated, a priori 'linked' features. We describe matching X-modality via fuzzy smoothed embedding (MaxFuse), a cross-modal data integration method that, through iterative coembedding, data smoothing and cell matching, uses all information in each modality to obtain high-quality integration even when features are weakly linked.
View Article and Find Full Text PDFThe intestine is a complex organ that promotes digestion, extracts nutrients, participates in immune surveillance, maintains critical symbiotic relationships with microbiota and affects overall health. The intesting has a length of over nine metres, along which there are differences in structure and function. The localization of individual cell types, cell type development trajectories and detailed cell transcriptional programs probably drive these differences in function.
View Article and Find Full Text PDFData integration to align cells across batches has become a cornerstone of single cell data analysis, critically affecting downstream results. Yet, how much biological signal is erased during integration? Currently, there are no guidelines for when the biological differences between samples are separable from batch effects, and thus, data integration usually involve a lot of guesswork: Cells across batches should be aligned to be "appropriately" mixed, while preserving "main cell type clusters". We show evidence that current paradigms for single cell data integration are unnecessarily aggressive, removing biologically meaningful variation.
View Article and Find Full Text PDFsingle-cell sequencing methods have enabled the profiling of multiple types of molecular readouts at cellular resolution, and recent developments in spatial barcoding, in situ hybridization, and in situ sequencing allow such molecular readouts to retain their spatial context. Since no technology can provide complete characterization across all layers of biological modalities within the same cell, there is pervasive need for computational cross-modal integration (also called diagonal integration) of single-cell and spatial omics data. For current methods, the feasibility of cross-modal integration relies on the existence of highly correlated, a priori "linked" features.
View Article and Find Full Text PDFThe ability to align individual cellular information from multiple experimental sources is fundamental for a systems-level understanding of biological processes. However, currently available tools are mainly designed for single-cell transcriptomics matching and integration, and generally rely on a large number of shared features across datasets for cell matching. This approach underperforms when applied to single-cell proteomic datasets due to the limited number of parameters simultaneously accessed and lack of shared markers across these experiments.
View Article and Find Full Text PDFBrain networks are increasingly characterized at different scales, including summary statistics, community connectivity, and individual edges. While research relating brain networks to behavioral measurements has yielded many insights into brain-phenotype relationships, common analytical approaches only consider network information at a single scale. Here, we designed, implemented, and deployed Multi-Scale Network Regression (MSNR), a penalized multivariate approach for modeling brain networks that explicitly respects both edge- and community-level information by assuming a low rank and sparse structure, both encouraging less complex and more interpretable modeling.
View Article and Find Full Text PDFNeurobiological abnormalities associated with psychiatric disorders do not map well to existing diagnostic categories. High co-morbidity suggests dimensional circuit-level abnormalities that cross diagnoses. Here we seek to identify brain-based dimensions of psychopathology using sparse canonical correlation analysis in a sample of 663 youths.
View Article and Find Full Text PDFJ R Stat Soc Series B Stat Methodol
September 2017
Continuous treatments (e.g., doses) arise often in practice, but many available causal effect estimators are limited by either requiring parametric models for the effect curve, or by not allowing doubly robust covariate adjustment.
View Article and Find Full Text PDFProbab Theory Relat Fields
April 2015
This paper considers a sparse spiked covariancematrix model in the high-dimensional setting and studies the minimax estimation of the covariance matrix and the principal subspace as well as the minimax rank detection. The optimal rate of convergence for estimating the spiked covariance matrix under the spectral norm is established, which requires significantly different techniques from those for estimating other structured covariance matrices such as bandable or sparse covariance matrices. We also establish the minimax rate under the spectral norm for estimating the principal subspace, the primary object of interest in principal component analysis.
View Article and Find Full Text PDFA new formulation for the construction of adaptive confidence bands in non-parametric function estimation problems is proposed. Confidence bands are constructed which have size that adapts to the smoothness of the function while guaranteeing that both the relative excess mass of the function lying outside the band and the measure of the set of points where the function lies outside the band are small. It is shown that the bands adapt over a maximum range of Lipschitz classes.
View Article and Find Full Text PDFWe study the rate of convergence for the largest eigenvalue distributions in the Gaussian unitary and orthogonal ensembles to their Tracy-Widom limits. We show that one can achieve an () rate with particular choices of the centering and scaling constants. The arguments here also shed light on more complicated cases of Laguerre and Jacobi ensembles, in both unitary and orthogonal versions.
View Article and Find Full Text PDFChain graphs present a broad class of graphical models for description of conditional independence structures, including both Markov networks and Bayesian networks as special cases. In this paper, we propose a computationally feasible method for the structural learning of chain graphs based on the idea of decomposing the learning problem into a set of smaller scale problems on its decomposed subgraphs. The decomposition requires conditional independencies but does not require the separators to be complete subgraphs.
View Article and Find Full Text PDFClear cell renal cell carcinoma (ccRCC) is the most common malignancy of the adult kidney and displays heterogeneity in clinical outcomes. Through comprehensive gene expression profiling, we have identified previously a set of transcripts that predict survival following nephrectomy independent of tumor stage, grade, and performance status. These transcripts, designated as the SPC (supervised principal components) gene set, show no apparent biological or genetic features that provide insight into renal carcinogenesis or tumor progression.
View Article and Find Full Text PDF