With the increased use of gene expression profiling for personalized oncology, optimized RNA sequencing (RNA-seq) protocols and algorithms are necessary to provide comparable expression measurements between exome capture (EC)-based and poly-A RNA-seq. Here, we developed and optimized an EC-based protocol for processing formalin-fixed, paraffin-embedded samples and a machine-learning algorithm, Procrustes, to overcome batch effects across RNA-seq data obtained using different sample preparation protocols like EC-based or poly-A RNA-seq protocols. Applying Procrustes to samples processed using EC and poly-A RNA-seq protocols showed the expression of 61% of genes (N = 20,062) to correlate across both protocols (concordance correlation coefficient > 0.
View Article and Find Full Text PDFIntratumor heterogeneity (ITH) represents a major challenge for anticancer therapies. An integrated, multidimensional, multiregional approach dissecting ITH of the clear cell renal cell carcinoma (ccRCC) tumor microenvironment (TME) is employed at the single-cell level with mass cytometry (CyTOF), multiplex immunofluorescence (MxIF), and single-nucleus RNA sequencing (snRNA-seq) and at the bulk level with whole-exome sequencing (WES), RNA-seq, and methylation profiling. Multiregional analyses reveal unexpected conservation of immune composition within each individual patient, with profound differences among patients, presenting patient-specific tumor immune microenvironment signatures despite underlying genetic heterogeneity from clonal evolution.
View Article and Find Full Text PDFCellular deconvolution algorithms virtually reconstruct tissue composition by analyzing the gene expression of complex tissues. We present the decision tree machine learning algorithm, Kassandra, trained on a broad collection of >9,400 tissue and blood sorted cell RNA profiles incorporated into millions of artificial transcriptomes to accurately reconstruct the tumor microenvironment (TME). Bioinformatics correction for technical and biological variability, aberrant cancer cell expression inclusion, and accurate quantification and normalization of transcript expression increased Kassandra stability and robustness.
View Article and Find Full Text PDFSingle-cell RNA sequencing is often applied in study designs that include multiple individuals, conditions or tissues. To identify recurrent cell subpopulations in such heterogeneous collections, we developed Conos, an approach that relies on multiple plausible inter-sample mappings to construct a global graph connecting all measured cells. The graph enables identification of recurrent cell clusters and propagation of information between datasets in multi-sample or atlas-scale collections.
View Article and Find Full Text PDFGenome rearrangements have played an important role in the evolution of from its progenitor . Traditional phylogenetic trees for based on sequence comparison have short internal branches and low bootstrap supports as only a small number of nucleotide substitutions have occurred. On the other hand, even a small number of genome rearrangements may resolve topological ambiguities in a phylogenetic tree.
View Article and Find Full Text PDF