Modeling of genomic profiles from the Cancer Genome Atlas (TCGA) by using recently developed mathematical frameworks has associated a genome-wide pattern of DNA copy-number alterations with a shorter, roughly one-year, median survival time in glioblastoma (GBM) patients. Here, to experimentally test this relationship, we whole-genome sequenced DNA from tumor samples of patients. We show that the patients represent the U.
View Article and Find Full Text PDFMore than a quarter of lung, uterine, and ovarian adenocarcinoma (LUAD, USEC, and OV) tumors are resistant to platinum drugs. Only recently and only in OV, patterns of copy-number alterations that predict survival in response to platinum were discovered, and only by using the tensor GSVD to compare Agilent microarray platform-matched profiles of patient-matched normal and primary tumor DNA. Here, we use the GSVD to compare whole-genome sequencing (WGS) and Affymetrix microarray profiles of patient-matched normal and primary LUAD, USEC, and OV tumor DNA.
View Article and Find Full Text PDFDNA alterations have been observed in astrocytoma for decades. A copy-number genotype predictive of a survival phenotype was only discovered by using the generalized singular value decomposition (GSVD) formulated as a comparative spectral decomposition. Here, we use the GSVD to compare whole-genome sequencing (WGS) profiles of patient-matched astrocytoma and normal DNA.
View Article and Find Full Text PDFWe use the generalized singular value decomposition (GSVD), formulated as a comparative spectral decomposition, to model patient-matched grades III and II, i.e., lower-grade astrocytoma (LGA) brain tumor and normal DNA copy-number profiles.
View Article and Find Full Text PDFThe number of large-scale high-dimensional datasets recording different aspects of a single disease is growing, accompanied by a need for frameworks that can create one coherent model from multiple tensors of matched columns, e.g., patients and platforms, but independent rows, e.
View Article and Find Full Text PDFTo search for evolutionary forces that might act upon transcript length, we use the singular value decomposition (SVD) to identify the length distribution functions of sets and subsets of human and yeast transcripts from profiles of mRNA abundance levels across gel electrophoresis migration distances that were previously measured by DNA microarrays. We show that the SVD identifies the transcript length distribution functions as "asymmetric generalized coherent states" from the DNA microarray data and with no a-priori assumptions. Comparing subsets of human and yeast transcripts of the same gene ontology annotations, we find that in both disparate eukaryotes, transcripts involved in protein synthesis or mitochondrial metabolism are significantly shorter than typical, and in particular, significantly shorter than those involved in glucose metabolism.
View Article and Find Full Text PDFDespite recent large-scale profiling efforts, the best prognostic predictor of glioblastoma multiforme (GBM) remains the patient's age at diagnosis. We describe a global pattern of tumor-exclusive co-occurring copy-number alterations (CNAs) that is correlated, possibly coordinated with GBM patients' survival and response to chemotherapy. The pattern is revealed by GSVD comparison of patient-matched but probe-independent GBM and normal aCGH datasets from The Cancer Genome Atlas (TCGA).
View Article and Find Full Text PDFThe number of high-dimensional datasets recording multiple aspects of a single phenomenon is increasing in many areas of science, accompanied by a need for mathematical frameworks that can compare multiple large-scale matrices with different row dimensions. The only such framework to date, the generalized singular value decomposition (GSVD), is limited to two matrices. We mathematically define a higher-order GSVD (HO GSVD) for N≥2 matrices D(i)∈R(m(i) × n), each with full column rank.
View Article and Find Full Text PDFEvolutionary relationships among organisms are commonly described by using a hierarchy derived from comparisons of ribosomal RNA (rRNA) sequences. We propose that even on the level of a single rRNA molecule, an organism's evolution is composed of multiple pathways due to concurrent forces that act independently upon different rRNA degrees of freedom. Relationships among organisms are then compositions of coexisting pathway-dependent similarities and dissimilarities, which cannot be described by a single hierarchy.
View Article and Find Full Text PDFThis report provides a global view of how gene expression is affected by DNA replication. We analyzed synchronized cultures of Saccharomyces cerevisiae under conditions that prevent DNA replication initiation without delaying cell cycle progression. We use a higher-order singular value decomposition to integrate the global mRNA expression measured in the multiple time courses, detect and remove experimental artifacts and identify significant combinations of patterns of expression variation across the genes, time points and conditions.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
November 2007
We describe the use of a higher-order singular value decomposition (HOSVD) in transforming a data tensor of genes x "x-settings," that is, different settings of the experimental variable x x "y-settings," which tabulates DNA microarray data from different studies, to a "core tensor" of "eigenarrays" x "x-eigengenes" x "y-eigengenes." Reformulating this multilinear HOSVD such that it decomposes the data tensor into a linear superposition of all outer products of an eigenarray, an x- and a y-eigengene, that is, rank-1 "subtensors," we define the significance of each subtensor in terms of the fraction of the overall information in the data tensor that it captures. We illustrate this HOSVD with an integration of genome-scale mRNA expression data from three yeast cell cycle time courses, two of which are under exposure to either hydrogen peroxide or menadione.
View Article and Find Full Text PDFDNA microarrays make it possible, for the first time, to record the complete genomic signals that guide the progression of cellular processes. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment, and drug development. This chapter reviews the first data-driven models that were created from these genome-scale data, through adaptations and generalizations of mathematical frameworks from matrix algebra that have proven successful in describing the physical world, in such diverse areas as mechanics and perception: the singular value decomposition model, the generalized singular value decomposition model comparative model, and the pseudoinverse projection integrative model.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
October 2006
We describe the singular value decomposition (SVD) of yeast genome-scale mRNA lengths distribution data measured by DNA microarrays. SVD uncovers in the mRNA abundance levels data matrix of genes x arrays, i.e.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
December 2005
We describe the use of the matrix eigenvalue decomposition (EVD) and pseudoinverse projection and a tensor higher-order EVD (HOEVD) in reconstructing the pathways that compose a cellular system from genome-scale nondirectional networks of correlations among the genes of the system. The EVD formulates a genes x genes network as a linear superposition of genes x genes decorrelated and decoupled rank-1 subnetworks, which can be associated with functionally independent pathways. The integrative pseudoinverse projection of a network computed from a "data" signal onto a designated "basis" signal approximates the network as a linear superposition of only the subnetworks that are common to both signals and simulates observation of only the pathways that are manifest in both experiments.
View Article and Find Full Text PDFWe describe an integrative data-driven mathematical framework that formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the "basis" set. By using pseudoinverse projection, the molecular biological profiles of the data samples are least-squares-approximated as superpositions of the basis profiles. Reconstruction of the data in the basis simulates experimental observation of only the cellular states manifest in the data that correspond to those of the basis.
View Article and Find Full Text PDFWe describe a comparative mathematical framework for two genome-scale expression data sets. This framework formulates expression as superposition of the effects of regulatory programs, biological processes, and experimental artifacts common to both data sets, as well as those that are exclusive to one data set or the other, by using generalized singular value decomposition. This framework enables comparative reconstruction and classification of the genes and arrays of both data sets.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
February 2003
Analysis of the patterns of gene expression in follicular lymphomas from 24 patients suggested that two groups of tumors might be distinguished. All patients, whose biopsies were obtained before any treatment, were treated with rituximab, a monoclonal antibody directed against the B cell antigen, CD20. Gene expression patterns in the tumors that subsequently failed to respond to rituximab appeared more similar to those of normal lymphoid tissues than to gene expression patterns of tumors from rituximab responders.
View Article and Find Full Text PDFBackground: Soft-tissue tumours are derived from mesenchymal cells such as fibroblasts, muscle cells, or adipocytes, but for many such tumours the histogenesis is controversial. We aimed to start molecular characterisation of these rare neoplasms and to do a genome-wide search for new diagnostic markers.
Methods: We analysed gene-expression patterns of 41 soft-tissue tumours with spotted cDNA microarrays.