Background: Filter feature selection methods compute molecular signatures by selecting subsets of genes in the ranking of a valuation function. The motivations of the valuation functions choice are almost always clearly stated, but those for selecting the genes according to their ranking are hardly ever explicit.
Method: We addressed the computation of molecular signatures by searching the optima of a bi-objective function whose solution space was the set of all possible molecular signatures, ie, the set of subsets of genes.
This work proposes a sequential methodology for selecting variables in classification problems in which the number of predictors is much larger than the sample size. The methodology includes a Monte Carlo permutation procedure that conditionally tests the null hypothesis of no association among the outcomes and the available predictors. In order to improve computing aspects, we propose a new parametric distribution, the Truncated and Zero Inflated Gumbel Distribution.
View Article and Find Full Text PDFBackground: Molecular heterogeneity of tumors suggests the presence of multiple different subclones that may limit response to targeted therapies and contribute to acquisition of drug resistance, but its quantification has remained challenging.
Results: We performed simulations to evaluate statistical measures that best capture the molecular diversity within a group of tumors for either continuous (gene expression) or discrete (mutations, copy number alterations) molecular data. Dispersion based metrics in the principal component space best captured the underlying heterogeneity.
We examined whether baseline Ki67 expression in estrogen receptor-positive (ER+) primary breast cancer correlates with clinical benefit and time to progression on first-line endocrine therapy and survival in metastatic disease. Ki67 values and outcome information were retrieved from a prospectively maintained clinical database and validated against the medical records; 241 patients with metastatic breast cancer were included--who had ER+ primary cancer with known Ki67 expression level--and received first-line endocrine therapy for metastatic disease. Patients were assigned to low (<10 %), intermediate (10-25 %), or high (>25 %) Ki67 expression groups.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
April 2011
In this paper we propose an application of local statistical models to the problem of identifying patients with pathologic complete response (PCR) to neoadjuvant chemotherapy. The idea of using local models is to split the input space (with data from PCR and NoPCR patients) and build a model for each partition. After the construction of the models we used bayesian classifiers and logistic regression to classify patients in the two classes.
View Article and Find Full Text PDFIf the benefit of adjuvant chemotherapy may be determined at the level of a population, to determine the real chemosensitivity of a tumor at the individual level is impossible. The concept of neoadjuvant chemotherapy in patients with localized breast cancer is interesting because it helps to know the chemosensitivity of a tumor "in vivo". It is possible to use a single criterion to predict the effectiveness of targeted therapies.
View Article and Find Full Text PDFThe purpose was to compare logistic regression model (LRM) and recursive partitioning (RP) to predict pathologic complete response to preoperative chemotherapy in patients with breast cancer. The two models were built in a same training set of 496 patients and validated in a same validation set of 337 patients. Model performance was quantified with respect to discrimination (evaluated by the areas under the receiver operating characteristics curves (AUC)) and calibration.
View Article and Find Full Text PDFBackground: DNA microarray technology has emerged as a major tool for exploring cancer biology and solving clinical issues. Predicting a patient's response to chemotherapy is one such issue; successful prediction would make it possible to give patients the most appropriate chemotherapy regimen. Patient response can be classified as either a pathologic complete response (PCR) or residual disease (NoPCR), and these strongly correlate with patient outcome.
View Article and Find Full Text PDFNew concepts may prove necessary to profit from the avalanche of sequence data on the genome, transcriptome, proteome and interactome and to relate this information to cell physiology. Here, we focus on the concept of large activity-based structures, or hyperstructures, in which a variety of types of molecules are brought together to perform a function. We review the evidence for the existence of hyperstructures responsible for the initiation of DNA replication, the sequestration of newly replicated origins of replication, cell division and for metabolism.
View Article and Find Full Text PDFTwo multilayer neural networks were designed to discriminate vigilance states (waking, paradoxical sleep, and non-REM sleep) in the rat using a single parieto-occipital EEG derivation. After filtering (bandwidth 3.18-25 Hz) and digitization at 512 HZ, the EEG signal was segmented into eight second epochs.
View Article and Find Full Text PDF