K-means clustering is widely used for exploratory data analysis. While its dependence on initialisation is well-known, it is common practice to assume that the partition with lowest sum-of-squares (SSQ) total i.e.
View Article and Find Full Text PDFSingle clustering methods have often been used to elucidate clusters in high dimensional medical data, even though reliance on a single algorithm is known to be problematic. In this paper, we present a methodology to determine a set of 'core classes' by using a range of techniques to reach consensus across several different clustering algorithms, and to ascertain the key characteristics of these classes. We apply the methodology to immunohistochemical data from breast cancer patients.
View Article and Find Full Text PDFTime-to-event analysis is important in a wide range of applications from clinical prognosis to risk modeling for credit scoring and insurance. In risk modeling, it is sometimes required to make a simultaneous assessment of the hazard arising from two or more mutually exclusive factors. This paper applies to an existing neural network model for competing risks (PLANNCR), a Bayesian regularization with the standard approximation of the evidence to implement automatic relevance determination (PLANNCR-ARD).
View Article and Find Full Text PDFBackground: Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate.
View Article and Find Full Text PDFThis paper presents an analysis of censored survival data for breast cancer specific mortality and disease-free survival. There are three stages to the process, namely time-to-event modelling, risk stratification by predicted outcome and model interpretation using rule extraction. Model selection was carried out using the benchmark linear model, Cox regression but risk staging was derived with Cox regression and with Partial Logistic Regression Artificial Neural Networks regularised with Automatic Relevance Determination (PLANN-ARD).
View Article and Find Full Text PDFObjective: An integrated decision support framework is proposed for clinical oncologists making prognostic assessments of patients with operable breast cancer. The framework may be delivered over a web interface. It comprises a triangulation of prognostic modelling, visualisation of historical patient data and an explanatory facility to interpret risk group assignments using empirically derived Boolean rules expressed directly in clinical terms.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
March 2008
A three stage development process for the production of a hierarchical rule based prognosis tool is described. The application for this tool is specific to breast cancer patients that have a positive expression of the HER 2 gene. The first stage is the development of a Bayesian classification neural network to classify for cancer specific mortality.
View Article and Find Full Text PDFThere is much interest in rule extraction from neural networks and a plethora of different methods have been proposed for this purpose. We discuss the merits of pedagogical and decompositional approaches to rule extraction from trained neural networks, and show that some currently used methods for binary data comply with a theoretical formalism for extraction of Boolean rules from continuously valued logic. This formalism is extended into a generic methodology for rule extraction from smooth decision surfaces fitted to discrete or quantized continuous variables independently of the analytical structure of the underlying model, and in a manner that is efficient even for high input dimensions.
View Article and Find Full Text PDF