IEEE/ACM Trans Comput Biol Bioinform
October 2024
Machine learning algorithms have been extensively used for accurate classification of cancer subtypes driven by gene expression-based biomarkers. However, biomarker models combining multiple gene expression signatures are often not reproducible in external validation datasets and their feature set size is often not optimized, jeopardizing their translatability into cost-effective clinical tools. We investigated how to solve the multi-objective problem of finding the best trade-offs between classification performance and set size applying seven algorithms for machine learning-driven feature subset selection and analyse how they perform in a benchmark with eight large-scale transcriptome datasets of cancer, covering both training and external validation sets.
View Article and Find Full Text PDFMotivation: Cancer is a very heterogeneous disease that can be difficult to treat without addressing the specific mechanisms driving tumour progression in a given patient. High-throughput screening and sequencing data from cancer cell-lines has driven many developments in drug development, however, there are important aspects crucial to precision medicine that are often overlooked, namely the inherent differences between tumours in patients and the cell-lines used to model them in vitro. Recent developments in transfer learning methods for patient and cell-line data have shown progress in translating results from cell-lines to individual patients in silico.
View Article and Find Full Text PDFRecent research on multi-view clustering algorithms for complex disease subtyping often overlooks aspects like clustering stability and critical assessment of prognostic relevance. Furthermore, current frameworks do not allow for a comparison between data-driven and pathway-driven clustering, highlighting a significant gap in the methodology. We present the COPS R-package, tailored for robust evaluation of single and multi-omics clustering results.
View Article and Find Full Text PDFHyaluronan (HA) accumulation in clear cell renal cell carcinoma (ccRCC) is associated with poor prognosis; however, its biology and role in tumorigenesis are unknown. RNA sequencing of 48 HA-positive and 48 HA-negative formalin-fixed paraffin-embedded (FFPE) samples was performed to identify differentially expressed genes (DEG). The DEGs were subjected to pathway and gene enrichment analyses.
View Article and Find Full Text PDF