IEEE/ACM Trans Comput Biol Bioinform
October 2024
Machine learning algorithms have been extensively used for accurate classification of cancer subtypes driven by gene expression-based biomarkers. However, biomarker models combining multiple gene expression signatures are often not reproducible in external validation datasets and their feature set size is often not optimized, jeopardizing their translatability into cost-effective clinical tools. We investigated how to solve the multi-objective problem of finding the best trade-offs between classification performance and set size applying seven algorithms for machine learning-driven feature subset selection and analyse how they perform in a benchmark with eight large-scale transcriptome datasets of cancer, covering both training and external validation sets.
View Article and Find Full Text PDFMotivation: Cancer is a very heterogeneous disease that can be difficult to treat without addressing the specific mechanisms driving tumour progression in a given patient. High-throughput screening and sequencing data from cancer cell-lines has driven many developments in drug development, however, there are important aspects crucial to precision medicine that are often overlooked, namely the inherent differences between tumours in patients and the cell-lines used to model them in vitro. Recent developments in transfer learning methods for patient and cell-line data have shown progress in translating results from cell-lines to individual patients in silico.
View Article and Find Full Text PDFRecent research on multi-view clustering algorithms for complex disease subtyping often overlooks aspects like clustering stability and critical assessment of prognostic relevance. Furthermore, current frameworks do not allow for a comparison between data-driven and pathway-driven clustering, highlighting a significant gap in the methodology. We present the COPS R-package, tailored for robust evaluation of single and multi-omics clustering results.
View Article and Find Full Text PDFHyaluronan (HA) accumulation in clear cell renal cell carcinoma (ccRCC) is associated with poor prognosis; however, its biology and role in tumorigenesis are unknown. RNA sequencing of 48 HA-positive and 48 HA-negative formalin-fixed paraffin-embedded (FFPE) samples was performed to identify differentially expressed genes (DEG). The DEGs were subjected to pathway and gene enrichment analyses.
View Article and Find Full Text PDFTo associate specimens identified by molecular characters to other biological knowledge, we need reference sequences annotated by Linnaean taxonomy. In this study, we (1) report the creation of a comprehensive reference library of DNA barcodes for the arthropods of an entire country (Finland), (2) publish this library, and (3) deliver a new identification tool for insects and spiders, as based on this resource. The reference library contains mtDNA COI barcodes for 11,275 (43%) of 26,437 arthropod species known from Finland, including 10,811 (45%) of 23,956 insect species.
View Article and Find Full Text PDFTypical clustering analysis for large-scale genomics data combines two unsupervised learning techniques: dimensionality reduction and clustering (DR-CL) methods. It has been demonstrated that transforming gene expression to pathway-level information can improve the robustness and interpretability of disease grouping results. This approach, referred to as biological knowledge-driven clustering (BK-CL) approach, is often neglected, due to a lack of tools enabling systematic comparisons with more established DR-based methods.
View Article and Find Full Text PDFHealth Res Policy Syst
April 2020
Background: Evidence-informed decision-making and better use of scientific information in societal decisions has been an area of development for decades but is still topical. Decision support work can be viewed from the perspective of information collection, synthesis and flow between decision-makers, experts and stakeholders. Open policy practice is a coherent set of methods for such work.
View Article and Find Full Text PDF