Many human diseases result from a complex interplay of behavioral, clinical, and molecular factors. Integrating low-dimensional behavioral and clinical features with high-dimensional molecular profiles can significantly improve disease outcome prediction and diagnosis. However, while some biomarkers are crucial, many lack informative value. To enhance prediction accuracy and understand disease mechanisms, it is essential to integrate relevant features and identify key biomarkers, separating meaningful data from noise and modeling complex associations. To address these challenges, we introduce the High-dimensional Feature Importance Test (HdFIT) framework for machine learning models. HdFIT includes a feature screening step for dimension reduction and leverages machine learning to model complex associations between biomarkers and disease outcomes. It robustly evaluates each feature's impact. Extensive Monte Carlo experiments and a real microbiome study demonstrate HdFIT's efficacy, especially when integrated with advanced models like deep neural networks. Our framework shows significant improvements in identifying crucial features and enhancing prediction accuracy, even in high-dimensional settings.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735761 | PMC |
http://dx.doi.org/10.1093/bib/bbae709 | DOI Listing |
BMC Med Inform Decis Mak
January 2025
Great Ormond Street Institute of Child Health, University College London, London, UK.
Introduction: Unsupervised feature learning methods inspired by natural language processing (NLP) models are capable of constructing patient-specific features from longitudinal Electronic Health Records (EHR).
Design: We applied document embedding algorithms to real-world paediatric intensive care (PICU) EHR data to extract patient-specific features from 1853 patients' PICU journeys using 647 unique lab tests and medication events. We evaluated the clinical utility of the patient features via a K-means clustering analysis.
BMC Med
January 2025
Department of Nuclear Medicine, West China Hospital, Sichuan University, Guoxue Alley, Address: No.37, Chengdu City, Sichuan, 610041, China.
Background: This study aimed to construct a radiomics-based imaging biomarker for the non-invasive identification of transformed follicular lymphoma (t-FL) using PET/CT images.
Methods: A total of 784 follicular lymphoma (FL), diffuse large B-cell lymphoma, and t-FL patients from 5 independent medical centers were included. The unsupervised EMFusion method was applied to fuse PET and CT images.
BMC Med Res Methodol
January 2025
Leeds Institute of Clinical Trials Research, University of Leeds, Clarendon Way, Leeds, LS2 9NL, UK.
Background: Early detection and diagnosis of cancer are vital to improving outcomes for patients. Artificial intelligence (AI) models have shown promise in the early detection and diagnosis of cancer, but there is limited evidence on methods that fully exploit the longitudinal data stored within electronic health records (EHRs). This review aims to summarise methods currently utilised for prediction of cancer from longitudinal data and provides recommendations on how such models should be developed.
View Article and Find Full Text PDFSci Rep
January 2025
Shanghai Key Laboratory of Magnetic Resonance, School of Physics and Electronic Science, East China Normal University, Shanghai, China.
Prediction of isocitrate dehydrogenase (IDH) mutation status and epilepsy occurrence are important to glioma patients. Although machine learning models have been constructed for both issues, the correlation between them has not been explored. Our study aimed to exploit this correlation to improve the performance of both of the IDH mutation status identification and epilepsy diagnosis models in patients with glioma II-IV.
View Article and Find Full Text PDFSci Rep
January 2025
Rashpetco Company, Cairo, Egypt.
This study presents a comprehensive workflow to detect low seismic amplitude gas fields in hydrocarbon exploration projects, focusing on the West Delta Deep Marine (WDDM) concession, offshore Egypt. The workflow integrates seismic spectral decomposition and machine learning algorithms to identify subtle anomalies, including low seismic amplitude gas sand and background amplitude water sand. Spectral decomposition helps delineate the fairway boundaries and structural features, while Amplitude Versus Offset (AVO) analysis is used to validate gas sand anomalies.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!