Background: Prediction in high dimensional settings is difficult due to the large number of variables relative to the sample size. We demonstrate how auxiliary 'co-data' can be used to improve the performance of a Random Forest in such a setting.
Results: Co-data are incorporated in the Random Forest by replacing the uniform sampling probabilities that are used to draw candidate variables by co-data moderated sampling probabilities. Co-data here are defined as any type information that is available on the variables of the primary data, but does not use its response labels. These moderated sampling probabilities are, inspired by empirical Bayes, learned from the data at hand. We demonstrate the co-data moderated Random Forest (CoRF) with two examples. In the first example we aim to predict the presence of a lymph node metastasis with gene expression data. We demonstrate how a set of external p-values, a gene signature, and the correlation between gene expression and DNA copy number can improve the predictive performance. In the second example we demonstrate how the prediction of cervical (pre-)cancer with methylation data can be improved by including the location of the probe relative to the known CpG islands, the number of CpG sites targeted by a probe, and a set of p-values from a related study.
Conclusion: The proposed method is able to utilize auxiliary co-data to improve the performance of a Random Forest.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5745983 | PMC |
http://dx.doi.org/10.1186/s12859-017-1993-1 | DOI Listing |
BMC Health Serv Res
January 2025
Amref Health Africa in Ethiopia, EPI Technical Assistant at West Gondar Zonal Health Department, SLL Project, COVID-19 Vaccine, Gondar, Ethiopia.
Background: Ethiopian healthcare relies heavily on Health Extension Workers (HEWs), who deliver essential services to communities nationwide. By analyzing existing research, the authors explore how prevalent job satisfaction is and what factors affect it. This comprehensive analysis aims to improve HEW satisfaction through targeted interventions, ultimately leading to a more effective healthcare workforce and better health outcomes in Ethiopia.
View Article and Find Full Text PDFSci Rep
January 2025
Research Unit of Health Sciences and Technology, University of Oulu, Oulu, Finland.
Optical techniques, such as functional near-infrared spectroscopy (fNIRS), contain high potential for the development of non-invasive wearable systems for evaluating cerebral vascular condition in aging, due to their portability and ability to monitor real-time changes in cerebral hemodynamics. In this study, thirty-six healthy adults were measured by single channel fNIRS to explore differences between two age groups using machine learning (ML). The subjects, measured during functional magnetic resonance imaging (fMRI) at Oulu University Hospital, were divided into young (age ≤ 32) and elderly (age ≥ 57) groups.
View Article and Find Full Text PDFSci Rep
January 2025
Faculty of Engineering, Université de Moncton, Moncton, NB, E1A3E9, Canada.
Diabetes is a growing health concern in developing countries, causing considerable mortality rates. While machine learning (ML) approaches have been widely used to improve early detection and treatment, several studies have shown low classification accuracies due to overfitting, underfitting, and data noise. This research employs parallel and sequential ensemble ML approaches paired with feature selection techniques to boost classification accuracy.
View Article and Find Full Text PDFBone
January 2025
ARTORG Centre for Biomedical Engineering Research, University of Bern, Bern, Switzerland.
Osteoporosis is the most common bone metabolic unbalance, leading to fragility fractures, which are known to be associated with structural changes in the bone. Cortical bone accounts for 80 % of the skeleton mass and undergoes remodeling throughout life, leading to changes in its thickness and microstructure. Although many studies quantified the different cortical bone structures using CT techniques (3D), they are often realised on a small number of samples.
View Article and Find Full Text PDFJ Am Med Inform Assoc
January 2025
Department of Cardiology, Royal North Shore Hospital, Sydney, NSW, Australia.
Objective: We aimed to develop a highly interpretable and effective, machine-learning based risk prediction algorithm to predict in-hospital mortality, intubation and adverse cardiovascular events in patients hospitalised with COVID-19 in Australia (AUS-COVID Score).
Materials And Methods: This prospective study across 21 hospitals included 1714 consecutive patients aged ≥ 18 in their index hospitalization with COVID-19. The dataset was separated into training (80%) and test sets (20%).
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!