Patients with inflammatory bowel disease (IBD) wait months and undergo numerous invasive procedures between the initial appearance of symptoms and receiving a diagnosis. In order to reduce time until diagnosis and improve patient wellbeing, machine learning algorithms capable of diagnosing IBD from the gut microbiome's composition are currently being explored. To date, these models have had limited clinical application due to decreased performance when applied to a new cohort of patient samples. Various methods have been developed to analyze microbiome data which may improve the generalizability of machine learning IBD diagnostic tests. With an abundance of methods, there is a need to benchmark the performance and generalizability of various machine learning pipelines (from data processing to training a machine learning model) for microbiome-based IBD diagnostic tools. We collected fifteen 16S rRNA microbiome datasets (7,707 samples) from North America to benchmark combinations of gut microbiome features, data normalization and transformation methods, batch effect correction methods, and machine learning models. Pipeline generalizability to new cohorts of patients was evaluated with two binary classification metrics following leave-one-dataset-out cross (LODO) validation, where all samples from one study were left out of the training set and tested upon. We demonstrate that taxonomic features processed with a compositional transformation method and batch effect correction with the naive zero-centering method attain the best classification performance. In addition, machine learning models that identify non-linear decision boundaries between labels are more generalizable than those that are linearly constrained. Lastly, we illustrate the importance of generating a curated training dataset to ensure similar performance across patient demographics. These findings will help improve the generalizability of machine learning models as we move towards non-invasive diagnostic and disease management tools for patients with IBD.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8895431 | PMC |
http://dx.doi.org/10.3389/fgene.2022.784397 | DOI Listing |
Medicine (Baltimore)
January 2025
Department of Otolaryngology, Hangzhou Red Cross Hospital (Zhejiang Hospital of Integrated Traditional Chinese and Western Medicine), Hangzhou, Zhejiang, China.
T-helper 17 (Th17) cells significantly influence the onset and advancement of malignancies. This study endeavor focused on delineating molecular classifications and developing a prognostic signature grounded in Th17 cell differentiation-related genes (TCDRGs) using machine learning algorithms in head and neck squamous cell carcinoma (HNSCC). A consensus clustering approach was applied to The Cancer Genome Atlas-HNSCC cohort based on TCDRGs, followed by an examination of differential gene expression using the limma package.
View Article and Find Full Text PDFAnal Chem
January 2025
Key Laboratory of OptoElectronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, Fujian Normal University, Fuzhou, Fujian 350117, China.
Multiple myeloma is a hematologic malignancy characterized by the proliferation of abnormal plasma cells in the bone marrow. Despite therapeutic advancements, there remains a critical need for reliable, noninvasive methods to monitor multiple myeloma. Circulating plasma cells (CPCs) in peripheral blood are robust and independent prognostic markers, but their detection is challenging due to their low abundance.
View Article and Find Full Text PDFJ Med Internet Res
January 2025
Division of Clinical Pathology, Department of Pathology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan.
Background: Sepsis, a critical global health challenge, accounted for approximately 20% of worldwide deaths in 2017. Although the Sequential Organ Failure Assessment (SOFA) score standardizes the diagnosis of organ dysfunction, early sepsis detection remains challenging due to its insidious symptoms. Current diagnostic methods, including clinical assessments and laboratory tests, frequently lack the speed and specificity needed for timely intervention, particularly in vulnerable populations such as older adults, intensive care unit (ICU) patients, and those with compromised immune systems.
View Article and Find Full Text PDFJCO Clin Cancer Inform
January 2025
Machine Learning Department, H. Lee Moffit Cancer Center and Research Institute, Tampa, FL.
Purpose: Adaptive radiotherapy accounts for interfractional anatomic changes. We hypothesize that changes in the gross tumor volumes identified during daily scans could be analyzed using delta-radiomics to predict disease progression events. We evaluated whether an auxiliary data set could improve prediction performance.
View Article and Find Full Text PDFJ Clin Oncol
January 2025
INSERM, IMRBU955, Univ Paris Est Créteil, Créteil, France.
Purpose: Establishing an accurate prognosis remains challenging in older patients with cancer because of the population's heterogeneity and the current predictive models' reduced ability to capture the complex interactions between oncologic and geriatric predictors. We aim to develop and externally validate a new predictive score (the Geriatric Cancer Scoring System [GCSS]) to refine individualized prognosis for older patients with cancer during the first year after a geriatric assessment (GA).
Materials And Methods: Data were collected from two French prospective multicenter cohorts of patients with cancer 70 years and older, referred for GA: ELCAPA (training set January 2007-March 2016) and ONCODAGE (validation set August 2008-March 2010).
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!