Motivation: Metabolomics generates complex data necessitating advanced computational methods for generating biological insight. While machine learning (ML) is promising, the challenges of selecting the best algorithms and tuning hyperparameters, particularly for non-experts, remain. Automated machine learning (AutoML) can streamline this process; however, the issue of interpretability could persist. This research introduces a unified pipeline that combines AutoML with explainable AI (XAI) techniques to optimize metabolomics analysis.
Results: We tested our approach on two datasets: renal cell carcinoma (RCC) urine metabolomics and ovarian cancer (OC) serum metabolomics. AutoML, using auto-sklearn, surpassed standalone ML algorithms such as SVM and random forest in differentiating between RCC and healthy controls, as well as OC patients and those with other gynecological cancers (Non-OC). Auto-sklearn employed a mix of algorithms and ensemble techniques, yielding a superior performance (AUC of 0.97 for RCC and 0.85 for OC). Shapley Additive Explanations (SHAP) provided a global ranking of feature importance, identifying dibutylamine and ganglioside GM(d34:1) as the top discriminative metabolites for RCC and OC, respectively. Waterfall plots offered local explanations by illustrating the influence of each metabolite on individual predictions. Dependence plots spotlighted metabolite interactions, such as the connection between hippuric acid and one of its derivatives in RCC, and between GM3(d34:1) and GM3(18:1_16:0) in OC, hinting at potential mechanistic relationships. Through decision plots, a detailed error analysis was conducted, contrasting feature importance for correctly versus incorrectly classified samples. In essence, our pipeline emphasizes the importance of harmonizing AutoML and XAI, facilitating both simplified ML application and improved interpretability in metabolomics data science.
Availability: https://github.com/obifarin/automl-xai-metabolomics.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10634896 | PMC |
http://dx.doi.org/10.1101/2023.10.26.564244 | DOI Listing |
Am J Emerg Med
January 2025
Department of Emergency Medicine, Yale University School of Medicine, New Haven, CT, USA; Center for Outcomes Research and Evaluation, Yale University, New Haven, CT, USA.
Background: This study aimed to examine how physician performance metrics are affected by the speed of other attendings (co-attendings) concurrently staffing the ED.
Methods: A retrospective study was conducted using patient data from two EDs between January-2018 and February-2020. Machine learning was used to predict patient length of stay (LOS) conditional on being assigned a physician of average speed, using patient- and departmental-level variables.
JMIR Med Inform
January 2025
Department of Science and Education, Shenzhen Baoan Women's and Children's Hospital, Shenzhen, China.
Background: Large language models (LLMs) have been proposed as valuable tools in medical education and practice. The Chinese National Nursing Licensing Examination (CNNLE) presents unique challenges for LLMs due to its requirement for both deep domain-specific nursing knowledge and the ability to make complex clinical decisions, which differentiates it from more general medical examinations. However, their potential application in the CNNLE remains unexplored.
View Article and Find Full Text PDFJMIR AI
January 2025
Department of Radiology, Children's National Hospital, Washington, DC, United States.
J Chem Theory Comput
January 2025
Guizhou Provincial Engineering Technology Research Center for Chemical Drug R&D, School of Pharmacy, Guizhou Medical University, Guiyang, Guizhou 550025, P. R. China.
Traditional machine learning methods face significant challenges in predicting the properties of highly symmetric molecules. In this study, we developed a machine learning model based on graph neural networks (GNNs) to accurately and swiftly predict the thermodynamic and photochemical properties of fullerenols, such as C(OH) ( = 1 to 30). First, we established a global method for generating fullerenol isomers through isomer fingerprinting, which can generate all possible isomers or produce diverse structural types on demand.
View Article and Find Full Text PDFPLoS One
January 2025
Dirección General de Minería, República Dominicana.
This study investigates the geochemical characteristics of rare earth elements (REEs) in highland karstic bauxite deposits located in the Sierra de Bahoruco, Pedernales Province, Dominican Republic. These deposits, formed through intense weathering of volcanic material, represent a potentially valuable REE resource for the nation. Surface and subsurface soil samples were analyzed using portable X-ray fluorescence (pXRF) and a NixPro 2 color sensor validated with inductively coupled plasma optical emission spectrometry (ICP-OES).
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!