This manuscript presents a proof-of-concept for a generalizable strategy, the full algorithm, designed to estimate disease risk using real-world clinical tabular data systems, such as electronic health records (EHR) or claims databases. By integrating classic statistical methods and modern artificial intelligence techniques, this strategy automates the production of a disease prediction model that comprehensively reflects the dynamics contained within the underlying data system. Specifically, the full algorithm parses through every facet of the data (e.g., encounters, diagnoses, procedures, medications, labs, chief complaints, flowsheets, vital signs, demographics, etc.), selects which factors to retain as predictor variables by evaluating the data empirically against statistical criteria, structures and formats the retained data into time-series, trains a neural network-based prediction model, then subsequently applies this model to current patients to generate risk estimates. A distinguishing feature of the proposed strategy is that it produces a self-adaptive prediction system, capable of evolving the prediction mechanism in response to changes within the data: as newly collected data expand/modify the dataset organically, the prediction mechanism automatically evolves to reflect these changes. Moreover, the full algorithm operates without the need for a-priori data curation and aims to harness all informative risk and protective factors within the real-world data. This stands in contrast to traditional approaches, which often rely on highly curated datasets and domain expertise to build static prediction models based solely on well-known risk factors. As a proof-of-concept, we codified the full algorithm and tasked it with estimating 12-month risk of initial stroke or myocardial infarction using our hospital's real-world EHR. A 66-month pseudo-prospective validation was conducted using records from 558,105 patients spanning April 2015 to September 2023, totalling 3,424,060 patient-months. Area under the receiver operating characteristic curve (AUROC) values ranged from .830 to .909, with an improving trend over time. Odds ratios describing model precision for patients 1-100 and 101-200 (when ranked by estimated risk) ranged from 15.3 to 48.1 and 7.2 to 45.0, respectively, with both groups showing improving trends over time. Findings suggest the feasibility of developing high-performing disease risk calculators in the proposed manner.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11371204 | PMC |
http://dx.doi.org/10.1371/journal.pdig.0000589 | DOI Listing |
Proc Natl Acad Sci U S A
January 2025
Center for Psychiatry Research and Center for Cognitive and Computational Neuropsychiatry, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm 17177, Sweden.
Soccer is arguably the most widely followed sport worldwide, and many dream of becoming soccer players. However, only a few manage to achieve this dream, which has cast a significant spotlight on elite soccer players who possess exceptional skills to rise above the rest. Originally, such attention was focused on their great physical abilities.
View Article and Find Full Text PDFBMC Med Genomics
January 2025
School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, Hubei, China.
Background: Drug and protein targets affect the physiological functions and metabolic effects of the body through bonding reactions, and accurate prediction of drug-protein target interactions is crucial for drug development. In order to shorten the drug development cycle and reduce costs, machine learning methods are gradually playing an important role in the field of drug-target interactions.
Results: Compared with other methods, regression-based drug target affinity is more representative of the binding ability.
Spectrochim Acta A Mol Biomol Spectrosc
January 2025
Department of Environment, Faculty of Bioscience Engineering, Ghent University, 9000 Ghent, Belgium. Electronic address:
Contamination of wheat by the mycotoxin Deoxynivalenol (DON), produced by Fusarium fungi, poses significant challenges to the quality of crop yield and food safety. Visible and near-infrared (vis-NIR) spectroscopy has emerged as a promising, non-destructive, and efficient tool for detecting mycotoxins in cereal crops and foods. This study aims to utilize vis-NIR spectroscopy, coupled with a feature selection technique and machine learning modelling, to predict and classify DON contamination in wheat kernels and flour.
View Article and Find Full Text PDFJ Gen Virol
January 2025
Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, UK.
The complexity and speed of evolution in viruses with RNA genomes makes predictive identification of variants with epidemic or pandemic potential challenging. In recent years, machine learning has become an increasingly capable technology for addressing this challenge, as advances in methods and computational power have dramatically improved the performance of models and led to their widespread adoption across industries and disciplines. Nascent applications of machine learning technology to virus research have now expanded, providing new tools for handling large-scale datasets and leading to a reshaping of existing workflows for phenotype prediction, phylogenetic analysis, drug discovery and more.
View Article and Find Full Text PDFDrug Healthc Patient Saf
January 2025
Department of Pharmacy Administration, University of Mississippi School of Pharmacy, University, MS, 38677 USA.
Objective: This review summarized the real-world effectiveness outcomes of Janus kinase inhibitors (JAKi) for rheumatoid arthritis (RA) based on observational studies.
Methods: A systematic review followed PRISMA guidelines, with searches conducted in PubMed, Embase, and CINAHL from each database's inception to June 2, 2023. Studies were included if they evaluated real-world effectiveness outcomes of JAKi for US RA patients.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!