Background: Machine learning approaches for predicting disease risk from high-dimensional whole genome sequence (WGS) data often result in unstable models that can be difficult to interpret, limiting the identification of putative sets of biomarkers. Here, we design and validate a graph-based methodology based on maximum flow, which leverages the presence of linkage disequilibrium (LD) to identify stable sets of variants associated with complex multigenic disorders.

Results: We apply our method to a previously published logistic regression model trained to identify variants in simple repeat sequences associated with autism spectrum disorder (ASD); this L-regularized model exhibits high predictive accuracy yet demonstrates great variability in the features selected from over 230,000 possible variants. In order to improve model stability, we extract the variants assigned non-zero weights in each of 5 cross-validation folds and then assemble the five sets of features into a flow network subject to LD constraints. The maximum flow formulation allowed us to identify 55 variants, which we show to be more stable than the features identified by the original classifier.

Conclusion: Our method allows for the creation of machine learning models that can identify predictive variants. Our results help pave the way towards biomarker-based diagnosis methods for complex genetic disorders.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8091705PMC
http://dx.doi.org/10.1186/s13040-021-00262-xDOI Listing

Publication Analysis

Top Keywords

machine learning
8
maximum flow
8
identify variants
8
variants
6
maximum flow-based
4
flow-based network
4
network approach
4
approach identification
4
identification stable
4
stable noncoding
4

Similar Publications

Objectives: The lack of definitive biomarkers presents a significant challenge for chemo-immunotherapy in extensive-stage small-cell lung cancer (ES-SCLC). We aimed to identify key genes associated with chemo-immunotherapy efficacy in ES-SCLC through comprehensive gene expression analysis using machine learning (ML).

Methods: A prospective multicenter cohort of patients with ES-SCLC who received first-line chemo-immunotherapy was analyzed.

View Article and Find Full Text PDF

Purpose: Recent advances in artificial intelligence provide opportunities to capture and represent complex features of human language in a more automated manner, offering potential means of improving the efficiency of language assessment. This review article presents computerized approaches for the analysis of narrative language and identification of language disorders in children.

Method: We first describe the current barriers to clinicians' use of language sample analysis, narrative language sampling approaches, and the data processing stages that precede analysis.

View Article and Find Full Text PDF

Background: With the use of machine learning algorithms, artificial intelligence (AI) has become a viable diagnostic and treatment tool for oral cancer. AI can assess a variety of information, including histopathology slides and intraoral pictures.

Aim: The purpose of this systematic review is to evaluate the efficacy and accuracy of AI technology in the detection and diagnosis of oral cancer between 2020 and 2024.

View Article and Find Full Text PDF

Introduction: The prevalence of neurodegenerative diseases has significantly increased, necessitating a deeper understanding of their symptoms, diagnostic processes, and prevention strategies. Frontotemporal dementia (FTD) and Alzheimer's disease (AD) are two prominent neurodegenerative conditions that present diagnostic challenges due to overlapping symptoms. To address these challenges, experts utilize a range of imaging techniques, including magnetic resonance imaging (MRI), diffusion tensor imaging (DTI), functional MRI (fMRI), positron emission tomography (PET), and single-photon emission computed tomography (SPECT).

View Article and Find Full Text PDF

Predicting Discharge Destination from Inpatient Rehabilitation Using Machine Learning.

Am J Phys Med Rehabil

December 2024

Department of Physical Medicine and Rehabilitation, Medical College of Wisconsin, 8701 W Watertown Plank Rd, Milwaukee, WI, 53226.

Predicting discharge destination for patients at inpatient rehabilitation facilities is important as it facilitates transitions of care and can improve healthcare resource utilization. This study aims to build on previous studies investigating discharges from inpatient rehabilitation by employing machine learning models to predict discharge disposition to home versus non-home and explore related factors. Fifteen machine learning models were tested.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!