Plant specialized metabolism (SM) enzymes produce lineage-specific metabolites with important ecological, evolutionary, and biotechnological implications. Using as a model, we identified distinguishing characteristics of SM and GM (general metabolism, traditionally referred to as primary metabolism) genes through a detailed study of features including duplication pattern, sequence conservation, transcription, protein domain content, and gene network properties. Analysis of multiple sets of benchmark genes revealed that SM genes tend to be tandemly duplicated, coexpressed with their paralogs, narrowly expressed at lower levels, less conserved, and less well connected in gene networks relative to GM genes. Although the values of each of these features significantly differed between SM and GM genes, any single feature was ineffective at predicting SM from GM genes. Using machine learning methods to integrate all features, a prediction model was established with a true positive rate of 87% and a true negative rate of 71%. In addition, 86% of known SM genes not used to create the machine learning model were predicted. We also demonstrated that the model could be further improved when we distinguished between SM, GM, and junction genes responsible for reactions shared by SM and GM pathways, indicating that topological considerations may further improve the SM prediction model. Application of the prediction model led to the identification of 1,220 genes with previously unknown functions, each assigned a confidence measure called an SM score, providing a global estimate of SM gene content in a plant genome.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6369796PMC
http://dx.doi.org/10.1073/pnas.1817074116DOI Listing

Publication Analysis

Top Keywords

machine learning
12
prediction model
12
genes
10
specialized metabolism
8
metabolism genes
8
genes machine
8
model
6
robust predictions
4
predictions specialized
4
metabolism
4

Similar Publications

Background: Kidney tumors, common in the urinary system, have widely varying survival rates post-surgery. Current prognostic methods rely on invasive biopsies, highlighting the need for non-invasive, accurate prediction models to assist in clinical decision-making.

Purpose: This study aimed to construct a K-means clustering algorithm enhanced by Transformer-based feature transformation to predict the overall survival rate of patients after kidney tumor resection and provide an interpretability analysis of the model to assist in clinical decision-making.

View Article and Find Full Text PDF

Rib pathology is uniquely difficult and time-consuming for radiologists to diagnose. AI can reduce radiologist workload and serve as a tool to improve accurate diagnosis. To date, no reviews have been performed synthesizing identification of rib fracture data on AI and its diagnostic performance on X-ray and CT scans of rib fractures and its comparison to physicians.

View Article and Find Full Text PDF

Cognitive resilience (CR) describes the phenomenon of individuals evading cognitive decline despite prominent Alzheimer's disease neuropathology. Operationalization and measurement of this latent construct is non-trivial as it cannot be directly observed. The residual approach has been widely applied to estimate CR, where the degree of resilience is estimated through a linear model's residuals.

View Article and Find Full Text PDF

Unveiling the role of PANoptosis-related genes in breast cancer: an integrated study by multi-omics analysis and machine learning algorithms.

Breast Cancer Res Treat

January 2025

Department of Breast Surgery, Thyroid Surgery, Huangshi Central Hospital, Affiliated Hospital of Hubei Polytechnic University, No.141, Tianjin Road, Huangshi, 435000, Hubei, China.

Background: The heterogeneity of breast cancer (BC) necessitates the identification of novel subtypes and prognostic models to enhance patient stratification and treatment strategies. This study aims to identify novel BC subtypes based on PANoptosis-related genes (PRGs) and construct a robust prognostic model to guide individualized treatment strategies.

Methods: The transcriptome data along with clinical data of BC patients were sourced from the TCGA and GEO databases.

View Article and Find Full Text PDF

Urinary tract infections (UTIs) often prompt empiric outpatient antibiotic prescriptions, risking mismatches. This study evaluates the impact of "UTI Smart-Set" (UTIS), an AI-driven decision-support tool, on prescribing patterns and mismatches in a large outpatient organization. UTIS integrates machine learning forecasts of antibiotic resistance, patient data, and guidelines into a user-friendly order set for UTI management.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!