Plant specialized metabolites mediate interactions between plants and the environment and have significant agronomical/pharmaceutical value. Most genes involved in specialized metabolism (SM) are unknown because of the large number of metabolites and the challenge in differentiating SM genes from general metabolism (GM) genes. Plant models like have extensive, experimentally derived annotations, whereas many non-model species do not. Here we employed a machine learning strategy, transfer learning, where knowledge from is transferred to predict gene functions in cultivated tomato with fewer experimentally annotated genes. The first tomato SM/GM prediction model using only tomato data performs well (-measure = 0.74, compared with 0.5 for random and 1.0 for perfect predictions), but from manually curating 88 SM/GM genes, we found many mis-predicted entries were likely mis-annotated. When the SM/GM prediction models built with data were used to filter out genes where the based model predictions disagreed with tomato annotations, the new tomato model trained with filtered data improved significantly (-measure = 0.92). Our study demonstrates that SM/GM genes can be better predicted by leveraging cross-species information. Additionally, our findings provide an example for transfer learning in genomics where knowledge can be transferred from an information-rich species to an information-poor one.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7731531PMC
http://dx.doi.org/10.1093/insilicoplants/diaa005DOI Listing

Publication Analysis

Top Keywords

transfer learning
12
plant specialized
8
specialized metabolism
8
genes
8
metabolism genes
8
knowledge transferred
8
sm/gm prediction
8
sm/gm genes
8
tomato
5
within- cross-species
4

Similar Publications

Article Title: Machine Learning Models for Pancreatic Cancer Risk Prediction Using Electronic Health Record Data-A Systematic Review and Assessment.

View Article and Find Full Text PDF

Decoding Brain Development and Aging: Pioneering Insights From MRI Techniques.

Invest Radiol

October 2024

From the Department of Radiology, Juntendo University School of Medicine, Tokyo, Japan (A.H., S.K., J.K., M.N., W.U., S.F., T.A., A.W., K.K., S.A.); Department of Radiology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan (A.H., M.N., S.F.); Polytechnique Montréal, Montreal, Quebec, Canada (S.N.); Montreal Heart Institute, University of Montreal, Montreal, Quebec, Canada (S.N.); and Center for Advanced Interdisciplinary Research, Ss. Cyril and Methodius University in Skopje, Skopje, North Macedonia (S.N.).

The aging process induces a variety of changes in the brain detectable by magnetic resonance imaging (MRI). These changes include alterations in brain volume, fluid-attenuated inversion recovery (FLAIR) white matter hyperintense lesions, and variations in tissue properties such as relaxivity, myelin, iron content, neurite density, and other microstructures. Each MRI technique offers unique insights into the structural and compositional changes occurring in the brain due to normal aging or neurodegenerative diseases.

View Article and Find Full Text PDF

Study Question: How can we best achieve tissue segmentation and cell counting of multichannel-stained endometriosis sections to understand tissue composition?

Summary Answer: A combination of a machine learning-based tissue analysis software for tissue segmentation and a deep learning-based algorithm for segmentation-independent cell identification shows strong performance on the automated histological analysis of endometriosis sections.

What Is Known Already: Endometriosis is characterized by the complex interplay of various cell types and exhibits great variation between patients and endometriosis subtypes.

Study Design, Size, Duration: Endometriosis tissue samples of eight patients of different subtypes were obtained during surgery.

View Article and Find Full Text PDF

Integrating machine learning and remote sensing for long-term monitoring of chlorophyll-a in Chilika Lagoon, India.

Environ Monit Assess

December 2024

Department of Forest, Environment, and Climate Change, Chilika Development Authority, Barkul, Odisha, India.

Chlorophyll-a (Chla) is recognized as a key indicator of water quality and ecological health in aquatic ecosystems, offering valuable insights into ecosystem dynamics and changes over time. This study aimed to to develop and validate a robust ML model for estimating Chla using Landsat data, produce a time series of Chl a maps, and analyze the spatiotemporal variability of Chla in Chilika Lagoon, Asia's largest brackish water lagoon. Nine ML regression models, including Extreme Gradient Boost, Support Vector Regression, Random Forest, and Bagging Regression, were evaluated using Landsat imagery and field data.

View Article and Find Full Text PDF

The study delved into an extensive assessment of outdoor air pollutant levels, focusing specifically on PM, SO, NO, and CO, across the Mashhad metropolis from 2017 to 2021. In tandem, it explored their intricate correlations with meteorological conditions and the consequent health risks posed. Employing EPA health risk assessment methods, the research delved into the implications of pollutant exposure on human health.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!