Integrated knowledge mining, genome-scale modeling, and machine learning for predicting Yarrowia lipolytica bioproduction.

Metab Eng

Department of Energy, Environmental and Chemical Engineering, Washington University, St. Louis, MO, 63130, USA. Electronic address:

Published: September 2021

Predicting bioproduction titers from microbial hosts has been challenging due to complex interactions between microbial regulatory networks, stress responses, and suboptimal cultivation conditions. This study integrated knowledge mining, feature extraction, genome-scale modeling (GSM), and machine learning (ML) to develop a model for predicting Yarrowia lipolytica chemical titers (i.e., organic acids, terpenoids, etc.). First, Y. lipolytica production data, including cultivation conditions, genetic engineering strategies, and product information, was manually collected from literature (~100 papers) and stored as either numerical (e.g., substrate concentrations) or categorical (e.g., bioreactor modes) variables. For each case recorded, central pathway fluxes were estimated using GSMs and flux balance analysis (FBA) to provide metabolic features. Second, a ML ensemble learner was trained to predict strain production titers. Accurate predictions on the test data were obtained for instances with production titers >1 g/L (R = 0.87). However, the model had reduced predictability for low performance strains (0.01-1 g/L, R = 0.29) potentially due to biosynthesis bottlenecks not captured in the features. Feature ranking indicated that the FBA fluxes, the number of enzyme steps, the substrate inputs, and thermodynamic barriers (i.e., Gibbs free energy of reaction) were the most influential factors. Third, the model was evaluated on other oleaginous yeasts and indicated there were conserved features for some hosts that can be potentially exploited by transfer learning. The platform was also designed to assist computational strain design tools (such as OptKnock) to screen genetic targets for improved microbial production in light of experimental conditions.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ymben.2021.07.003DOI Listing

Publication Analysis

Top Keywords

integrated knowledge
8
knowledge mining
8
genome-scale modeling
8
machine learning
8
predicting yarrowia
8
yarrowia lipolytica
8
cultivation conditions
8
production titers
8
mining genome-scale
4
modeling machine
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!