Identification of Genome Sequences of Polyphosphate-Accumulating Organisms by Machine Learning.

Front Cell Dev Biol

State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin, China.

Published: January 2021

In the field of sewage treatment, the identification of polyphosphate-accumulating organisms (PAOs) usually relies on biological experiments. However, biological experiments are not only complicated and time-consuming, but also costly. In recent years, machine learning has been widely used in many fields, but it is seldom used in the water treatment. The present work presented a high accuracy support vector machine (SVM) algorithm to realize the rapid identification and prediction of PAOs. We obtained 6,318 genome sequences of microorganisms from the publicly available microbial genome database for comparative analysis (MBGD). Minimap2 was used to compare the genomes of the obtained microorganisms in pairs, and read the overlap. The SVM model was established using the similarity of the genome sequences. In this SVM model, the average accuracy is 0.9628 ± 0.019 with 10-fold cross-validation. By predicting 2,652 microorganisms, 22 potential PAOs were obtained. Through the analysis of the predicted potential PAOs, most of them could be indirectly verified their phosphorus removal characteristics from previous reports. The SVM model we built shows high prediction accuracy and good stability.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7848102PMC
http://dx.doi.org/10.3389/fcell.2020.626221DOI Listing

Publication Analysis

Top Keywords

genome sequences
12
svm model
12
polyphosphate-accumulating organisms
8
machine learning
8
biological experiments
8
potential paos
8
identification genome
4
sequences polyphosphate-accumulating
4
organisms machine
4
learning field
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!