Improving the performance of machine learning models for early warning of harmful algal blooms using an adaptive synthetic sampling method.

Water Res

Department of Civil, Environmental and Plant Engineering, Konkuk University, Seoul 05029, Republic of Korea; Department of Civil and Environmental Engineering, Konkuk University, Seoul 05029, Republic of Korea. Electronic address:

Published: December 2021

AI Article Synopsis

  • - The study aimed to create an early warning system for predicting harmful algal blooms, using machine learning models (ANN and SVM) with eight years of data to enhance decision-making and management practices.
  • - A significant challenge faced was the class imbalance in alert level data, which affected the models' performance; this was addressed by generating synthetic data using the ADASYN method.
  • - The models using both original and synthetic data showed improved prediction accuracy for critical alert levels, especially the transition from normal conditions to bloom formation, indicating enhanced efficiency in monitoring harmful algal blooms.

Article Abstract

Many countries have attempted to monitor and predict harmful algal blooms to mitigate related problems and establish management practices. The current alert system-based sampling of cell density is used to intimate the bloom status and to inform rapid and adequate response from water-associated organizations. The objective of this study was to develop an early warning system for cyanobacterial blooms to allow for efficient decision making prior to the occurrence of algal blooms and to guide preemptive actions regarding management practices. In this study, two machine learning models: artificial neural network (ANN) and support vector machine (SVM), were constructed for the timely prediction of alert levels of algal bloom using eight years' worth of meteorological, hydrodynamic, and water quality data in a reservoir where harmful cyanobacterial blooms frequently occur during summer. However, the proportion imbalance on all alert level data as the output variable leads to biased training of the data-driven model and degradation of model prediction performance. Therefore, the synthetic data generated by an adaptive synthetic (ADASYN) sampling method were used to resolve the imbalance of minority class data in the original data and to improve the prediction performance of the models. The results showed that the overall prediction performance yielded by the caution level (L1) and warning level (L2) in the models constructed using a combination of original and synthetic data was higher than the models constructed using original data only. In particular, the optimal ANN and SVM constructed using a combination of original and synthetic data during both training (including validation) and test generated distinctively improved recall and precision values of L1, which is a very critical alert level as it indicates a transition status from normalcy to bloom formation. In addition, both optimal models constructed using synthetic-added data exhibited improvement in recall and precision by more than 33.7% while predicting L-1 and L-2 during the test. Therefore, the application of synthetic data can improve detection performance of machine learning models by solving the imbalance of observed data. Reliable prediction by the improved models can be used to aid the design of management practices to mitigate algal blooms within a reservoir.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.watres.2021.117821DOI Listing

Publication Analysis

Top Keywords

algal blooms
16
synthetic data
16
machine learning
12
learning models
12
management practices
12
prediction performance
12
models constructed
12
data
11
performance machine
8
models
8

Similar Publications

Efficient management of soil nutrients is essential for optimizing crop production, ensuring sustainable agricultural practices, and addressing the challenges posed by population growth and environmental degradation. Smart agriculture, using advanced technologies, plays an important role in achieving these goals by enabling real-time monitoring and precision management of nutrients. In open-field soil cultivation, spatial variability in soil properties demands site-specific nutrient management and integration with variable-rate technology (VRT) to optimize fertilizer application, reduce nutrient losses, and enhance crop yields.

View Article and Find Full Text PDF

Watercress (), a freshwater aquatic plant in the Brassicaceae family, is characterized by its high content of specialized metabolites, including flavonoids, glucosinolates, and isothiocyanates. Traditionally, commercial cultivation is conducted in submerged beds using river or spring water, often on soil or gravel substrates. However, these methods have significant environmental impacts, such as promoting eutrophication due to excessive fertilizer use and contaminating water sources with pesticides.

View Article and Find Full Text PDF

Analysis on Bacterial Community of Algal Blooms Near Pingtan Island, China.

Biology (Basel)

January 2025

Fujian Key Laboratory of Special Marine Bio-Resources Sustainable Utilization, College of Life Sciences, Fujian Normal University, Fuzhou 350117, China.

, known as a global red tide species, is a common red tide species found in Pingtan Island. To examine the bacterial community structure in different environments during the red tide period of on Pingtan Island, samples were collected from the Algal Bloom Area (ABA), Transition Area (TA), and Non-Algal Bloom Area (NBA) on 6 April 2022, and the environmental physicochemical factors and bacterial community were determined. The outbreak of red tide significantly impacted the water quality and bacterial community structure in the affected sea area.

View Article and Find Full Text PDF

Elevated emissions of flue gases deteriorate the quality of air, impacting both terrestrial and aquatic ecosystems through their contribution to acid rain and eutrophication. This study examines the bio-mitigation process in a packed bed reactor and its capacity to concurrently decrease the environmental consequences of industrial flue gases (CO, NO, and SO) and wastewater by employing mixed bacterial consortia. The highest biomass productivity achieved during the growth phase was 0.

View Article and Find Full Text PDF

Viruses that infect cyanobacteria are an integral part of aquatic food webs, influencing nutrient cycling and ecosystem health. However, the significance of virus host range, replication efficiency, and host compatibility on cyanobacterial dynamics, growth, and toxicity remains poorly understood. In this study, we examined the effects of cyanophage additions on the dynamics and activity of optimal, sub-optimal, and non-permissive cyanobacterial hosts in cultures of Microcystis aeruginosa and Raphidiopsis raciborskii.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!