A mixed spatial prediction model in estimating spatiotemporal variations in benzene concentrations in Taiwan.

Chemosphere

Department of Geomatics, National Cheng Kung University, Tainan, Taiwan; National Institute of Environmental Health Sciences, National Health Research Institutes, Miaoli, Taiwan. Electronic address:

Published: August 2022

AI Article Synopsis

  • This study is the first to predict changes in benzene concentrations across Taiwan using a mix of machine learning algorithms and Land-use Regression (LUR) techniques.
  • Monthly benzene data from 2003 to 2019 was used to develop the model, and additional data from 2020, along with previous mobile monitoring data, helped validate its accuracy.
  • The ensemble model combining GBoost, CatBoost, and XGBoost improved prediction accuracy, capturing 92% of benzene concentration variability compared to the conventional LUR's 79%.

Article Abstract

It is well known benzene negatively impacts human health. This study is the first to predict spatial-temporal variations in benzene concentrations for the entirety of Taiwan by using a mixed spatial prediction model integrating multiple machine learning algorithms and predictor variables selected by Land-use Regression (LUR). Monthly benzene concentrations from 2003 to 2019 were utilized for model development, and monthly benzene concentration data from 2020, as well as mobile monitoring vehicle data from 2009 to 2019, served as external data for verifying model reliability. Benzene concentrations were estimated by running six LUR-based machine learning algorithms; these algorithms, which include random forest (RF), deep neural network (DNN), gradient boosting (GBoost), light gradient boosting (LightGBM), CatBoost, extreme gradient boosting (XGBoost), and ensemble algorithms (a combination of the three best performing models), can capture how nonlinear observations and predictions are related. The results indicated conventional LUR captured 79% of the variability in benzene concentrations. Notably, the LUR with ensemble algorithm (GBoost, CatBoost, and XGBoost) surpassed all other integrated methods, increasing the explanatory power to 92%. This study establishes the value of the proposed ensemble-based model for estimating spatiotemporal variation in benzene exposure.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.chemosphere.2022.134758DOI Listing

Publication Analysis

Top Keywords

benzene concentrations
20
gradient boosting
12
mixed spatial
8
spatial prediction
8
prediction model
8
model estimating
8
estimating spatiotemporal
8
benzene
8
variations benzene
8
machine learning
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!