Optimized stacking, a new method for constructing ensemble surrogate models applied to DNAPL-contaminated aquifer remediation.

J Contam Hydrol

Civil, Water and Environmental Engineering Faculty, Shahid Beheshti Univ., P.O. Box 16765-1719, Bahar Blvd., Hakimieh, Tehran 1658953571, Iran. Electronic address:

Published: December 2021

Surfactant-enhanced aquifer remediation (SEAR) is an appropriate method for DNAPL-contaminated aquifer remediation; However, due to the high cost of the SEAR method, finding the optimal remediation scenario is usually essential. Embedding numerical simulation models of DNAPL remediation within the optimization routines are computationally expensive, and in this situation, using surrogate models instead of numerical models is a proper alternative. Ensemble methods are also utilized to enhance the accuracy of surrogate models, and in this study, the Stacking ensemble method was applied and compared with conventional methods. First, Six machine learning methods were used as surrogate models, and various feature scaling techniques were employed, and their impact on the models' performance was evaluated. Also, Bagging and Boosting homogeneous ensemble methods were used to improve the base models' accuracy. A total of six stand-alone surrogate models and 12 homogeneous ensemble models were used as the base input models of the Stacking ensemble model. Due to the large size of the Stacking model, Bayesian hyper-parameter optimization method was used to find its optimal hyper-parameters. The results showed that the Bayesian hyper-parameter optimization method had better performance than common methods such as random search and grid search. The artificial neural network model, whose input data was scaled by the power transformer method, had the best performance with a cross-validation RMSE of 0.065. The Boosting method increased the base models' accuracy more than other homogeneous methods, and the best Boosting model had a test RMSE of 0.039. The Stacking ensemble method significantly increased the base models' accuracy and performed better than other ensemble methods. The best ensemble surrogate model constructed with Stacking had a cross-validation RMSE of 0.016. Finally, a differential evolution optimization model was used by substituting the Stacking ensemble model with the numerical model, and the optimal remediation strategy was obtained at a total cost of $ 72,706.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jconhyd.2021.103914DOI Listing

Publication Analysis

Top Keywords

surrogate models
20
stacking ensemble
16
aquifer remediation
12
ensemble methods
12
base models'
12
models' accuracy
12
ensemble
10
method
9
models
9
ensemble surrogate
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!