Optimizing modeling windows to better capture the long-term variation of PM concentrations in China during 2005-2019.

Sci Total Environ

School of Public Health, Key Laboratory of Public Health Safety of the Ministry of Education, Key Laboratory of Health Technology Assessment of the Ministry of Health, Fudan University, Shanghai 200032, China. Electronic address:

Published: January 2023

Including data of different time intervals during model development influences the predicting accuracy of PM but has not been widely discussed. Therefore, we included modeling data with multiple time windows to identify optimized modeling time windows for capturing the long-term variation of PM in China during 2005-2019. In general, we incorporated PM measurements, aerosol optical depth (AOD), meteorological parameters, land use data, and other predictors to train random forest models. The study period was separated into two phases (2013-2019 and 2005-2012) according to the availability of PM measurements. First, we trained models with two strategies of choosing time windows to compare model performance in predicting PM from 2013 to 2019, when measurements were available. Strategy 1a (ST1a) refers to training one model with all available data, and Strategy 1b (ST1b) refers to training multiple models each with one-year data. Second, we trained models with additional ten strategies (ST2a-ST2j) based on data from different time windows during 2013-2019 to compare the accuracy in predicting PM before 2013, when measurements were unavailable. The internal and external cross-validation (CV) indicated that the model performance of ST1b was better than ST1a. Predictions based on ST1a tended to underestimate PM levels in 2013 and 2014 when PM concentrations were high, and overestimate after 2017 when PM dropped dramatically. The external CV of predicting historical PM was the most robust in ST2i (averaged predictions from two models developed by 2013 and 2014 data, respectively). Models with data closer to historical years and PM levels performed better in predicting historical PM concentrations. Our results suggested that training models with data of current-years performed better during 2013-2019, and with data of 2013 and 2014 performed better in predicting historical PM before 2013 in China. The comparison provided evidence for choosing optimized time windows when predicting long-term PM concentrations in China.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.scitotenv.2022.158624DOI Listing

Publication Analysis

Top Keywords

time windows
20
2013 2014
12
predicting historical
12
performed better
12
data
10
long-term variation
8
concentrations china
8
china 2005-2019
8
data time
8
trained models
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!