Background: Although PM (fine particulate matter with an aerodynamic diameter less than 2.5 µm) is an air pollutant of great concern in Texas, limited regulatory monitors pose a significant challenge for decision-making and environmental studies.

Objective: This study aimed to predict PM concentrations at a fine spatial scale on a daily basis by using novel machine learning approaches and incorporating satellite-derived Aerosol Optical Depth (AOD) and a variety of weather and land use variables.

Methods: We compiled a comprehensive dataset in Texas from 2013 to 2017, including ground-level PM concentrations from regulatory monitors; AOD values at 1-km resolution based on images retrieved from the MODIS satellite; and weather, land-use, population density, among others. We built predictive models for each year separately to estimate PM concentrations using two machine learning approaches called gradient boosted trees and random forest. We evaluated the model prediction performance using in-sample and out-of-sample validations.

Results: Our predictive models demonstrate excellent in-sample model performance, as indicated by high R values generated from the gradient boosting models (0.94-0.97) and random forest models (0.81-0.90). However, the out-of-sample R values fall within a range of 0.52-0.75 for gradient boosting models and 0.44-0.69 for random forest models. Model performance varies slightly across years. A generally decreasing trend in predicted PM concentrations over time is observed in Eastern Texas.

Impact Statement: We utilized machine learning approaches to predict PM levels in Texas. Both gradient boosting and random forest models perform well. Gradient boosting models perform slightly better than random forest models. Our models showed excellent in-sample prediction performance (R > 0.9).

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41370-024-00659-wDOI Listing

Publication Analysis

Top Keywords

random forest
20
machine learning
16
gradient boosting
16
forest models
16
learning approaches
12
boosting models
12
models
10
regulatory monitors
8
predictive models
8
prediction performance
8

Similar Publications

This research article presents a thorough and all-encompassing examination of predictive models utilized in the estimation of viscosity for ionic liquid solutions. The study focuses on crucial input parameters, namely the type of cation, the type of anion, the temperature (measured in Kelvin), and the concentration of the ionic liquid (expressed in mol%). This study assesses three influential machine learning algorithms that are based on the Decision Tree methodology.

View Article and Find Full Text PDF

Bidirectional recurrent neural network approach for predicting cervical cancer recurrence and survival.

Sci Rep

December 2024

School of Engineering and Technology, Sunway University, No. 5, Jalan Universiti, Bandar Sunway, Petaling Jaya, 47500, Selangor Darul Ehsan, Malaysia.

Cervical cancer is a deadly disease in women globally. There is a greater chance of getting rid of cervical cancer in case of earliest diagnosis. But for some patients, there is a chance of recurrence.

View Article and Find Full Text PDF

Legume content (LC) in grass-legume mixtures is important for assessing forage quality and optimizing fertilizer application in meadow fields. This study focuses on differences in LC measurements obtained from unmanned aerial vehicle (UAV) images and ground surveys based on dry matter assessments in seven meadow fields in Hokkaido, Japan. We propose a UAV-based LC (LC) estimation and mapping method using a land cover map from a simple linear iterative clustering (SLIC) algorithm and a random forest (RF) classifier.

View Article and Find Full Text PDF

Nursing activity recognition has immense importance in the development of smart healthcare management and is an extremely challenging area of research in human activity recognition. The main reasons are an extreme class-imbalance problem and intra-class variability depending on both the subject and the recipient. In this paper, we apply a unique two-step feature extraction, coupled with an intermediate feature 'Angle' and a new feature called mean min max sum to render the features robust against intra-class variation.

View Article and Find Full Text PDF

Cuproptosis, a newly identified form of cell death, has drawn increasing attention for its association with various cancers, though its specific role in colorectal cancer (CRC) remains unclear. In this study, transcriptomic and clinical data from CRC patients available in the TCGA database were analyzed to investigate the impact of cuproptosis. Differentially expressed genes linked to cuproptosis were identified using Weighted Gene Co-Expression Network Analysis (WGCNA).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!