From data to harvest: Leveraging ensemble machine learning for enhanced crop yield predictions across Canada amidst climate change.

Sci Total Environ

Laboratoire de Géosimulation Environnementale (LEDGE), Département de Géographie, Université de Montréal, 1375 Avenue Thérèse-Lavoie-Roux, Montréal, QC H2V 0B3, Canada. Electronic address:

Published: November 2024

Accurate crop yield predictions are crucial for farmers and policymakers. Despite the widespread use of ensemble machine learning (ML) models in computer science, their application in crop yield prediction remains relatively underexplored. This study, conducted in Canada, aims to assess the potential of five distinct ensemble ML models-Adaptive Boosting (AdaBoost), Gradient Boosting Machine (GBM), XGBoost, LightGBM, and Random Forest (RF)-in predicting crop yields chosen for their ability to manage complex datasets and their strong performance potential. The study integrated various factors, including climate variables, satellite-derived vegetation indices, soil characteristics, and honeybee census data. Data preparation comprised two main steps: first, climate variables were interpolated and averaged for croplands in ArcGIS Pro, along with averaging vegetation indices and soil characteristics. Honeybee census data was also incorporated. Second, the data was organized in Python to create a structured format for models' input. The models' accuracy was assessed using Root Mean Squared Error (RMSE), R-squared, and Mean Absolute Error (MAE). XGBoost emerged as the most accurate model, with the lowest MAE (68.70 for canola and 39.47 for soybeans), lowest RMSE (119.48 for canola and 102.39 for soybeans), and highest R-squared values (0.95 for canola and 0.96 for soybeans) on the test dataset. The study also assessed crop yields under various climate change scenarios, finding minimal variations across the scenarios, but significant negative impacts on canola and soybean yields across Canada. Honeybee colonies were identified as the most influential factor on crop yields, contributing 52.34 % to canola and 57.18 % to soybean yields. This research provides detailed crop yield maps of canola and soybeans at the Census Consolidated Subdivisions (CCS) level across Canada's agricultural landscape, offering valuable forecasts for localized decision-making. Additionally, it offers a proactive strategy for climate change preparedness, assisting farmers and stakeholders optimise resource allocation and manage risks effectively.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.scitotenv.2024.175764DOI Listing

Publication Analysis

Top Keywords

crop yield
16
climate change
12
crop yields
12
ensemble machine
8
machine learning
8
yield predictions
8
climate variables
8
vegetation indices
8
indices soil
8
soil characteristics
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!