Geographically Weighted Statistical Machine Learning Methods for Predicting Net Primary Productivity in the Eastern Sahel Region

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

University of the Witwatersrand, Johannesburg

Abstract

Sustainable land management and ecosystem resilience are essential for climate adaptation and resource conservation, particularly in regions susceptible to environmental degradation. This study applies the Geographically Weighted Statistical Machine Learning (GWSML) methods to predict Net Primary Productivity (NPP) in the eastern Sahel, a semi-arid region characterised by high climate variability, land degradation, and socio-economic vulnerability. By integrating Geographically Weighted Regression (GWR), Geographically Weighted Random Forests (GWRF), and Geographically Weighted Neural Networks (GWNN), the research addresses spatial heterogeneity and nonlinearity in environmental data, overcoming the limitations of traditional global models. Using data from Niger, Chad, and Sudan, spanning 2019-2021, the models leverage spatially explicit climatic variables—rainfall, temperature, soil moisture, and elevation—to estimate NPP with high accuracy. The data were processed and analysed using Ordinary Kriging (OK) to handle missing data, followed by model calibration. Spatial autocorrelation in residuals was examined using Moran’s I, and the evaluation was conducted using spatial regression and geographically weighted machine learning techniques. Model performance evaluation was carried out using key metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R2). This evaluation involved a comparative analysis with global models, particularly Ordinary Least Squares (OLS), and traditional machine learning methods, including Random Forests (RF) and Neural Networks (NN). The results demonstrate that machine learning methods, enhanced with geographical weighting, outperform traditional/global approaches by capturing localised variations and nonlinear dependencies. The best results produced for this study were an R2 of 0.9360, RMSE of 0.0333, and MSE of 0.0012, all achieved by the GWNN model. The GWRF model yielded the best MAE of 0.0191, albeit with a lower R2 of 0.9308 compared to GWNN. However, GWR produced better performance than global models, with an R2 of 0.9207. The study results show that GWR, GWRF, and GWNN outperform global regression models in their ability to capture spatial variability. Concurrently, GWRF and GWNN significantly improve prediction accuracy, effectively capturing nonlinear relationships and spatial heterogeneity between NPP and its drivers. The findings highlight the importance of spatially adaptive models for predicting ecological productivity and informing climate adaptation strategies. These models can help mitigate land degradation and promote sustainable agriculture in regions with spatial heterogeneity. Integrating these methods into ecological modelling promises improved outcomes for socio-economic stability, environmental sustainability, and food security in developing climate-vulnerable regions like the eastern Sahel.

Description

A dissertation submitted in fulfilment of the requirements for the degree of Master of Science, to the Faculty of Science, School of Statistics and Actuarial Science, University of the Witwatersrand, Johannesburg, 2025

Citation

Letsela, Kopano Lazarus. (2025). Geographically Weighted Statistical Machine Learning Methods for Predicting Net Primary Productivity in the Eastern Sahel Region. [Master's dissertation, University of the Witwatersrand, Johannesburg]. WIReDSpace. https://hdl.handle.net/10539/47680

Endorsement

Review

Supplemented By

Referenced By