Computers and Electronics in Agriculture 218 (2024) 108730

Available online 13 February 2024
0168-1699/© 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Optical remote sensing of crop biophysical and biochemical parameters: An 
overview of advances in sensor technologies and machine learning 
algorithms for precision agriculture 

Mahlatse Kganyago a,b,*, Clement Adjorlolo a,c, Paidamwoyo Mhangara a, Lesiba Tsoeleng d 

a School of Geography, Archaeology and Environmental Studies, University of the Witwatersrand, Johannesburg 2050, South Africa 
b Department of Geography, Environmental Management and Energy Studies, University of Johannesburg, Johannesburg, South Africa 
c African Union Development Agency (AUDA-NEPAD), 230 15th Rd, Midrand, 1685, Johannesburg, South Africa 
d Earth Observation, South African National Space Agency, The Enterprise Building, Mark Shuttleworth Street, Pretoria 0001, South Africa   

A R T I C L E  I N F O   

Keywords: 
Remote sensing 
Machine learning 
Precision agriculture 
Leaf area index 
Chlorophyll content 

A B S T R A C T   

This paper provides an overview of the recent developments in remote sensing technology and machine learning 
algorithms for estimating important biophysical and biochemical parameters for precision farming. The objec-
tives are (i) to provide an overview of recent advances in remotely sensed retrieval of biophysical and 
biochemical parameters brought by the developments in sensor technologies and robust machine learning al-
gorithms and (ii) to identify the sources of uncertainty in retrieving biophysical and biochemical parameters and 
implications for precision agriculture. The review revealed that developments in crop biophysical and 
biochemical parameters retrieval techniques were mainly driven by announcements and the availability of new 
sensors. Two ground-breaking events can be identified, i.e., the availability of Sentinel-2 and the SuperDove 
constellation. The two provide high temporal-high spatial resolution data relevant for site-specific management 
and super-spectral configuration, enabling retrieval of crop growth and health parameters. The free availability 
of Sentinel-2 triggered the testing of its spectral configurations and upscaling of retrieval approaches using 
simulated data from field spectrometers and airborne hyperspectral sensors. SuperDoves will likely reduce the 
cost of very high-resolution data while providing unprecedented capabilities for detailed, accurate and frequent 
characterisation of field variability. Studies showed that the red-edge bands and hybrid models coupling Radi-
ative Transfer Model (RTM) and machine learning regression algorithms (MLRA) are promising for operational 
and accurate monitoring of stress-related crop parameters to aid time-sensitive agronomic decisions. However, 
such models were tested in Mediterranean climates and performed poorly in African semi-arid areas and China’s 
temperate continental semi-humid monsoon climates. Therefore, locally-calibrated RTM models incorporating 
crop-type maps and other spatio-temporal constraints may reduce uncertainties when adapted to data-scarce 
regions. Generally, permanent experimental sites and a lack of systematic calibration data on various crops 
are some limiting factors to using remote sensing technologies for PA in Sub-Saharan Africa. Other complexities 
arise from farm configurations, such as small field sizes and mixed cropping practices. Therefore, future studies 
should develop generic, scalable and transferable models, especially within under-studied areas.   

1. Introduction 

In an era of climate variability and change, dwindling agricultural 
resources brought on by ineffective land management and competition 
from other land uses, and the need to address food insecurity, precision 
agriculture is one of a few viable solutions. It promises the optimisation 
of farm inputs (i.e., fertilisers, seeds, water and chemicals), improved 

efficiency and profitability of the agricultural system by averting po-
tential losses over stressed areas, and reduced environmental impact by 
avoiding the excessive application of inputs (Mulla, 2013). Conse-
quently, many studies report the benefits of integrating specific preci-
sion agriculture technologies in farm operations. For example, Bellvert 
et al. (2020) found cost savings of €7 090 and €9 960 over two 
consecutive seasons, i.e., 2016 and 2017, respectively, when using 

* Corresponding author. 
E-mail address: mahlatsek@uj.ac.za (M. Kganyago).  

Contents lists available at ScienceDirect 

Computers and Electronics in Agriculture 

journal homepage: www.elsevier.com/locate/compag 

https://doi.org/10.1016/j.compag.2024.108730 
Received 29 November 2022; Received in revised form 3 January 2024; Accepted 7 February 2024   

mailto:mahlatsek@uj.ac.za
www.sciencedirect.com/science/journal/01681699
https://www.elsevier.com/locate/compag
https://doi.org/10.1016/j.compag.2024.108730
https://doi.org/10.1016/j.compag.2024.108730
https://doi.org/10.1016/j.compag.2024.108730
http://crossmark.crossref.org/dialog/?doi=10.1016/j.compag.2024.108730&domain=pdf
http://creativecommons.org/licenses/by/4.0/


Computers and Electronics in Agriculture 218 (2024) 108730

2

optimised precision irrigation based on an integrated vine water con-
sumption model and remote sensing data in a commercial vineyard of 
100 ha. However, precision agriculture faces several limitations, such as 
information accuracy, high data volumes, complex information to be 
understood by farmers, and initial implementation costs. In previous 
decades, high data volumes generated by PA technologies (in-situ sen-
sors, weather station, software, aerial and satellite data) were difficult to 
store, access and rapidly analyse to provide timely field variability in-
formation. Recently, this is averted by adopting big data analytics on 
cloud computing services and platforms, but the interpretation of the 
complex information from PA services and their cost vis-à-vis benefits to 
the farmers may be another critical limitation. New studies on agricul-
tural digitisation footprint are concerned with growing data volumes in 
precision agriculture, indicating exponential increases over time (Kayad 
et al., 2022; Marinello et al., 2019). In contrast, the issue of information 
accuracy poses a significant risk to precision agriculture success and 
controls farmers’ perception of its benefits and, thus, its related 
technologies. 

In-situ sensors such as weather stations and soil moisture sensors 
provide information about the variability of weather and soil parame-
ters. In contrast, crop parameters can be measured with proximal and 
remote sensors. Remotely sensed leaf and canopy parameters related to 
crop growth, health, and yield are particularly appealing. These can be 
acquired non-destructively and repeatedly over vast areas and in great 
detail. Moreover, remote sensing systems are versatile and flexible, 
capable of mounting advanced imaging and non-imaging spectroradi-
ometers deployable on airborne (or aerial systems) and space-bourne (or 
satellite) systems. Specifically, the spectroradiometers deployed on un-
manned spacecraft have evolved rapidly from low spatial-high temporal 
resolution sensors in the 1970s (e.g., Advanced Very High-Resolution 
Radiometer, AVHRR) to medium spatial-low temporal resolution sen-
sors in the 1980s (e.g., Landsat Thematic mapper, TM) to more sophis-
ticated sensor constellations providing high spatial-high temporal 
resolution in the 2020s (e.g., Sentinel-2 and Planet SuperDove). 

Critically, the availability of remotely sensed data alone is insuffi-
cient, as they are complex and do not readily provide desired informa-
tion for precision agriculture. Therefore, techniques of varying 
sophistication are required to extract relevant crop health and growth 
information for farm-level management and decisions. These techniques 
have evolved in tandem with the developments in sensor technology and 
access to data from such new sensors. For example, there has been a shift 
from the reliance on the visible and near-infrared (VNIR) vegetation 
indices (VIs) to advanced red-edge indices with the advent of RapidEye 
and Worldview sensors (Qian et al., 2022; Xie et al., 2018) and more 
sophisticated physically-based techniques which use Radiative Transfer 
Models (RTMs). Meanwhile, advances in machine learning regression 
algorithms (MLRAs) have brought prospects for integrating a variety of 
remotely sensed, climatic and environmental variables in the retrieval of 
important biochemical and biophysical parameters of crops such as Leaf 
Chlorophyll a + b Content (LCab), Canopy Chlorophyll Content (CCC) 
and Leaf Area Index (LAI). Coupled with RTMs, studies have shown that 
MLRAs can be used to develop generic and crop-specific tools for 
retrieving crop biochemical and biophysical parameters over various 
crop types, growth stages and climatic conditions (Fernandes et al., 
2014; Shah et al., 2019; Yan et al., 2019). 

Despite the advances mentioned above in retrieving crop biochem-
ical and biophysical parameters from remotely sensed data, several 
technical challenges still need to be addressed. These include field 
measurement errors (such as instrument and sampling errors), poor 
radiometric quality due to residual errors from atmospheric correction, 
and parameterisation of RTMs. Unfortunately, the latest developments 
in sensor technology and MLRAs for biophysical and biochemical 
parameter retrieval in the context of precision agriculture have not been 
comprehensively reviewed in the last decade. To this end, we assimilate 
and review the literature from 2011 to 2023 to showcase recent de-
velopments in sensor technology and machine learning algorithms 

applications for biophysical and biophysical parameter retrieval in 
agricultural landscapes, as well as the technical challenges and emerging 
trends with the potentials to support PA. The objectives of this paper 
were two-fold. First, to provide an overview of recent advances in 
remotely sensed retrieval of crop biochemical and biophysical parame-
ters brought by the development of sensor technologies and robust 
machine learning algorithms. Second, to identify the sources of uncer-
tainty in retrieving biophysical and biochemical parameters and impli-
cations for precision agriculture. The scope of this review is limited to 
retrieval of LAI and Chlorophyll Content over herbaceous croplands and 
grasslands. The literature search was performed using relevant key-
words on the Scopus Abstract and Citation Database (www.scopus.com) 
to select “Published” and “In Press” journal articles. 

The remainder of this manuscript is organised as follows: Section 2 
provides an overview of the theoretical basis for the remote sensing of 
biophysical and biochemical parameters, focusing mainly on leaf and 
canopy parameters related to crop growth and health, i.e., LAI and 
Chlorophyll content, section 3 provides an overview of the de-
velopments in sensor technology (i.e., multi- and hyper-spectral sen-
sors), as well as advances in remote sensing platforms, section 4 provides 
a review of the advances in the applications of MLRAs for retrieving leaf 
and canopy parameters of crops, contrasting with traditional techniques 
such as inversion of RTMs and VIs, section 5 identifies the sources of 
uncertainty in the retrieval of crop parameters, and sections 6 and 7 
summarises the lessons and makes recommendations based on, and 
discusses main findings of, the literature review. 

2. Remote sensing of essential biophysical and biochemical 
parameters for precision agriculture 

Remote sensing of biophysical and biochemical parameters been 
topical since the dawn of satellite-based remote sensing. Biophysical 
parameters include Fractional Vegetation Cover (FVC), Fraction of 
absorbed Photosynthetically Active Radiation (FaPAR), LAI, and 
biomass; while biochemical parameters include to the canopy water 
content, chlorophyll content, Nitrogen (N), Phosphorus (P), Pottasium 
(K), Carteronoids (Car), and proteins. The LAI is inarguably one of the 
most studied parameters due to its importance in characterizing vege-
tation, representing the energy exchange between the plant canopies 
and the atmosphere, and related to the productivity of plants. LAI 
—measured as m2/m− 2(-|-)— is the ratio of photosynthetically active 
(PV) and non-PV (NPV) leaf area to ground area. The PV component is 
referred to as the green LAI (LAIG), while the NPV component is referred 
to as the brown LAI (LAIB). Among the two, LAIG tends to receive sig-
nificant attention from researchers of various disciplines and satellite 
data providers due to its importance for ecological, climate, and crop 
yield modelling (Zaroug et al., 2013). Consequently, several operational 
LAIG, based on low-resolution sensors exist and have been extensively 
evaluated and validated (Fang et al., 2019, 2012; Sun et al., 2013) and 
utilised for various crop monitoring (Campos-Taberner et al., 2018), 
drought assessments (Zhang et al., 2020), and other applications (Rasul 
et al., 2020). However, only a handful of studies (Amin et al., 2021; 
Delegido et al., 2015) attempted to characterise LAIB. 

In precision agriculture, characterising the spatio-temporal vari-
ability of LAIG at the field level is critical for monitoring the crop 
physiological development and phenological status of both erect (i.e., 
erectophile) and non-erect (i.e., planophile) canopies during the vege-
tative stages of the crop. Moreover, it can be used to model crop biomass 
and yield, thus providing essential insights into the productivity of the 
fields. Despite being limitedly studied, LAIB is critical for detecting and 
evaluating the impact of heatwaves and droughts, determining the 
extent of conservation agriculture (or soil management) practices and 
modelling fire risks of senescent crops (Amin et al., 2021; Delegido et al., 
2015). Moreover, it is an essential decision-support tool for optimising 
harvest schedules, decisions on planting dates of cover crops, and 
planning for transportation and storage and fire risk modelling (Hank 

M. Kganyago et al.                                                                                                                                                                                                                             

http://www.scopus.com/


Computers and Electronics in Agriculture 218 (2024) 108730

3

et al., 2019). Due to the coverage and cost limitations of destructive lab- 
based methods, LAI measurements can be acquired with optical field- 
based instruments such as DEMON, Ceptometer (Accu-Par, Decagon 
Devices, Pullman, WA, USA), smartphones (Pocket-LAI), digital hemi-
spherical photography (DHP), Pastis57, multi-band-vegetation imager 
(MVI), TRAC (Tracing Radiation and Architecture of Canopies), and 
LiCOR Plant Canopy Analyzer (Li-Cor, Inc., Lincoln, NE, USA). These 
instruments provide an effective Plant Area Index (PAIe) as they cannot 
distinguish leaves from other plant tissue (e.g., branches and stems). 
Moreover, errors due to the non-random distribution of canopy foliage, 
radiation interception from other plant components, and gap fraction 
saturation as the LAI approaches values of 5 – 6 m2 m− 2 (Gower et al., 
1999; Weiss et al., 2004), are carried forward to the retrieval methods, 
and subsequent remotely sensed LAI retrievals. Nevertheless, these in-
struments are scientifically accepted due to their ease of use, portability, 
affordability, and consistency. 

Leaf chlorophyll a and b content (LCab) is a critical crop biochemical 
parameter for monitoring crop health status, stress and gross primary 
productivity (Gitelson et al., 2014). Leaf chlorophyll content is signifi-
cantly correlated to N, i.e., one of the most limiting nutrient in plants, 
which is critical for crop growth, health, and yield (Gitelson et al., 
2003). Chlorophyll, inert structural elements in cell tissue, and the 
carbon-fixing enzyme, ribulose biphosphate carboxylase (RuBisCo), are 
all products of N. Direct estimation of N content through lab-based 
methods is destructive, laborious and costly, thus remotely sensed 
proxies such as LCab are crucial for optimising N fertilisation and rapid 
assessments of crop N status in precision agriculture (Jia et al., 2013; 
Tian et al., 2013; Vincini et al., 2016). In contrast, Canopy Chlorophyll 
Content (CCC) —obtained as a product of LAIG and LCab— is closely 
related to LAI (Boegh et al., 2013; Gitelson et al., 2005), for example, 
with R2 exceeding 85 % for Maize and Barley (Ciganda et al., 2008). 
Generally, there is a lack of consensus in the literature regarding the 
relationship between CCC and N content. For example, Baret et al. 
(2007) and Delloye et al. (2018) showed that CCC is applicable to can-
opy N content assessment. Contrarily, Vincini et al. (2016) argued that 
the factors affecting LAI, such as stand density and water stress, make 
CCC inadequate for distinguishing N deficiency from other crop 
stressors. Meanwhile, the Cab-sensitive vegetation indices (VIs) such as 
the red-edge green Chlorophyll Index (CIgreen), Chlorophyll Index (CIred- 

edge), and MERIS Chlorophyll Index (MTCI) tend to be the best estima-
tors of LAI, while the LAI-sensitive ones such as NDI45 and NDVI also 
tend to be the best linear estimators for Cab (Frampton et al., 2013; Viña 
et al., 2011). This non-exclusive sensitivity to various biophysical and 
biochemical parameters is mainly due to the interaction of various plant 
traits, which influences canopy spectral properties, thus making 
decoupling of such interacting parameters difficult (Delloye et al., 2018; 
Verrelst et al., 2016). Therefore, there is need for further studies to 
clarify these inconsistencies. 

3. Developments in remote sensing sensor technologies for 
biophysical and biochemical parameters retrieval 

3.1. Advances in remote sensing platforms 

The developments in sensor technology are occurring rapidly, thus 
enabling the extraction of accurate and reliable actionable information 
promptly. In the 1970s, remote sensing systems consisted of sensors 
mounted on aerial platforms such as fixed-wing aeroplanes. Today, the 
low-Earth orbiting satellites and remotely piloted aircraft systems (i.e., 
RPAS, also commonly referred to as unmanned aerial vehicles, UAVs) 
are increasingly adopted for characterising field conditions. Inarguably, 
UAVs are the most powerful remote sensing platforms, with the capa-
bility to fly closest to the targets (i.e., at low altitudes), thus, offering 
super high spatial resolutions (<1 cm) and customisable spectral 
coverage and flexible revisit times (López-Granados et al., 2016). 
Recently, UAVs are becoming more accessible and affordable and 

provide essential tools to acquire data at field scales and at a relatively 
low cost compared to other platforms. Hence, the availability of UAVs 
has since driven the miniaturisation of advanced sensor technology. For 
example, recent studies demonstrate the capability of UAVs for carrying 
a range of miniaturised sensor payloads and accurately characterising 
crop water stress variability and status (Zhou et al., 2018), NPK fertiliser 
deficiency (Corti et al., 2019), weed patch detection (Zisi et al., 2018), 
crop diseases mapping (Abdulridha et al., 2019), above-ground-biomass 
of crops (Zheng et al., 2019), and yield estimation (Wan et al., 2020; 
Zhou et al., 2017). Compared with manned aerial (i.e., aircraft) and 
space (i.e., satellite) platforms, UAVs are substantially cheaper to 
operate and maintain, thus holding more possibilities for precision 
agriculture. Additionally, due to their very low altitude, images 
collected by UAVs do not require atmospheric correction, which limits 
the adoption of satellite images in precision agriculture as they intro-
duce a more considerable delay than is necessary to support timely farm 
management decisions (Atzberger, 2013). 

Regrettably, the adoption of UAVs in developing regions such as sub- 
Saharan Africa is still limited, attributable to a myriad of complex fac-
tors, among others, affordability, smaller field sizes —characteristic of 
extensive small-holder farming (i.e., usually < 0.5 ha)— such that 
economic benefits of UAVs cannot be realised, lack of awareness, tech-
nical and institutional capacities, and poor rural connectivity. Besides, 
the legislative restrictions (such as a requirement for a piloting license, 
air service license and RPAS Operator Certificate in South Africa) for 
flying UAVs instinctively reserve the technology to a few individuals and 
companies. Moreover, although the cost of UAVs may be low, the cost of 
the remote sensing payload may be prohibitive. It may be compounded 
by the repeated imaging required for crop monitoring during the season. 
Intrinsically, UAVs are spatially limited due to limitations posed by 
battery capacity; thus, they may only address a few user imaging needs 
at a time. 

Alternatively, satellite platforms —have capabilities to remotely and 
repeatedly acquire data over large areas, in one pass, at medium (<30 
m) to very high (<1 m) spatial resolution— are a convenient compro-
mise and yet still relatively cheaper than manned aerial platforms. Since 
the launch of the first Earth Resources Satellite (Landsat 1 MSS) in the 
1970s, the utilisation of space for collecting Earth observation (EO) data 
has increased rapidly, with numerous low-Earth orbiting satellites 
launched. In the last decade, various low (e.g., Sentinel-3), medium-to- 
high (e.g., Sentinel-2), and very-high-resolution satellites (e.g., 
Worldview-3) were launched into space. At the same time, ground-based 
systems are required to acquire calibration and validation data under 
various field conditions and cover types. Using proximal and hand-held 
sensors, information on crops’ biophysical and biochemical properties, 
such as LAI, chlorophyll content, and spectral data, can be collected non- 
invasively and non-destructively. Today, myriad data collected with 
ground-based, aerial (manned aircraft), low-altitude UAVs, and space- 
based systems (i.e., satellites) are essential for cross-calibration, 
training and validation of biophysical and biochemical retrieval 
models, and product development. 

3.2. Advances in sensors 

Remotely sensed optical sensors can be classified according to their 
spatial resolution (i.e., low- to very-high-spatial-resolution sensors), 
temporal resolutions (i.e., low- to high-temporal-resolution sensors), 
and spectral or bandwidth configurations (i.e., broadband or multi-
spectral sensors, and narrow-band or hyperspectral sensors). 

3.2.1. Multispectral sensors 
Precision agriculture applications require high spatial (<20 m) and 

high temporal resolution (<5 days) remotely sensed data to characterise 
field conditions adequately and accurately throughout the phenological 
stages of the crops (Atzberger, 2013). Solitarily, the remotely sensed 
data from low- and medium-resolution multispectral sensors are either 

M. Kganyago et al.                                                                                                                                                                                                                             


Computers and Electronics in Agriculture 218 (2024) 108730

4

unreasonably coarse to obtain relevant within-field crop condition in-
formation or are unreasonably infrequent (exacerbated by cloud cover) 
to support site-specific farm management decisions meaningfully. This 
inherent compromise between spatial and temporal resolutions is well- 
known and evident in many heritage satellite missions, such as the 
MODerate Resolution Imaging Spectroradiometer (MODIS) and the 
Landsat programme. A summary of the significant milestones in multi-
spectral sensor technology is provided in Fig. 1. 

These spatial and temporal compromises presented new research 
problems and, consequently, the development of data fusion algorithms 
seeking to leverage the high temporal resolution of low-resolution sen-
sors and the spatial detail of medium- to high-resolution (<20 m) data to 
generate daily or near-daily medium- or high-resolution synthetic 
reflectance data which are similar to the measured reflectance obser-
vations, with R2 > 0.85 (Wu et al., 2012). The data fusion algorithms 
—consisting of the spatial and temporal adaptive reflectance fusion 
models and unmixing-based methods— have also been applied to in-
crease the frequency of the VIs and, thus, biophysical parameters such as 

LAI (Wu et al., 2015). Using the Enhanced Spatial and Temporal 
Adaptive Reflectance Fusion Model (ESTARFM), Zhou et al. (2020) 
fused the reflectance imagery of Sentinel-2 and Sentinel-3 to generate 
high temporal-high spatial resolution reflectance data to the R2 between 
60% and > 90% depending of the date. Moreover, using the Spatial- 
Temporal Data Fusion Approach (STDFA), Wu et al. (2015) found that 
the daily LAI values generated from daily synthetic vegetation indices 
were better than the well-established MODIS LAI product when vali-
dated against the daily LAI measurements from the winter wheat field 
with R2 ≈ 0.98 and RMSE ≈ 0.15 m2/m − 2. In another study, Houborg 
and Houborg & McCabe (2018) developed the Cubesat Enabled Spatio- 
Temporal Enhancement Method (CESTEM) to generate consistent sur-
face reflectance with Landsat-8 but at 3 m offered by PlanetScope Dove. 
Therefore, one can argue that the development of data fusion algorithms 
reduces the shortcomings of both the low- and medium-resolution sen-
sors while also improving their utility for precision agriculture by 
exploiting good qualities of each, i.e., higher repeat cycles (up to < 1 
day) and detailed coverage (<20 m). Since they were comprehensively 

Fig. 1. Significant milestones in the Earth observation satellites with multispectral sensor payloads.  

M. Kganyago et al.                                                                                                                                                                                                                             


Computers and Electronics in Agriculture 218 (2024) 108730

5

validated, provide long-term time series, and are based on well- 
established methods, products from low-resolution sensors are invalu-
able for the quality assessment of new estimation approaches using 
medium to high-resolution sensors (Kganyago et al., 2020; Zhou et al., 
2020). Moreover, data fusion algorithms are also used for harmonising 
the spatial resolution of sensors offering multiple resolutions, such as 
MultiSpectral Instrument (MSI) (Zhang et al., 2019) or between different 
sensors, i.e., Operational Land Imager (OLI), HJ-1 CCD (Charge Coupled 
Device) or MSI and MODIS or Sentinel-3 Ocean and Land Colour In-
strument (OLCI) (Chen et al., 2020; Kimm et al., 2020; Liao et al., 2019; 
Zhou et al., 2020). 

In recent years, the advances in sensor technology have brought 
about the capacity to obtain high spatial and high temporal resolution 
multispectral data through off-nadir viewing and tasking capabilities 
and satellite constellations. However, most of these very-high-resolution 
(VHR, <5 m) satellite sensors, e.g., IKONOS, Quickbird, GeoEye-1, and 
Pleiades (High-resolution optical imagers, HiRI), provided only VNIR 
multispectral data which, in the context of biophysical and biochemical 
parameter retrieval, are spectrally constrained. Besides, they capture the 
necessary bands, i.e., red and NIR, for vegetation analysis using para-
metric techniques such as vegetation indices. Unfortunately, the 
retrieval of crop biophysical and biochemical parameters using VNIR 
broadbands has been shown to suffer from saturation due to the chlo-
rophyll absorption in the red band (see detailed discussion in Section 4). 
Interestingly, some of the new commercial high-resolution and VHR 
sensor designs, such as SPOT-6 and -7 New Astrosat Optical Modular 
Imager (NAOMI) and PlanetScope’s Dove constellation, retain the VNIR 
spectral configuration, whilst Copernicus Sentinel-2 constellation pri-
oritised higher spatial resolution (i.e., 10 m) for this spectral region. This 
can be attributed to the popularity and wide adoption of VNIR indices in 
various vegetation analyses, including precision agriculture. Therefore, 
high spatio-temporal characteristics seem to be a priority in providing 
time-critical site-specific information relevant to precision agriculture. 
Also, because of their simplicity, vegetation indices such as NDVI are 
relatable to farmers, and their establishment places them as a bench-
mark for new indices (Li and Wang, 2013, Zhengyang et al., 2011, Wang 
et al., 2018, He et al., 2017). 

Two recent ground-breaking events in Earth observation satellite 
sensor technology can be identified, which offer unparalleled capabil-
ities and prospects for precision agriculture. First, the availability of 
PlanetScope’s large constellation of nano-satellite sensors in 2017, 
consisting of hundreds of CubeSats with 10 cm by 10 cm by 30 cm di-
mensions and providing imagery at 3 m spatial resolution daily. The 
imagery from Planet CubeSats has been demonstrated to provide better 
estimations of biophysical and biochemical parameters using Radiative 
Transfer Models (RTM) and empirical approaches (Kimm et al., 2020). 
For example, Kimm et al. (2020) found R2 > 75 % and RMSE of ~ 1 m2 m 
− 2 using the Green Wide Dynamic Range Vegetation Index (GrWDRVI) 
and LUT-based RTM inversion in Illinois, USA. They attributed their 
results to the CubeSat’s finer resolution reflectance data, consistent with 
the sampling area of in-situ LAI measurements. Although VHR imagery 
is critical to achieving greater precisions in site-specific management 
applications, it poses several challenges due to increased spatial 
variability. 

Second, the announcement and subsequent launch of the Sentinel-2 
constellation, i.e., Sentinel-2A (in 2016) and − 2B (in 2017), carrying 
identical Multispectral Imager (MSI) cameras. Sentinel-2 MSI does not 
only guarantee data continuity and interoperability with previous mis-
sions such as Landsat, SPOT, and MERIS (Wu et al., 2019) but also 
provides better spatial resolutions (i.e., up to 10 m), temporal revisits (i. 
e., 5 days), and the red-edge bands. The red-edge bands —known to 
enhance the biophysical and biochemical parameter retrieval accuracy 
using various techniques (Dong et al., 2019; Ramoelo et al., 2012)— 
were previously only provided by programmable commercial missions 
such as RapidEye constellation (i.e., since 2008), Gaofen-6 and World-
view series (i.e., since 2009). Although RapidEye and Worldview 

provided higher spatial resolutions, i.e., 6.5 m and > 2 m, respectively, 
their prohibitive costs have impeded systematic and operational moni-
toring of crops using VHR sensors (Houborg et al., 2015). Therefore, 
extensive archives are only available for big cities worldwide, while 
agricultural areas have limited coverage, and data availability depends 
on historical customer orders over specific small areas. For Gaofen-6, its 
limitation mainly because it is region-specific, and inaccessible to re-
gions out of China. Conversely, Sentinel-2 MSI is provided at no cost to 
users and provides three red-edge bands, centred at 705 nm, 740 nm and 
783 nm; thus present many research and operational advancement 
prospects. From the precision agriculture perspective, the increased 
accuracy brought by incorporating red-edge bands is significant for 
increasing the reliability of retrieved stress-related parameters of crops 
to aid time-sensitive farm management decisions. 

Consequently, several studies have exploited Sentinel-2 red-edge 
data for LAI, LCab, and N retrieval over various crops and environments. 
Before its launch, studies mainly demonstrated its potential using 
simulated data from leaf and canopy RTMs and hyperspectral sensors. 
Frampton et al. (2013) used experimental data from SEN3Exp (Barrax, 
Spain, 2009) and SicilyS2EVAL (Sicily, Italy, 2010) combined with 
synthetic Sentinel-2 MSI data generated from two airborne imaging 
spectrometers, i.e., Compact Airborne Spectrographic Imager (CASI) 
and AISA Eagle, and the PROSAIL model, to the conclusion that red-edge 
bands improved correlation of the VIs with LAI, LCab and CCC and 
averted saturation effects. Using SPARC field campaign (Barrax, Spain, 
2003) and Compact High-Resolution Imaging Spectrometry (CHRIS) 
data, Verrelst et al., (2012b) evaluated the performance of the three MSI 
spectral configurations, i.e., S2-10m (4 bands), S2-20m (8 bands), and 
S2-60m (10 bands), with MLRAs and concluded that LAI and FVC could 
be accurately estimated with S2-10m configuration due to its high 
spatial resolution, while highest accuracies for LCab were achieved with 
red-edge bands. Post-launch studies such as Delloye et al. (2018) mostly 
confirm the findings of the previous studies, showing the contribution of 
red-edge bands to the accurate retrieval of LAI, LCab and CCC, particu-
larly reduced uncertainties of the low and high values and better char-
acterisation across various growth stages and farming practices. 
Meanwhile, Herrmann et al. (2011) found that Sentinel-2 MSI perform 
equivalently to a field hyperspectral sensor, i.e., FieldSpec Pro FR 
spectrometer (Analytical Spectral Devices, USA), in estimating LAI. 

The significance of Sentinel-2 and Worldview sensor configurations 
is also reflected in the missions launched in the 2020s, such as the third 
generation of PlanetScope’s constellation, SuperDove. They represent 
another revolution in the Earth observation satellite sensor technology 
for precision agriculture. It offers five to eight spectral bands, which 
include VNIR bands with two green bands (513 – 549 nm and 547 – 583 
nm), yellow (600 – 620 nm) and red-edge bands (697 – 713 nm), thus 
providing the capability for detailed, accurate, and frequent character-
isation of field variability. A summary of sensor characteristics is pro-
vided in Table 1. 

3.2.2. Hyperspectral sensors 
Hyperspectral data, characterised by hundreds of narrow (<10 nm) 

and contiguous spectral bands, are critical for accurately and reliably 
extracting the crop biophysical and biochemical parameters. Contrary to 
multispectral broadband sensors, hyperspectral sensors (i.e., imaging 
and non-imaging spectrometers) can detect minute variations in plant 
biochemical and biophysical traits that affect the reflected radiance. 
Because of this capability, hyperspectral sensors are critical for identi-
fying significant spectral bands (or regions) where the various plant 
traits absorb electromagnetic energy (also called absorption features), 
simulating new multispectral sensor data (Delegido et al., 2011; Estévez 
et al., 2020; Verrelst et al., 2013b, 2012b), and detecting fine plant leaf 
traits and other stress factors (De Castro et al., 2012). These absorption 
features are then used to design the VIs and used as input variables in 
machine-learning regression models (Delegido et al., 2014, 2011). 
Despite their recognised significance for agricultural and natural 

M. Kganyago et al.                                                                                                                                                                                                                             


Computers and Electronics in Agriculture 218 (2024) 108730

6

resources management applications, only a few space-based hyper-
spectral sensors or imaging spectrometers existed, with most being sci-
ence demonstrator missions. These include sensors such as Hyperion 
onboard EO-1, CHRIS onboard PROBA-V, DLR Earth Sensing Imaging 
Spectrometer (DESIS), EnMap and PRISMA, respectively. The capability 
of the technology is mainly demonstrated with airborne hyperspectral 
sensors such as Airborne Visible and Infrared Imaging Spectrometer 
(AVIRIS), HyMap, and Compact Airborne Spectrographic Imager (CASI), 
and non-imaging field spectrometers which are seemingly cheaper to 
operate relative to satellite systems, that also require sophisticated 
processing procedures and extensive storage. 

3.2.3. Leaf and canopy radiative transfer modelling 
Studies have also used physically-based leaf and canopy RTMs to 

demonstrate the capability of upcoming optical sensors (Féret et al., 
2017), design new spectral indices (Yi et al., 2014), and determine 
optimal spectral bands for retrieval of biophysical and biochemical pa-
rameters (Richter et al., 2012). RTMs simulate the interactions of solar 
radiation and plant biophysical and biochemical properties based on 
physically-sound cause-effect relationships. Given a set of leaf and 
canopy vegetation traits (e.g., LAI, Chlorophyll a and b, leaf water 
content, dead matter, and leaf angle distribution), acquisition conditions 
(e.g., view and illumination angles), soil background and environmental 
parameters, RTMs can model the full-range (i.e., 400 nm to 2500 nm) 
spectral reflectance of vegetation at a leaf or canopy scales. In this re-
gard, several RTMs are at our disposal, varying in complexity, the 
number of parameters, and physical principles. Some examples include 
Invertible Forest Reflectance Model (INFORM) (Atzberger, 2000), Leaf 
Incorporating Biochemistry Exhibiting Reflectance and Transmittance 
and Yields (LIBERTY) (Dawson et al., 1998), FLUSPECT (Vilfan et al., 
2018, 2016), and SCOPE (Soil Canopy Observation of Photosynthesis 
and Energy) (Van Der Tol et al., 2009). Blackburn (2007) and Ustin et al. 
(2009) previously reviewed some of these RTMs. 

Inarguably, PROSPECT + SAIL (commonly called PROSAIL) —a one 
dimensional coupled leaf-canopy RTM consisting of PROSPECT (Jac-
quemoud and Baret, 1990) and SAIL (Scattering by Arbitrary Inclined 
Leaves) (Verhoef, 1984) models— is one the most utilised RTM in 
vegetation studies (Jacquemoud et al., 2009, 1995). It has been pop-
ularised by wide availability, simplicity, fewer input parameters, 
comparative predictive power to complex RTMs, and its relatively fast 
computation time. The role of PROSPECT in the PROSAIL RTM is to 

preliminarily simulate leaf directional-hemispherical spectral reflec-
tance and transmittance before they are incorporated into the SAIL to 
simulate the Top-of-Canopy reflectance, considering various structural 
and optical properties of the leaves, canopy, and soil background level 
for a given acquisition and illumination configuration. Indeed, the two 
RTMs have evolved independently, predominantly in terms of the 
complexity of the input parameters, parameterisation (Verrelst et al., 
2013a; Wang et al., 2018), coupling with other RTMs (Estévez et al., 
2020; Laurent et al., 2014; Verhoef and Bach, 2003), and inversion 
techniques. Consequently, different version of PROSPECT exist such as 
PROSPECT-4, -5, -5B, -PRO, and -Dynamic (-D) (Féret et al., 2017; Feret 
et al., 2008), while the prominent SAIL versions are 4SAIL and 4SAIL2 
(Verhoef et al., 2007; Verhoef and Bach, 2007). Generally, PROSPECT 
requires LCab content (Cab, [μg cm− 2]), leaf dry matter content (Cm, [g 
cm− 2]), leaf water thickness or water content (Cw, [g cm− 2]), carotenoid 
content (Ccx), leaf size to crop height (Sl), and a leaf mesophyll structural 
parameter (N, [unitless]). On the other hand, SAIL requires LAI (m/ 
m− 2(− |-)), average leaf inclination angle (ALIA, [◦]), the fraction of 
diffuse incoming solar radiation (skyl, usually fixed at 0.1) and the view 
and illumination geometry, i.e., sun zenith angle (θs, [◦], sensor view 
zenith angle (θv, [◦]), and relative azimuth angle (ϕsv, [◦]) as well as a 
hot spot parameter (Sl, [m/m]). A summary of input parameters used in 
literature for specific vegetation or crop types is provided in Table 2. We 
refer interested readers to an exhaustive account of this RTM’s appli-
cations in vegetation studies by Jacquemoud et al. (2009). 

Recent studies (Estévez et al., 2020; Laurent et al., 2014) demontrate 
the benefit of estimating crop parameters from the Top-of-Atmosphere 
(TOA) reflectance data by coupling the canopy RTMs with an atmo-
spheric RTMs. By inverting these coupled models, the crop parameters 
can be estimated directly based on the at-sensor spectral signature. 
Moreover, various operational and upcoming sensor responses in the 
spectral domain can be used to resample the simulated spectra by RTMs, 
providing insights into the prospects of these sensors. For example, 
before the Sentinel-2 and − 3 launch, several studies demonstrated its 
potential value for biophysical and biophysical parameter mapping 
using RTM-simulated data (Delegido et al., 2011; Estévez et al., 2020; 
Verrelst et al., 2013b, 2012b). This included exploring different inver-
sion techniques such as MLRAs, vegetation indices, and LUT-based ap-
proaches (Atzberger and Richter, 2012; Clevers and Gitelson, 2013; 
Richter et al., 2009; Verrelst et al., 2015), which informed post-launch 
operational techniques. 

Table 1 
Characteristics of multispectral sensors relevant for biophysical and biophysical parameter retrieval, classified by spatial resolution. Sensors with pixel sizes of ~ 250 m 
– 1 km, 20 m to ~ 250 m, <20 m to =>5 m, and <=5 m are classified as low, medium, high and very-high-spatial resolution, respectively.   

Satellite Sensor Spectral coverage Spatial resolution Revisit period Availability 

Low SPOT-4 & − 5 SPOT-VGT VNIR 1 km <1 day 1998 – 2014 
Terra / Aqua MODIS VNIR/SWIR 250 m – 1 km <1 day 2001 – to date 
Envisat MERIS VNIR 300 m <1 day 2002 – 2012 
PROBA PROBA-V VNIR 300 m <1 day 2013 – 2020 
Sentinel-3 OLCI VNIR 300 m <1 day 2016 – Present 

Medium Landsat-5 TM VNIR/SWIR 30 m 16 days 1984 – 2012 
SPOT 1–––4 HRV VNIR 20 m 26 days 1986 –2013 
Landsat-7 ETM+ VNIR/SWIR 30 m 16 days 1999 – 2022 
Landsat-8 & − 9 OLI VNIR/SWIR 30 m 16 days 2014 – Present 
Sentinel-2 MSI RE/SWIR 20 m 5 days 2016 – Present 

High RapidEye REIS VNIR/RE 6.5 m 1 day 2008 – 2020 
SPOT 5 HRG VNIR/SWIR 10 m 26 days 2002 – 2015 
SPOT 6/7 NAOMI VNIR 6 m ~3 days 2013 – Present 
Sentinel-2 MSI VNIR 10 m 5 days 2016 – Present 

Very-High IKONOS – VNIR 3.2 m <3 days 1999 – 2015 
GeoEye – VNIR 1.84 m <3 days 2008 – Present 
QuickBird BGIS 2000 VNIR 2.62 m <3 days 2001 – 2015 
Pleiades HiRI VNIR 2 m 1 day 2011 – Present 
Worldview-2 WV-110 VNIR/RE/Y 1.84 m 1 day 2009 – 2022 
Worldview-3 – VNIR/RE/Y/SWIR 1.24 m – 3.7 m 1 day 2014 – Present 
PlanetScope Dove Dove classic and Dove-R VNIR 3 m 1 day 2017 – Present 
PlanetScope Dove SuperDove VNIR/RE/Y 3 m 1 day 2021 – Present  

M. Kganyago et al.                                                                                                                                                                                                                             


Computers and Electronics in Agriculture 218 (2024) 108730

7

4. Machine learning regression algorithms for biophysical and 
biochemical variables retrieval 

Machine learning regression algorithms (MLRAs, also called 
nonlinear, nonparametric regression algorithms) are robust to the non- 
linear functional dependence of the crop biophysical and biochemical 
parameters and the spectral reflectance data, small training samples, 
and noise and do not have normality assumptions when compared to 
linear algorithms such as Ordinary Least Squares Regression (OLS) and 
Partial Least Squares Regression (PLSR). MLRAs establish relationships 
with each predictor variable using all the available variations without 
making assumptions (Verrelst et al., 2012a). Generally, MLRAs are 
typically classified into three categories, i.e., tree-based (or tree en-
sembles), kernel-based, and deep learning (or Neural Networks), based 
on their architectural designs (Rivera-caicedo et al., 2017). The MLRAs 
were widely exploited for crop biophysical and biochemical parameters 
retrieval in recent studies (see Table 3) as they are more robust than 
traditional approaches such as vegetation indices and RTM inversion 
using numerical optimisation procedures. For example, the NDVI has 
several limitations, such as saturation with increasing biomass and LAI, 
sensitivity to soil background and atmospheric contamination. Despite 
much progress in addressing these problems, such as incorporating 
background adjustment and atmospheric noise compensation using the 
blue band (450 – 520 nm) by Liu and Huete (1995), soil-adjustment 
factor by (Huete, 1988), red-edge bands by (Gitelson and Merzlyak, 
1994), recent studies such as Verrelst et al. (2015) show that vegetation 
indices are sensitive to specific ranges of crop parameters below or 
beyond which they perform poorly, regardless of their mathematical 
formulation and spectral bands. Feng et al. (2019) found that vegetation 
indices were sensitive to various cultivars, irrigation levels, plant den-
sities, N rates and years. 

Tree-based MLRAs are less complicated, intuitive, and popular than 
kernel-based and deep learning algorithms. This family of algorithms 
have origins in the Classification and Regression Trees (CART), which is 
rarely used in recent years due to the advent of more robust and updated 
variants such as the Random Forest (RF) and Gradient Boosted Regres-
sion Trees (GBRT), also known as Gradient Boosting Machines (GBM) 
(Friedman, 2001). Recently, Chen and Guestrin (2016) proposed a more 
advanced, scalable and sparsity-aware improvement on GBM, namely, 
eXtreme Gradient Boosting (XGBoost). The XGBoost algorithm aims to 
improve the implementation of GBM by (1) handling missing values 
more efficiently, (2) constructing trees quickly and building huge 
models utilising parallel computing, and (3) utilising highly regularised 
formalisation and gradient-boosted trees, thus avoiding over-fitting. 
Therefore, XGBoost often outperforms other algorithms (Beltran et al., 
2019). 

However, RF is still prominently used in literature due to good per-
formance and relatively few required hyperparameters, with studies 
showing its robustness and high accuracy in predicting various crop 
biophysical and biochemical parameters. For example, LI et al. (2017a) 
found an R2 of 88 % and an RMSE of 0.195 m2 m− 2 in retrieving 
grassland LAI Landsat TM + and OLI. Another study (Tavakoli and 
Gebbers, 2019) found R2 of 56 % to 69 % in N content of Wheat in 
Germany using images acquired from a digital camera. Overall, tree- 
based algorithms are appealing due to their interpretability and 
comprehensibility. For example, these algorithms allow interrogation of 
the tree structure and variables used at each split and provide the 
importance (or relative influence) values for each explanatory variable. 
Therefore, they can be used to obtain novel insights, such as new 
functional relationships between the response and explanatory vari-
ables. Others (Shelestov et al., 2017) have used such variable impor-
tance measures for feature selection. The capability to interrogate the 
model structure is critical for troubleshooting the models to obtain 
robust crop parameter retrievals from satellite images (Azodi et al., 
2020). 

On the other hand, kernel-based and deep-learning MLRAs are Ta
bl

e 
2 

PR
O

SA
IL

 in
pu

t p
ar

am
et

er
s 

co
m

m
on

ly
 u

se
d 

in
 li

te
ra

tu
re

 to
 g

en
er

at
e 

Lo
ok

-U
p-

Ta
bl

es
 (

LU
Ts

). 
 

So
ur

ce
 

PR
O

SP
EC

T 
4S

A
IL

 
Ve

ge
ta

tio
n 

/ 
Cr

op
 ty

pe
 

N
 

C a
b 

C m
 

C w
 

C c
x 

Sl
 

Ve
r. 

LA
I 

A
LI

A
 

H
ot

 
θ s

 
θ v

 
φ s

v 

(F
ra

m
pt

on
 e

t a
l.,

 2
01

3)
 

1.
5 

5–
70

 
0.

00
9 

  
– 

5 
0–

8 
35

 
0.

01
 

30
 

10
   

(Y
i e

t a
l.,

 2
01

4)
 

1.
2–

2.
6 

60
 

0.
00

1–
0.

01
8 

0.
01

–0
.0

6 
15

 
– 

5 
0.

2–
8 

– 
0.

5 
35

 
0 

 
Co

tt
on

 
(N

ig
am

 e
t a

l.,
 2

01
4)

 
1–

3 
30

–7
0 

0.
00

8–
0.

02
5 

0.
01

–0
.0

6 
 

0.
1–

0.
5 

– 
1–

7 
– 

0.
5 

−
20

–8
0 

0–
55

 
±

12
0 

W
he

at
 

(S
eh

ga
l e

t a
l.,

 2
01

6)
 

1.
0 

20
–8

0 
0.

00
46

 
0.

01
–0

.0
4 

1.
0 

– 
5B

 
0.

1–
6 

70
,5

7,
45

 
0.

78
,0

.4
0,

0.
32

 
51

,4
5,

33
 

0 
0 

W
he

at
 

(K
at

te
nb

or
n 

et
 a

l.,
 2

01
7)

 
1.

9 
10

–6
0 

Cw
/3

.2
–4

 
0.

01
–0

.0
3 

3–
15

 
– 

5 
0.

2–
6 

 
0.

05
 

35
.5

 
6.

5 
98

.6
  

(F
en

gh
ua

 e
t a

l.,
 2

01
7)

 
1–

4 
20

–8
0 

0.
00

2–
2.

0 
0.

00
05

–0
.0

4 
4–

17
 

– 
– 

1–
5 

20
–5

0 
0.

01
–1

 
±

50
 

±
50

  
Ri

ce
 

(A
tz

be
rg

er
 a

nd
 R

ic
ht

er
, 2

01
2)

 
2.

0 
20

–7
0 

0.
00

4–
0.

00
7 

0.
6–

1.
4 

– 
– 

– 
0.

00
1–

8.
0 

20
–7

0 
0.

01
–1

.0
 

21
 

8.
4 

13
8 

Va
ri

ou
s 

(S
i e

t a
l.,

 2
01

2)
 

1.
5–

1.
9 

15
–5

5 
0.

00
25

–0
.0

05
 

0.
01

–0
.0

2 
– 

– 
– 

0.
1–

4 
20

–7
0 

0.
05

–0
.1

 
0 

0 
30

 
G

ra
ss

la
nd

 
(D

ua
n 

et
 a

l.,
 2

01
4)

 
1–

2 
20

–7
0 

0.
00

4–
0.

00
7 

0.
00

5–
0.

03
 

– 
– 

– 
0.

00
1–

6 
30

–7
0 

0.
05

–1
   

M

ai
ze

, p
ot

at
o,

 a
nd

 s
un

flo
w

er
 

(J
ay

 e
t a

l.,
 2

01
7)

 
1–

2 
20

–6
5 

0.
00

2–
0.

01
5 

0.
03

–0
.0

9 
5–

20
 

– 
– 

0.
1–

3.
5 

10
–9

0 
0.

33
   

Su

ga
rb

ee
t 

(L
ia

ng
 e

t a
l.,

 2
01

5)
 

1 
10

–9
0 

0.
00

2–
0.

02
 

0.
00

3–
0.

05
 

– 
– 

5B
 

0.
1–

10
 

30
–8

0 
0.

05
–0

.1
 

30
 

13
.3

1 
13

8.
08

 
Va

ri
ou

s 
(L

i e
t a

l.,
 2

01
7)

 
1.

2–
1.

8 
35

–7
5 

0.
00

3–
0.

01
1 

0.
8–

0.
9 

– 
– 

– 
0–

7 
45

–6
5 

0.
1–

0.
5 

30
–8

0 
0–

55
 

±
12

0 
W

he
at

 
(V

er
re

ls
t e

t a
l.,

 2
01

6)
 

1.
2–

2.
6 

0–
80

 
0.

00
1–

0.
05

 
0.

00
1–

0.
05

 
– 

– 
4 

0–
7 

30
–6

0 
– 

30
 

– 
– 

M
ai

ze
 a

nd
 s

oy
be

an
 r

ot
at

io
n 

N
ot

es
 fo

r 
Ta

bl
e 

2.
 

N
 –

 L
ea

f s
tr

uc
tr

e 
pa

ra
m

te
r; 

C a
b 

– 
Le

af
 c

hl
or

op
hy

ll 
co

nc
en

tr
at

io
n;

 C
m

 –
 D

ry
 m

at
te

r c
on

te
nt

 o
r l

ea
f m

as
s p

er
 a

re
a;

 C
w

 –
 E

qu
iv

al
en

t w
at

er
 c

on
te

nt
; L

A
I –

 L
ea

f A
re

a 
In

de
x;

 A
LI

A
 –

 A
ve

ra
ge

 L
ea

f I
nc

lin
at

io
n 

A
ng

le
; h

ot
 –

 H
ot

 sp
ot

 
si

ze
 p

ar
am

te
r. 

Sk
yl

 p
ar

am
et

er
 - 

fr
ac

tio
n 

of
 d

iff
us

e 
in

co
m

in
g 

so
la

r 
ra

di
at

io
n,

 is
 o

fte
n 

fix
ed

 a
t 0

.1
 fo

r 
al

l w
av

el
en

gt
hs

 (
Ri

ch
te

r 
et

 a
l.,

 2
01

2)
. 

M. Kganyago et al.                                                                                                                                                                                                                             


ComputersandElectronicsinAgriculture218(2024)108730

8

Table 3 
Commonly used MLRAs for biophysical and biochemical retrieval with associated accuracy measures, i.e., Coefficient of determination, R2, and Root Mean Squared Error, RMSE, where possible. RF, XGBoost, SVM, GPR, 
KRR, and ANN denote Random Forest, eXtreme Gradient Boosting, Gaussian Process Regression, Kernel Ridge Regression, and Artificial Neural Networks (ANN).   

Author(s) MLRA Accuracy (R2; RMSE) Targetparameter 
(s) 

Crop type 
(s) 

Sensor Explanatory Variables Experimental site(s) 

Tree- 
based 

(Ramoelo et al., 
2015) 

RF R2: 0.71 – 0.89; RMSE: 0.04 – 
0.0.8 

N Grassland Worldview-2 Spectral bands and 
Vegetation indices 

Northern South Africa 

(LI et al., 2017a) RF R2: 0.88; RMSE: 0.195 m2 m − 2 LAI Grassland Landsat7 ETM+; 
Landsat8 OLI 

Spectral bands and 
Vegetation indices 

Hulunber, Inner Mongolia, China (49◦20́24́́N, 119◦59́44́́E) 

(Tavakoli and 
Gebbers, 2019) 

RF R2: 0.56 – 0.69; RMSE: 0.27 – 
0.19 %* 

N Wheat Digital camera Spectral bands and 
Vegetation indices 

Marquardt experimental station, Potsdam, Germany 
(52◦27′N, 12◦57′E) 

Kernel- 
based 

(Malenovský et al., 
2017) 

SVM R2: 0.52 – 0.54; RMSE: 242.6 – 
238.3 nmol g− 1 dry weight 

LCab Antarctic 
mosses 

Hyperspectral UAS; 
WorldView-2 

Spectral bands Antarctic Specially Protected Area 135 (66.282◦S, 
110.539◦E) and Robinson Ridge (66.368◦S, 110.586◦E) 

(Verrelst et al., 
2012b) 

GPR R2: 0.94 – 0.96; RMSE: 0.47 – 
0.55 m2 m− 2 

LAI Multi-crops# Sentinel-2 MSI*; 
Sentinel-3 OLCI* 

Spectral bands Barrax, La Mancha region, Spain (30◦3′ N, 2◦6′ W) 

(Verrelst et al., 
2012b) 

GPR R2: 0.94 – 0.99; RMSE: 1.81 – 
5.36 μg cm− 2 ¥ 

LCab Multi-crops# Sentinel-2*; Sentinel- 
3* 

Spectral bands Barrax, La Mancha region, Spain (30◦3′ N, 2◦6′ W) 

(Campos-Taberner 
et al., 2016) 

GPR R2: 0.88 – 0.89; RMSE: 0.78 m2 

m − 2φ 
LAI Rice Landsat-8 OLIϕ; SPOT- 

5ϕ 
Spectral bands Valencia, Spain; and Lomellina, rice district, Lombardy, 

Italy 
(Campos-Taberner 
et al., 2016) 

KRR R2: 0.82 – 0.83; RMSE: 0.94 – 
0.97 m2 m− 2φ 

LAI Rice Landsat-8 OLIϕ; SPOT- 
5ϕ 

Spectral bands Valencia, Spain; and Lomellina, rice district, Lombardy, 
Italy 

(Elarab et al., 2015) RVM RMSE: 5.31 μg cm− 2 LCab Oats VNIR & Thermal UAS Spectral bands and 
Vegetation indices 

Utah, USA (39◦14′ N,112◦6′ W) 

(Verrelst et al., 
2016) 

GPR R2: 0.79; RMSE: 72.36 mg m− 2 LCab Multi-crops# Field spectrometer† Spectral bands Barrax, La Mancha region, Spain (30◦3′ N, 2◦6′ W) 

(Verrelst et al., 
2016) 

GPR R2: 0.94; RMSE: 0.40 m2 m − 2 LAI Multi-crops# Field spectrometer† Spectral bands Barrax, La Mancha region, Spain (30◦3′ N, 2◦6′ W) 

(Verrelst et al., 
2016) 

GPR R2: 0.95; RMSE: 0.37 m2 m − 2 LAI Multi-crops# HyMap† Spectral bands Barrax, La Mancha region, Spain (30◦3′ N, 2◦6′ W) 

(Wen et al., 2018) GPR R2: 0.85; RMSE: 0.95 μg ml− 1 N Rice Hyperspectral UAS Spectral bands ShenyangAgricultural University, Liaoning Province, 
China  
(41◦81′63″N, 123◦55′85″E) 

Deep 
learning 

(Verger et al., 2011) ANN R2: -; RMSE: 0.37 m2 m − 2 LAI Multi-crops# CHRIS/PROBA Spectral bands Barrax, Castilla-La Mancha region, Spain (30◦3′ N, 2◦6′ W) 
(Campos-Taberner 
et al., 2016) 

ANN R2: 0.83 – 0.84; RMSE: 0.91 – 
0.93 m2 m− 2 φ 

LAI Rice Landsat-8 OLIϕ; SPOT- 
5ϕ 

Spectral bands Valencia, Spain; and Lomellina, rice district, Lombardy, 
Italy 

(Delloye et al., 
2018) 

ANN R2: 0.55 – 0.85; RMSE: 1.00 – 
0.70 m2 m− 2 ‡

LAI Wheat Sentinel-2 MSI; SPOT- 
5 

Spectral bands Belgium 

(Delloye et al., 
2018) 

ANN R2: 0.08 – 0.31; RMSE: 13.94 – 
11.03 μg cm− 2 ‡

LCab Wheat Sentinel-2 MSI; SPOT- 
5 

Spectral bands Belgium 

(Delloye et al., 
2018) 

ANN R2: 0.46 – 0.62; RMSE: 0.51 – 
0.35 g m− 2 ‡

CCC Wheat Sentinel-2 MSI; SPOT- 
5 

Spectral bands Belgium 

(Dhakar et al., 2019) ANN R2: 0.56 – 0.75; RMSE: 1.34 – 
0.94 m2 m − 2 ₤ 

LAI Wheat Sentinel-2 MSI Spectral bands Pataudi block of Gurugram district, Haryana, India 

Notes for Table 3. 
¥ Accuracy ranges are for various configurations tested in the study, i.e., S2-10 m (4 bands with 10 m spatial resolution), S2-20 m (8 bands, i.e., B2 to B8a, at 20 m resolution), S2-60 m (10 bands, i.e., B1 to B9, at 60 m 
resolution), and S3-300 m (19 bands, i.e., BO1 to BO20, at 300 m resolution). 
* Simulated from Compact High-Resolution Imaging Spectrometry (CHRIS) data with a spectral range of 400 to 1050 nm. 
# Sunflower, Maize, Alfalfa, Wheat, Sugar beet, Onion, Garlic, Potato, and Vineyard. 
ϕ Simulated from PROSAIL Radiative Transfer Model. 
φ Accuracies are for simulated Landsat OLI and SPOT-5 data, respectively. 
† Original data were reduced to fewer bands using Gaussian Process Regression-based band analysis tool (GPR-BAT). 
‡ Accuracies were achieved with simulated Sentinel-2 data divided into different band subsets, i.e., SPOT-5 (4), 10-bands (3), Red-edge (7) and All bands (9). 
₤ Accuracies were achieved with inversion of PROSAIL LUT with ANN using Sentinel-2 MSI corrected with MODTRAN and libRadtran atmospheric correction approaches. 
* Accuracies are for estimated N over various phenologies, i.e., 219 and 234 days after sowing (DAS) and combined dates. 

M
. Kganyago et al.                                                                                                                                                                                                                             


Computers and Electronics in Agriculture 218 (2024) 108730

9

complex, computationally expensive, and opaque (or ‘black box’). One 
also has to parameterise several confounding hyperparameters, and 
there is no capability to compute variable importance measures directly 
and interrogate the models’ inner workings. Nonetheless, these MLRAs 
have attracted significant attention in recent studies using simulated and 
actual data from various hyper- and multi-spectral sensors (see Table 3). 
Support Vector Machine (SVM) is perhaps the most commonly used 
kernel-based algorithm in remote sensing applications and supports 
several kernels, with popular ones being Gaussian Radial Basis Function 
(RBF), Sigmoid, polynomial and linear kernels (Mountrakis et al., 2011; 
Yuan et al., 2017). However, the Gaussian Regression Process (GPR) and 
its variants, such as Variational Heteroscedastic Gaussian Processes, are 
popular in crop biophysical and biochemical parameters retrieval 
studies, offering better prospects due to its high accuracy and a special 
capability to generate estimates of the response variable’s uncertainty 
which enable evaluations of the crop parameters retrievals’ reliability 
for use in operational applications (Verrelst et al., 2013b). Like tree- 
based algorithms, GPR provides insights into the relevant bands which 
indicate the relationship with crop biophysical and biochemical pa-
rameters. GPR also supports multiple kernels, the common being 
anisotropic squared exponential and scaled Gaussian kernels. The al-
gorithm requires fewer hyper-parameters, i.e., θ = {v,σb,σn}, where (v,
σb) refers to the signal —composed of the scaling factor v and the length- 
scale of the explanatory variables σb— and σn refers to the standard 
deviation of the estimated noise. Verrelst et al., (2012b) compared the 
capabilities of deep learning, i.e., ANN, and three kernel-based MLRAs, i. 
e., SVM, Kernel Ridge Regression (KRR) and GPR, in the retrieval of LAI, 
LCab, and Fractional Vegetation Cover (FVC) using simulated Sentinel-2 
and Sentinel-3 data from CHRIS/PROBA data. Their study found that 
GPR had a relatively higher processing speed (i.e., < 2 s) and produced 
the most accurate results for all the biophysical and biochemical pa-
rameters considered, i.e., R2 of 0.89 – 0.99, with LCab being the most 
accurate and within Global Monitoring for Environmental and Security 
(GMES) limit of 10 % accuracy. Others (Wen et al., 2018) found an 
accuracy of 85 % and RMSE of 0.95 μg ml in estimating N in Rice using 
Hyperspectral data collected from an Unmanned Aerial System (UAS). 

In contrast, Artificial Neural Network (ANN) is the most commonly 
used deep learning algorithm. As its name suggests, the algorithm is 
inspired by neurological sciences. It consists of layered and inter-
connected structures of artificial neurons by weights or links. It consists 
of various hyperparameters, such as the number of hidden layers, 
weights, nodes per layer, learning rate, the shape of the nonlinearity, 
and regularisation parameters. The ANN structure is often optimised 
using a learning algorithm, such as computationally-heavy Levenberg- 
Marquardt or a quicker optimisation algorithm, such as gradient back- 
propagation. Recent developments such as Recurrent Neural Net-
works, Convolutional Neural Networks, and Long Short-Term Memory 
present new prospects for improving crop parameter retrieval accuracy. 
For example, Albughdadi et al. (2021) found that a 2-D convolutional 
network, i.e., UNet, performed better than the SNAP Biophysical pro-
cessor and the multilayer perceptron regressor and was computationally 
efficient. These are less prone to error than other machine learning 
techniques and have been shown to perform optimally in various envi-
ronments for various applications, e.g., yield prediction (Barbosa et al., 
2020), LAI (Apolo-Apolo et al., 2020), chlorophyll content (Xiaoyan 
et al., 2020), leaf water content (Nasir et al., 2019), crop type mapping 
and plant disease detection (Golhani et al., 2018). 

Besides regression problems, MLRAs can also be applied to feature 
selection or dimensionality reduction, thus critical for reducing the 
dimensionality (n < p) and collinearity of predictors. Although some 
MLRA, such as SVMs, has been dubbed as robust to dimensionality, 
recent studies increasingly show that dimensionality reduction tech-
niques help reduce uncertainties in crop parameters retrieval. For 
example, Verger et al. (2011) found that feature selection benefited the 
ANN performance, with additional bands causing the noise. The high 
dimensionality, characteristic of hyperspectral sensors and RTM- 

generated LUTs (often 1 nm spaced spectral bands), causes collin-
earity; thus, can result in over-fitting and a biased outcome (Wen et al., 
2018). In such cases, although the retrieval model performs well in one 
area, the retrieval accuracy plummets when transfered to other envi-
ronmental scenarios. Hence, the optimal number of spectral bands for 
LAI retrieval was 7 out of the 62 bands from CHRIS/PROBA. While 
Shelestov (2017) found that the same number of features were signifi-
cant for LAI, FaPAR, and FCover retrieval using Landsat and SPOT-5 
images collected as part of the SPOT-5 Take Five initiative. Others 
(Delegido et al., 2011; Verrelst et al., 2016) have shown that four to nine 
bands, selected through variable selection techniques, are sufficient to 
achieve robust retrievals of biophysical and biochemical parameters. 
The variable selection approaches are generally divided into filter-based 
(e.g., analysis of variance), wrapper-based (e.g., Recursive Feature 
Elimination-Support Vector Machine), and embedded (e.g., sparse Par-
tial Least Squares) algorithms. In addition to feature selection, multi-
variate dimensionality reduction techniques such as Principal 
Component Analysis (PCA) and Partial Least Squares (PLS) have been 
used effectively. Rivera-caicedo et al. (2017) showed that such tech-
niques improved the LAI retrievals when coupled with NN, KRR, and 
GPR, achieving R2

cv of 93 % instead of all HyMap bands. Considering 
that the multispectral data from new generation sensors such as 
Sentinel-2 have increased the number of bands, i.e., more than 7, it re-
mains questionable whether feature selection with these datasets can 
improve the accuracy and reduce uncertainties of retrieving crop bio-
physical and biochemical parameters. Generally, optimising multispec-
tral features for specific crop parameters is limitedly studied. 

Various MLRAs have different limitations, as identified in Table 4. 
These relate to their ability to handle different datasets, computational 
complexity and efficiency, transferability (i.e., portable to new areas and 
periods), and transparency and explainability, i.e., the algorithm’s inner 
mechanism can be interrogated to understand their functioning in 
different scenarios. 

5. Sources of uncertainty in the retrieval of crop biophysical and 
biochemical parameters and implications for precision 
agriculture 

Uncertainties in remotely sensed biophysical and biochemical pa-
rameters are a significant concern for agronomic applications. These 
uncertainties emanate from several factors, such as in-situ measurement 
errors (including instrument and sampling errors), acquisition condi-
tions (including atmospheric conditions and sensor and sun geometries), 
and parameterisation of RTMs and retrieval techniques such as machine 
learning algorithms. In-situ measurements are a critical part of any 
empirical analysis, including crop parameter retrieval using RTMs, VIs 
and MLRAs; therefore, they must be reasonably accurate and collected 
using appropriate sampling strategies. Otherwise, the quality of these 
measurements may adversely impact the accuracy and reliability of the 
retrieved crop parameters. The existence of various field instrumenta-
tion with various estimation methods is one of the factors causing 
measurements to be incomparable and unreproducible, particularly 
when there is no cross-calibration of such instruments to reduce sys-
tematic errors. For example, in-situ measurements from the Minolta 
SPAD-502 chlorophyll meter (i.e., commonly used for LCab measure-
ments) and MC-100 Chlorophyll Concentration are different. While MC- 
100 measures the absolute chlorophyll values, the SPAD-502 chloro-
phyll meter measures an index related to LCab. Therefore, it requires 
site-specific data, i.e., lab-based chlorophyll content, for calibration over 
many crop types and leaf structures. Such a requirement undermines the 
advantages of using remotely sensed satellite data, as the instrument 
provides unstable and inconsistent in-situ measurements. Uddling et al. 
(2007) showed that SPAD values saturate at LCab > 40 µg cm− 2 and 
differ according to growth stages, species, and distribution. Therefore, 
determining empirical calibration equations for every situation may be 
difficult, labour-intensive, and expensive. Other studies (Elarab et al., 

M. Kganyago et al.                                                                                                                                                                                                                             


Computers and Electronics in Agriculture 218 (2024) 108730

10

2015; Kganyago et al., 2022) utilised the published calibration equa-
tions to determine in-situ LCab values, which may introduce un-
certainties due to variations in crop types, growth stages, and prevailing 
environmental conditions in various study areas. As a result, Kganyago 
et al. (2022) suggest instrument cross-calibration and adjustment of 
systematic errors in the field measurements taken by various in-
struments. Moreover, uncertainties linked to optical instruments such as 
LAI-2000 (i.e., commonly used for LAI measurements) are inevitable 
and well-known. For example, Rautiainen et al. (2012) reported that 
LAI-2000 has 10 % – 20 % uncertainties varying by species and envi-
ronment. For precision agriculture applications, uncertainties related to 
in-situ instruments will likely be transferred to the estimated crop bio-
physical and biochemical parameters. It is, therefore, critical that they 
are reported adequately as part of the metadata to ensure that the user of 
the information generated from such data knows its limitations. 

Satellite and aerial images are prone to contamination by varying 
atmospheric constituents between the acquisition dates and residual 
errors after atmospheric correction. Atmospheric contamination and the 
inability of atmospheric correction (AC) tools to completely remove it 
also causes increased uncertainties in the accuracy of the crop param-
eters estimates. Changing atmospheric conditions during image acqui-
sition has been a challenge since the dawn of satellite-based remote 
sensing. Aerosols (smoke, dust, and atmospheric gases such as CO2) 
increase the atmospheric backscattering signal while attenuating the 
surface directional reflectance signal (Hilker et al., 2009), undermining 
the radiometric integrity and consistency of the measured reflectance. 
Consistent radiometric measurements are critical for phenological 
analysis and reliable estimates of crop parameters. Wilson et al. (2014) 
showed that aerosol optical thickness (AOT) may cause errors of about 
1.7 % in the measured reflectance and a 5 % change in the Normalised 
Difference Vegetation Index (NDVI). Ohde (2013) determined the sen-
sitivities of chlorophyll-a concentration estimation methods to atmo-
spheric dust and clouds. Their results showed that while dusty skies 
caused an overestimation of > 8 % in chlorophyll-a concentration, at-
mospheric conditions consisting of mixtures of clouds and dust resulted 
in overestimations of 7 % – 14 %. Alarmingly, as an integral input to 
precision agriculture, such errors manifest in remotely sensed crop pa-
rameters and may lead to false detection of stress and other crop con-
ditions. Consequently, it is critical to utilise well-validated AC tools (de 
Keukelaere et al., 2018; Doxani et al., 2018; Sola et al., 2018) and 
standardised analysis-ready-data processed according to minimum re-
quirements CEOS (Committee on Earth Observation Satellites) Analysis 
Ready Data for Land (CARD4L). A limitation here could be the delayed 
implementation of relevant standards by satellite data providers, which 
may limit the radiometric consistency between sensors and, thus, the 
realisation of the full information potential of satellite data for precision 

agriculture applications (Giuliani et al., 2017). Moreover, the delay 
introduced by AC, renders subsequent information extraction steps 
(such as biophysical and biochemical parameter retrieval) obsolete as it 
comes with a larger delay than required to support timely farm man-
agement decisions (Atzberger, 2013). In such cases, retrieval models 
which utilise the Top-of-Atmosphere (TOA) data directly (Estévez et al., 
2022, 2020), are convenient. Besides the signal attenuation and AC re-
sidual errors, satellite radiometric measurements induce non-negligible 
sensor and sun geometry effects, explaining > 30 % in reflectance 
variability (Kganyago et al., 2023). 

Another important consideration is the limitations of simulations 
generated by Radiative Transfer Models (RTMs). Despite successfully 
implemented operational workflows using MLRAs to invert crop pa-
rameters (Baret and Weiss, 2018; Weiss and Baret, 2016), RTMs are ill- 
posed, i.e., the different combinations of canopy parameters can simu-
late similar canopy reflectance due to mutually compensating effects 
(Houborg et al., 2015). Moreover, they cannot represent complex can-
opies of various crops due to their simplified modelling assumptions 
(Darvishzadeh et al., 2008; Dorigo et al., 2012). Hence, contrary to the 
widely accepted wisdom that coupled RTM-MLRA models are universal, 
recent studies found poor performances in environments other than 
where they were trained (Fernandes et al., 2014; Kganyago et al., 2020; 
Xie et al., 2019). In validating the SNAP Biophysical processor, based on 
PROSAIL-generated LUTs and Neural Networks, Kganyago et al. (2020) 
found LAI uncertainties of > 2 m2/m − 2(− |-) using Sentinel-2 data and 
concluded that the method was not suitable for precision agriculture. In 
another study, Xie et al. (2019) found a similar result in China over 
winter wheat, where the SNAP biophysical processor obtained LAI un-
certainties of 1.53 m2/m − 2(− |-) (R2 > 0.5), whilst the uncertainty for 
CCC was 148.58 μg cm2. To deal with ill-posedness, several regularisa-
tion strategies are used in literature, including incorporating measured 
leaf and canopy parameter value ranges (Xie et al., 2019), imposing 
spatial constraints on the RTM model (Atzberger and Richter, 2012), 
incorporating land cover information (Verrelst et al., 2012d) and using 
multiple best solutions instead of one in LUT-based inversions (Verrelst 
et al., 2013a). Duan et al. (2014) found low uncertainties of 0.62 m2 m 
− 2 in estimating LAI using the PROSAIL model and UAV hyperspectral 
data in Inner Mongolia, China, by approximating LAI and Average Leaf 
Angle (ALA) parameter ranges from the local in-situ measurements. 
Therefore, locally calibrated and regularised RTMs combined with other 
MLRAs may reduce uncertainties and improve the accuracy of bio-
physical and biochemical parameters, thus making them fit-for-purpose 
for precision agriculture. 

Other uncertainties may be introduced by several confounding 
hyperparameters required in calibrating various MLRAs. The sensitivity 
of MLRAs to these hyperparameters in the context of crop biophysical 

Table 4 
Pros and cons of different Machine Learning Regression Algorithms (MLRAs), reported in the literature.   

Pros Cons 

Tree-based  • Intuitive, 
RF requires few parameters for parameterisation, 
XGBoost and RF are computationally inexpensive, 
RF does not overfit4 

RF is computationally efficient4  

• RF requires a large training set for better performance1 

RF suffer from high dimensionality and collinearity, 

Kernel-based  • GPR provides uncertainty/confidence intervals estimates3, 
GPR has good interpolation capabilities, 
KRR and SVM can handle small training data1, 
GPR and KRR are computationally efficient3 

GPR is transparent, provides relevant samples and bands to model accuracy3,  

• KRR and SVM are computationally expensive when training data is large1 

SVM is computationally expensive3 

Deep Learning  • ANN is transferable, 
ANN is computationally efficient in application mode2, 3  

• Computationally expensive during training5, 
Opaque “Blackbox” models 3, 
ANN is unpredictable when presented with unseen spectra3, 
ANN is complex 
ANN is sensitive to dimensionality2 

ANN requires sufficiently large training sets1, 2 

Note for 1(Carter and Liang, 2019), 2 (Verger et al., 2011), 3 (Verrelst et al., 2012c), 4 (Li et al., 2015), 5 (Rivera-Caicedo et al., 2017). 

M. Kganyago et al.                                                                                                                                                                                                                             


Computers and Electronics in Agriculture 218 (2024) 108730

11

and biochemical parameters retrieval has not yet been thoroughly 
studied. However, studies applying MLRAs often employ hyper-
parameter tuning strategies which evaluate all combinations of the 
required hyperparameters or those selected randomly. Then, hyper-
parameters that result in the lowest uncertainties (e.g., RMSE) are 
chosen to retrieve biophysical and biochemical parameters. It should be 
noted that these hyperparameters are site-, crop condition- (i.e., 
phenology), and species-specific and a slight deviation in site factors (i. 
e., crop, climatic and environmental conditions) may result in higher 
uncertainties in the retrieved parameters. This is especially important to 
consider when MLRA models are transferred to new areas (or sites) 
where the crop types and conditions, climatic (i.e., temperatures, rela-
tive humidity, and rainfall), and environmental (i.e., soil types, soil 
moisture, and topography) factors may be different from the original site 
where the model was trained. Therefore, a sound transferability 
framework, which considers the variability of crop types, physiological 
conditions, climate, and environmental factors, is needed to improve the 
transferability of MLRAs and reduce the need for extensive field data 
collection. 

Finally, although the availability of sub-meter spatial resolution UAV 
imagery offers many prospects for precision agriculture, it may intro-
duce some uncertainty to crop parameter estimation using various 
retrieval techniques, including MLRA. This uncertainty emanates from 
high spatial variability caused by detecting inter-row spaces and back-
ground features such as soils, weeds, litter, and shadows. These might 
confuse the retrieval algorithms, especially if their spectral signatures 
were not considered during model calibration, causing false alarms. 
Therefore, masking such background features is an important pre- 
processing step. Moreover, object-based or parcel-based approaches 
may be more appropriate than pixel-based approaches to deal with the 
‘too much’ detail (i.e., <10 cm) characteristic of UAV data (Yang et al., 
2022). Acquiring data at ultra-high spatial resolution also presents 
image registration problems which may cause resource wastage. For 
example, image misregistrations will result in mislocated prescription 
maps (Gómez-Candón et al., 2014), causing the Variable Rate Applica-
tion (VRA) systems to spray fertilisers, herbicides, and pesticides in 
areas that are not necessarily needed. This is because the accuracy 
available from most GPS (Global Positioning Systems) is coarser than the 
pixel size provided by ultra-high-resolution systems. To deal with this 
problem, Hunt et al. (2018) suggested that each image should be ana-
lysed as a separate plot for monitoring. Besides, these images also pre-
sent opportunities for discrimination of crop species in inter-cropping 
cultural systems, hence the capability to estimate crop parameters per 
species in such systems by accounting for different canopy configura-
tions and leaf spectral properties. Overall, uncertainties from various 
sources imply that the retrieval of crop biophysical parameters may lack 
the required accuracy, needed for effective site-specific crop monitoring 
and management. Interested readers can find other important but 
broader technical considerations in the literature (Ali and Imran, 2021; 
Malenovský et al., 2009). 

6. Lessons and recommendations for future studies 

Several lessons could be derived from the literature review, as well as 
the opportunities for future studies. Concerning the relevant crop 
growth and health parameters, this paper established that most studies 
rarely focused on the estimation of LAIB, but rather on LAIG. However, 
the current retrieval methods for LAIG cannot be directly used to retrieve 
LAIB. Hence, there is an opportunity for future studies to explore tech-
niques for its retrieval and operationalisation. It is an essential decision- 
support tool for assessing the impact of heat waves and, at the end of the 
season, for optimising harvest schedules, covering crop planting dates, 
and planning transportation and storage logistics. Moreover, there is a 
lack of consensus in the literature about the relationship between CCC 
and N content. Some authors, such as Baret et al. (2007), showed that 
CCC could be used in canopy N content, and others, such as Vincini et al. 

(2016), argue that it is inadequate for distinguishing N deficiency from 
other crop stressors. To demystify these inconsistencies, further studies 
are needed in different environments. 

Inarguably, UAVs are the most powerful remote sensing systems, 
offering many advantages such as super high spatial resolutions (<1 cm) 
and customisable spectral coverage and flexible revisit times (López- 
Granados et al., 2016). However, their lack of adoption in developing 
regions such as sub-Saharan Africa is related to a myriad of complex 
factors, such as smaller field sizes (i.e., usually < 0.5 ha) such that 
economic benefits of UAVs cannot be realised, lack of awareness, tech-
nical and institutional capacities, and poor rural connectivity. There-
fore, UAV data can be used with satellite data to enhance operational 
models for crop parameter retrieval at many resolutions, making such 
models portable to data-scarce regions. In the past, the compromise 
between the spatial and temporal resolutions in remote sensing sensor 
designs reduced the utility of the datasets for precision agriculture. 
Recent literature demonstrates the prospects of data fusion algorithms 
for reducing the shortcomings of both the low- and medium/high- 
resolution sensors and improving their utility for precision agriculture 
by exploiting good qualities of each, i.e., higher repeat cycles (up to < 1 
day) and detailed coverage (<20 m). Moreover, this paper found that 
new high- and very-high-resolution sensors (e.g., PlanetScope’s Dove 
constellation) maintain the broadband VNIR sensor configurations 
despite the well-established saturation of crop parameters retrieved 
within this region of the electromagnetic spectrum. This is attributed to 
their lower cost and the fact that VNIR indices are still popular in various 
vegetation analyses and easily understood by users. Moreover, high 
spatial and temporal resolutions seem to be the most attractive factor for 
operational site-specific management than the spectral bands they 
provide. 

Two ground-breaking events which revolutionise the Earth obser-
vation satellite sensor technology for precision agriculture were identi-
fied. First, the free availability of Sentinel-2 constellation, i.e., Sentinel- 
2A (in 2016) and -2B (in 2017), carrying identical MSI cameras. 
Sentinel-2 constellation guarantees data continuity of, and interopera-
bility with, the previous missions and provides higher spatial resolu-
tions, 5 days revisits, and red-edge bands previously offered by 
commercial missions only. The new red-edge bands, centred at 705 nm, 
740 nm, and 783 nm, triggered new research and presented prospects 
for operational and accurate monitoring of stress-related agricultural 
crop parameters to aid time-sensitive agricultural decisions and 
improved yields. Second, the CubeSat constellations, such as Planet-
Scope’s SuperDove, will likely reduce the VHR data cost, thus increasing 
accessibility to researchers, product developers, governments, and 
farmers. In addition to standard VNIR bands, it provides two green 
bands (513 – 549 nm and 547 – 583 nm), yellow (600 – 620 nm) and red- 
edge bands (697 – 713 nm) daily and a 3 m spatial resolution. This 
provides the capability and opportunities for detailed, accurate, and 
frequent characterisation of field variability, thus increasing operational 
outcomes. 

Machine Learning Regression Algorithms (MLRAs) have evolved 
significantly. However, this review found that RF is still prominently 
used in literature, with studies citing good performance and fewer 
required hyperparameters as the main attractive aspects. In general, 
tree-based algorithms are also appealing because they are interpretable 
and explainable, i.e., they allow interrogation of the tree structure and 
variables used and indicate influential variables. In contrast, kernel- 
based and deep learning algorithms are considered complex, computa-
tionally expensive, and opaque (or ‘black box’). Some limitations are a 
requirement to tune many confounding hyperparameters, and there is 
no capability to directly compute variable importance and interrogate 
the models’ inner workings. Therefore, the effect of each hyper-
parameter on the accuracy of these MLRAs has not been studied yet. 
Nonetheless, these MLRAs have attracted significant attention in recent 
studies using simulated and various data from hyper- and multi-spectral 
sensors and higher accuracies were reported. The most considerable 

M. Kganyago et al.                                                                                                                                                                                                                             


Computers and Electronics in Agriculture 218 (2024) 108730

12

interest was in GPR, and its variants are evident, where most studies 
report relatively high accuracies and a unique capability to provide 
response variable uncertainty estimates that enable assessment of the 
reliability of the crop parameter retrievals for agronomic applications. 
Like tree-based algorithms, GPR provides insights into the relevant 
bands, which indicate the relationship with the response variables. It has 
been found to perform better than Artificial Neural Networks (ANN), 
which is somewhat computationally expensive and complex. Further 
studies are needed to stabilise MLRA performance across many crop 
types and conditions and climatic and environmental conditions by 
incorporating coupled RTM parametrised with crop- and site-specific in- 
situ data. In the future, studies should conduct a bibliometric analysis to 
identify which retrieval techniques and hyper-parameter values are 
frequently used. 

Additionally, the review found that coupling MLRAs with multivar-
iate dimensionality reduction enhances the accuracy of crop parameter 
retrievals even when using powerful algorithms such as ANN, KRR, and 
GPR (Yang et al., 2012). Considering that the data from quasi- 
hyperspectral sensors such as Sentinel-2, Worldview-3 and SuperDove 
have many bands, it is not yet known whether feature selection with 
these datasets can improve the accuracy and reduce uncertainties of 
retrieving crop growth and health parameters. Generally, the optimi-
sation of multispectral features for specific crop parameters is limitedly 
studied. Therefore, future studies should address this gap, especially 
considering the numerous variables, such as spectral bands, vegetation 
indices, and textural features, that are essential for accurately predicting 
crop parameters. 

7. Conclusions 

This paper sought to provide a comprehensive review of recent ad-
vances in remotely sensed retrieval of biochemical and biophysical pa-
rameters of crops brought by the developments in sensor technologies 
and novel machine learning retrieval techniques. Moreover, sources of 
uncertainty in retrieving crop parameters were identified, and practical 
implications for precision farming were discussed. Overall, the review 
revealed that developments in MLRA crop parameter retrieval tech-
niques were mainly driven by announcements and the availability of 
new sensors, with the availability of the Sentinel-2 and SuperDoves 
constellations being ground-breaking events. Many studies were con-
ducted with simulated data and airborne hyperspectral sensors at spe-
cific study areas with time series of field data covering many crops. 
Unfortunately, such permanent experimental sites are missing in sub- 
Saharan Africa, and there need to be coordinated and systematic ef-
forts targeting calibration and validation data collection. Compara-
tively, well-coordinated campaigns are exemplary in other regions, such 
as Europe, where several campaigns exist, such as SPectra bARrax 
Campaign (SPARC, Spain, 2003 – 2004), SENtinel-2 and FLuorescence 
EXperiment (SEN2FLEX, Spain, 2005) and SEN3EXP (Spain, 2009). 

Moreover, other prominent field campaigns included AgriSAR 
(Germany, 2006), and CarboEurope/FLEx/Sentinel-2 (CEFLES2, France, 
2007). Although some models were tested across many sites, in most 
cases, such places were in Mediterranean climates. Hence, such models 
may not be portable to different climates, such as semi-arid African 
agricultural areas (Kganyago et al., 2020) and temperate continental 
semi-humid monsoon climates in China (Xie et al., 2019). Another 
consideration is that farm sizes in developed countries such as the USA, 
Spain, France and Italy are significantly larger than in sub-Saharan Af-
rica. Therefore, crop-specific models may not be relevant in regions 
where farm configurations are characterised by small field sizes (i.e., 
<0.5 Ha) and mixed cropping systems. In addition to addressing gaps 
identified here, future research should focus on the development of 
generic (multi-crop), scalable (multi-sensor) and transferable models 
(across many sites and growing stages), especially within under-studied 
sub-Saharan African areas. 

Funding details 

This work was supported by the University of Witwatersrand and 
University of Johannesburg URC Grant [2023URC00563]. 

CRediT authorship contribution statement 

Mahlatse Kganyago: Conceptualization, Data curation, Formal 
analysis, Investigation, Methodology, Writing – original draft, Writing – 
review & editing. Clement Adjorlolo: Supervision, Writing – review & 
editing. Paidamwoyo Mhangara: Supervision, Writing – review & 
editing. Lesiba Tsoeleng: Data curation, Investigation. 

Declaration of competing interest 

The authors declare that they have no known competing financial 
interests or personal relationships that could have appeared to influence 
the work reported in this paper. 

Data availability 

No data was used for the research described in the article. 

Acknowledgements 

We appreciate the participation of anonymous reviewers in the peer- 
review process and the editorial team. Mahlatse Kganyago received 
funding from University of Witwatersrand and University of Johannes-
burg (UJ) University Research Council Grant (URC, [2023URC00563]). 

References 

Abdulridha, J., Ampatzidis, Y., Kakarla, S.C., Roberts, P., 2019. Detection of target spot 
and bacterial spot diseases in tomato using UAV-based and benchtop-based 
hyperspectral imaging techniques. Precis. Agric. https://doi.org/10.1007/s11119- 
019-09703-4. 

Albughdadi, M., Rieu, G., Duthoit, S., Alswaitti, M., 2021. Towards a massive sentinel-2 
LAI time-series production using 2-D convolutional networks. Comput. Electron. 
Agric. 180 https://doi.org/10.1016/j.compag.2020.105899. 

Ali, A., Imran, M., 2021. Remotely sensed real-time quantification of biophysical and 
biochemical traits of Citrus (Citrus sinensis L.) fruit orchards – A review. Sci. Hortic. 
https://doi.org/10.1016/j.scienta.2021.110024. 

Amin, E., Verrelst, J., Rivera-Caicedo, J.P., Pipia, L., Ruiz-Verdú, A., Moreno, J., 2021. 
Prototyping Sentinel-2 green LAI and brown LAI products for cropland monitoring. 
Remote Sens. Environ. 255 https://doi.org/10.1016/j.rse.2020.112168. 

Apolo-Apolo, O.E., Pérez-Ruiz, M., Martínez-Guanter, J., Egea, G., 2020. A mixed data- 
based deep neural network to estimate leaf area index in wheat breeding trials. 
Agronomy 10. https://doi.org/10.3390/agronomy10020175. 

Atzberger, C., 2000. Development of an invertible forest reflectance model The INFORM- 
Model, in: A Decade of Trans-European Remote Sensing Cooperation. Proceedings of 
the 20th EARSeL Symposium. 

Atzberger, C., 2013. Advances in remote sensing of agriculture: Context description, 
existing operational monitoring systems and major information needs. Remote Sens. 
(Basel) 5, 949–981. https://doi.org/10.3390/rs5020949. 

Atzberger, C., Richter, K., 2012. Spatially constrained inversion of radiative transfer 
models for improved LAI mapping from future Sentinel-2 imagery. Remote Sens. 
Environ. 120, 208–218. https://doi.org/10.1016/j.rse.2011.10.035. 

Azodi, C.B., Tang, J., Shiu, S.H., 2020. Opening the Black Box: Interpretable Machine 
Learning for Geneticists. Trends Genet. 36, 442–455. https://doi.org/10.1016/j. 
tig.2020.03.005. 

Barbosa, A., Trevisan, R., Hovakimyan, N., Martin, N.F., 2020. Modeling yield response 
to crop management using convolutional neural networks. Comput. Electron. Agric. 
170, 105197 https://doi.org/10.1016/j.compag.2019.105197. 

Baret, F., Weiss, M., 2018. Gio Global Land Component - Lot I “Operation of the Global 
Land Component” Algorithm Theoretical Basis Document. 

Baret, F., Houlès, V., Guérif, M., 2007. Quantification of plant stress using remote sensing 
observations and crop models: The case of nitrogen management. J. Exp. Bot. 58 
https://doi.org/10.1093/jxb/erl231. 

Bellvert, J., Mata, M., Vallverdú, X., Paris, C., Marsal, J., 2020. Optimizing precision 
irrigation of a vineyard to improve water use efficiency and profitability by using a 
decision-oriented vine water consumption model. Precis. Agric. https://doi.org/ 
10.1007/s11119-020-09718-2. 

Beltran, J.C., Valdez, P., Naval, P., 2019. Predicting Protein-Protein Interactions based 
on Biological Information using Extreme Gradient Boosting. 2019 IEEE Conference 
on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 
2019. https://doi.org/10.1109/CIBCB.2019.8791241. 

Boegh, E., Houborg, R., Bienkowski, J., Braban, C.F., Dalgaard, T., van Dijk, N., 
Dragosits, U., Holmes, E., Magliulo, V., Schelde, K., di Tommasi, P., Vitale, L., 
Theobald, M.R., Cellier, P., Sutton, M.A., 2013. Remote sensing of LAI, chlorophyll 

M. Kganyago et al.                                                                                                                                                                                                                             

https://doi.org/10.1007/s11119-019-09703-4
https://doi.org/10.1007/s11119-019-09703-4
https://doi.org/10.1016/j.compag.2020.105899
https://doi.org/10.1016/j.scienta.2021.110024
https://doi.org/10.1016/j.rse.2020.112168
https://doi.org/10.3390/agronomy10020175
http://refhub.elsevier.com/S0168-1699(24)00121-2/h0030
http://refhub.elsevier.com/S0168-1699(24)00121-2/h0030
http://refhub.elsevier.com/S0168-1699(24)00121-2/h0030
https://doi.org/10.3390/rs5020949
https://doi.org/10.1016/j.rse.2011.10.035
https://doi.org/10.1016/j.tig.2020.03.005
https://doi.org/10.1016/j.tig.2020.03.005
https://doi.org/10.1016/j.compag.2019.105197
https://doi.org/10.1093/jxb/erl231
https://doi.org/10.1007/s11119-020-09718-2
https://doi.org/10.1007/s11119-020-09718-2


Computers and Electronics in Agriculture 218 (2024) 108730

13

and leaf nitrogen pools of crop- and grasslands in five European landscapes. 
Biogeosciences 10, 6279–6307. https://doi.org/10.5194/bg-10-6279-2013. 

Campos-Taberner, M., García-Haro, F.J., Busetto, L., Ranghetti, L., Martínez, B., 
Gilabert, M.A., Camps-Valls, G., Camacho, F., Boschetti, M., 2018. A critical 
comparison of remote sensing Leaf Area Index estimates over rice-cultivated areas: 
From Sentinel-2 and Landsat-7/8 to MODIS, GEOV1 and EUMETSAT polar system. 
Remote Sens. (Basel) 10. https://doi.org/10.3390/rs10050763. 

Carter, C., Liang, S., 2019. Evaluation of ten machine learning methods for estimating 
terrestrial evapotranspiration from remote sensing. Int. J. Appl. Earth Obs. Geoinf. 
78, 86–92. https://doi.org/10.1016/j.jag.2019.01.020. 

Chen, T., Guestrin, C., 2016. XGBoost: A scalable tree boosting system. Proceedings of 
the ACM SIGKDD International Conference on Knowledge Discovery and Data 
Mining 13-17-Augu, 785–794. https://doi.org/10.1145/2939672.2939785. 

Chen, Y., Feng, L., Mo, J., Mo, W., Ding, M., Liu, Z., 2020. Identification of Sugarcane 
with NDVI Time Series Based on HJ-1 CCD and MODIS Fusion. J. Indian Soc. Remote 
Sens. 48 https://doi.org/10.1007/s12524-019-01042-1. 

Ciganda, V., Gitelson, A., Schepers, J., 2008. Vertical profile and temporal variation of 
chlorophyll in maize canopy: Quantitative “crop vigor” indicator by means of 
reflectance-based techniques. Agron. J. 100 https://doi.org/10.2134/ 
agronj2007.0322. 

Clevers, J.G.P.W., Gitelson, A.A., 2013. Remote estimation of crop and grass chlorophyll 
and nitrogen content using red-edge bands on sentinel-2 and-3. Int. J. Appl. Earth 
Obs. Geoinf. 23, 344–351. https://doi.org/10.1016/j.jag.2012.10.008. 

Corti, M., Cavalli, D., Cabassi, G., Vigoni, A., Degano, L., Marino Gallina, P., 2019. 
Application of a low-cost camera on a UAV to estimate maize nitrogen-related 
variables. Precis. Agric. 20, 675–696. https://doi.org/10.1007/s11119-018-9609-y. 

Darvishzadeh, R., Skidmore, A., Schlerf, M., Atzberger, C., 2008. Inversion of a radiative 
transfer model for estimating vegetation LAI and chlorophyll in a heterogeneous 
grassland. Remote Sens. Environ. 112 https://doi.org/10.1016/j.rse.2007.12.003. 

Dawson, T.P., Curran, P.J., Plummer, S.E., 1998. LIBERTY - Modeling the effects of Leaf 
Biochemical Concentration on Reflectance Spectra. Remote Sens. Environ. 65 
https://doi.org/10.1016/S0034-4257(98)00007-8. 

De Castro, A.-I., Jurado-Exposito, M., Gómez-Casero, M.-T., Lopez-Granados, F., 2012. 
Applying neural networks to hyperspectral and multispectral field data for 
discrimination of cruciferous weeds in winter crops. The Scientific World Journal 
2012. 

de Keukelaere, L., Sterckx, S., Adriaensen, S., Knaeps, E., Reusen, I., Giardino, C., 
Bresciani, M., Hunter, P., Neil, C., van der Zande, D., Vaiciute, D., 2018. Atmospheric 
correction of Landsat-8/OLI and Sentinel-2/MSI data using iCOR algorithm: 
validation for coastal and inland waters. Eur J Remote Sens 51, 525–542. https:// 
doi.org/10.1080/22797254.2018.1457937. 

Delegido, J., Verrelst, J., Alonso, L., Moreno, J., 2011. Evaluation of sentinel-2 red-edge 
bands for empirical estimation of green LAI and chlorophyll content. Sensors 11, 
7063–7081. https://doi.org/10.3390/s110707063. 

Delegido, J., Van Wittenberghe, S., Verrelst, J., Ortiz, V., Veroustraete, F., Valcke, R., 
Samson, R., Rivera, J.P., Tenjo, C., Moreno, J., 2014. Chlorophyll content mapping 
of urban vegetation in the city of Valencia based on the hyperspectral NAOC index. 
Ecol. Ind. 40, 34–42. https://doi.org/10.1016/j.ecolind.2014.01.002. 

Delegido, J., Verrelst, J., Rivera, J.P., Ruiz-Verdú, A., Moreno, J., 2015. Brown and green 
LAI mapping through spectral indices. Int. J. Appl. Earth Obs. Geoinf. 35, 350–358. 
https://doi.org/10.1016/j.jag.2014.10.001. 

Delloye, C., Weiss, M., Defourny, P., 2018. Retrieval of the canopy chlorophyll content 
from Sentinel-2 spectral bands to estimate nitrogen uptake in intensive winter wheat 
cropping systems. Remote Sens. Environ. 216, 245–261. https://doi.org/10.1016/j. 
rse.2018.06.037. 

Dhakar, R., Sehgal, V.K., Chakraborty, D., Sahoo, R.N., Mukherjee, J., 2019. Field scale 
wheat LAI retrieval from multispectral Sentinel 2A-MSI and LandSat 8-OLI imagery: 
effect of atmospheric correction, image resolutions and inversion techniques. 
Geocarto Int. 1–21. https://doi.org/10.1080/10106049.2019.1687591. 

Dong, T., Liu, J., Shang, J., Qian, B., Ma, B., Kovacs, J.M., Walters, D., Jiao, X., Geng, X., 
Shi, Y., 2019. Assessment of red-edge vegetation indices for crop leaf area index 
estimation. Remote Sens. Environ. 222, 133–143. https://doi.org/10.1016/j. 
rse.2018.12.032. 

Dorigo, W., Lucieer, A., Podobnikar, T., Carni, A., 2012. Mapping invasive Fallopia 
japonica by combined spectral, spatial, and temporal analysis of digital orthophotos. 
Int. J. Appl. Earth Obs. Geoinf. 19, 185–195. https://doi.org/10.1016/j. 
jag.2012.05.004. 

Doxani, G., Vermote, E., Roger, J.C., Gascon, F., Adriaensen, S., Frantz, D., Hagolle, O., 
Hollstein, A., Kirches, G., Li, F., Louis, J., Mangin, A., Pahlevan, N., Pflug, B., 
Vanhellemont, Q., 2018. Atmospheric Correction Inter-Comparison Exercise. 
Remote Sens (basel) 10, 1–18. https://doi.org/10.3390/rs10020352. 

Duan, S.B., Li, Z.L., Wu, H., Tang, B.H., Ma, L., Zhao, E., Li, C., 2014. Inversion of the 
PROSAIL model to estimate leaf area index of maize, potato, and sunflower fields 
from unmanned aerial vehicle hyperspectral data. Int. J. Appl. Earth Obs. Geoinf. 26, 
12–20. https://doi.org/10.1016/j.jag.2013.05.007. 

Elarab, M., Ticlavilca, A.M., Torres-Rua, A.F., Maslova, I., McKee, M., 2015. Estimating 
chlorophyll with thermal and broadband multispectral high resolution imagery from 
an unmanned aerial system using relevance vector machines for precision 
agriculture. Int. J. Appl. Earth Obs. Geoinf. 43, 32–42. https://doi.org/10.1016/j. 
jag.2015.03.017. 

Fang, H., Wei, S., Liang, S., 2012. Validation of MODIS and CYCLOPES LAI products 
using global field measurement data. Remote Sens. Environ. 119, 43–54. https://doi. 
org/10.1016/j.rse.2011.12.006. 

Fang, H., Zhang, Y., Wei, S., Li, W., Ye, Y., Sun, T., Liu, W., 2019. Validation of global 
moderate resolution leaf area index (LAI) products over croplands in northeastern 

Chinas. Remote Sens. Environ. 233, 111377 https://doi.org/10.1016/j. 
rse.2019.111377. 

Féret, J.B., Gitelson, A.A., Noble, S.D., Jacquemoud, S., 2017. PROSPECT-D: Towards 
modeling leaf optical properties through a complete lifecycle. Remote Sens. Environ. 
193, 204–215. https://doi.org/10.1016/j.rse.2017.03.004. 

Fernandes, R., Weiss, M., Camacho, F., Berthelot, B., Baret, F., Duca, R., 2014.