A GEOSPATIAL APPROACH TO MAPPING JACARANDA TREE DISTRIBUTION IN JOHANNESBURG, SOUTH AFRICA. Rohini Chelsea Reddy (1836015) November 2023 Supervisor: Prof. Jennifer Fitchett ii Declaration I declare that this dissertation is my own, original work, except where otherwise acknowledged. It is being submitted for the Degree of Master of Science at the University of the Witwatersrand, Johannesburg. It has not been submitted previously for any degree or examination, at this, or any other university. ________________________ ______ 24 November 2023_ Rohini Chelsea Reddy Date iii Abstract Accurate mapping of the spatial distribution of invasive species is vital for the implementation of effective monitoring and management strategies. In countries where resources are scarce and costly, citizen science provides a cost-effective and accurate alternative for large-scale data collection. Citizen’s familiarity with their environment contributes to aspects such as accurate identification of features on the landscape. Advances in a geographic information system (GIS) together with open-sourced photography from Google Street View, provide accurate methods for in-field and remote validation of citizen science data for invasive mapping and assists with the creation and compilation of maps to visualize the spatial distribution of invasive plants upon the landscape. In this study, the first spatial distribution maps for invasive tree species, Jacaranda mimosofolia (common name: Jacaranda), are created for the City of Johannesburg (CoJ). Jacaranda trees are well-known by citizens in the CoJ for their district purple flowers which blanket the landscape during springtime. A combination approach using citizen science, GIS, and Google Street View for data collection, analysis, and creation of the first spatial distribution map of exact location and prevalence of Jacaranda trees within certain suburbs of the CoJ, is produced. A total of 8,931 ground-truthing geopoints together with extensive Google Street View validation for Jacaranda tree presence, formed the basis of accurate spatial distribution maps. The first research question of this study focused on the spatial distribution of Jacaranda trees in the CoJ and was answered as a total of 54 suburbs were confirmed as having a large presence of Jacaranda trees in the CoJ. Citizen science data collected a total of 488 geotags for possible Jacaranda tree presence in the CoJ, over a 75-day online survey collection period. Although citizen science data provided a lower spatial resolution compared to successful fieldwork and iv Google Street View approaches, citizen science data provided very high accuracy for the identification and geolocation of Jacaranda tree presence in the CoJ which answers the second research question based on the effectiveness of the geospatial approach towards citizen science, ground-truthing and Google Street View as data collection methods. Since the accuracy of citizen science resulted in 66% of collected geotags within the categories of ‘very high’, ‘high’ and ‘moderate’ accuracy ranges of between <7-24m from a confirmed Jacaranda tree, together with the accuracy of 8,931 in-field collected geolocation of Jacaranda trees and Google Street View’s accuracy and capability of collecting street view imagery – it is concluded that the combined approach of ground-truthing, citizen science and Google Street View contribute not only to effective data collection, but also towards the successful mapping of Jacaranda tree presence in the CoJ. v Acknowledgements First and foremost, I would like to thank my supervisor, Prof. Jennifer Fitchett. Prof, thank you for seeing the potential in me from my Honours year and for providing the best teaching support I have received throughout my entire academic career. You have always been patient, understanding and kind from the time I changed my entire focus four months into the year, right the way through to my final days of submission whilst I juggled moving to a new city. Your passion is admirable. I would like to thank the South African National Space Agency for awarding me a bursary that provided the funding for my Masters studies. I would also like to acknowledge the support and assistance from the fantastic team at Esri South Africa. To Lizette Rust for offering me an abundance of resources and support. To Cameron Muller for being the best Mentor and now one of my role models in the GIS field. To Deon Lengton for helping with Esri devices and providing support during fieldwork. To the Esri Interns who kept me smiling (and dancing). Specifically, I would also like to acknowledge Lindokuhle Nhlabathi, Shelton Ndlovu, Basil Rabophala, Tears Rankapole, and Linda Mabeso for accompanying me on data collection during my fieldwork campaigns, you each made the fieldwork component one of the best experiences of my academic career. To everyone from the Esri Durban office for sharing this academic journey with me. I would like to acknowledge all anonymous respondents involved in the survey aspect of my research project. I would also like to thank Dean Peters and Dale Carter from the City of Cape Town for the ongoing support and motivation towards the completion of my studies. vi Last, but certainly not least, I would like to thank my entire family for their immeasurable support. To my grandparents – Ma and Pa, thank you both for ensuring I get away from my desk with weekly brunch and coffee outings which were always filled with laughter and wise words of encouragement. To my little brother, Kieron, for always offering warm hugs, funny faces, and constant reminders to make time for fun. To my parents, Sangita and Noel, this research project is dedicated to the both of you. I would never have made it this far if it wasn’t for the fighting spirit you have instilled in me. Thank you for the endless amount of love, guidance, understanding, support, and patience you both continue to offer. vii Table of Contents Declaration .................................................................................................................................................... ii Abstract ........................................................................................................................................................ iii Acknowledgements ....................................................................................................................................... v Table of Contents ........................................................................................................................................ vii List of Figures ................................................................................................................................................ x List of Tables ...............................................................................................................................................xiii List of Acronyms and Abbreviations ........................................................................................................... xiv Glossary ....................................................................................................................................................... xv CHAPTER 1: INTRODUCTION ......................................................................................................................... 1 1.1 Background ......................................................................................................................................... 2 1.2 Rationale ............................................................................................................................................. 4 1.3 Aims and Objectives ............................................................................................................................ 6 1.4 Research Questions ............................................................................................................................ 6 1.5 Structure of the Dissertation .............................................................................................................. 7 CHAPTER 2: LITERATURE REVIEW ................................................................................................................. 8 2.1 Introduction ........................................................................................................................................ 9 2.1.1 Background and Previous Research ............................................................................................. 9 2.2 Tools for Recording Vegetation ........................................................................................................ 11 2.2.1 Citizen Science as a Tool ............................................................................................................ 11 2.2.2 Google Earth and Google Street View as a Tool ........................................................................ 16 2.2.3 Social Media as a Tool ................................................................................................................ 18 2.3. Combined Citizen Science and Geospatial Approaches ................................................................... 19 2.3.1 Spatial Distribution Mapping ..................................................................................................... 20 2.4 Synthesis ........................................................................................................................................... 24 CHAPTER 3: STUDY SITE .............................................................................................................................. 25 3.1 Introduction ...................................................................................................................................... 26 3.2 History ............................................................................................................................................... 26 3.3 General Geography ........................................................................................................................... 27 3.4 Climate .............................................................................................................................................. 29 3.5 Population ......................................................................................................................................... 30 viii CHAPTER 4: METHODOLOGY ...................................................................................................................... 32 4.1 Introduction ...................................................................................................................................... 33 4.2 Data Collection .................................................................................................................................. 33 4.2.1. Ground-truth Data .................................................................................................................... 35 4.2.2 Virtual Campaign Data ............................................................................................................... 41 4.2.3 Citizen Science Data ................................................................................................................... 42 4.3 Data Analysis ..................................................................................................................................... 45 4.3.1 Geoprocessing ............................................................................................................................ 45 4.4 Synthesis ........................................................................................................................................... 52 CHAPTER 5: RESULTS ................................................................................................................................... 53 5.1 Introduction ...................................................................................................................................... 54 5.2 Ground-truth Validation of Jacaranda Tree Presence ...................................................................... 54 5.3 Google Street View Validation of Jacaranda Tree Presence ............................................................. 57 5.4 Citizen Science Survey Responses ..................................................................................................... 61 5.4.1 Social Media Posting Dates for Survey Responses..................................................................... 63 5.5 Citizen Demographic Information ..................................................................................................... 65 5.5.1 Citizen Familiarity with the CoJ .................................................................................................. 66 5.5.2 Citizen Suburb of Residence ...................................................................................................... 67 5.6 Citizen Science Certainty Ratings ...................................................................................................... 69 5.6.1 Certainty Ratings Linked To Citizen Length of Residency in the CoJ .......................................... 71 5.7 Citizen Science Geotagging ............................................................................................................... 73 5.8 Citizen Science Geotag Accuracy ...................................................................................................... 76 5.8.1 Inaccurate, Unconfirmed, or Invalid Geotags from Citizen Science .......................................... 78 5.8.2 Accuracy of Geotagging Certainty from Citizen Science ............................................................ 87 5.9 Spatial Distribution of Jacaranda Trees ............................................................................................ 89 5.10 Synthesis ......................................................................................................................................... 96 CHAPTER 6: DISCUSSION ............................................................................................................................. 97 6.1. Introduction ..................................................................................................................................... 98 6.2 Mapping Jacaranda Tree Distribution ............................................................................................... 99 6.2.1 Invasive Species Monitoring and Management ....................................................................... 105 6.2.2 Citizen Science Engagement .................................................................................................... 107 6.2.3 Tourism Opportunities ............................................................................................................. 109 6.3 Future Applications ......................................................................................................................... 111 ix 6.4 Limitations ....................................................................................................................................... 113 6.5 Synthesis ......................................................................................................................................... 116 CHAPTER 7: CONCLUSIONS ...................................................................................................................... 117 7.1 Introduction .................................................................................................................................... 118 7.2 Accomplishment of Study Aim, Objectives, and Research Questions ............................................ 119 7.3 Implications of Jacaranda Tree Distribution in the CoJ ................................................................... 122 7.4 Future Research Avenues ............................................................................................................... 123 References ................................................................................................................................................ 126 Appendix A: Ethics Clearance Certificate .................................................................................................. 145 Appendix B: Survey Questionnaire ........................................................................................................... 146 Appendix C: Examples of Social Media Posts ............................................................................................ 163 Appendix D: Example of Poster with QR Code to Survey ......................................................................... 169 x List of Figures Figure 3.1: Study site map of the CoJ showing the main regions of Diepsloot, Midrand, Sandton, Alexandra, Inner City, Johannesburg South, Ennerdale, Soweto, Diepkloof, Rosebank, Roodepoort and adjacent municipalities.…………………………………………………………………… 29 Figure 4.1: Methodology diagram…………………………………………………………………………… 34 Figure 4.2: Own photograph during fieldwork showing a) winter-time green tree, b) winter-time green leaf c) spring-time purple tree, and d) spring-time purple flower of Jacaranda trees. …………………………………………………………………………………………………………………………….. …… 36 Figure 4.3: Own photograph of CT8 Tablet and Garmin device provided by Esri South Africa…………………………………………………………………………………………………………………………… 37 Figure 4.4: Own photograph during fieldwork in Soweto, showing three-button layout of QuickCapture application……………………………………………………………………………………………. 40 Figure 4.5: Model builder workflow showing custom built SQL queries for data filtering and analysis..…………………………………………………………………………………………………….………………. 50 Figure 5.1: Map showing ground-based validation of a total of 8,931 geopoints where a single geopoint represents Jacaranda tree presence in the CoJ. ……………………………………………………………………………………………………………………………… 56 Figure 5.2: Map showing Google Street View captures of Jacaranda tree presence in the CoJ, where a single white geopoint represents between 1 - 6 Jacaranda trees………………… 58 Figure 5.3: Google Street View capture showing a) two green and b) four flowering Jacaranda trees in a single capture.…………………………………………………………………………………………… 59 xi Figure 5.4: Count of online survey responses collected per month from citizens of the CoJ, over the 75-day survey collection period…………………………………………………………………………. 62 Figure 5.5: Response count per day as influenced by posting days on five social media platforms of Instagram, Facebook, WhatsApp, LinkedIn, and Twitter………………………………………. 64 Figure 5.6: Response count of each citizen’s timeframe for which they have lived in the CoJ…………………………………………………………………………………………………………………………… 66 Figure 5.7: Citizen science online survey responses per Suburb in the CoJ, ordered alphabetically…………………………………………………………………………………………………………... 68 Figure 5.8: Certainty rating of total citizen science responses per suburb, ordered from ‘Very certain’, ‘Certain’, ‘Neither certain nor uncertain’, ‘Uncertain’, and ‘Very uncertain’…….. 70 Figure 5.9: Certainty ratings of Jacaranda tree presence in citizen’s suburbs, as influenced by citizen’s residence in the CoJ………………………………………………………………………………………... 72 Figure 5.10: Map showing all captured citizen science geotags for Jacaranda tree presence in the CoJ………………………………………………………………………………………………………………………………… 74 Figure 5.11: Map showing total citizen science geotags categorized by ‘my street’, ‘other streets’ and ‘exact location’ of Jacaranda tree presence in CoJ…………………………………………………. 75 Figure 5.12: Screen capture from Google Street View of misidentification of Jacaranda tree to the common London Plane tree in the CoJ……………………………………………………………………….. 84 Figure 5.13: Screen capture from Google Street View of misidentification of Jacaranda tree to similar pigmented flowering trees of differing tree species………………………………………... 85 xii Figure 5.14: Screen capture from Google Street View of streets with available Google Street View imagery shown in blue. Streets without a blue highlight, do not have Google Street View imagery available……………………………………………………………………………………………………………………... 86 Figure 5.15: Screen capture from Google Street View of blurred Google Street View imagery due to strong sunlight limiting visibility of tree species in this capture………………………………. 87 Figure 5.16: Accuracy of citizen science geotagging compared to certainty ratings for Jacaranda tree presence on citizen’s streets..……………………………………………………………………………. 88 Figure 5.17: Accuracy of citizen science geotagging compared to certainty ratings for Jacaranda tree presence on other streets..………………………………………………………………………………… 89 Figure 5.18: Map showing combined geopoints of ground-based validation, Google Street View validation and citizen science geotags for Jacaranda tree presence in the CoJ……….... 90 Figure 5.19: Zoomed-in map showing select suburbs with large numbers of Jacaranda trees where ground-based geopoints, Google Street View geotagging, and citizen science geotagging data is abundant.………………………………………………………………………………………………………………….. 92 Figure 5.20: Density map showing regions with large numbers of confirmed Jacaranda trees, as supported by ground-based and Google Street View validation...……………………………… 93 Figure 6.1: Google Street View of a blurred capture due to camera malfunction or privacy concerns…………………………………………………………………………………………………………….……… 102 Figure 6.2: Google Street View capture of Kruger Street intersection in the CoJ where Jacaranda trees are captured in a) flowering and b) bare morphology………………………………………. 103 xiii List of Tables Table 5.1: Number of validated Jacaranda tree geopoints collected from each both field campaigns carried out in winter and spring seasons…………………………………………………….. 55 Table 5.2: List of hashtags, tags and mentions which were used in Jacaranda posts on social media..…………………………………………………..……………………………………………………………………. 65 Table 5.3: Certainty ratings of Jacaranda tree presence from citizen’s street and other streets…………………………………………………………………………………………………………………………. 73 Table 5.4: Six categories for level of accuracy of citizen science geotags from confirmed Jacaranda tree presence.…………………………………………………..……………………………………….. 76 Table 5.5: Citizen science geotags categorized by six levels of accuracy. Accuracy levels were analyzed for the total geotag count, geotags on ‘my street’, geotags on ‘other streets’ and ‘exact location’.…………………………………………………..………………………………………………………………... 77 Table 5.6: Description of 29 citizen science collected geotags which lie within the 'inaccurate’, ‘unconfirmed' or 'invalid' accuracy range for >200m of confirmed Jacaranda tree presence...…………………………………………………..……………………………………………………………... 80 Table 5.7: Suburbs with high concentrations of Jacaranda tree presence, as confirmed from field campaigns (highlighted in pink) and Google Street View (highlighted in green), citizen science (indicated with an *). Suburbs where both field campaigns and Google Street View were used are highlighted in purple..…………………………………………………..……………………………………………. 94 xiv List of Acronyms and Abbreviations CoJ: City of Johannesburg EVI: Enhanced Differential Vegetation Index GIS: Geographic information system GLONASS: Global Navigation Satellite System GPS: Global Positioning System GUIDs: Globally Unique Identifiers KML: Keyhole Markup Language LAI: Leaf Area Index LHWP: Lesotho Highlands Water Project NDVI: Normalized Differential Vegetation Index PPGIS: Public Participation GIS PSHB: Polyphagous Shot Hole Borer QR: Quick Response SABAP: The South African Bird Atlas Project SQL: Structured Query Language TOD: Transit-oriented Development VGI: Volunteered Geographic Information xv Glossary Capture: This is a single frame which is captured from a desired perspective using Google Street View. This single frame, or capture, has embedded geographic co-ordinates from which the street view was captured. Citizen Science: “The collection and analysis of data relating to the natural world by members of the general public, typically as part of a collaborative project with professional scientists” (Oxford 2016). In this study, the general public were not actively engaged in the analysis of data, but rather with data collection only. Crowdsource: Is to “obtain (information or input into a particular task or project) by enlisting the services of a large number of people, either paid or unpaid, typically via the Internet” (Oxford 2016). In this study, the term ‘crowdsourced’ must be read with an understanding of the citizens of the City of Johannesburg (CoJ) as the unpaid volunteers who obtained Jacaranda tree information for the purpose of this study. Esri: Esri is a private company, started in 1969 where the first commercial geographic information system was developed. Today, Esri is a world leader in GIS software. Geographic information system (GIS): A geographic information system produces maps and manages and analyzes a wide variety of types of data. In this study, a GIS assisted with the successful mapping of Jacaranda tree presence and distribution in the CoJ, whilst simultaneously providing the mapping platform for citizen science data collection. xvi Geopoint: These points represent the exact geographic co-ordinates of a Jacaranda tree collected through two field-campaigns in the months of August and October, using Esri’s Quickcapture application. Geotag: This tag represents a single citizen science capture of a Jacaranda tree using Esri’s Survey123 Connect, which is a sophisticated geocoded survey platform. Ground-truth(ing): This process involved ground-based validation of Jacaranda tree presence whereby physical visits to Jacaranda tree locations were done. 1 CHAPTER 1: INTRODUCTION Own photograph taken in the suburb of Northcliff, on 12 October 2022. 2 1.1 Background The City of Johannesburg (CoJ) agglomerates more than one third of the Gauteng City-region population (Schäffler and Swilling, 2013). The CoJ is one of the fastest growing cities in South Africa with the highest numbers of migrants (Rasmeni and Mudia, 2019). South Africa’s history of apartheid posed urban pressures which are still seen today (Risimati and Gumbo, 2018). Segregation of ethnic races lead to the CoJ having division of northern suburbs of the CoJ as affluent and southern suburbs as poor (Rasmeni and Mudia, 2019). The division of suburbs influenced a tree planting boom where many invasive tree species were planted predominantly in these affluent areas in an effort to reduce dust, air, heat, and noise pollution in the CoJ (Schäffler and Swilling, 2013). One of these many tree species was the Jacaranda mimosifolia (common name: Jacaranda). Jacaranda is a deciduous tree belonging to the Bignoniaceae family (Xie et al. 2021). Jacaranda trees were brought to South Africa from Brazil in the 1800s not only to address pollution from increased mining in the CoJ, but also to beautify the landscape (Gachet and Schuhly, 2009). Jacaranda trees are identifiable by their bright purple flowers which blossom in spring and form a large part of the urban identity and urban forest of the CoJ (Schäffler and Swilling, 2013). Apart from the excitement which springtime Jacaranda flowering offers to citizens, these trees are also notorious for their conservation status. Jacaranda trees hold category 3 alien-invasive status and hold a unique position as an “invasive flagship species” (Fitchett and Fani, 2018, p. 470). Owing to this status, Jacaranda trees are prohibited from being replanted, transplanted, or sold – as a result, the aging population of Jacaranda trees will not always be part of the urban fabric of the CoJ (Schäffler and Swilling, 2013). Spatial information on the distribution of 3 Jacaranda trees around the CoJ is scarce, with only one remote sensing-based study to have identified the location of Jacaranda trees in the CoJ, with limited accuracy (Newete et al. 2022). Owing to this, it is imperative to know the spatial distribution of invasive Jacaranda trees, as stated in the primary and secondary aims of this study. The aim to produce the first Jacaranda distribution map in the CoJ using a three-way combined data collection approach ensures accurate and efficient data collection points and resulting maps, as seen later in this dissertation. Citizen science, in this context, has been defined as “the collection and analysis of data relating to the natural world by members of the general public, typically as part of a collaborative project with professional scientists” (Oxford 2016). In this study, the general public were not actively engaged in the analysis of data, but rather with data collection only. Citizen science has been used in vegetation distribution mapping as data is collected at a fine spatial resolution over a large geographic area, at a very low cost (Chandler et al. 2017). There is an increased use of citizen science for invasive tree mapping (although often for their removal), providing data outputs which are both reliable and accurate in complex urban environments (eg. Hawthorne et al. 2015; Seiferling et al. 2017). This provides a framework for mapping Jacaranda trees which both contributes to the growing literature on mapping invasive species and assists those in monitoring the current distribution of this more unique flagship invasive. Moreover, as stated in the secondary aims, a citizen science approach for invasive species mapping in the CoJ is useful for both data collection and for quantifying the accuracy of citizen science data (eg. Hawthorne et al. 2015). Citizen science data collection is limited by issues of accuracy in the identification of specific tree species by non-specialists; these are addressed by ground-truthing through physical fieldwork 4 and the use of digital imagery such as Google Street View (Visser et al. 2014). Cross referencing of citizen science logged geopoints with trees identified on Google Street View, together with ground-based fieldwork – permits both an assessment of accuracy of citizen science as an approach, and a more comprehensive and accurate Jacaranda distribution map as explored in the research questions of this study - as linked to the spatial distribution of Jacaranda trees in the CoJ as well as the effectiveness of the geospatial approach towards citizen science, ground- truthing and Google Street View as data collection methods. 1.2 Rationale One of the many purposes of invasive mapping is to represent their distributions across both local and global landscapes and environments (Pedrotti 2012; Seiferling et al. 2017). Knowledge about the main features of a landscape, such as invasive presence, is imperative for efficient and effective maintenance and monitoring (Qiu et al. 2019; Choi et al. 2022). Repeated mapping of vegetation or individual species over time permits a collection of data which may offer information such as rate of growth, phenological changes, direction of growth and population presence – to name a few (Pedrotti 2012; Dujardin et al. 2022). Furthermore, it provides a visual interpretation of any trends which occur long-term and assists with possible future trend analysis (Masocha and Skidmore, 2011; Buldrini et al. 2015). There is an urgent need for accurate mapping of the spatial distribution of invasive species as their presence leads to a loss of biodiversity and ecosystem services (Visser et al. 2014). Jacaranda trees as an invasive species share similar needs for their spatial distribution to be mapped for the CoJ (Newete et al. 2022). Additionally, given the importance of Jacaranda trees 5 to the identity of the CoJ, and the use of Jacaranda lined streets in photography and videography during the spring season, a map of their distribution has social and cultural value. These maps provide vital location information for citizens to experience well-known Jacaranda blooms, whilst simultaneously allowing for improved monitoring of these invasive trees for both ageing analysis and the presence of Polyphagous Shot Hole Borer (PSHB) – an invasive insect which burrows through tree bark, weakening the tree’s ability to transport moisture – presenting great risk to many urban trees (Paap et al. 2018; Newete et al. 2022). Remote sensing has been widely used for invasive species mapping; however, challenges arise when mapping heterogeneous landscapes which are vegetation dense (Fonji et al. 2014). These challenges include cost – as ground-truthing is expensive and time-consuming, spectral mixing – as similar spectral reflectance from other land cover types skew the dataset, and misclassification – as coarse spatial resolution results in several land cover types as classified into a single type (Huete et al. 2002). Citizen science presents the opportunity for large-scale ground-validation of invasive Jacaranda trees, at a high spatial resolution and low cost (Tang and Liu, 2016). Ground- truthing by field-visits are reduced in time as geolocation of Jacaranda trees are collected through citizen science, providing random sampling sites for ground-validation (Hawthorne et al. 2015). Moreover, Google Street View is a free platform compiling street-level views of roads and pavements across the world, which is useful for the identification of streets where these urban trees are located, without being physically present in-field (Visser et al. 2014). 6 1.3 Aims and Objectives The primary aim of this study is to produce the first map of the distribution of Jacaranda trees in the CoJ using a geospatial approach through a combination of citizen science, ground-truthing and Google Street View. The secondary aim is to explore the accuracy of citizen science in mapping distinct tree species. This will be achieved through the following objectives: 1) To use a citizen science approach to crowd-source data on the location of Jacaranda trees in the CoJ and to compare this data to ground-truthing and Google Street View data on confirmed Jacaranda tree presence in the CoJ, to then map all three data collection methods through an integrated geocoded survey and GIS platform. 2) To use both a GIS and Google Street View to validate ground-truthing data of both reported and un-reported Jacaranda tree locations in the CoJ, as reported from citizen science. 3) To perform an accuracy assessment of the citizen science derived data and map the level of accuracy on the final product. 1.4 Research Questions 1) What is the spatial distribution of Jacaranda trees in the CoJ? 2) How effective is the geospatial approach in the combined citizen science, Google Street View, and ground-truthing methods of data collection? 7 1.5 Structure of the Dissertation This dissertation comprises seven chapters. Chapter two explores studies that have engaged a GIS, citizen science and Google Street View as tools for successful vegetation mapping, together with the pertinent literature on the CoJ’s urban forest. Chapter three provides contextualization of the study site with reference to the climate, geography, urban forest, and population dynamics. Chapter four provides a detailed breakdown of the methods used in this research project, with reference to studies which have either engaged in similar or alternate methodological approaches. Chapter five provides a critical analysis of all core data components and presents these results along with their limitations. Chapter six provides a detailed discussion of each core component of this study where individual data sources are analyzed for trends and patterns on significant discoveries and their implications for this study and for future applications. The final chapter concludes with a synthesis of the study and achievement of the aim and objectives, followed by future research avenues and implications fostered from this research study. Due to the hierarchy of sections, sub-sections, and sub-sub-sections, to allow for ease of identification of figures and tables, figures and tables are numbered according to the subsection in which they fall. 8 CHAPTER 2: LITERATURE REVIEW Own photograph in the suburb of Waverley, taken on 17 October 2022. 9 2.1 Introduction Vegetation mapping is of importance as it provides spatial information on the location, type, direction, and rate of spread of plant species (Bois et al. 2011). In the CoJ, where the urban forest is one of the world’s most extensive, a spatial understanding of various urban forest species is imperative for effective monitoring and management of invasive species (Fitchett and Raik, 2021; Newete et al. 2022). This is especially significant for effective conservation efforts to manage and plan for future trends (Jordan et al. 2012; Thompson 2016). The literature review provides an overview of the published research reflecting tools that are used in recording vegetation, with a focus on the effectiveness of the geospatial approach towards the combined use of citizen science, Google Street View, and ground-truthing as data collection sources, as stated in the research question. In addition to reviewing existing studies that map vegetation, this literature review also critically reflects on methodologies employed to assess the accuracy of each of the data collection sources used, along with capabilities for spatial distribution mapping of invasive species in both the CoJ and other cities with heterogeneous urban forests. 2.1.1 Background and Previous Research Today, alien invasive Jacaranda trees are mostly found in the northern suburbs of the CoJ, as these were typically the wealthy suburbs during post-colonial times (Newete et al. 2022). Although Jacaranda trees are abundant in the CoJ’s landscape, studies linked to spatial distribution are few, with many existing studies focusing on remote sensing mapping approaches (Jombo and Adam 2018; Abutaleb et al. 2021; Newete et al. 2022). Newete et al. (2022) present spatial distribution mapping of Jacaranda tree presence in the CoJ using remote sensing efforts 10 revealed four main suburbs of Saxonwold, Houghton, Parkview, and Parkwood as high prevalence Jacaranda tree suburbs. Though remote sensing is a commonly used tool, this study highlighted two research gaps linked to spectral mixing and misclassification of tree canopies when mapping a heterogeneous landscape such as the CoJ. The first research gap presented is towards the exclusion of smaller suburbs within the CoJ with potential Jacaranda tree presence as remote sensing efforts were unable to detect the small- scale tree canopy cover of these suburbs (Newete et al. 2022). Since Jacaranda trees are a category three invasive species, it is important to identify all suburbs for their presence and to utilize both small and large-scale mapping techniques. A secondary data collection technique through fieldwork was aimed at collecting Jacaranda tree data using a hand-held GPS device, however, areas visited were few in numbers, chosen at random, and included major parks and suburbs only (Newete et al. 2022). To achieve a spatial distribution map which is accurate, all areas of the CoJ must be visited or considered irrespective of suburb size and independent of known areas for Jacaranda tree presence. By this, citizen science as a tool is effective to crowd-source data on both Jacaranda tree location and known presence as citizen science data is sourced at a fine spatial resolution, over the geographic area in focus (Chandler et al. 2017). Citizen science data can form the foundation for other spatial distribution mapping efforts to confirm or validate Jacaranda tree presence. This coincides with the use of ground-truthing and Google Street View as two capable tools for the validation, confirmation, and mapping of the spatial distribution of Jacaranda trees in the CoJ 11 (Ives et al. 2017; Seiferling et al. 2017). Ground-truthing and Google Street View data collection methods can be compared to citizen science data collection for a measure of accuracy to be produced, together with providing an assessment of the effectiveness of using a geospatial approach in this regard (Bois et al. 2011). These three data sources combined will address any gaps which may present themselves in the data collection period. Therefore, it is important to highlight areas where data has not been collected, as well as to consider additional data collection methodologies to curb these research gaps for more accurate spatial distribution mapping. 2.2 Tools for Recording Vegetation Vegetation has been observed and recorded by both scientists and members of the general public for hundreds of years (Marsham 1789; Sparks and Carey 1995). This has a range of applications from tracking phenological shifts to monitoring invasive species and identifying biomes. The current toolbox of methods used in recording vegetation includes the use of citizen science, a combination of Google Earth with Google Street View, as well as social media. Each tool provides fundamental opportunity for high-resolution data collection, at varied geographical scales, which is explored in this section. 2.2.1 Citizen Science as a Tool Citizens as volunteers provide support for sampling activities which are essential for both biodiversity monitoring and public awareness of ecological changes (Miller-Rushing et al. 2012; van der Wal et al. 2015). Citizen science, in this context, has been defined as “the collection and 12 analysis of data relating to the natural world by members of the general public, typically as part of a collaborative project with professional scientists” (Oxford 2016). These citizens were not actively engaged in the analysis of data, but rather with data collection only. Volunteers for citizen science projects are made up of the public with a focus on the connection of people to science (Wright et al. 2015; Vahidi et al. 2021). When citizens participate, it empowers the public to make direct contributions to scientific research (Dickinson et al. 2012; Miller-Rushing et al. 2012). Citizen science can encompass ‘crowdsourcing’ and ‘public participation’ (Brown and Kytta 2014; Fonji et al. 2014; Vahidi et al. 2021). However, there are distinct differences between these terms. Public participation is the participation by the public-at-large to scientific research, as well as political decision making, urban planning and other applications (Cooper et al. 2018). Here, scientific research is argued to have never been the exclusive domain of professional scientists as many ‘amateur’ scientists have been successful in this regard (Cooper et al. 2018). Scientific research is mostly involved in routine with documenting, monitoring, analysis and writing as mundane tasks which almost anyone can make useful contributions towards – if they are careful and stringent to protocol (Cooper et al. 2018). In citizen science, a citizen scientist is concerned only with science and not scientific research. In this study, citizens did not have to be literate to the scientific process, but merely capable of using simple data tools, such as user-friendly geocoded surveys. An example of citizen science in this regard is seen in studies from the data logging tool named CyberTracker. Cybertracker was tested in the Karoo National Park, South Africa, where two citizens could not read or write but were successful in the thorough data 13 collection and monitoring of endangered Desert Black Rhino through the CyberTracker field computer (Liebenberg et al. 1999). Crowdsourcing is commonly used to “obtain information or input (into a particular task or project) by enlisting the services of a large number of people, either paid or unpaid, typically via the Internet” (Oxford 2016). Therefore, the complexity and ambiguity between the terms of citizen science, crowdsourcing, and public participation as three separate concepts, must be acknowledged (Cooper et al. 2018). Though citizen science, crowdsourcing and public participation are similar by the involvement of the public for data collection, each term clearly differs by the level of which participants are expected to engage on (Harvey 2012). Therefore, citizen science is the most correct term for this study as the participation of the public-at-large for the purpose of science was evident. Moreover, the public in this study must be understood as citizens of the CoJ who each participated on a voluntary basis without the enlistment of their services or benefit of an income nor an incentive as crowdsourcing approaches would offer. In vegetation recording, a citizen science technique assists with the cross-validation of professional data collection as a larger database consisting of both professional ground-control data and citizen observation data is created (Roman et al. 2017; Roy-Dufresne et al. 2019). These multiple stakeholder collaborations present new opportunities not only for database management, but also for vegetation distribution modelling (Roy-Dufresne et al. 2019; Azzurro and Cerri, 2021). Information extracted from citizens who live near the environmental region of interest often complements or even replaces professional methods of ecological sampling (Ives et al. 2017; Azzurro and Cerri, 2021). These diverse uses of citizen science have each assisted in increasing public awareness of science by incorporating local knowledge of citizens into scientific 14 research (Wright et al. 2015; Roman et al. 2017). This promotes advocacy among citizens as more learning opportunities are presented and diverse views are considered (Cooper et al. 2008; Vahidi et al. 2021). However, most of these citizen science programs are carried out in North America or Europe, with few programs found in Africa, Asia, and Central and South America (Chandler et al. 2017). 2.2.1.1 Citizen Science Projects Globally, Citizen science projects in biodiversity and conservation management have attracted increased attention due to awareness of citizen science programs (Newman et al. 2012; Nugent 2018). Two global programs, among many others, are those of iNaturalist and the Global Coral Reef Monitoring Network (Chandler et al. 2017). iNaturalist is a mobile application which serves the global community for nature observation and identification by means of photography with embedded information on the date and location of the photograph for scientists and society to learn from (Nugent 2018). The Global Coral Reef Monitoring Network is an initiative which provides updated scientific information on the status and trends of coral reef ecosystems throughout the world (GCRMN 2020). Locally, a range of citizen science programs have been implemented by the Animal Demography Unit (ADU), based at the University of Cape Town. The South African Bird Atlas Project (SABAP) project is another well-known local citizen science project whereby bird species distribution and abundance are mapped across several countries in southern Africa through digital mapping (SABAP2 2023). Cape Citizen Science is a project based in the City of Cape Town, supported by the Forestry and Agricultural Biotechnology Institute, with 15 a focus on utilizing citizen science to collect information on plant diseases and insect pests, to protect the flora of the greater Cape floristic region (Cape Citizen Science 2023). These are but a few of the many other well-known projects which use citizen science for data collection. Though the above-mentioned projects are successful on both global and local scales, certain data collection projects may require location specific information. In the projects listed above, a mobile application is commonly used as the platform for users to collect data, however, these applications lack a geospatial component whereby a device’s precise location can be accurately and immediately provided therefore shadowing the location data from where the user is situated. This automated accuracy-controlled approach has proven to drastically increase data accuracy – as opposed to a user manually inputting their location – where a spatial component is necessary for reporting (Atzmanstorfer et al. 2014). Hence, a margin for location error, based on citizen inaccuracy, must be considered. Moreover, many of the above-mentioned projects depend on the aggregated answers from citizen science reports as a marker for validity and accuracy of true recordings (Swanson 2016). This too provides margin for error as additional validation tools are not utilized to leverage possibilities of inaccurate identification by citizens (Hara et al. 2013). Ground-truthing by field work campaigns provide the most accurate form of data validity since citizen science reports can be visited physically to confirm the presence or abundance of the research area in focus (Hawthorne et al. 2015). Further, the validation of citizen science points in-field contributes to accuracy measurements where citizen science collected data may be directly compared and assessed to control data in-field (Thompson 2016; Wakie et al. 2016). Similarly, the combined use of aerial imagery and digital street views – as accessed through open-source platforms such as 16 Google Earth and Google Street View – promotes validation opportunities to remotely confirm the presence or abundance of the research area in focus (Ives et al. 2017; Seiferling et al. 2017). Hence, any inaccuracies found in citizen science data can be validated or confirmed to ensure data accuracy within these large citizen science projects. 2.2.2 Google Earth and Google Street View as a Tool Google Earth’s aerial imagery together with Google Street View imagery, are two views which complement each other when tasked with recording both vegetation and infrastructure components (Berland and Lange, 2017). Google Earth and Google Street View are publicly available and easily accessible geospatial technologies which are cost-effective and offer precise alternative methods for identifying trees and buildings at local and global scale (Li et al. 2015; Choi et al. 2022). Google Earth constructs pictures of the surface of the earth by downloading satellite imagery in a user-friendly interface to allow both the public and researchers to ‘zoom-in’ to locations of interest whilst still viewing the satellite image (Lisle 2006). Google Street View is an interface within Google Earth and Google Maps which provides a streetscape view using individual photographs which are ‘stitched’ together to form a continuous view (Li et al. 2015). This continuous view is achieved through vehicles with mounted cameras, which physically drive through the streets of cities, towns, and villages across the world to supply these photographs (Li et al. 2015). Google Street View is a type of platform which provides on-the-ground street imagery at a high spatial resolution, for visual validation of sites close to roads (Visser et al. 2014; 17 Qiu et al. 2019). Google Street View offers an interactive approach for users to manually ‘drag and drop’ a virtual pin from the software directly onto virtual streets which have been previously mapped using vehicle platforms with photography instrumentation (Anguelov et al. 2010; Li et al. 2015). Users can then maneuver in all directions across a virtual 2D, and 3D plane to identify features along streets such as trees and buildings (Berland and Lange, 2017; Ives et al. 2017). In scientific research, Google Earth can be used to capture geographic information such as the exact coordinates of individual land features to entire polygon areas which define perimeter (Fonji et al. 2014; Ferreira-Rodriguez et al. 2021). Google Street View enables researchers to capture geographic coordinates of individual street-level features, validated through 2D and 3D imagery viewing (Qiu et al. 2019; Choi et al. 2022). Google Street View is especially useful for the virtual identification of urban street trees as the presence and species of specific trees at multiple locations can be validated against citizen collected data (Ives et al. 2017; Seiferling et al. 2017). Google Street View in tree identification, assists with identifying numerous factors such as location of trees, number of trees, size of trees and species – to name a few (Visser et al. 2014; Berland and Lange, 2017). However, studies which use Google Street View as a remote validation tool of street-level features also utilize additional data collection tools to ensure data collection is accurate. Citizen science is a common tool used in conjunction with Google Street View as citizens act as ‘sensors’ to both collect large amounts of data from the geographic area in focus whilst also assuming the role of fieldworkers since citizens are often required to walk various streets to validate Google Street View collected data (Hara et al. 2013). Hence, citizens are not only engaging in digital data 18 collection through online surveys, but also serving as fieldworkers since certain streets are physically visited and features are validated or confirmed by citizens often through means of photographic or digital survey verification (Clark 2014; Wegner et al. 2016). Projects such as Opentreemap aim to create a publicly available tree inventory for each city in the world by utilizing a combined approach of open-source street view data together with citizen science to catalogue street trees (Wegner et al. 2016; Hamilton et al. 2018). The addition of citizen science to Google Street View has been proven to enhance data quality, specifically for cases where tree species identification or validation is imperative (Clark 2014; Crall et al. 2015; Roman et al. 2017). 2.2.3 Social Media as a Tool The widespread use of social media as a valuable tool in vegetation recording has recently been discovered. Social media platforms which lead in photography and visual representation are the forefront of capturing and sharing visible vegetation with the public (Silva et al. 2018). Common social media sites such as Facebook, Instagram and Twitter can, when granted permission, capture user’s GPS location from their mobile devices, based on either the user’s physical location or the location at which the photograph was taken (Brown and Kytta, 2014; Sullivan et al. 2014). This function is particularly useful as distinct phenological stages, such as peak flowering, may be identified through photography ensuring correct species identification (Silva et al. 2018). There are many geotagged photographs on social media sites which circulate across the world through the flexibility of the internet (Wegner et al. 2016; Yang et al. 2019). The inter-site 19 variability encourages the public to participate and therefore supports the citizen science approach since these social platforms make engagement and accessibility seamless for data collection (Clark 2014; Fonji et al. 2014). This interactivity encourages users to support the creativity, diversity, and openness of various fields – including those focused on biodiversity and ecosystem monitoring as associated with plant and tree mapping (Hawthorne et al. 2015; Wakie et al. 2016; Li et al. 2015). 2.3. Combined Citizen Science and Geospatial Approaches In urban environments, citizens as sensors or ground-collectors of vegetation data are a great help towards successful identification of various species across the landscape (Qiu et al. 2019; Dujardin et al. 2022). The fast-growing data source of citizen science provides reliable and accurate invasive species plant mapping data outputs, from both simple and complex urban environments (Hawthorne et al. 2015; Roman et al. 2017; Seiferling et al. 2017). Combined citizen science and geospatial approaches have the potential to accumulate large amounts of data, over a wide geographic region all within a short timeframe, which may otherwise be difficult to collect (Fonji et al. 2014; Ferreira-Rodriguez et al. 2021; Dujardin et al. 2022). The accurate identification provided by citizen science assists with the creation and compilation of maps to visualize the spatial distribution of invasive plants upon the landscape (Hawthorne et al. 2015; Thompson 2016). This is highly advantageous in regions or countries which are underdeveloped and limited in resources (Thompson 2016; Ferreira-Rodriguez et al. 2021). 20 2.3.1 Spatial Distribution Mapping Combined citizen science and geospatial approaches in invasive species mapping grants opportunity not only for effective biodiversity and ecosystem management, but also towards community awareness and education (Theobald et al. 2015; Maynard-Bean et al. 2020). This awareness bridges the gap between citizens, researchers, and policymakers, as each person shares responsibility in the project as it provides an opportunity for stakeholders to engage with and educate the public on scientific research in the community, whilst simultaneously encouraging citizens to express their knowledge on the local environment (Schlossberg and Shuford, 2005; Dickinson et al. 2012; Brown and Kytta, 2014; Fonji et al. 2014). Scientists and researchers support the aid of combined citizen science and geospatial approaches towards invasive species mapping as data accuracy is resoundingly higher when collected by citizens who are familiar with their region as a higher citizen participation is observed since citizens show increased concern towards projects which may directly impact their community (Bois et al. 2011; Jordan et al. 2012; Fonji et al. 2014; Theobald et al. 2015). This data accuracy reflects on the secondary aims of this study which explores citizen’s accuracy in data collection of invasive Jacaranda tree species in the CoJ. Scientists and researchers further support the use of internet-based platforms in combined citizen science and geospatial approaches since digital surveys, websites, and mobile applications permit large-scale instant collection, storage and sharing of both geolocation and attribute data (Bois et al. 2011; Hawthorne et al. 2015; Tang and Liu, 2016; Wakie et al. 2016). Developments in geospatial technology allow digital surveys to access Global Positioning System (GPS) location information from the device used to take the survey, to produce an exact geographic coordinate 21 location in real-time (Buldrini et al. 2015; Wright et al. 2015; Berland and Lange, 2017; Joy et al. 2019). In biodiversity and ecosystem monitoring and management, this GPS-approach with geospatial technology is highly beneficial for locating and mapping out high-risk areas and assists with the identification of any patterns or future trends (Salem 2003; O’Donoghue et al. 2010). This GPS information can be stored on a geospatial database where several analysis tools may be used simultaneously to output information on spatial distribution, rate of spread, affected areas or resources, and possible monitoring solutions – to name a few (Le Maitre et al. 2002; Hawthorne et al. 2015). This reflects on the research question of this study whereby geospatial approaches assist with the analysis and visualization of data collected from citizen science, Google Street View and ground-truthing. Studies which use these real-time combined citizen science and geospatial approaches for data collection stem from various fields, which include but are not limited to; disaster management mapping, trend prediction mapping, biodiversity monitoring and ecosystem monitoring (Salem 2003; O’Donoghue et al. 2010; Joy et al. 2019). The combined citizen science and geospatial approaches in these fields assist with real-time monitoring, mapping, and management networks for the identification of affected areas and potentially affected areas (Tran et al. 2009; O’Donoghue et al. 2010; Joy et al. 2019; Yang et al. 2019). Invasive species mapping projects are varied and may include careful selection and training of volunteers to correctly identify specific invasive species (Jordan et al. 2012; Wakie et al. 2016). Selection and training of volunteers increase the accuracy of positively identified invasive plant species, whilst also producing much larger sample sizes (Jordan et al. 2012; Thompson 2016). Training may also involve volunteers’ engagement with GPS devices when capturing ground 22 control points (Thompson 2016; Wakie et al. 2016). Though training increases the accuracy of data collected from citizens, the uncertainty of training time needed often compromises planned timeframes for project start and completion (Fuccillo et al. 2015). This is a major disadvantage for projects which depend on seasonality for the successful identification of invasive plant species (Bois et al. 2011; Clark 2014). Well-known projects which use a combined citizen science and geospatial approach for distribution mapping and monitoring include eBird, SABAP, OpenStreetMap, OakMapper, Abandoned Developments and CoCoRaHS (Sullivan et al. 2014; Tang and Liu, 2016). All six projects use volunteer collected data over online platforms to collect user inputs over a large geographic and temporal scale (Haklay and Weber, 2008; Tang and Liu, 2016). eBird is a web- interface that records recreational and professional bird observations from the Western Hemisphere and beyond (Sullivan et al. 2014). This is similar to our local SABAP Bird mapping program in South Africa. OpenStreetMap is a web-based interface which allows citizens to report on the built environment and street conditions on a global scale (Sullivan et al. 2014). OakMapper uses an interactive map to record sudden oak tree death in California (Connors et al. 2012). Abandoned Developments records abandoned residential construction sites throughout the south-eastern United States using an interactive map (Tang and Liu, 2016). CocoRaHS is an interactive website used to quickly record precipitation observations across North America (Tang and Liu, 2016; Qiu et al. 2019). Two major kinds of interactive platforms were evident from these six combined citizen science and geospatial approach projects: direct and indirect (Tang and Liu, 2016). The indirect approach as used by eBird, SABAP and CocoRaHS involved indirect submission of data through a checklist, 23 whilst the direct approach as used in OpenStreetMap, OakMapper, and Abandoned Developments, involved a direct submission of data on an interactive map (Tang and Liu, 2016; Qiu et al. 2019). eBird, SABAP and OpenStreetMap resulted in the highest interaction rates because citizens were attracted to the project since they belonged to specific interest groups (Dickinson et al. 2012; Tang and Liu, 2016). eBird and SABAP attracted ‘Birders’ who are typically people with an interest in birdwatching, whilst OpenStreetMap attracted people who were interested in mapping the built environment (Sullivan et al. 2014; Tang and Liu, 2016). The Abandoned Developments and OakMapper projects did not have specific interest groups – which resulted in low levels of participation towards these projects from citizens (Haklay and Weber, 2008; Tang and Liu, 2016). There are often concerns around the quality of citizen science data, however, studies show that data quality can be maintained when monitoring programs are well developed and carefully implemented (Atzmanstorfer et al. 2014; Crall et al. 2015). Volunteers may be specified or selected based on factors around the project scope to ensure data integrity (Joy et al. 2019; Dujardin et al. 2022). The specification of volunteers within citizen science projects is imperative to reduce issues around distribution of observers versus distribution of species as high concentrations of data points may represent higher population responses of observers rather than higher research sample data from species (Crall et al. 2015; Van de Wal et al. 2015). Additionally, the layout of surveys, websites, and applications contribute to a higher data quality from combined citizen science and geospatial approaches in invasive species mapping (Fonji et al. 2014). Many online surveys follow a common approach which either requires the citizen to identify an invasive plant species from a list of pre-configured invasive species options, or to 24 identify a single invasive plant species only (Fonji et al. 2014, Ferreira-Rodriguez et al. 2021). Questions within surveys, websites and applications often require details on the count of invasive species seen and a geo-tagged point of location from where the sighting was made (Tang and Liu, 2016; Wakie et al. 2016). Geo-tagged points collected from these online surveys of specific invasive species can be represented as spatial distribution maps using a GIS as a tool (Hawthorne et al. 2015; Ferreira-Rodriguez et al. 2021). 2.4 Synthesis Many conservation decisions made by cities and countries are based on imperfect or varied data (Bird et al. 2014; Barnard et al. 2017). Reliant policymaking is dependent on updated and accurate data on biodiversity, to predict species’ responses to local and global change (Bled et al. 2013; Altwegg et al. 2014). Although phenological and remote sensing mapping efforts exist within the CoJ, spatial distribution mapping of Jacaranda presence has yet to be accomplished. Spatial presence of invasive trees is pivotal not only towards effective management, planning and monitoring, but is also key towards the identification of suburbs and regions which may be at risk for high invasive tree presence. This offers an additional opportunity to re-model the existing conservation and land management plans of the CoJ, to consider key areas which may pose future risk. 25 CHAPTER 3: STUDY SITE Own photograph taken in suburb of Saxonwold, on 12 October 2022. 26 3.1 Introduction The mapping of Jacaranda tree distribution using a combined citizen science and geospatial approach in this study, is focused on the CoJ, within the Gauteng province of South Africa. In late spring, the CoJ is a well-known location to experience the distinct purple flowering of invasive Jacaranda street trees (Fitchett and Raik, 2021). The CoJ is rapidly developing with high-rise buildings and industrial zones taking over vegetated areas, leaving the urban forest at risk due to poor monitoring and management practices (Newete et al. 2022). It is important to identify high risk areas where old Jacaranda trees may die from invasive pest species attack or may fall onto private property. This chapter describes the CoJ’s history, general geography, climate and population. 3.2 History The CoJ is one of the largest and fasting growing cities in Africa, with one of the world’s largest urban forests assets which require attention (Schäffler and Swilling, 2013). The CoJ experienced an apartheid regime past whereby segregation and land separation of different races, each contributed to shaping to landscape of the CoJ today (Schäffler and Swilling, 2013). This segregation led to those of the white race residing in northern suburbs of the CoJ with the black races occupying the southern regions (Turton et al. 2006). The CoJ started major mining interventions from the 1800s with the discovery of gold in the Rand area (Turton et al. 2006). An influx in mining activities led to a tree-planting boom in the late 19th century to address the increase in dust, air, noise, and heat pollution in the CoJ (Schäffler and 27 Swilling, 2013). Quick-growing species were selected such as Eucalyptus, Black Wattle, London Planes, Oaks, and Jacaranda – to name but a few of the Colonial familiar species which were planted (Turton et al. 2006). Resultantly, major greening efforts were abundant in northern affluent suburbs of the CoJ, forming greenbelts and public parks (Schäffler and Swilling, 2013). Today, these trees form large parts of the CoJ’s urban forest with well over 10 million trees which occupy approximately 16.1% of the CoJ’s area as covered by both native and alien species (Schäffler and Swilling, 2013; Hardy and Nel, 2015). 3.3 General Geography The CoJ is classified as part of the South African highveld, located in the Gauteng province of South Africa (Crétat et al. 2012). The CoJ is the largest City in South Africa with an area of 1,645 km2 and is commonly referred to as ‘’The City of Gold’’ after gold was discovered in the Rand region in 1886 (Vearey 2010, p.37; Harrison and Zack, 2012). The CoJ is the economic and transport hub of South Africa and contributes to approximately 10% of Africa’s economic activity (Risimati and Gumbo, 2018). The main regions of the CoJ are those of Diepsloot, Midrand, Sandton, Alexandra, Inner City, Johannesburg South, Ennerdale, Soweto, Diepkloof, Rosebank, and Roodepoort (Risimati and Gumbo, 2018). The CoJ is administratively divided into seven regions, namely A, B, C, D, E, F, and G, which each consists of several suburbs (Risimati and Gumbo, 2018). The exact boundaries of these regions are regularly changed and therefore difficult to map. The CoJ promotes transit- oriented developments (TODs) in previously marginalized areas by focusing on development of economic and business sectors within Regions F, B and E. Regions G and D are defined as medium 28 to low-income residential spaces whilst regions A and C are defined as medium to high income residential spaces with some commercial activities (Risimati and Gumbo, 2018). The CoJ is a metropolitan City, with two neighboring metropolitan cities, namely City of Tshwane (Commonly known as Pretoria) and City of Ekurhuleni (Fitchett and Raik, 2021). The CoJ is situated along the watershed of the Witwatersrand and depends on the Lesotho Highlands Water Project (LHWP) transboundary transfer scheme as the main water supply source (Mguni et al. 2022). The Witwatersrand is at the headwaters of two major river basins, namely the Lesotho and the Orange River basins, where any pollution occurring in these basins impact activities downstream (Turton et al. 2006). Hence, capping of the LHWP transboundary transfer scheme, by means of a maximum water capacity limit, until the second phase of the LWHP is completed – adds to water availability pressure in the CoJ (Mguni et al. 2022). The CoJ has a highly variable topology with the central regions of the CoJ found at an altitude of 1,763 m above sea-level and northern regions found at lower altitudes, resulting in different air temperatures between these regions (Newete et al. 2022; Souverijns et al. 2022). 29 Figure 3.1: Study site map of the CoJ showing the main regions of Diepsloot, Midrand, Sandton, Alexandra, Inner City, Johannesburg South, Ennerdale, Soweto, Diepkloof, Rosebank, Roodepoort and adjacent municipalities. 3.4 Climate The CoJ has a temperate climate with mean annual temperatures of 22°C (Dyson et al. 2015). Solar radiation and air temperatures are the largest influencers to heat stress in the CoJ, compared to humidity (Souverijns et al. 2022). The variable topology of the central suburbs found at higher altitudes than northern suburbs contribute to different air temperatures experienced across these regions (Hardy and Nel, 2015). The central regions of the CoJ emit the highest concentrations of heat due to high building density, increased anthropogenic activity and low 30 vegetation presence (Souverijns et al. 2022). Urban heat island stress is counteracted by the presence of vegetation which assists with detoxifying the air and providing shaded cover which lowers temperatures in the vicinity (Hardy and Nel, 2015). The CoJ experiences a mean annual rainfall of approximately 600 mm annually, with most of the rainfall during the summer months of October and March (Crétat et al. 2012). Rainfall decreases from east to west of the country as influenced by the warm Agulhus current and presence of the Indian ocean on the east and cooler Benguela current from the Atlantic Ocean on the west (Crétat et al. 2012). Summer rainfall may lead to intense Highveld thunderstorms which may cause hazardous flash flooding due to poor infiltration rates of the urban environments artificial surfaces (Schäffler and Swilling, 2013). The low mean annual rainfall of the CoJ, combined with high evaporative potential and the variability of rainfall, results in South African rivers having among the lowest conversion of mean annual rainfall to mean annual runoff (Turton et al. 2006). 3.5 Population The CoJ has a current population of approximately 6 million people with an approximate annual growth rate of 2% (United Nations 2022). Population growth and urbanization in the CoJ contributes to changes in the quality and quantity of urban green spaces (Abutaleb et al. 2021). The result of a rapidly growing population leads to decreased availability of land space as ‘green spaces’ in the City (Rasmeni and Madyira, 2019). An increase in population is translated to a mounting demand for service infrastructure to meet growing needs of the City, which is contributed by both domestic and international immigration (Abutaleb et al. 2021). The spatial distribution of highly populated suburbs in the CoJ is influenced by apartheid, where urban 31 growth was controlled, and the City was segregated by racial lines (Todes 2012). Spatial change in the CoJ has been rapid after 1994. In the CoJ, northern and western edges of the City house higher population density with the development of popular residential complexes (Todes 2012). Resultantly, suburbs which are highly populated, usually offer reduced vegetation cover. Rapidly developing suburbs within the CoJ such as Sandton, Melrose Arch and Midrand are just a few examples of areas where increased urbanization leads to reduced presence of vegetation (Mudede et al. 2020). 32 CHAPTER 4: METHODOLOGY Own photograph in the suburb of Northcliff, on 12 October 2022. 33 4.1 Introduction This dissertation is the first to map ground-level Jacaranda tree spatial distribution in the CoJ. The primary aim of this study is to map the spatial distribution of Jacaranda trees in the CoJ, through a combination of citizen science, ground-truthing using a GIS, and Google Street View techniques. The secondary aim is to explore the accuracy of citizen science in mapping distinct tree species. This is followed by two research questions: First, “what is the spatial distribution of Jacaranda trees in the CoJ?”. Second, “how effective is the geospatial approach in the combined citizen science, Google Street View, and ground-truthing methods of data collection?” The combined approach requires the integration of processes to ensure harmony between the three data sources. This chapter outlines the data acquisition process from two field campaigns, Google Street View efforts and survey responses collected from citizen science, as supported by Esri South Africa. Thereafter, methodological processes for each dataset will be discussed to determine any relationships between data acquisition procedures and resulting patterns or trends. Additionally, the calculation process for citizen science data accuracy will be discussed. 4.2 Data Collection As this research study aims to produce the first spatial distribution map of Jacaranda tree presence in the CoJ, initial ground-based data acquisition was imperative to the success of the study since this would form the foundation to identify any data gaps or inaccuracies from citizen science collected data (Section 2.2.1). Two field campaigns for ground-data collection were necessary for the collection of control points for validated Jacaranda tree presence in the CoJ. In 34 the absence of research studies which use geospatial technology as a tool for invasive tree mapping in the CoJ, a secondary data collection technique to address any missing data from fieldwork campaigns was important. Google Street View for virtual data collection of confirmed Jacaranda trees in the CoJ assisted here (Section 2.2.2). These two data collection techniques ensured digital standardization of data, so geopoints could be processed and integrated into digital maps (Crall et al. 2015). Citizen science data collection was carried out through the use of digital surveys (Vahidi et al. 2021). This digital component granted seamless transition of both attribute and spatial data, which were collected from individual users and stored on online geospatial databases (Figure 4.1). Figure 4.1: Methodology diagram 35 4.2.1. Ground-truth Data Two field campaigns were carried out from 5-11 July 2022 and 11-17 October 2022 to collect validated Jacaranda tree location. Based on extensive research prior to field campaigning, a preselection of suburbs with a large number of Jacaranda trees guided the first field campaign in July 2022 (Section 2.1.1; Newete et al. 2022). Jacaranda tree prevalence varies per suburb with some suburbs having a high prevalence, with Jacaranda trees lining either side of multiple streets, whilst other suburbs have a very low prevalence with Jacaranda trees located sporadically across the suburb. Some suburbs, especially the more recently developed suburbs, do not have any Jacaranda trees at all. Since the two field campaigns were conducted during the seasons of winter and spring, Jacaranda trees vary is physical appearance. In winter, Jacaranda trees are identified by lush green bipinnately compound leaves which made these trees stand-out against the dry winter landscape and were identified by their distinct appearance (Figure 4.2). Similarly, field campaigns during spring blossoming of Jacaranda trees facilitated easy identification of the trees as Jacaranda trees blossom with bright purple flowers. 36 a b c d Figure 4.2: Own photograph during fieldwork showing a) winter-time green tree, b) winter-time green leaf c) spring-time purple tree, and d) spring-time purple flower of Jacaranda trees. Data from field campaigns were captured as geopoints using software and hardware provided by Esri South Africa. Observations for Jacaranda tree presence was made through physically driving 37 to and walking along individual streets per suburb and capturing exact coordinates of Jacaranda tree presence using both the Cedar CT8 Rugged Tablet (CT8 Tablet) and Garmin Glo (Garmin) GPS device, together with Esri ArcGIS QuickCapture software (Figure 4.3). Figure 4.3: Own photograph of CT8 Tablet and Garmin device provided by Esri South Africa. A manual log was updated of individual street names which were not completed during the allocated day of fieldwork. These incomplete streets would either be visited the next day or validated remotely using Google Street View. In-field navigation to different streets was assisted by navigation services provided by Google Maps (Fonji et al. 2014). Up to six individual street names were added as ‘stops’ within a single Google Maps journey for routes of ‘best travel’ to be mapped. The first and last ‘stop’ or street name within Google Maps assisted as anchor points to ensure fieldwork was conducted within the boundary of the suburb in focus, as navigation 38 without Google Maps became disorientating in larger suburbs or within streets with similar names to others. The extensive field campaigning for control points of confirmed Jacaranda trees formed part of this study’s main objective to have enough control points of known Jacaranda tree presence to measure citizen science data against, and therefore test for accuracy of citizen science data. 4.2.1.1 Field Devices and Software The CT8 Tablet, which is commonly used with Esri ArcGIS products, is a durable field device powered by Android 8.1 with Google mobile services included (Juniper systems 2019; Esri 2022a). The CT8 Tablet has renowned GPS accuracy between 5-8m under a dense tree canopy, due to an external GNSS Antenna attachment (Juniper systems 2019). The handheld Garmin device boosts GPS accuracy of the CT8 Tablet with space-based global navigation satellite system (GLONASS) receivers (Garmin 2023). The Garmin device has a GPS accuracy of between 1-3m under a dense tree canopy (Garmin 2023). The Garmin device connects wirelessly through Bluetooth to the CT8 Tablet, and these devices together form the hardware to collect accurate ground-truthing averaging between 1-3m under a dense tree canopy. Esri’s QuickCapture application was used for the rapid collection of individual geopoints. QuickCapture is an application which can be easily configured by the user for a tailored fieldwork experience (Esri 2023a). For this study, a template for reporting was used and configured as an application where one-touch button activations were utilized. This configuration process involved the deconstruction of a reporting template, and reconstruction which was fine-tuned and tailored for optimal user functionality for this research study. The first button activates a 39 drop-down list of pre-filled regions within the CoJ, so a specific region could be selected during fieldwork. The second button would capture the exact geographic co-ordinates from where the button was pressed in-field. Thereafter, a secondary screen would open to display a high- resolution aerial imagery map, sourced from Esri South Africa, within the application. The LANDSAT8 aerial image map would display every recorded point for Jacaranda tree presence in the form of geopoints. This application layout was chosen for its high efficiency and simplicity to easily capture a large number of geopoints in-field with very high accuracy. A third button provided a live-tracking route option. This tracking option was a valuable guide at the end of each day of fieldwork, to keep a record of streets and suburbs visited. All geopoints with live tracking routes were then configured to automatically save every 30 seconds. Automatic uploads at 5-minute intervals were configured for all geopoints to upload onto Esri’s ArcGIS Online for storage, management, and analysis. ArcGIS Online is a cloud-based mapping and analysis solution used for map creation, data analysis, and sharing of data whether in-field on behind a machine (Esri 2023b; Figure 4.4). 40 Figure 4.4: Own photograph during fieldwork in Soweto, showing three-button layout of QuickCapture application. All collected geopoints were accessed through ArcGIS Online at the end of each day of fieldwork. In cases where streets were covered more than once (Section 6.6), these duplicated geopoints were filtered and removed using Esri’s Map Viewer. Map Viewer is an interactive web mapping and data visualization application for creating, exploring, and saving web maps which is available through ArcGIS Online (Esri 2023c). Live route tracking data was removed from the geopoint dataset to appear as its own separate dataset. Geopoints were processed for geographic projection to comply with decimal degree standardization throughout the project (Fonji et al. 2014). 41 4.2.2 Virtual Campaign Data Virtual campaigns were carried out intermittently from August 2022 to November 2022 using Google Street View was the chosen tool for virtual validation of Jacaranda trees along streets in the CoJ (Section 2.2.2). In this study, a single Google Street View capture contained at least one Jacaranda tree (Section 5.3). Once a capture was taken, it was named chronologically on the software, and the spacebar button on a keyboard was pressed once, to maneuver in equal intervals down a virtual street. If a view did not contain Jacaranda trees, this capture was still taken and instead named “None” to ensure a log existed for streets visited without Jacaranda tree presence. This process was repeated until the entire street was covered, and ended when an entire suburb was complete. Google Street View captures were manually categorized into virtual folders which were named after the suburb in which they were captured. Entire virtual folders per suburb were saved automatically on Google Earth Pro software. The digital file format of completed suburbs was that of Keyhole Markup Language (KML) files within Google Earth Pro (Ives et al. 2017). KML files are a universal format used for storing geographic data, however these file types are not easily read through ArcGIS products (Esri 2023d). Instead, KML files were uploaded to Esri’s ArcGIS Earth where they could be saved as compressed files known as KMZ files. ArcGIS Earth is an interactive 3D experience similar to Google Earth Pro, except it does not yet have functionality for street-level analysis (Esri 2023d). In this study, ArcGIS Earth was useful as a conversion and sharing tool since converted KML files could be uploaded to ArcGIS Online for storage. This ensured data integrity and standardization 42 across Esri products since field campaign and virtual campaign data were uploaded to an online geodatabase and could be accessed through ArcGIS Online (Ferreira-Rodriguez et al. 2021). This systematic approach to the rapid collection and efficient storing of Google Street View captures assists with accuracy assessment efforts since these Google Street View captures form part of the control points from which citizen science data will be compared to. Moreover, the segregation of Google Street View captures as stored per suburb name, assists with detailed accuracy comparisons in cases where citizens have captured Jacaranda tree locations in suburbs which were not covered by ground-truthing efforts. Hence, careful attention is paid towards mapping spatial distribution of possible Jacaranda tree presence in these suburbs since ground validation points may not exist. 4.2.3 Citizen Science Data Citizen science data was obtained through digital surveys which were purpose-built, refined, and tailored for this research project. These digital surveys were produced using Esri’s Survey123 Connect. Survey123 Connect is a sophisticated desktop application which permits the user to author surveys using Microsoft Excel, which is alternatively coded with questions, instructions, restrictions, geotagging, and prompts each enabled through the advanced technological capability of Survey123 Connect (Esri 2023e). In this study, an advanced survey format was required whereby each question required tailored styling and functional formatting to guide the user through a series of geospatial instructions. This formatting included ‘drop-down’ lists, ‘select one’ or ‘select many’ options, ‘required questions’ with prompt or instruction messages for the 43 user, accuracy warnings for geotag questions, among others. Therefore, template survey layouts which exist on ArcGIS Online could not be used. Survey123 Connect automatically creates the digital survey with styling and layout as prompted from the coded spreadsheet program which is linked to the ArcGIS Online portal for all responses from users to be stored and managed digitally (Esri 2023e). Furthermore, Survey123 Connect can be restricted with GPS warnings to ensure a user can only capture a geotag once their device is in a specific accuracy zone. To ensure geolocation accuracy in this study, surveys were configured with a 7m accuracy and would not allow the user to capture a geopoint if their accuracy was out of this range. Survey123 Connect automatically generates both a survey link which can be accessed using any web browser, together with a Quick Response (QR) code which smart devices may scan from either a paper or digital based source to automatically redirect the user to a web browser to view the survey (Esri 2023e). Both digital links and QR codes assisted with the dissemination of the survey to a wide and varied sample of respondents through both snowball-sampling and convenience-sampling approaches and is in-line with this study’s objective to make this survey accessible through social media platforms for citizens of the CoJ to participate in Jacaranda tree data collection (Section 2.2.3). Snowball-sampling was useful in cases where a single user passed the survey link to others, while convenience-sampling was useful since social media formed a major part of attracting a wider audience and increased attention for survey completion. The five social media platforms of WhatsApp, Instagram, Facebook, LinkedIn, and Twitter would each result in different trends of responses from citizens of the CoJ since each of the above-mentioned social media platforms distribute website links uniquely. Instagram was the preferred platform 44 to share digital survey links as accompanying Jacaranda photography could be posted along with main headings or attractive text directed at the CoJ audience, in an effort to easily attract a CoJ citizen’s attention. Additionally, physical posters were created for residents of the CoJ to scan the QR code and access the survey. Since surveys were coded and created for this research project, and as geotagging was a key component of the survey, a decision was made to target smart devices for survey collection instead of paper-based approaches. The use of social media platforms and QR codes to disseminate the survey ensured most users who received the survey had access to smart devices that could enable their participation. 4.2.3.1 Social Media Data Social media assisted with identifying the best days and times of day to share survey links which correlated with increased user activity of Jacaranda photography. These data trends were collected through manual logging in a Microsoft Excel spreadsheet. Select Jacaranda season inspired events such as the “Jacaranda In Your Pocket” photo competition, “Johannesburg Free Walking Trails” for Jacaranda season walks in the City and “I Heart Jhb” for information on Jacaranda markets – among others – were logged in Microsoft Excel, capturing key dates for upcoming public events, competition open and close dates, and any information on the location of events. These dates formed the framework for the social media drive ensuring that survey links were shared in-line with upcoming dates of interest. The trend analysis of dates which corresponded to key interest from users was an iterative process which required constant monitoring not only for key dates and events, but also for the on-time creation of posts. These posts assisted with sharing the survey link to users whilst 45 simultaneously assisting with increased attention to the ‘Jacaranda Mapper’ Instagram account which had unique and attractive Jacaranda content linked to the survey for this research project. Response count from users on days where survey link was shared was monitored on an online dashboard, accessible through ArcGIS Online. Survey123 Connect intertwines with ArcGIS Online not only with automatic saving of user’s submitted responses, but also provides a virtual dashboard where survey response statistics are updated in real-time (Esri 2023e). The strategic monitoring of social media trends resulted in a total of 203 responses and 488 geotags for Jacaranda tree location collected during the 75-day survey collection period. 4.3 Data Analysis Data standardization was insured since all data from field campaigns, Google Street View, and citizen science surveys were stored and managed entirely on ArcGIS Online. Although data was manipulated from Google Earth Pro to ArcGIS Earth, the strict decision to stay within Esri’s product suites not only ensured data integrity, but also ease of access across the several Esri products and applications used in this study (Ferrara 2020). This was further driven by use of Esri’s well-known ArcGIS Pro software which was pivotal for data analysis and visualization within this study as it served as an integration and analysis platform of all collected geopoints from citizen science, Google Street View and ground-truthing efforts. 4.3.1 Geoprocessing ArcGIS Pro is a powerful single desktop GIS application which includes world-leading geospatial tools for data visualization, advanced analysis, and map creation (Esri 2023f). ArcGIS Pro hosts a 46 diversity of built-in spatial analysis tools which each assisted in the analysis of data collected from citizen science, Google Street View and ground-truthing efforts for the successful creation of several high-resolution spatial distribution maps for Jacaranda presence in the CoJ. 4.3.1.1 Structured Query Language Structured Query Language (SQL) is a tool widely used for spatial modelling and database management in a GIS (Lei 2021). SQL expressions in Esri’s ArcGIS Pro adhere to standard SQL syntax with common use of Boolean statements (Esri 2022b). SQL is particularly useful for relational databases, which was the format used to relate attribute citizen science data to spatial citizen science data collected from geotagging efforts (Aufaure-Portier 1995). Here, several ‘select layer by attribute’ SQL expressions were used to isolate data in unique pairs which were specific to this study. The use of SQL was essential in this study since attribute questions and respective unique answers per user (as adopted from citizen science survey responses), could be selected in isolation or cumulatively with associated spatial information. This resulted in unlimited data combinations for deeper regression analysis (Aufaure-Portier 1995). 4.3.1.2 Joins and Relates Data collected from citizen science surveys were exported from ArcGIS Online in Microsoft Excel format. Hereafter, the Excel spreadsheet was cleaned and standardized to meet formatting requirements for use in ArcGIS Pro. Three Excel spreadsheets were compiled from citizen science responses. The first spreadsheet with attribute information, the second with spatial information of collected geotags from the citizen’s street and other streets, and the third with spatial 47 information from geotags linked to exact location tags. The separation of these datasets was imperative for analysis of citizen science accuracy where non-spatial data such as citizen’s ‘length of stay’ or ‘suburb of residency’ in the CoJ could be isolated and directed compared to the same user’s geotagging accuracy to find any trends which support the study’s secondary aim. A series of spatial joins were made between the three spreadsheets based on both Global IDs and Globally Unique Identifiers (GUIDs) which each represent an individual user through a unique identity code (Esri 2023e). These identity codes were extracted and linked across all three spreadsheets ensuring spatial analysis ability for single or multiple users simultaneously (Godfrey and Stoddart, 2018). These advanced spatial joins provided an opportunity to cross-compare answers from various users who contributed to the citizen science dataset, for new trend possibilities to be identified. 4.3.1.3 Spatial Analysis Once all spreadsheets were linked based on a Global IDs or GUIDs, geometric analysis to correct spatial co-ordinates from the three datasets were done (Suzuki 2005). Since ground-truth geopoints were collected in-field and configured through QuickCapture, and citizen science data were collected through Survey123 Connect; geometric analysis ensures spatial standardization between the datasets before spatial analysis can be carried out (Breytenbach 2016). In this study, the Lo29 reference system based on the WGS84 reference ellipsoid was the geo-reference framework of choice since Esri software re-projects Lo29 co-ordinates with accuracy, whilst enforcing strict pixel-to-pixel registration with the gridded reference provided from ground- truthing data (Breytenbach 2016). 48 Several spatial analysis tools such as proximity, intersect, distance, X-Y to line, buffer, and near- distance, among others, were used. Esri offers extensive and advanced geoprocessing tools built on various statistical approaches such as Pearsons’s correlation coefficient, Cohens kappa coefficient, Multiple regression, and Thiessen Polygon, among others (Esri 2022b). Geoprocessing tools are selected and applied on a case-by-case basis. In this study, several geoprocessing toolboxes were utilized such as the ‘Spatial Statistics Toolbox’, ‘Geostatistical Analyst Toolbox’ and ‘Spatial Analyst Toolbox’, to name a few. Each toolbox contains an array of individual tools which each perform advanced statistical analysis once the dataset has been formatted correctly from the initial collection, cleaning, filtering, and storage of data (Esri 2022b). ArcGIS Pro automatically processes individual or multiple spatial analyses at any given time, as a supervised classification (Esri 2023f). Proximity analysis from citizen science geotags to nearest ground-truth geopoints from both ground-truthing and Google Street View datasets were completed. For this, a new dataset was created and comprised of combined ground-truth geopoints and Google Street View geotags, as control points for validated Jacaran