3. Electronic Theses and Dissertations (ETDs) - All submissions
Permanent URI for this communityhttps://wiredspace.wits.ac.za/handle/10539/45
Browse
12 results
Search Results
Item Deep learning based semantic segmentation of unstructured outdoor environments(2020) Ndlovu, NkosinathiThe past decade has seen increased interest in, and demand for autonomous vehicles. Complete and successful autonomy of mobile systems such as unmanned ground vehicles (UGVs) depends on perception. Perception is the ability of an agent to semantically interpret its operational environment through vision. Deep learning approaches have recently been continuously more successful over traditional/classical methods in perception and vision tasks. This is primarily due to their non-reliance on selected hand-crafted features, they adopt a more robust and generalised learned-feature approach through representation learning. Convolutional Neural Networks (CNNs) have been widely used towards the goal of scene parsing and perception. In this research we focus on using CNN architectures for semantic segmentation in unstructured outdoor environments for autonomous navigation. Our first contribution is to provide a novel dataset for unstructured outdoor domains: the CSIR dataset. We seek to establish whether it is possible to semantically segment an unstructured scene into pre-defined classes such as grass, road, sky, trees etc. This is achieved through an exhaustive comparative study on state-of-the-art CNN architectures on this dataset, and a similar additional dataset: the Freiburg Forest dataset. Furthermore, we seek to establish whether there are any benefits in using transfer learning and pre-trained weights in training CNN architectures for semantic segmentation with limited datasets. Lastly, we identify the important architectural factors necessary for successful semantic segmentation in unstructured outdoor scenes.Item Learning to backpropagate(2020) Eyono, Roy HenhaThe backpropagation algorithm is regarded as the de-facto standard for gradient optimization in artificial neural networks. Since the conception of the method, modifications of the algorithm have been proposed. Adaptive learning rate methods are examples of custom modification to the back propagation equation as they scale the gradient equation during training. Existing gradient modifications are largely based on theoretical fundamen tals of gradient optimization, but few methods optimize for modifications that are based on data. We are motivated by the idea of discovering better custom gradient update equations for gradient optimization. In this paper, we present our parametrized backpropagation learning frame work (PBLF) which learns modifications of the backpropagation gradient update equation for stochastic gradient optimization. We achieve this by optimizing parts of the backpropagation equation to produce custom gra dient update equations for gradient optimization. We evaluate our custom equations by training our target network on a validation dataset. In our dissertation, we provide empirical analysis and evidence to support PBLF as a competitive alternative to standard backpropagation. In our experiments, we report competitive empirical performances on CI FAR10 with our custom gradient update equations sampled from PBLF. Our data-driven method offers promising custom update equations for gradient optimizationItem Using ensemble learning for the network intrusion detection problem(2019-08-01) Kalonji, Roland MpoyiNowadays, most organizations and platforms employ an intrusion detection system (IDS) to enhance their network security and protocol systems. The IDS has therefore become an essential component of any network system; it is a tool with several applications that can be tuned to specific content in a network by identifying various accesses (normal or attack). However, network intrusion detection system (NIDS) that focuses on revealing suspicious activities, is not effective in solving various problems such as identifying false IP packets and encrypted traffic. Hence, this work investigates the use of ensemble learning to solve these types of network intrusion detection problems (NIDPs). Random forest (RF), Decision Tree (DT) and Support Vector Machine (SVM) are introduced as classifiers based on Boruta and Principal Component Analysis (PCA) algorithms. In general, the main difficulties in using ensemble for the intrusion problem are to minimize false alarms and to maximize detection accuracy (Anuar et al., 2008). Additionally, the NIDP is divided into five categories, namely the detection of probe attacks, denial of service, remote to local, user to root and normal instances. Each problem is examined by one of the three aforementioned classifiers. In tackling these problems, the three classifiers achieved competitive results comparing to the works conducted by Balon-Perin (2012), Zainal et al. (2009) and Kevric et al. (2017). The results revealed that ensemble learning achieved more than 99% accuracy in demarcating attacks from normal connections. Particularly, RF, DT and SVM allowed to safeguard the NIDS from known and unknown attacks by developing reliable techniques. The KDD99 and NSL KDD datasets have been used to implement and measure the system performance (Fan et al., 2000; Dhanabal and Shantharajah, 2015).Item Design of an adaptive dynamic inversion-based neurocontroller for a tandem-controlled agile surface-to-air missile(2019) Phahlamohlaka, K JConventional plant-model dependent controller design approaches such as gain scheduling work well for simple flight envelope and airframe geometry. For complex flight envelope and airframe the approach results in a costly exercise to obtain a high-fidelity plant model. In this study an adaptive controller design approach is taken for an agile dual aerodynamically controlled DAC missile autopilot. Adaptive controller approach does not require an exhaustive plant model and has the capability of accommodating plant uncertainty and unmodelled dynamics online. A direct model reference adaptive control MRAC is investigated with different adaptive rules for the DAC missile. A two time-scale separation dynamic inversion controller with proportional-integral controller was used as the baseline controller of the proposed MRAC controller. The two time-scale separation controller was benchmarked with a gain scheduled three loop autopilot. A radial basis function neural network RBFNN is used to approximate the unmatched uncertainty of the missile dynamics. Adaption of the uncertainties is done on the fast dynamics controller to ensure fast recovery. The uncertainty of the slow dynamics is handled with a proportional-integral PI controller. The following adaptive rules were used with the RBFNN adaptive loop recovery ALRItem Ensembles of neural networks for time series with application to climate change prediction(2018) Choma, JoshuaEnsembles of artificial neural networks combining the outputs of individual time series models may have the potential to improve overall predictive performance. Deep and modular artificial neural networks are among recently developed machine learning techniques that have been successfully applied across various domains ranging from speech recognition to image classification. Climate change prediction information is important for planning and managing the impact of global change. However, the generation of climate change predictions from physical or numerical models is computationally very intensive, often requiring supercomputing processing capabilities and producing very large volumes of data. This research focuses on the application of various ensembles of architectures of artificial neural networks (ANNs) to time series. These ensembles are applied to the outputs of six different physical climate change prediction models. The output of these ensembles can be viewed as the consensual output of the individual artificial neural network prediction models. Six different climate change prediction models are considered for the area, Addis Ababa in Ethiopia. A single parameter, namely, the maximum predicted temperature (MaxTemp) aggregated over a quarterly period is studied. An artificial neural network is individually trained on the output of one of the six climate change prediction models. The predictive performance of different ensembles of these trained ANNs are compared to the actual averaged outputs of the climate change models. Results show that some ensembles have good predictive fidelity compared with the individual model outputsItem Deformable part model with CNN features for facial landmark detection under occlusion(2018) Brink, HannoDetecting and localizing facial regions in images is a fundamental building block of many applications in the field of affective computing and human-computer interaction. This allows systems to do a variety of higher level analysis such as facial expression recognition. Facial expression recognition is based on the effective extraction of relevant facial features. Many techniques have been proposed to deal with the robust extraction of these features under a wide variety of poses and occlusion conditions. These techniques include Deformable Part Models (DPMs), and more recently deep Convolutional Neural Networks (CNNs). Recently, hybrid models based on DPMs and CNNs have been proposed considering the generalization properties of CNNs and DPMs. In this work we propose a combined system, using CNNs as features for a DPM with a focus on dealing with occlusion. We also propose a method of face localization allowing occluded regions to be detected and explicitly ignored during the detection step.Item Automatic speech feature extraction using a convolutional restricted boltzmann machine(2017) Anderson, David JohnRestricted Boltzmann Machines (RBMs) are a statistical learning concept that can be interpreted as Arti cial Neural Networks. They are capable of learning, in an unsupervised fashion, a set of features with which to describe a data set. Connected in series RBMs form a model called a Deep Belief Network (DBN), learning abstract feature combinations from lower layers. Convolutional RBMs (CRBMs) are a variation on the RBM architecture in which the learned features are kernels that are convolved across spatial portions of the input data to generate feature maps identifying if a feature is detected in a portion of the input data. Features extracted from speech audio data by a trained CRBM have recently been shown to compete with the state of the art for a number of speaker identi cation tasks. This project implements a similar CRBM architecture in order to verify previous work, as well as gain insight into Digital Signal Processing (DSP), Generative Graphical Models, unsupervised pre-training of Arti cial Neural Networks, and Machine Learning classi cation tasks. The CRBM architecture is trained on the TIMIT speech corpus and the learned features veri ed by using them to train a linear classi er on tasks such as speaker genetic sex classi cation and speaker identi cation. The implementation is quantitatively proven to successfully learn and extract a useful feature representation for the given classi cation tasksItem Short-term hourly load forecasting in South Africa using neutral networks(2018) Ilunga, Elvis TshianiAccuracy of the load forecasts is very critical in the power system industry, which is the lifeblood of the global economy to such an extent that its art-of-the-state management is the focus of the Short-Term Load Forecasting (STLF) models. In the past few years, South Africa faced an unprecedented energy management crisis that could be addressed in advance, inter alia, by carefully forecasting the expected load demand. Moreover, inaccurate or erroneous forecasts may result in either costly over-scheduling or adventurous under-scheduling of energy that may induce heavy economic forfeits to power companies. Therefore, accurate and reliable models are critically needed. Traditional statistical methods have been used in STLF but they have limited capacity to address nonlinearity and non-stationarity of electric loads. Also, such traditional methods cannot adapt to abrupt weather changes, thus they failed to produce reliable load forecasts in many situations. In this research report, we built a STLF model using Artificial Neural Networks (ANNs) to address the accuracy problem in this field so as to assist energy management decisions makers to run efficiently and economically their daily operations. ANNs are a mathematical tool that imitate the biological neural network and produces very accurate outputs. The built model is based on the Multilayer Perceptron (MLP), which is a class of feedforward ANNs using the backpropagation (BP) algorithm as its training algorithm, to produce accurate hourly load forecasts. We compared the MLP built model to a benchmark Seasonal Autoregressive Integrated Moving Average with Exogenous variables (SARIMAX) model using data from Eskom, a South African public utility. Results showed that the MLP model, with percentage error of 0.50%, in terms of the MAPE, outperformed the SARIMAX with 1.90% error performance.Item Automated parking space detection(2018) Nyambal, Julien CedricParking space management is a problem that most big cities encounter. Without parking space management strategies, the traffic can become anarchic. Compared to physical sensors around the parking lot, a camera monitoring it can send images to be processed for vacancy detection. This dissertation implements a system to automatically detect and classify spaces (vacant or occupied) in images of a parking lot. Detection is done using the Region based Convolutional Neural Networks (RCNN). It reduces the amount of time that would otherwise be spent manually mapping out a parking lot. After the spaces are detected, they are classified as either vacant or occupied. It is accomplished using the Histograms of Oriented Gradients (HOG) with the Linear and Radial Basis Function (RBF) Support Vector Machines (SVM), Convolutional Neural Networks (CNN) and a Hybrid approach. The classifiers are trained, tested and validated using data collected for this research. We compared the results of the Hybrid classifier against CNN and SVMs. The Hybrid classifier performed better than all the other ones with an accuracy of 89.36% and a precision of 82.54%, which is the best score obtained from all the other classifiers used. Novel contributions of this work include the new labeled database, the use of the RCNN for bay detection, and the classification of bays using the hybrid CNN and SVM.Item The Wits intelligent teaching system (WITS): a smart lecture theatre to assess audience engagement(2017) Klein, RichardThe utility of lectures is directly related to the engagement of the students therein. To ensure the value of lectures, one needs to be certain that they are engaging to students. In small classes experienced lecturers develop an intuition of how engaged the class is as a whole and can then react appropriately to remedy the situation through various strategies such as breaks or changes in style, pace and content. As both the number of students and size of the venue grow, this type of contingent teaching becomes increasingly difficult and less precise. Furthermore, relying on intuition alone gives no way to recall and analyse previous classes or to objectively investigate trends over time. To address these problems this thesis presents the WITS INTELLIGENT TEACHING SYSTEM (WITS) to highlight disengaged students during class. A web-based, mobile application called Engage was developed to try elicit anonymous engagement information directly from students. The majority of students were unwilling or unable to self-report their engagement levels during class. This stems from a number of cultural and practical issues related to social display rules, unreliable internet connections, data costs, and distractions. This result highlights the need for a non-intrusive system that does not require the active participation of students. A nonintrusive, computer vision and machine learning based approach is therefore proposed. To support the development thereof, a labelled video dataset of students was built by recording a number of first year lectures. Students were labelled across a number of affects – including boredom, frustration, confusion, and fatigue – but poor inter-rater reliability meant that these labels could not be used as ground truth. Based on manual coding methods identified in the literature, a number of actions, gestures, and postures were identified as proxies of behavioural engagement. These proxies are then used in an observational checklist to mark students as engaged or not. A Support Vector Machine (SVM) was trained on Histograms of Oriented Gradients (HOG) to classify the students based on the identified behaviours. The results suggest a high temporal correlation of a single subject’s video frames. This leads to extremely high accuracies on seen subjects. However, this approach generalised poorly to unseen subjects and more careful feature engineering is required. The use of Convolutional Neural Networks (CNNs) improved the classification accuracy substantially, both over a single subject and when generalising to unseen subjects. While more computationally expensive than the SVM, the CNN approach lends itself to parallelism using Graphics Processing Units (GPUs). With GPU hardware acceleration, the system is able to run in near real-time and with further optimisations a real-time classifier is feasible. The classifier provides engagement values, which can be displayed to the lecturer live during class. This information is displayed as an Interest Map which highlights spatial areas of disengagement. The lecturer can then make informed decisions about how to progress with the class, what teaching styles to employ, and on which students to focus. An Interest Map was presented to lecturers and professors at the University of the Witwatersrand yielding 131 responses. The vast majority of respondents indicated that they would like to receive live engagement feedback during class, that they found the Interest Map an intuitive visualisation tool, and that they would be interested in using such technology. Contributions of this thesis include the development of a labelled video dataset; the development of a web based system to allow students to self-report engagement; the development of cross-platform, open-source software for spatial, action and affect labelling; the application of Histogram of Oriented Gradient based Support Vector Machines, and Deep Convolutional Neural Networks to classify this data; the development of an Interest Map to intuitively display engagement information to presenters; and finally an analysis of acceptance of such a system by educators.