3. Electronic Theses and Dissertations (ETDs) - All submissions
Permanent URI for this communityhttps://wiredspace.wits.ac.za/handle/10539/45
Browse
1 results
Search Results
Item Automatic speech feature extraction using a convolutional restricted boltzmann machine(2017) Anderson, David JohnRestricted Boltzmann Machines (RBMs) are a statistical learning concept that can be interpreted as Arti cial Neural Networks. They are capable of learning, in an unsupervised fashion, a set of features with which to describe a data set. Connected in series RBMs form a model called a Deep Belief Network (DBN), learning abstract feature combinations from lower layers. Convolutional RBMs (CRBMs) are a variation on the RBM architecture in which the learned features are kernels that are convolved across spatial portions of the input data to generate feature maps identifying if a feature is detected in a portion of the input data. Features extracted from speech audio data by a trained CRBM have recently been shown to compete with the state of the art for a number of speaker identi cation tasks. This project implements a similar CRBM architecture in order to verify previous work, as well as gain insight into Digital Signal Processing (DSP), Generative Graphical Models, unsupervised pre-training of Arti cial Neural Networks, and Machine Learning classi cation tasks. The CRBM architecture is trained on the TIMIT speech corpus and the learned features veri ed by using them to train a linear classi er on tasks such as speaker genetic sex classi cation and speaker identi cation. The implementation is quantitatively proven to successfully learn and extract a useful feature representation for the given classi cation tasks