Generalizing regularization of neural networks to correlated parameter spaces

dc.contributor.authorJarvis, Devon
dc.date.accessioned2022-07-29T07:32:20Z
dc.date.available2022-07-29T07:32:20Z
dc.date.issued2021
dc.descriptionA dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Scienceen_ZA
dc.description.abstractA common assumption of regularization techniques used with artificial neural networks is that their parameters are independently distributed. This is primarily done for simplicity or to enforce this constraint on the model parameters. In this work we provide theoretical and empirical results showing that the independence assumption is unreasonable and unhelpful for regularization. We create and evaluate a novel regularization method called Metric regularization which adapts the degree of regularization for each parameter of the network based on how important the parameter is for reducing the loss on the training data. Importantly Metric regularization accounts for the impact that a parameter has on the other model parameters to determine how important it is for reducing the loss. Thus, our novel regularization method adapts to the correlation of the parameters in the model. We provide theoretical results showing that Metric regularization has the Minimum Mean Squared Error property. We also evaluate the utility of Metric regularization empirically and find that it is damaging to the model which is unable to effectively fit the training data as a result. We instead find that regularization methods which adaptively choose to regularize only the parameters which are unhelpful for fitting the training data are able to improve the generalizability of the networks without hindering the training data performance. We provide justifications for the apparent disconnect between our theoretical and empirical results for Metric regularization and in so doing shed some light on what causes a generalization gap with networks as well as the impacts of different initialization regimes used when training networksen_ZA
dc.description.librarianCK2022en_ZA
dc.facultyFaculty of Scienceen_ZA
dc.identifier.urihttps://hdl.handle.net/10539/33075
dc.language.isoenen_ZA
dc.schoolSchool of Computer Science and Applied Mathematicsen_ZA
dc.titleGeneralizing regularization of neural networks to correlated parameter spacesen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
1365149-MSc Dissertation_ Devon Jarvis.pdf
Size:
8.28 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections