Comparative study of Machine learning techniques for loan fraud prediction

Date
2022
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Loan fraud is a major and growing problem. Due to this problem, millions of amounts are lost yearly. There has been a significant amount of research on fraud prediction. However, there has been less research on loan fraud prediction. It could be because of a lack of data available for analysis. This research aimed to compare machine learning algorithms to predict fraud practices in loan administration to find techniques that present the most accurate results. The dataset used is the Kaggle fraud detection dataset for the year 2019. Four machine learning algorithms, random forest, extreme gradient boost, adaptive boosting, and multilayer perceptron, were evaluated. The results obtained during the first attempt show that extreme gradient boost performed best compared to the other three models, with an area under the curve score (AUC) of 0.74. The results from the second attempt show that adaptive boosting performed best with an AUC score of 1.00, followed by extreme gradient boosting with an AUC score of 0.94
Description
A dissertation submitted in fulfilment of the requirements for the degree of Master of Science to the Faculty of Science, University of the Witwatersrand, Johannesburg, 2022
Keywords
Machine learning techniques, Loan fraud prediction, Area Under the Curve score (AUC)
Citation
Collections