Comparative study of Machine learning techniques for loan fraud prediction

Thumbnail Image

Date

2022

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Loan fraud is a major and growing problem. Due to this problem, millions of amounts are lost yearly. There has been a significant amount of research on fraud prediction. However, there has been less research on loan fraud prediction. It could be because of a lack of data available for analysis. This research aimed to compare machine learning algorithms to predict fraud practices in loan administration to find techniques that present the most accurate results. The dataset used is the Kaggle fraud detection dataset for the year 2019. Four machine learning algorithms, random forest, extreme gradient boost, adaptive boosting, and multilayer perceptron, were evaluated. The results obtained during the first attempt show that extreme gradient boost performed best compared to the other three models, with an area under the curve score (AUC) of 0.74. The results from the second attempt show that adaptive boosting performed best with an AUC score of 1.00, followed by extreme gradient boosting with an AUC score of 0.94

Description

A dissertation submitted in fulfilment of the requirements for the degree of Master of Science to the Faculty of Science, University of the Witwatersrand, Johannesburg, 2022

Keywords

Machine learning techniques, Loan fraud prediction, Area Under the Curve score (AUC)

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By