Generative Model Based Adversarial Defenses for Deepfake Detectors

Kavilan Dhavan, Nair

Generative Model Based Adversarial Defenses for Deepfake Detectors

dc.contributor.author	Kavilan Dhavan, Nair
dc.contributor.supervisor	Klein, Richard
dc.date.accessioned	2024-10-13T21:51:46Z
dc.date.available	2024-10-13T21:51:46Z
dc.date.issued	2023-08
dc.description	A Research Report submitted in partial fulfilment of the requirements for the degree of Master of Science (Coursework and Research Report in Computer Science), to the Faculty of Science, School of Computer Science & Applied Mathematics, University of the Witwatersrand, Johannesburg, 2023.
dc.description.abstract	Deepfake videos present a serious threat to society as they can be used to spread mis-information through social media. Convolutional Neural Networks (CNNs) have been effective in detecting deepfake videos, but they are vulnerable to adversarial attacks that can compromise their accuracy. This vulnerability can be exploited by deepfake creators to evade detection. In this study, we evaluate the effectiveness of two genera- tive adversarial defense mechanisms, APE-GAN and MagNet, in the context of deepfake detection. We use the FaceForensics++ dataset and a CNN victim model based on the XceptionNet architecture, which we attack using the iterative fast gradient sign method at two different levels of ✏, ✏ = 0.0001 and ✏ = 0.01. We find that both APE-GAN and MagNet can purify the adversarial images and restore the performance of the vic- tim model to within 10% of the model’s accuracy on benign fake inputs. However, these methods were less effective at restoring accuracy for adversarial real examples and were not able to significantly restore accuracy when the adversarial attack was aggressive (✏ = 0.01). We recommend that an adversarial defense method be used in conjunction with a deepfake detector to improve the accuracy of predictions. APE-GAN and MagNet are effective methods in the deepfake context, but their effectiveness is limited when the adversarial attack is aggressive.
dc.description.submitter	MM2024
dc.faculty	Faculty of Science
dc.identifier	0000-0002-5619-1470
dc.identifier.citation	Kavilan Dhavan, Nair. (2023). Generative Model Based Adversarial Defenses for Deepfake Detectors. {Master's dissertation, University of the Witwatersrand, Johannesburg]. https://hdl.handle.net/10539/41538
dc.identifier.uri	https://hdl.handle.net/10539/41538
dc.language.iso	en
dc.publisher	University of the Witwatersrand, Johannesburg
dc.rights	©2023 University of the Witwatersrand, Johannesburg. All rights reserved. The copyright in this work vests in the University of the Witwatersrand, Johannesburg. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of University of the Witwatersrand, Johannesburg.
dc.rights.holder	University of the Witwatersrand, Johannesburg
dc.school	School of Computer Science and Applied Mathematics
dc.subject	Deepfakes
dc.subject	Adversarial Attacks
dc.subject	Adversarial Defenses
dc.subject	Generative Adversarial networks
dc.subject	Autoencoders
dc.subject	CNN
dc.subject	UCTD
dc.subject.other	SDG-9: Industry, innovation and infrastructure
dc.title	Generative Model Based Adversarial Defenses for Deepfake Detectors
dc.type	Dissertation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Nair_Generative_2023.pdf
Size:: 3.63 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.43 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electronic Theses and Dissertations (Masters)