Speech deepfakes are artificial voices generated by machine learning models. Previous literature has highlighted deepfakes as one of the biggest security threats arising from progress in artificial intelligence due to their potential for misuse. However, studies investigating human detection capabilities are limited. We presented genuine and deepfake audio to n = 529 individuals and asked them to identify the deepfakes. We ran our experiments in English and Mandarin to understand if language affects detection performance and decision-making rationale. We found that detection capability is unreliable. Listeners only correctly spotted the deepfakes 73% of the time, and there was no difference in detectability between the two languages. Increasing listener awareness by providing examples of speech deepfakes only improves results slightly. As speech synthesis algorithms improve and become more realistic, we can expect the detection task to become harder. The difficulty of detecting speech deepfakes confirms their potential for misuse and signals that defenses against this threat are needed.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10395974 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0285333 | PLOS |
Philos Technol
November 2024
University of Sussex School of Law, Politics and Sociology, Brighton, UK.
Artificially generated content threatens to seriously disrupt the public sphere. Generative AI massively facilitates the production of convincing portrayals of fabricated events. We have already begun to witness the spread of synthetic misinformation, political propaganda, and non-consensual intimate deepfakes.
View Article and Find Full Text PDFNat Commun
September 2024
Media Lab, Massachusetts Institute of Technology, Cambridge, MA, USA.
JMIR Biomed Eng
March 2024
Klick Labs, Toronto, ON, Canada.
Background: The digital era has witnessed an escalating dependence on digital platforms for news and information, coupled with the advent of "deepfake" technology. Deepfakes, leveraging deep learning models on extensive data sets of voice recordings and images, pose substantial threats to media authenticity, potentially leading to unethical misuse such as impersonation and the dissemination of false information.
Objective: To counteract this challenge, this study aims to introduce the concept of innate biological processes to discern between authentic human voices and cloned voices.
Commun Biol
June 2024
Cognitive and Affective Neuroscience Unit, Department of Psychology, University of Zurich, Zurich, Switzerland.
Deepfakes are viral ingredients of digital environments, and they can trick human cognition into misperceiving the fake as real. Here, we test the neurocognitive sensitivity of 25 participants to accept or reject person identities as recreated in audio deepfakes. We generate high-quality voice identity clones from natural speakers by using advanced deepfake technologies.
View Article and Find Full Text PDFPLoS One
August 2023
Department of Computer Science, University College London, London, United Kingdom.
Speech deepfakes are artificial voices generated by machine learning models. Previous literature has highlighted deepfakes as one of the biggest security threats arising from progress in artificial intelligence due to their potential for misuse. However, studies investigating human detection capabilities are limited.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!