The Dark Side of ChatGPT's New Voice Capabilities
The possibility of a get-out-of-jail-free card for politics' biggest liars
My first piece in a little while - I’m now based at Columbia University in New York City, working on my doctoral thesis about law and AI. This is an explainer written for The Spinoff covering some recent developments in generative AI and what they might mean for politics.
It’s often hard to make sense of what’s happening in the hyperbole-infested world of Silicon Valley. ChatGPT, the generative AI that went viral at the start of the year, is a good example. Is it an entertaining gimmick, an apocalyptic portent for journalism and the arts, or something inbetween? It’s hard to say, although we do know one thing: recent improvements to ChatGPT have the potential to destabilise politics as we know it.
Hold up. What is ‘generative AI’?
Generative AI is a type of artificial intelligence system trained on an enormous amount of data from the internet. They generate new content in response to ‘prompts’, like through ChatGPT’s message box. ChatGPT is designed to generate human-like responses to questions, and is trained on an enormous bank of human conversations and language to help it seem lifelike.
So isn’t ChatGPT just a glorified chatbot?
Well, yes, that’s sort of right. At its core, ChatGPT is a faithful mimic. But it turns out that mimicking human language also means that ChatGPT can mimic human performance on a range of complex tasks, from mathematics and coding to medicine and law. The AI is far from perfect. ChatGPT has a bad habit of ‘hallucinating’ and providing incorrect answers with boundless confidence, but it is still an impressive tool - especially for a work in progress.
What’s changing?
OpenAI, the developer of ChatGPT, recently announced that ChatGPT is being updated with voice capabilities, including the ability to generate realistic synthetic voice - ie, the ability to craft "realistic voices from just a few seconds of real speech”.
What does that have to do with politics or elections?
This technology could make it straightforward to create realistic audio of a politician, in their own voice, saying things they have never said in real life. It isn’t hard to imagine how a political operative could weaponise this kind of tool using social media, spreading lifelike disinformation at scale.
However, the flipside of this problem is just as scary. We already have a track record overseas of politicians routinely denying things that they have said or done. The mainstreaming of realistic voice generation could hand a permanent get-out-of-jail-free card to those politicians. In an interview with the New York Times, Democrat campaign organisers discussed the Access Hollywood tape (where Donald Trump was recorded bragging about sexually assaulting women) and what would change if the tape leaked in 2023 rather than 2016. While Trump admitted to the authenticity of the tape in 2016, it seems likely that today he would claim it was faked. Matt Hodges, the engineering director on Joe Biden’s 2020 campaign, said that Republicans might swap from shouts of “fake news” to “Woke AI”.
Oh no.
It isn’t great. On the one hand, it seems absurd that politicians could get away with a scandal if there is a literal recording of them admitting to something. But if ChatGPT really can generate a realistic voice of a politician, who could be made to say anything, it isn’t clear what could be done to prove the authenticity of a true recording. This has far-reaching consequences: voice recordings historically have exposed corruption, brought down presidents and held world leaders to account. The integrity of those investigations or leaks could forever be questioned as AI-generated from this point on, along with a laundry list of other important audio, like evidence used in courts. Short of an ironclad technological breakthrough to sift between true audio and imitations, it’s hard to chart a path forward.
What are the developers doing about this potential villain arc?
OpenAI is alive to the possible risks of their new voice-enabled tool, and specifically the problem that ChatGPT could be used to impersonate public figures or commit fraud. They say they are tightly limiting the release of the voice capabilities for specific uses, like a collaboration with Spotify to allow podcasters to release translations into other languages. However, OpenAI’s guardrails have been proven to be porous before, and arguably just the existence of a mainstream voice-simulation technology will embolden politicians who want to claim that leaks or audio are fake. Other AI developers have developed similar technology, too, albeit not with the size or influence of ChatGPT’s userbase.
This is stressful. TLDR?
Your friendly neighborhood AI, ChatGPT, isn’t just good for writing mundane office emails: it will soon be able to generate fake – but realistic – audio of anyone, especially famous people like politicians. Even if OpenAI stops most users from being able to access that functionality, the mainstream development of this technology offers politicians an exit route from inconvenient leaks or recordings: they can point to ChatGPT and deny, deny, deny.
The romantic notion that objective facts matter in politics might be in for another body blow.