You know you're screwed when you're simultaneously the White House, Time's Person of the Year, and the ire of pop culture's most rabid fans. That's what happened last week to X, the Elon Musk-owned platform formerly known as Twitter, when AI-generated, obscene deepfake images of Taylor Swift went viral.
One of the most widespread posts of non-consensual, obvious deepfakes has been viewed more than 45 million times with hundreds of thousands of likes. This doesn't factor in all the accounts that reshared the images in different posts – once an image is widely circulated, it's basically impossible to remove it.
X lacks the infrastructure to detect abusive content quickly and at scale. Even in the days of Twitter, this problem was difficult to solve, but it has gotten worse since Musk fired most of Twitter's staff, including its trust and safety teams. So, Taylor Swift's huge and passionate fanbase took matters into their own hands, flooding search results for queries like “Taylor Swift Eye” and “Taylor Swift Deepfake” to make it even more difficult for users to find abusive images. X simply banned the search term “Taylor Swift” for a few days, as the White House press secretary called on Congress to do something. When users search for a musician's name, they see a notice that an error has occurred.
This content moderation failure became a national news story because Taylor Swift is Taylor Swift. But if social platforms can't save one of the world's most famous women, who can?
“If what happened to Taylor Swift happens to you, which happens to a lot of people, you might not get the same amount of support based on clout, which means you don't really have access to these. important communities of care,” Dr. Carolina R., at Northumbria University's Center for Digital Citizens in the UK, told TechCrunch. “And it's these communities where most consumers have recourse in these situations, which really shows you the failure of content regulation.”
Banning the search term “Taylor Swift” is like putting a piece of Scotch tape over a broken pipe. There are many obvious solutions, like how TikTok users search for “segs” instead of sex. A search block can be implemented to make X look like they're doing something, but it doesn't stop them from searching for “t swift” instead. Mike Masnick, founder of the Copia Institute and TechDirt, called the effort “the sledgehammer version of Trust & Safety.”
“Platforms suck when it comes to giving women, non-binary people, and queer people agency over their bodies, so they reflect systems of abuse and patriarchy offline,” it said. “If your moderation systems can't respond in a crisis, or if your moderation systems can't respond to the needs of users when they report something wrong, then we have a problem.”
So, what should X have done to prevent the Taylor Swift fiasco?
R asked these questions as part of his research and proposed that a complete overhaul of how social platforms handle content moderation is needed. Recently, she hosted a series of roundtable discussions with 45 Internet users affected by censorship and abuse around the world and issued recommendations to platforms about how to implement change.
One recommendation is that social media platforms be more transparent with individual users about their account decisions or their reports of other accounts.
“While the platforms have access to that material, you don't have access to the case record – they don't want to make it public,” he said. “When it comes to abuse, I think people need a more personalized, contextual and rapid response that includes at least direct communication, if not face-to-face help.”
X announced this week that it will hire 100 content moderators to work at a new “trust and safety” center in Austin, Texas. But under Musk's purview, the platform hasn't set a strong precedent for protecting marginalized users from abuse. Taking Musk at face value will also be challenging, as the mogul has a long history of failing to deliver on his promises. When he first bought Twitter, Musk announced that he would set up a content moderation council before making major decisions. This did not happen.
In the case of AI-generated deepfakes, the onus is not solely on social platforms. It's also on companies that create consumer-facing productive AI products.
According to an investigation by 404 Media, Swift's abusive descriptions came from a Telegram group dedicated to creating non-consensual, obvious deepfakes. Users in the group often use Microsoft Designer, which is derived from Open AI's DALL-E 3 to create images based on inputted prompts. In the loophole that Microsoft fixed, users could generate images of celebrities by typing in prompts such as “Taylor 'singer' Swift” or “Jennifer 'actress' Aniston”.
Shane Jones, a principal software engineering lead at Microsoft, wrote a letter to the Washington state attorney general stating that he discovered a vulnerability in DALL-E 3 in December that “allowed the model to bypass certain guardrails designed to prevent it from creating and distributing malicious images.”
Jones warned Microsoft and OpenAI of the vulnerabilities, but two weeks later, he had no indication the problems were being fixed. So, he posted an open letter on LinkedIn asking OpenAI to stop making DALL-E 3 available. Jones alerted Microsoft to his letter, but it was asked to take it down immediately.
“We must hold companies accountable for the safety of their products and their responsibility to disclose known risks to the public,” Jones wrote in his letter to the state attorney general. “Concerned employees like me should not be afraid to remain silent.”
As the world's most influential companies make big bets on AI, platforms need to adopt a proactive approach to controlling abusive content – but making celebrity deepfakes isn't easy, as offending behavior easily escapes regulation.
“It really shows you that the platforms are unreliable,” he said. “Marginalized communities should trust their followers and fellow users more than the people who are technically responsible for our safety online.”