AI

The Taylor Swift deepfake debacle was frustratingly preventable

Comment

Taylor Swift performs onstage during the 2018 American Music Awards at Microsoft Theater on October 9, 2018 in Los Angeles, California.
Image Credits: Kevin Winter / Getty Images

You know you’ve screwed up when you’ve simultaneously angered the White House, the TIME Person of the Year and pop culture’s most rabid fanbase. That’s what happened last week to X, the Elon Musk-owned platform formerly called Twitter, when AI-generated, pornographic deepfake images of Taylor Swift went viral.

One of the most widespread posts of the nonconsensual, explicit deepfakes was viewed more than 45 million times, with hundreds of thousands of likes. That doesn’t even factor in all the accounts that reshared the images in separate posts — once an image has been circulated that widely, it’s basically impossible to remove.

X lacks the infrastructure to identify abusive content quickly and at scale. Even in the Twitter days, this issue was difficult to remedy, but it’s become much worse since Musk gutted so much of Twitter’s staff, including the majority of its trust and safety teams. So, Taylor Swift’s massive and passionate fanbase took matters into their own hands, flooding search results for queries like “taylor swift ai” and “taylor swift deepfake” to make it more difficult for users to find the abusive images. As the White House’s press secretary called on Congress to do something, X simply banned the search term “taylor swift” for a few days. When users searched the musician’s name, they would see a notice that an error had occurred.

This content moderation failure became a national news story, since Taylor Swift is Taylor Swift. But if social platforms can’t protect one of the most famous women in the world, who can they protect?

“If you have what happened to Taylor Swift happen to you, as it’s been happening to so many people, you’re likely not going to have the same amount of support based on clout, which means you won’t have access to these really important communities of care,” Dr. Carolina Are, a fellow at Northumbria University’s Centre for Digital Citizens in the U.K., told TechCrunch. “And these communities of care are what most users are having to resort to in these situations, which really shows you the failure of content moderation.”

Banning the search term “taylor swift” is like putting a piece of Scotch tape on a burst pipe. There are many obvious workarounds, like how TikTok users search for “seggs” instead of sex. The search block was something that X could implement to make it look like they’re doing something, but it doesn’t stop people from just searching “t swift” instead. Copia Institute and Techdirt founder Mike Masnick called the effort “a sledge hammer version of trust & safety.”

“Platforms suck when it comes to giving women, non-binary people and queer people agency over their bodies, so they replicate offline systems of abuse and patriarchy,” Are said. “If your moderation systems are incapable of reacting in a crisis, or if your moderation systems are incapable of reacting to users’ needs when they’re reporting that something is wrong, we have a problem.”

So, what should X have done to prevent the Taylor Swift fiasco?

Are asks these questions as part of her research, and proposes that social platforms need a complete overhaul of how they handle content moderation. Recently, she conducted a series of roundtable discussions with 45 internet users from around the world who are impacted by censorship and abuse to issue recommendations to platforms about how to enact change.

One recommendation is for social media platforms to be more transparent with individual users about decisions regarding their account or their reports about other accounts.

“You have no access to a case record, even though platforms do have access to that material — they just don’t want to make it public,” Are said. “I think when it comes to abuse, people need a more personalized, contextual and speedy response that involves, if not face-to-face help, at least direct communication.”

X announced this week that it would hire 100 content moderators to work out of a new “Trust and Safety” center in Austin, Texas. But under Musk’s purview, the platform has not set a strong precedent for protecting marginalized users from abuse. It can also be challenging to take Musk at face value, as the mogul has a long track record of failing to deliver on his promises. When he first bought Twitter, Musk declared he would form a content moderation council before making major decisions. This did not happen.

In the case of AI-generated deepfakes, the onus is not just on social platforms. It’s also on the companies that create consumer-facing generative AI products.

According to an investigation by 404 Media, the abusive depictions of Swift came from a Telegram group devoted to creating nonconsensual, explicit deepfakes. The members of the group often use Microsoft Designer, which draws from OpenAI’s DALL-E 3 to generate images based on inputted prompts. In a loophole that Microsoft has since addressed, users could generate images of celebrities by writing prompts like “taylor ‘singer’ swift” or “jennifer ‘actor’ aniston.”

A principal software engineering lead at Microsoft, Shane Jones, wrote a letter to the Washington state attorney general stating that he found vulnerabilities in DALL-E 3 in December, which made it possible to “bypass some of the guardrails that are designed to prevent the model from creating and distributing harmful images.”

Jones alerted Microsoft and OpenAI to the vulnerabilities, but after two weeks, he had received no indication that the issues were being addressed. So, he posted an open letter on LinkedIn to urge OpenAI to suspend the availability of DALL-E 3. Jones alerted Microsoft to his letter, but he was swiftly asked to take it down.

“We need to hold companies accountable for the safety of their products and their responsibility to disclose known risks to the public,” Jones wrote in his letter to the state attorney general. “Concerned employees, like myself, should not be intimidated into staying silent.”

OpenAI told TechCrunch that it immediately investigated Jones’ report and found that the technique he outlined did not bypass its safety systems.

“In the underlying DALL-E 3 model, we’ve worked to filter the most explicit content from its training data including graphic sexual and violent content, and have developed robust image classifiers that steer the model away from generating harmful images,” a spokesperson from OpenAI said. “We’ve also implemented additional safeguards for our products, ChatGPT and the DALL-E API – including declining requests that ask for a public figure by name.”

OpenAI added that it uses external red teaming to test products for misuse. It’s still not confirmed if Microsoft’s program is responsible for the explicit Swift deepfakes, but the fact stands that as of last week, both journalists and bad actors on Telegram were able to use this software to generate images of celebrities.

Jones refutes OpenAI’s claims. He told TechCrunch, “I am only now learning that OpenAI believes this vulnerability does not bypass their safeguards. This morning, I ran another test using the same prompts I reported in December and without exploiting the vulnerability, OpenAI’s safeguards blocked the prompts on 100% of the tests. When testing with the vulnerability, the safeguards failed 78% of the time, which is a consistent failure rate with earlier tests. The vulnerability still exists.”

As the world’s most influential companies bet big on AI, platforms need to take a proactive approach to regulate abusive content — but even in an era when making celebrity deepfakes wasn’t so easy, violative behavior easily evaded moderation.

“It really shows you that platforms are unreliable,” Are said. “Marginalized communities have to trust their followers and fellow users more than the people that are technically in charge of our safety online.”

Updated, 1/30/24 at 10:30 PM ET, with comment from OpenAI
Updated, 1/31/24 at 6:10 PM ET, with additional comment from Shane Jones

Swift retaliation: Fans strike back after explicit deepfakes flood X

Ahead of congressional hearing on child safety, X announces plans to hire 100 moderators in Austin

More TechCrunch

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the…

2 hours ago
A comprehensive list of 2024 tech layoffs

Featured Article

What to expect from WWDC 2024: iOS 18, macOS 15 and so much AI

Apple is hoping to make WWDC 2024 memorable as it finally spells out its generative AI plans.

2 hours ago
What to expect from WWDC 2024: iOS 18, macOS 15 and so much AI

We just announced the breakout session winners last week. Now meet the roundtable sessions that really “rounded” out the competition for this year’s Disrupt 2024 audience choice program. With five…

The votes are in: Meet the Disrupt 2024 audience choice roundtable winners

The malicious attack appears to have involved malware transmitted through TikTok’s DMs.

TikTok acknowledges exploit targeting high-profile accounts

It’s unusual for three major AI providers to all be down at the same time, which could signal a broader infrastructure issues or internet-scale problem.

AI apocalypse? ChatGPT, Claude and Perplexity all went down at the same time

Welcome to TechCrunch Fintech! This week, we’re looking at LoanSnap’s woes, Nubank’s and Monzo’s positive milestones, a plethora of fintech fundraises and more! To get a roundup of TechCrunch’s biggest…

A look at LoanSnap’s troubles and which neobanks are having a moment

Databricks, the analytics and AI giant, has acquired data management company Tabular for an undisclosed sum. (CNBC reports that Databricks paid over $1 billion.) According to Tabular co-founder Ryan Blue,…

Databricks acquires Tabular to build a common data lakehouse standard

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm. What started as a tool to hyper-charge productivity through writing essays and code with short text prompts has evolved…

ChatGPT: Everything you need to know about the AI-powered chatbot

The next few weeks could be pivotal for Worldcoin, the controversial eyeball-scanning crypto venture co-founded by OpenAI’s Sam Altman, whose operations remain almost entirely shuttered in the European Union following…

Worldcoin faces pivotal EU privacy decision within weeks

OpenAI’s chatbot ChatGPT has been down for several users across the globe for the last few hours.

OpenAI fixes the issue that caused ChatGPT outage for several hours

True Fit, the AI-powered size-and-fit personalization tool, has offered its size recommendation solution to thousands of retailers for nearly 20 years. Now, the company is venturing into the generative AI…

True Fit leverages generative AI to help online shoppers find clothes that fit

Audio streaming service TuneIn is teaming up with Discord to bring free live radio to the platform. This is TuneIn’s first collaboration with a social platform and one that is…

Discord and TuneIn partner to bring live radio to the social platform

The early victors in the AI gold rush are selling the picks and shovels needed to develop and apply artificial intelligence. Just take a look at data-labeling startup Scale AI…

Scale AI founder Alexandr Wang is coming to Disrupt 2024

Try to imagine the number of parts that go into making a rocket engine. Now imagine requesting and comparing quotes for each of those parts, getting approvals to purchase the…

Engineer brothers found Forge to modernize hardware procurement

Raspberry Pi has released a $70 AI extension kit with a neural network inference accelerator that can be used for local inferencing, for the Raspberry Pi 5.

Raspberry Pi partners with Hailo for its AI extension kit

When Stacklet’s founders, Travis Stanfield and Kapil Thangavelu, came out of Capital One in 2020 to launch their startup, most companies weren’t all that concerned with constraining cloud costs. But…

Stacklet sees demand grow as companies take cloud cost control more seriously

Fivetran’s Managed Data Lake Service aims to remove the repetitive work of managing data lakes.

Fivetran launches a managed data lake service

Lance Riedel and Nigel Daley both spent decades in search discovery, but it was while working at Pinterest that they began trying to understand how to use search engines to…

How a couple of former Pinterest search experts caught Biz Stone’s attention

GetWhy helps businesses carry out market studies and extract insights from video-based interviews using AI.

GetWhy, a market research AI platform that extracts insights from video interviews, raises $34.5M

AI-powered virtual physical therapy platform Sword Health has seen its valuation soar 50% to $3 billion.

Sword Health raises $130M and its valuation soars to $3B

Jeffrey Katzenberg and Sujay Jaswa, along with three general partners, manage $1.5 billion in assets today through their Build, Venture and Seed strategies.

WndrCo officially gets into venture capital with fresh $460M across two funds

The startup targets the middle ground between platforms that offer rigid templates, and those that facilitate a full-control approach.

Storyblok raises $80M to add more AI to its ‘headless’ CMS aimed at non-technical people

The startup has been pursuing a ground-up redesign of a well-understood technology.

‘Star Wars’ lasers and waterfalls of molten salt: How Xcimer plans to make fusion power happen

Sēkr, a startup that offers a mobile app for outdoor enthusiasts and campers, is launching a new AI tool for planning road trips. The new tool, called Copilot, is available…

Travel app Sēkr can plan your next road trip with its new AI tool

Microsoft’s education-focused flavor of its cloud productivity suite, Microsoft 365 Education, is facing investigation in the European Union. Privacy rights nonprofit noyb has just lodged two complaints with Austria’s data…

Microsoft hit with EU privacy complaints over schools’ use of 365 Education suite

Since the shock of Russia’s 2022 invasion of Ukraine, solar energy has been having a moment in Europe. Electricity prices have been going up while the investment required to get…

Samara is accelerating the energy transition in Spain one solar panel at a time

Featured Article

DEI backlash: Stay up-to-date on the latest legal and corporate challenges

It’s clear that this year will be a turning point for DEI.

24 hours ago
DEI backlash: Stay up-to-date on the latest legal and corporate challenges

The keynote will be focused on Apple’s software offerings and the developers that power them, including the latest versions of iOS, iPadOS, macOS, tvOS, visionOS and watchOS.

Watch Apple kick off WWDC 2024 right here

Hello and welcome back to TechCrunch Space. Unfortunately, Boeing’s Starliner launch was delayed yet again, this time due to issues with one of the three redundant computers used by United…

TechCrunch Space: China’s victory

The court ruling said that Fearless Fund’s Strivers Grant likely violates the Civil Rights Act of 1866, which bans the use of race in contracts.

An appeals court rules that VC Fearless Fund cannot issue grants to Black women, but the fight continues