AI

The Taylor Swift deepfake debacle was frustratingly preventable

Comment

Taylor Swift performs onstage during the 2018 American Music Awards at Microsoft Theater on October 9, 2018 in Los Angeles, California.
Image Credits: Kevin Winter / Getty Images

You know you’ve screwed up when you’ve simultaneously angered the White House, the TIME Person of the Year and pop culture’s most rabid fanbase. That’s what happened last week to X, the Elon Musk-owned platform formerly called Twitter, when AI-generated, pornographic deepfake images of Taylor Swift went viral.

One of the most widespread posts of the nonconsensual, explicit deepfakes was viewed more than 45 million times, with hundreds of thousands of likes. That doesn’t even factor in all the accounts that reshared the images in separate posts — once an image has been circulated that widely, it’s basically impossible to remove.

X lacks the infrastructure to identify abusive content quickly and at scale. Even in the Twitter days, this issue was difficult to remedy, but it’s become much worse since Musk gutted so much of Twitter’s staff, including the majority of its trust and safety teams. So, Taylor Swift’s massive and passionate fanbase took matters into their own hands, flooding search results for queries like “taylor swift ai” and “taylor swift deepfake” to make it more difficult for users to find the abusive images. As the White House’s press secretary called on Congress to do something, X simply banned the search term “taylor swift” for a few days. When users searched the musician’s name, they would see a notice that an error had occurred.

This content moderation failure became a national news story, since Taylor Swift is Taylor Swift. But if social platforms can’t protect one of the most famous women in the world, who can they protect?

“If you have what happened to Taylor Swift happen to you, as it’s been happening to so many people, you’re likely not going to have the same amount of support based on clout, which means you won’t have access to these really important communities of care,” Dr. Carolina Are, a fellow at Northumbria University’s Centre for Digital Citizens in the U.K., told TechCrunch. “And these communities of care are what most users are having to resort to in these situations, which really shows you the failure of content moderation.”

Banning the search term “taylor swift” is like putting a piece of Scotch tape on a burst pipe. There are many obvious workarounds, like how TikTok users search for “seggs” instead of sex. The search block was something that X could implement to make it look like they’re doing something, but it doesn’t stop people from just searching “t swift” instead. Copia Institute and Techdirt founder Mike Masnick called the effort “a sledge hammer version of trust & safety.”

“Platforms suck when it comes to giving women, non-binary people and queer people agency over their bodies, so they replicate offline systems of abuse and patriarchy,” Are said. “If your moderation systems are incapable of reacting in a crisis, or if your moderation systems are incapable of reacting to users’ needs when they’re reporting that something is wrong, we have a problem.”

So, what should X have done to prevent the Taylor Swift fiasco?

Are asks these questions as part of her research, and proposes that social platforms need a complete overhaul of how they handle content moderation. Recently, she conducted a series of roundtable discussions with 45 internet users from around the world who are impacted by censorship and abuse to issue recommendations to platforms about how to enact change.

One recommendation is for social media platforms to be more transparent with individual users about decisions regarding their account or their reports about other accounts.

“You have no access to a case record, even though platforms do have access to that material — they just don’t want to make it public,” Are said. “I think when it comes to abuse, people need a more personalized, contextual and speedy response that involves, if not face-to-face help, at least direct communication.”

X announced this week that it would hire 100 content moderators to work out of a new “Trust and Safety” center in Austin, Texas. But under Musk’s purview, the platform has not set a strong precedent for protecting marginalized users from abuse. It can also be challenging to take Musk at face value, as the mogul has a long track record of failing to deliver on his promises. When he first bought Twitter, Musk declared he would form a content moderation council before making major decisions. This did not happen.

In the case of AI-generated deepfakes, the onus is not just on social platforms. It’s also on the companies that create consumer-facing generative AI products.

According to an investigation by 404 Media, the abusive depictions of Swift came from a Telegram group devoted to creating nonconsensual, explicit deepfakes. The members of the group often use Microsoft Designer, which draws from OpenAI’s DALL-E 3 to generate images based on inputted prompts. In a loophole that Microsoft has since addressed, users could generate images of celebrities by writing prompts like “taylor ‘singer’ swift” or “jennifer ‘actor’ aniston.”

A principal software engineering lead at Microsoft, Shane Jones, wrote a letter to the Washington state attorney general stating that he found vulnerabilities in DALL-E 3 in December, which made it possible to “bypass some of the guardrails that are designed to prevent the model from creating and distributing harmful images.”

Jones alerted Microsoft and OpenAI to the vulnerabilities, but after two weeks, he had received no indication that the issues were being addressed. So, he posted an open letter on LinkedIn to urge OpenAI to suspend the availability of DALL-E 3. Jones alerted Microsoft to his letter, but he was swiftly asked to take it down.

“We need to hold companies accountable for the safety of their products and their responsibility to disclose known risks to the public,” Jones wrote in his letter to the state attorney general. “Concerned employees, like myself, should not be intimidated into staying silent.”

OpenAI told TechCrunch that it immediately investigated Jones’ report and found that the technique he outlined did not bypass its safety systems.

“In the underlying DALL-E 3 model, we’ve worked to filter the most explicit content from its training data including graphic sexual and violent content, and have developed robust image classifiers that steer the model away from generating harmful images,” a spokesperson from OpenAI said. “We’ve also implemented additional safeguards for our products, ChatGPT and the DALL-E API – including declining requests that ask for a public figure by name.”

OpenAI added that it uses external red teaming to test products for misuse. It’s still not confirmed if Microsoft’s program is responsible for the explicit Swift deepfakes, but the fact stands that as of last week, both journalists and bad actors on Telegram were able to use this software to generate images of celebrities.

Jones refutes OpenAI’s claims. He told TechCrunch, “I am only now learning that OpenAI believes this vulnerability does not bypass their safeguards. This morning, I ran another test using the same prompts I reported in December and without exploiting the vulnerability, OpenAI’s safeguards blocked the prompts on 100% of the tests. When testing with the vulnerability, the safeguards failed 78% of the time, which is a consistent failure rate with earlier tests. The vulnerability still exists.”

As the world’s most influential companies bet big on AI, platforms need to take a proactive approach to regulate abusive content — but even in an era when making celebrity deepfakes wasn’t so easy, violative behavior easily evaded moderation.

“It really shows you that platforms are unreliable,” Are said. “Marginalized communities have to trust their followers and fellow users more than the people that are technically in charge of our safety online.”

Updated, 1/30/24 at 10:30 PM ET, with comment from OpenAI
Updated, 1/31/24 at 6:10 PM ET, with additional comment from Shane Jones

Swift retaliation: Fans strike back after explicit deepfakes flood X

Ahead of congressional hearing on child safety, X announces plans to hire 100 moderators in Austin

More TechCrunch

Zoox, Amazon’s self-driving unit, is bringing its autonomous vehicles to more cities.  The self-driving technology company announced Wednesday plans to begin testing in Austin and Miami this summer. The two…

Zoox to test self-driving cars in Austin and Miami 

Called Stable Audio Open, the generative model takes a text description and outputs a recording up to 47 seconds in length.

Stability AI releases a sound generator

It’s not just instant-delivery startups that are struggling. Oda, the Norway-based online supermarket delivery startup, has confirmed layoffs of 150 jobs as it drastically scales back its expansion ambitions to…

SoftBank-backed grocery startup Oda lays off 150, resets focus on Norway and Sweden

Newsletter platform Substack is introducing the ability for writers to send videos to their subscribers via Chat, its direct messaging feature, the company announced on Wednesday. The rollout of video…

Substack brings video to its Chat feature

Hiya, folks, and welcome to TechCrunch’s inaugural AI newsletter. It’s truly a thrill to type those words — this one’s been long in the making, and we’re excited to finally…

This Week in AI: Ex-OpenAI staff call for safety and transparency

Ms. Rachel isn’t a household name, but if you spend a lot of time with toddlers, she might as well be a rockstar. She’s like Steve from Blues Clues for…

Cameo fumbles on Ms. Rachel fundraiser as fans receive credits instead of videos  

Cartwheel helps animators go from zero to basic movement, so creating a scene or character with elementary motions like taking a step, swatting a fly or sitting down is easier.

Cartwheel generates 3D animations from scratch to power up creators

The new tool, which is set to arrive in Wix’s app builder tool this week, guides users through a chatbot-like interface to understand the goals, intent and aesthetic of their…

Wix’s new tool taps AI to generate smartphone apps

ClickUp Knowledge Management combines a new wiki-like editor and with a new AI system that can also bring in data from Google Drive, Dropbox, Confluence, Figma and other sources.

ClickUp wants to take on Notion and Confluence with its new AI-based Knowledge Base

New York City, home to over 60,000 gig delivery workers, has been cracking down on cheap, uncertified e-bikes that have resulted in battery fires across the city.  Some e-bike providers…

Whizz wants to own the delivery e-bike subscription space, starting with NYC

This is the last major step before Starliner can be certified as an operational crew system, and the first Starliner mission is expected to launch in 2025. 

Boeing’s Starliner astronaut capsule is en route to the ISS 

TechCrunch Disrupt 2024 in San Francisco is the must-attend event for startup founders aiming to make their mark in the tech world. This year, founders have three exciting ways to…

Three ways founders can shine at TechCrunch Disrupt 2024

Google’s newest startup program, announced on Wednesday, aims to bring AI technology to the public sector. The newly launched “Google for Startups AI Academy: American Infrastructure” will offer participants hands-on…

Google’s new startup program focuses on bringing AI to public infrastructure

eBay’s newest AI feature allows sellers to replace image backgrounds with AI-generated backdrops. The tool is now available for iOS users in the U.S., U.K., and Germany. It’ll gradually roll…

eBay debuts AI-powered background tool to enhance product images

If you’re anything like me, you’ve tried every to-do list app and productivity system, only to find yourself giving up sooner than later because sooner than later, managing your productivity…

Hoop uses AI to automatically manage your to-do list

Asana is using its work graph to train LLMs with the goal of creating AI assistants that work alongside human employees in company workflows.

Asana introduces ‘AI teammates’ designed to work alongside human employees

Taloflow, an early stage startup changing the way companies evaluate and select software, has raised $1.3M in a seed round.

Taloflow puts AI to work on software vendor selection to reduce cost and save time

The startup is hoping its durable filters can make metals refining and battery recycling more efficient, too.

SiTration uses silicon wafers to reclaim critical minerals from mining waste

Spun out of Bosch, Dive wants to change how manufacturers use computer simulations by both using modern mathematical approaches and cloud computing.

Dive goes cloud-native for its computational fluid dynamics simulation service

The tension between incumbents and fintechs has existed for decades. But every once in a while, the two groups decide to put their competition aside and work together. In an…

When foes become friends: Capital One partners with fintech giants Stripe, Adyen to prevent fraud

After growing 500% year-over-year in the past year, Understory is now launching a product focused on the renewable energy sector.

Insurance provider Understory gets into renewable energy following $15M Series A

Ashkenazi will start her new role at Google’s parent company on July 31, after 23 years at Eli Lilly.

Alphabet brings on Eli Lilly’s Anat Ashkenazi as CFO

Tobiko aims to reimagine how teams work with data by offering a dbt-compatible data transformation platform.

With $21.8M in funding, Tobiko aims to build a modern data platform

In 1816, French physician René Laennec invented an instrument that allowed doctors to listen to the heart and lungs. That device — a stethoscope — eventually evolved from a simple…

Eko Health scores $41M to detect heart and lung disease earlier and more accurately

The number of satellites on low Earth orbit is poised to explode over the coming years as more mega-constellations come online. This will create new opportunities for bad actors to…

DARPA and Slingshot build system to detect ‘wolf in sheep’s clothing’ adversary satellites

SAP sees WalkMe’s focus on automating contextual, in-app support as bringing value to its own enterprise customers.

SAP to acquire digital adoption platform WalkMe for $1.5B

The National Democratic Alliance (NDA) has emerged victorious in India’s 2024 general election, but with a smaller majority compared to 2019. According to post-election analysis by Goldman Sachs, JPMorgan, CLSA,…

Modi-led coalition’s election win signals policy continuity in India — and spending cuts

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the…

22 hours ago
A comprehensive list of 2024 tech layoffs

Featured Article

What to expect from WWDC 2024: iOS 18, macOS 15 and so much AI

Apple is hoping to make WWDC 2024 memorable as it finally spells out its generative AI plans.

23 hours ago
What to expect from WWDC 2024: iOS 18, macOS 15 and so much AI

We just announced the breakout session winners last week. Now meet the roundtable sessions that really “rounded” out the competition for this year’s Disrupt 2024 audience choice program. With five…

The votes are in: Meet the Disrupt 2024 audience choice roundtable winners