AI

Racial bias observed in hate speech detection algorithm from Google

Comment

perspective bias

Understanding what makes something offensive or hurtful is difficult enough that many people can’t figure it out, let alone AI systems. And people of color are frequently left out of AI training sets. So it’s little surprise that Alphabet/Google-spawned Jigsaw manages to trip over both of these issues at once, flagging slang used by black Americans as toxic.

To be clear, the study was not specifically about evaluating the company’s hate speech detection algorithm, which has faced issues before. Instead it is cited as a contemporary attempt to computationally dissect speech and assign a “toxicity score” — and that it appears to fail in a way indicative of bias against black American speech patterns.

The researchers, at the University of Washington, were interested in the idea that databases of hate speech currently available might have racial biases baked in — like many other data sets that suffered from a lack of inclusive practices during formation.

Algorithmic accountability

They looked at a handful of such databases, essentially thousands of tweets annotated by people as being “hateful,” “offensive,” “abusive” and so on. These databases were also analyzed to find language strongly associated with African American English or white-aligned English.

Combining these two sets basically let them see whether white or black vernacular had a higher or lower chance of being labeled offensive. Lo and behold, black-aligned English was much more likely to be labeled offensive.

For both datasets, we uncover strong associations between inferred AAE dialect and various hate speech categories, specifically the “offensive” label from DWMW 17 (r = 0.42) and the “abusive” label from FDCL 18 (r = 0.35), providing evidence that dialect-based bias is present in these corpora.

The experiment continued with the researchers sourcing their own annotations for tweets, and found that similar biases appeared. But by “priming” annotators with the knowledge that the person tweeting was likely black or using black-aligned English, the likelihood that they would label a tweet offensive dropped considerably.

3tweets
Examples of control, dialect priming and race priming for annotators

This isn’t to say necessarily that annotators are all racist or anything like that. But the job of determining what is and isn’t offensive is a complex one socially and linguistically, and obviously awareness of the speaker’s identity is important in some cases, especially in cases where terms once used derisively to refer to that identity have been reclaimed.

What’s all this got to do with Alphabet or Jigsaw or Google? Well, Jigsaw is a company built out of Alphabet — which we all really just think of as Google by another name — with the intention of helping to moderate online discussion by automatically detecting (among other things) offensive speech. Its Perspective API lets people input a snippet of text and receive a “toxicity score.”

As part of the experiment, the researchers fed to Perspective a bunch of the tweets in question. What they saw were “correlations between dialects/groups in our datasets and the Perspective toxicity scores. All correlations are significant, which indicates potential racial bias for all datasets.”

chart perspe
Chart showing that African American English (AAE) was more likely to be labeled toxic by Alphabet’s Perspective API

So basically, they found that Perspective was way more likely to label black speech as toxic, and white speech otherwise. Remember, this isn’t a model thrown together on the back of a few thousand tweets — it’s an attempt at a commercial moderation product.

As this comparison wasn’t the primary goal of the research, but rather a byproduct, it should not be taken as some kind of massive takedown of Jigsaw’s work. On the other hand, the differences shown are very significant and quite in keeping with the rest of the team’s findings. At the very least it is, as with the other data sets evaluated, a signal that the processes involved in their creation need to be reevaluated.

I asked lead author of the paper, Maarten Sap, for a bit more information on this and received an obliging answer:

The perspective API is indeed not meant to be the focus of our work. Instead, we found widespread bias in a variety of hate speech detection datasets, which if you train machine learning models on them, those models will be biased against African American English and tweets by African Americans. As we see in our controlled experiment, it’s likely that there’s something about the missing context in which the tweet occurred that causes annotators to assume “the worst”. Because this is an annotation and a human bias issue, it is likely that our findings hold for the PerspectiveAPI as well, but since we don’t know exactly what their training data is, or their model, we can only observe the API’s behavior.

Considering the evidence of bias not only in a product headed to market but in the datasets one might base a new system on, Sap suggested that we may be jumping the gun on this whole automated hate speech detection thing.

“We have a very limited understanding of offendedness mechanisms, and how that relates to the speakers, listener, or annotators demographic/group identity, and yet we’re pushing full steam ahead with computational modelling as if we knew how to create a gold standard dataset,” he wrote in his email to me. “I think right now is the right time to start teaming up with political scientists, social psychologists, and other social scientists that will help us make sense of existing hate speech behavior.”

This article has been updated with the study author’s commentary above. You can read the full paper, which was presented at the Proceedings of the Association for Computational Linguistics in Florence, below:

The Risk of Racial Bias in Hate Speech Detection by TechCrunch on Scribd

More TechCrunch

Avendus, the top investment bank for venture deals in India, confirmed on Wednesday it is looking to raise up to $350 million for its new private equity fund.  The new…

Avendus, India’s top venture advisor, confirms it’s looking to raise a $350 million fund

China has closed a third state-backed investment fund to bolster its semiconductor industry and reduce reliance on other nations, both for using and for manufacturing wafers — prioritizing what is…

China’s $47B semiconductor fund puts chip sovereignty front and center

Apple’s annual list of what it considers the best and most innovative software available on its platform is turning its attention to the little guy.

Apple’s Design Awards nominees highlight indies and startups, largely ignore AI (except for Arc)

The spyware maker’s founder, Bryan Fleming, said pcTattletale is “out of business and completely done,” following a data breach.

Spyware maker pcTattletale says it’s ‘out of business’ and shuts down after data breach

AI models are always surprising us, not just in what they can do, but what they can’t, and why. An interesting new behavior is both superficial and revealing about these…

AI models have favorite numbers, because they think they’re people

On Friday, Pal Kovacs was listening to the long-awaited new album from rock and metal giants Bring Me The Horizon when he noticed a strange sound at the end of…

Rock band’s hidden hacking-themed website gets hacked

Jan Leike, a leading AI researcher who earlier this month resigned from OpenAI before publicly criticizing the company’s approach to AI safety, has joined OpenAI rival Anthropic to lead a…

Anthropic hires former OpenAI safety lead to head up new team

Welcome to TechCrunch Fintech! This week, we’re looking at the long-term implications of Synapse’s bankruptcy on the fintech sector, Majority’s impressive ARR milestone, and more!  To get a roundup of…

The demise of BaaS fintech Synapse could derail the funding prospects for other startups in the space

YouTube’s free Playables don’t directly challenge the app store model or break Apple’s rules. However, they do compete with the App Store’s free games.

YouTube’s free games catalog ‘Playables’ rolls out to all users

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the first months of 2024. Smaller-sized…

10 hours ago
A comprehensive list of 2024 tech layoffs

OpenAI has formed a new committee to oversee “critical” safety and security decisions related to the company’s projects and operations. But, in a move that’s sure to raise the ire…

OpenAI’s new safety committee is made up of all insiders

Time is running out for tech enthusiasts and entrepreneurs to secure their early-bird tickets for TechCrunch Disrupt 2024! With only four days left until the May 31 deadline, now is…

Early bird gets the savings — 4 days left for Disrupt sale

AI may not be up to the task of replacing Google Search just yet, but it can be useful in more specific contexts — including handling the drudgery that comes…

Skej’s AI meeting scheduling assistant works like adding an EA to your email

Faircado has built a browser extension that suggests pre-owned alternatives for ecommerce listings.

Faircado raises $3M to nudge people to buy pre-owned goods

Tumblr, the blogging site acquired twice, is launching its “Communities” feature in open beta, the Tumblr Labs division has announced. The feature offers a dedicated space for users to connect…

Tumblr launches its semi-private Communities in open beta

Remittances from workers in the U.S. to their families and friends in Latin America amounted to $155 billion in 2023. With such a huge opportunity, banks, money transfer companies, retailers,…

Félix Pago raises $15.5 million to help Latino workers send money home via WhatsApp

Google said today it’s adding new AI-powered features such as a writing assistant and a wallpaper creator and providing easy access to Gemini chatbot to its Chromebook Plus line of…

Google adds AI-powered features to Chromebook

The dynamic duo behind the Grammy Award–winning music group the Chainsmokers, Alex Pall and Drew Taggart, are set to bring their entrepreneurial expertise to TechCrunch Disrupt 2024. Known for their…

The Chainsmokers light up Disrupt 2024

The deal will give LumApps a big nest egg to make acquisitions and scale its business.

LumApps, the French ‘intranet super app,’ sells majority stake to Bridgepoint in a $650M deal

Featured Article

More neobanks are becoming mobile networks — and Nubank wants a piece of the action

Nubank is taking its first tentative steps into the mobile network realm, as the NYSE-traded Brazilian neobank rolls out an eSIM (embedded SIM) service for travelers. The service will give customers access to 10GB of free roaming internet in more than 40 countries without having to switch out their own existing physical SIM card or…

18 hours ago
More neobanks are becoming mobile networks — and Nubank wants a piece of the action

Infra.Market, an Indian startup that helps construction and real estate firms procure materials, has raised $50M from MARS Unicorn Fund.

MARS doubles down on India’s Infra.Market with new $50M investment

Small operations can lose customers by not offering financing, something the Berlin-based startup wants to change.

Cloover wants to speed solar adoption by helping installers finance new sales

India’s Adani Group is in discussions to venture into digital payments and e-commerce, according to a report.

Adani looks to battle Reliance, Walmart in India’s e-commerce, payments race, report says

Ledger, a French startup mostly known for its secure crypto hardware wallets, has started shipping new wallets nearly 18 months after announcing the latest Ledger Stax devices. The updated wallet…

Ledger starts shipping its high-end hardware crypto wallet

A data protection taskforce that’s spent over a year considering how the European Union’s data protection rulebook applies to OpenAI’s viral chatbot, ChatGPT, reported preliminary conclusions Friday. The top-line takeaway…

EU’s ChatGPT taskforce offers first look at detangling the AI chatbot’s privacy compliance

Here’s a shoutout to LatAm early-stage startup founders! We want YOU to apply for the Startup Battlefield 200 at TechCrunch Disrupt 2024. But you’d better hurry — time is running…

LatAm startups: Apply to Startup Battlefield 200

The countdown to early-bird savings for TechCrunch Disrupt, taking place October 28–30 in San Francisco, continues. You have just five days left to save up to $800 on the price…

5 days left to get your early-bird Disrupt passes

Venture investment into Spanish startups also held up quite well, with €2.2 billion raised across some 850 funding rounds.

Spanish startups reached €100 billion in aggregate value last year

Featured Article

Onyx Motorbikes was in trouble — and then its 37-year-old owner died

James Khatiblou, the owner and CEO of Onyx Motorbikes, was watching his e-bike startup fall apart.  Onyx was being evicted from its warehouse in El Segundo, near Los Angeles. The company’s unpaid bills were stacking up. Its chief operating officer had abruptly resigned. A shipment of around 100 CTY2 dirt bikes from Chinese supplier Suzhou…

1 day ago
Onyx Motorbikes was in trouble — and then its 37-year-old owner died

Featured Article

Iyo thinks its GenAI earbuds can succeed where Humane and Rabbit stumbled

Iyo represents a third form factor in the push to deliver standalone generative AI devices: Bluetooth earbuds.

1 day ago
Iyo thinks its GenAI earbuds can succeed where Humane and Rabbit stumbled