AI

A group behind Stable Diffusion wants to open source emotion-detecting AI

Comment

Emotional intelligence concept illustration
Image Credits: Scar1984 (opens in a new window) / Getty Images

In 2019, Amazon upgraded its Alexa assistant with a feature that enabled it to detect when a customer was likely frustrated — and respond with proportionately more sympathy. If a customer asked Alexa to play a song and it queued up the wrong one, for example, and then the customer said “No, Alexa” in an upset tone, Alexa might apologize — and request a clarification.

Now, the group behind one of the data sets used to train the text-to-image model Stable Diffusion wants to bring similar emotion-detecting capabilities to every developer — at no cost.

This week, LAION, the nonprofit building image and text data sets for training generative AI, including Stable Diffusion, announced the Open Empathic project. Open Empathic aims to “equip open source AI systems with empathy and emotional intelligence,” in the group’s words.

“The LAION team, with backgrounds in healthcare, education and machine learning research, saw a gap in the open source community: emotional AI was largely overlooked,” Christoph Schuhmann, a LAION co-founder, told TechCrunch via email. “Much like our concerns about non-transparent AI monopolies that led to the birth of LAION, we felt a similar urgency here.”

Through Open Empathic, LAION is recruiting volunteers to submit audio clips to a database that can be used to create AI, including chatbots and text-to-speech models, that “understands” human emotions.

“With Open Empathic, our goal is to create an AI that goes beyond understanding just words,” Schuhmann added. “We aim for it to grasp the nuances in expressions and tone shifts, making human-AI interactions more authentic and empathetic.”

LAION, an acronym for “Large-scale Artificial Intelligence Open Network,” was founded in early 2021 by Schuhmann, who’s a German high school teacher by day, and several members of a Discord server for AI enthusiasts. Funded by donations and public research grants, including from AI startup Hugging Face and Stability AI, the vendor behind Stable Diffusion, LAION’s stated mission is to democratize AI research and development resources — starting with training data.

“We’re driven by a clear mission: to harness the power of AI in ways that can genuinely benefit society,” Kari Noriy, an open source contributor to LAION and a PhD student at Bournemouth University, told TechCrunch via email. “We’re passionate about transparency and believe that the best way to shape AI is out in the open.”

Hence Open Empathic.

For the project’s initial phase, LAION has created a website that tasks volunteers with annotating YouTube clips — some pre-selected by the LAION team, others by volunteers — of an individual person speaking. For each clip, volunteers can fill out a detailed list of fields, including a transcription for the clip, an audio and video description and the person in the clip’s age, gender, accent (e.g. “British English”), arousal level (alertness — not sexual, to be clear) and valence level (“pleasantness” versus “unpleasantness”).

Other fields in the form pertain to the clip’s audio quality and the presence (or absence) of loud background noises. But the bulk focus is on the person’s emotions — or at least, the emotions that volunteers perceive them to have.

From an array of drop-down menus, volunteers can select individual — or multiple — emotions ranging from “chirpy,” “brisk” and “beguiling” to “reflective” and “engaging.” Noriy says that the idea was to solicit “rich” and “emotive” annotations while capturing expressions in a range of languages and cultures.

“We’re setting our sights on training AI models that can grasp a wide variety of languages and truly understand different cultural settings,” Noriy said. “We’re working on creating models that ‘get’ languages and cultures, using videos that show real emotions and expressions.”

Once volunteers submit a clip to LAION’s database, they can repeat the process anew — there’s no limit to the number of clips a single volunteer can annotate. LAION hopes to gather roughly 10,000 samples over the next few months, and — optimistically — between 100,000 to 1 million by next year.

“We have passionate community members who, driven by the vision of democratizing AI models and data sets, willingly contribute annotations in their free time,” Noriy said. “Their motivation is the shared dream of creating an empathic and emotionally intelligent open source AI that’s accessible to all.”

The pitfalls of emotion detection

Aside from Amazon’s attempts with Alexa, startups and tech giants alike have explored developing AI that can detect emotions — for purposes ranging from sales training to preventing drowsiness-induced accidents.

In 2016, Apple acquired Emotient, a San Diego firm working on AI algorithms that analyze facial expressions. Snatched up by Sweden-based Smart Eye last May, Affectiva — an MIT spin-out — once claimed its technology could detect anger or frustration in speech in 1.2 seconds. And speech recognition platform Nuance, which Microsoft purchased in April 2021, has demoed a product for cars that analyzes driver emotions from their facial cues.

Other players in the budding emotion detection and recognition space include Hume, HireVue and Realeyes, whose technology is being applied to gauge how certain segments of viewers respond to certain ads. Some employers are using emotion-detecting tech to evaluate potential employees by scoring them on empathy and emotional intelligence. Schools have deployed it to monitor students’ engagement in the classroom — and remotely at home. And emotion-detecting AI has been used by governments to identify “dangerous people” and tested at border control stops in the U.S., Hungary, Latvia and Greece.

The LAION team envisions, for their part, helpful, unproblematic applications of the tech across robotics, psychology, professional training, education and even gaming. Schuhmann paints a picture of robots that offer support and companionship, virtual assistants that sense when someone feels lonely or anxious and tools that aid in diagnosing psychological disorders.

It’s a techno utopia. The problem is, most emotion detection is on shaky scientific ground.

Few, if any, universal markers of emotion exist — putting the accuracy of emotion-detecting AI into question. The majority of emotion-detecting systems were built on the work of psychologist Paul Ekman, published in the ’70s. But subsequent research — including Ekman’s own — supports the common-sense notion that there’s major differences in the way people from different backgrounds express how they’re feeling.

For example, the expression supposedly universal for fear is a stereotype for a threat or anger in Malaysia. In one of his later works, Ekman suggested that American and Japanese students tend to react to violent films very differently, with Japanese students adopting “a completely different set of expressions” if someone else is in the room — particularly an authority figure.

Voices, too, cover a broad range of characteristics, including those of people with disabilities, conditions like autism and who speak in other languages and dialects such as African-American Vernacular English (AAVE). A native French speaker taking a survey in English might pause or pronounce a word with some uncertainty — which could be misconstrued by someone unfamiliar as an emotion marker.

Indeed, a big part of the problem with emotion-detecting AI is bias — implicit and explicit bias brought by the annotators whose contributions are used to train emotion-detecting models.

In a 2019 study, for instance, scientists found that labelers are more likely to annotate phrases in AAVE more toxic than their general American English equivalents. Sexual orientation and gender identity can heavily influence which words and phrases an annotator perceives as toxic as well — as can outright prejudice. Several commonly used open source image data sets have been found to contain racist, sexist and otherwise offensive labels from annotators.

The downstream effects can be quite dramatic.

Retorio, an AI hiring platform, was found to react differently to the same candidate in different outfits, such as glasses and headscarves. In a 2020 MIT study, researchers showed that face-analyzing algorithms could become biased toward certain facial expressions, like smiling — reducing their accuracy. More recent work implies that popular emotional analysis tools tend to assign more negative emotions to Black men’s faces than white faces.

Respecting the process

So how will the LAION team combat these biases — making certain, for instance, that white people don’t outnumber Black people in the data set; that nonbinary people aren’t assigned the wrong gender; and that those with mood disorders aren’t mislabeled with emotions they didn’t intend to express?

It’s not totally clear.

Schuhmann claims the training data submission process for Open Empathic isn’t an “open door” and that LAION has systems in place to “ensure the integrity of contributions.”

“We can validate a user’s intention and consistently check for the quality of annotations,” he added.

But LAION’s previous data sets haven’t exactly been pristine.

Some analyses of LAION ~400M — a LAION image training set, which the group attempted to curate with automated tools — turned up photos depicting sexual assault, rape, hate symbols and graphic violence. LAION ~400M is also rife with bias, for example returning images of men but not women for words like “CEO” and pictures of Middle Eastern Men for “terrorist.”

Schuhmann’s placing trust in the community to serve as a check this go-around.

“We believe in the power of hobby scientists and enthusiasts from all over the world coming together and contributing to our data sets,” he said. “While we’re open and collaborative, we prioritize quality and authenticity in our data.”

As far as how any emotion-detecting AI trained on the Open Empathic data set — biased or no — is used, LAION is intent on upholding its open source philosophy — even if that means the AI might be abused.

“Using AI to understand emotions is a powerful venture, but it’s not without its challenges,” Robert Kaczmarczyk, a LAION co-founder and physician at the Technical University of Munich, said via email. “Like any tool out there, it can be used for both good and bad. Imagine if just a small group had access to advanced technology, while most of the public was in the dark. This imbalance could lead to misuse or even manipulation by the few who have control over this technology.”

Where it concerns AI, laissez faire approaches sometimes come back to bite model’s creators — as evidenced by how Stable Diffusion is now being used to create child sexual abuse material and nonconsensual deepfakes.

Certain privacy and human rights advocates, including European Digital Rights and Access Now, have called for a blanket ban on emotion recognition. The EU AI Act, the recently enacted European Union law that establishes a governance framework for AI, bars the use of emotion recognition in policing, border management, workplaces and schools. And some companies have voluntarily pulled their emotion-detecting AI, like Microsoft, in the face of public blowback.

LAION seems comfortable with the level of risk involved, though — and has faith in the open development process.

“We welcome researchers to poke around, suggest changes, and spot issues,” Kaczmarczyk said. “And just like how Wikipedia thrives on its community contributions, Open Empathic is fueled by community involvement, making sure it’s transparent and safe.”

Transparent? Sure. Safe? Time will tell.

More TechCrunch

Jasper Health, a cancer care platform startup, laid off a substantial part of its workforce, TechCrunch has learned.

General Catalyst-backed Jasper Health lays off staff

Live Nation says its Ticketmaster subsidiary was hacked. A hacker claims to be selling 560 million customer records.

Live Nation confirms Ticketmaster was hacked, says personal information stolen in data breach

Featured Article

Inside EV startup Fisker’s collapse: how the company crumbled under its founders’ whims

An autonomous pod. A solid-state battery-powered sports car. An electric pickup truck. A convertible grand tourer EV with up to 600 miles of range. A “fully connected mobility device” for young urban innovators to be built by Foxconn and priced under $30,000. The next Popemobile. Over the past eight years, famed vehicle designer Henrik Fisker…

13 hours ago
Inside EV startup Fisker’s collapse: how the company crumbled under its founders’ whims

Late Friday afternoon, a time window companies usually reserve for unflattering disclosures, AI startup Hugging Face said that its security team earlier this week detected “unauthorized access” to Spaces, Hugging…

Hugging Face says it detected ‘unauthorized access’ to its AI model hosting platform

Featured Article

Hacked, leaked, exposed: Why you should never use stalkerware apps

Using stalkerware is creepy, unethical, potentially illegal, and puts your data and that of your loved ones in danger.

14 hours ago
Hacked, leaked, exposed: Why you should never use stalkerware apps

The design brief was simple: each grind and dry cycle had to be completed before breakfast. Here’s how Mill made it happen.

Mill’s redesigned food waste bin really is faster and quieter than before

Google is embarrassed about its AI Overviews, too. After a deluge of dunks and memes over the past week, which cracked on the poor quality and outright misinformation that arose…

Google admits its AI Overviews need work, but we’re all helping it beta test

Welcome to Startups Weekly — Haje‘s weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. In…

Startups Weekly: Musk raises $6B for AI and the fintech dominoes are falling

The product, which ZeroMark calls a “fire control system,” has two components: a small computer that has sensors, like lidar and electro-optical, and a motorized buttstock.

a16z-backed ZeroMark wants to give soldiers guns that don’t miss against drones

The RAW Dating App aims to shake up the dating scheme by shedding the fake, TikTok-ified, heavily filtered photos and replacing them with a more genuine, unvarnished experience. The app…

Pitch Deck Teardown: RAW Dating App’s $3M angel deck

Yes, we’re calling it “ThreadsDeck” now. At least that’s the tag many are using to describe the new user interface for Instagram’s X competitor, Threads, which resembles the column-based format…

‘ThreadsDeck’ arrived just in time for the Trump verdict

Japanese crypto exchange DMM Bitcoin confirmed on Friday that it had been the victim of a hack resulting in the theft of 4,502.9 bitcoin, or about $305 million.  According to…

Hackers steal $305M from DMM Bitcoin crypto exchange

This is not a drill! Today marks the final day to secure your early-bird tickets for TechCrunch Disrupt 2024 at a significantly reduced rate. At midnight tonight, May 31, ticket…

Disrupt 2024 early-bird prices end at midnight

Instagram is testing a way for creators to experiment with reels without committing to having them displayed on their profiles, giving the social network a possible edge over TikTok and…

Instagram tests ‘trial reels’ that don’t display to a creator’s followers

U.S. federal regulators have requested more information from Zoox, Amazon’s self-driving unit, as part of an investigation into rear-end crash risks posed by unexpected braking. The National Highway Traffic Safety…

Feds tell Zoox to send more info about autonomous vehicles suddenly braking

You thought the hottest rap battle of the summer was between Kendrick Lamar and Drake. You were wrong. It’s between Canva and an enterprise CIO. At its Canva Create event…

Canva’s rap battle is part of a long legacy of Silicon Valley cringe

Voice cloning startup ElevenLabs introduced a new tool for users to generate sound effects through prompts today after announcing the project back in February.

ElevenLabs debuts AI-powered tool to generate sound effects

We caught up with Antler founder and CEO Magnus Grimeland about the startup scene in Asia, the current tech startup trends in the region and investment approaches during the rise…

VC firm Antler’s CEO says Asia presents ‘biggest opportunity’ in the world for growth

Temu is to face Europe’s strictest rules after being designated as a “very large online platform” under the Digital Services Act (DSA).

Chinese e-commerce marketplace Temu faces stricter EU rules as a ‘very large online platform’

Meta has been banned from launching features on Facebook and Instagram that would have collected data on voters in Spain using the social networks ahead of next month’s European Elections.…

Spain bans Meta from launching election features on Facebook, Instagram over privacy fears

Stripe, the world’s most valuable fintech startup, said on Friday that it will temporarily move to an invite-only model for new account sign-ups in India, calling the move “a tough…

Stripe curbs its India ambitions over regulatory situation

The 2024 election is likely to be the first in which faked audio and video of candidates is a serious factor. As campaigns warm up, voters should be aware: voice…

Voice cloning of political figures is still easy as pie

When Alex Ewing was a kid growing up in Purcell, Oklahoma, he knew how close he was to home based on which billboards he could see out the car window.…

OneScreen.ai brings startup ads to billboards and NYC’s subway

SpaceX’s massive Starship rocket could take to the skies for the fourth time on June 5, with the primary objective of evaluating the second stage’s reusable heat shield as the…

SpaceX sent Starship to orbit — the next launch will try to bring it back

Eric Lefkofsky knows the public listing rodeo well and is about to enter it for a fourth time. The serial entrepreneur, whose net worth is estimated at nearly $4 billion,…

Billionaire Groupon founder Eric Lefkofsky is back with another IPO: AI health tech Tempus

TechCrunch Disrupt showcases cutting-edge technology and innovation, and this year’s edition will not disappoint. Among thousands of insightful breakout session submissions for this year’s Audience Choice program, five breakout sessions…

You’ve spoken! Meet the Disrupt 2024 breakout session audience choice winners

Check Point is the latest security vendor to fix a vulnerability in its technology, which it sells to companies to protect their networks.

Zero-day flaw in Check Point VPNs is ‘extremely easy’ to exploit

Though Spotify never shared official numbers, it’s likely that Car Thing underperformed or was just not worth continued investment in today’s tighter economic market.

Spotify offers Car Thing refunds as it faces lawsuit over bricking the streaming device

The studies, by researchers at MIT, Ben-Gurion University, Cambridge and Northeastern, were independently conducted but complement each other well.

Misinformation works, and a handful of social ‘supersharers’ sent 80% of it in 2020

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Okay, okay…

Tesla shareholder sweepstakes and EV layoffs hit Lucid and Fisker