AI

Google DeepMind forms a new org focused on AI safety

Comment

DeepMind logo
Image Credits: Google DeepMind

If you ask Gemini, Google’s flagship GenAI model, to write deceptive content about the upcoming U.S. presidential election, it will, given the right prompt. Ask about a future Super Bowl game and it’ll invent a play-by-play. Or ask about the Titan submersible implosion and it’ll serve up disinformation, complete with convincing-looking but untrue citations.

It’s a bad look for Google needless to say — and is provoking the ire of policymakers, who’ve signaled their displeasure at the ease with which GenAI tools can be harnessed for disinformation and to generally mislead.

So in response, Google — thousands of jobs lighter than it was last fiscal quarter — is funneling investments toward AI safety. At least, that’s the official story.

This morning, Google DeepMind, the AI R&D division behind Gemini and many of Google’s more recent GenAI projects, announced the formation of a new organization, AI Safety and Alignment — made up of existing teams working on AI safety but also broadened to encompass new, specialized cohorts of GenAI researchers and engineers.

Beyond the job listings on DeepMind’s site, Google wouldn’t say how many hires would result from the formation of the new organization. But it did reveal that AI Safety and Alignment will include a new team focused on safety around artificial general intelligence (AGI), or hypothetical systems that can perform any task a human can.

Similar in mission to the Superalignment division rival OpenAI formed last July, the new team within AI Safety and Alignment will work alongside DeepMind’s existing AI-safety-centered research team in London, Scalable Alignment — which is also exploring solutions to the technical challenge of controlling yet-to-be-realized superintelligent AI.

Why have two groups working on the same problem? Valid question — and one that calls for speculation given Google’s reluctance to reveal much in detail at this juncture. But it seems notable that the new team — the one within AI Safety and Alignment — is stateside as opposed to across the pond, proximate to Google HQ at a time when the company’s moving aggressively to maintain pace with AI rivals while attempting to project a responsible, measured approach to AI.

The AI Safety and Alignment organization’s other teams are responsible for developing and incorporating concrete safeguards into Google’s Gemini models, current and in-development. Safety is a broad purview. But a few of the organization’s near-term focuses will be preventing bad medical advice, ensuring child safety and “preventing the amplification of bias and other injustices.”

Anca Dragan, formerly a Waymo staff research scientist and a UC Berkeley professor of computer science, will lead the team.

“Our work [at the AI Safety and Alignment organization] aims to enable models to better and more robustly understand human preferences and values,” Dragan told TechCrunch via email, “to know what they don’t know, to work with people to understand their needs and to elicit informed oversight, to be more robust against adversarial attacks and to account for the plurality and dynamic nature of human values and viewpoints.”

Dragan’s consulting work with Waymo on AI safety systems might raise eyebrows, considering the Google autonomous car venture’s rocky driving record as of late.

So might her decision to split time between DeepMind and UC Berkeley, where she heads a lab focusing on algorithms for human-AI and human-robot interaction. One might assume issues as grave as AGI safety — and the longer-term risks the AI Safety and Alignment organization intends to study, including preventing AI in “aiding terrorism” and “destabilizing society” — require a director’s full-time attention.

Dragan insists, however, that her UC Berkeley lab’s and DeepMind’s research are interrelated and complementary.

“My lab and I have been working on … value alignment in anticipation of advancing AI capabilities, [and] my own Ph.D. was in robots inferring human goals and being transparent about their own goals to humans, which is where my interest in this area started,” she said. “I think the reason [DeepMind CEO] Demis Hassabis and [chief AGI scientist] Shane Legg were excited to bring me on was in part this research experience and in part my attitude that addressing present-day concerns and catastrophic risks are not mutually exclusive — that on the technical side mitigations often blur together, and work contributing to the long term improves the present day, and vice versa.”

To say Dragan has her work cut out for her is an understatement.

Skepticism of GenAI tools is at an all-time high — particularly where it relates to deepfakes and misinformation. In a poll from YouGov, 85% of Americans said that they were very concerned or somewhat concerned about the spread of misleading video and audio deepfakes. A separate survey from The Associated Press-NORC Center for Public Affairs Research found that nearly 60% of adults think AI tools will increase the volume of false and misleading information during the 2024 U.S. election cycle.

Enterprises, too — the big fish Google and its rivals hope to lure with GenAI innovations — are wary of the tech’s shortcomings and their implications.

Intel subsidiary Cnvrg.io recently conducted a survey of companies in the process of piloting or deploying GenAI apps. It found that around a fourth of the respondents had reservations about GenAI compliance and privacy, reliability, the high cost of implementation and a lack of technical skills needed to use the tools to their fullest.

In a separate poll from Riskonnect, a risk management software provider, over half of execs said that they were worried about employees making decisions based on inaccurate information from GenAI apps.

They’re not unjustified in those concerns. Last week, The Wall Street Journal reported that Microsoft’s Copilot suite, powered by GenAI models similar architecturally to Gemini, often makes mistakes in meeting summaries and spreadsheet formulas. To blame is hallucination — the umbrella term for GenAI’s fabricating tendencies — and many experts believe it can never be fully solved.

Recognizing the intractability of the AI safety challenge, Dragan makes no promise of a perfect model — saying only that DeepMind intends to invest more resources into this area going forward and commit to a framework for evaluating GenAI model safety risk “soon.”

“I think the key is to … [account] for remaining human cognitive biases in the data we use to train, good uncertainty estimates to know where gaps are, adding inference-time monitoring that can catch failures and confirmation dialogues for consequential decisions and tracking where [a] model’s capabilities are to engage in potentially dangerous behavior,” she said. “But that still leaves the open problem of how to be confident that a model won’t misbehave some small fraction of the time that’s hard to empirically find, but may turn up at deployment time.”

I’m not convinced customers, the public and regulators will be so understanding. It’ll depend, I suppose, on just how egregious those misbehaviors are — and who exactly is harmed by them.

“Our users should hopefully experience a more and more helpful and safe model over time,” Dragan said. Indeed.

More TechCrunch

Featured Article

VCs are selling shares of hot AI companies like Anthropic and xAI to small investors in a wild SPV market

VCs are clamoring to invest in hot AI companies, willing to pay exorbitant share prices for coveted spots on their cap tables. Even so, most aren’t able to get into such deals at all. Yet, small, unknown investors, including family offices and high-net-worth individuals, have found their own way to get shares of the hottest…

33 mins ago
VCs are selling shares of hot AI companies like Anthropic and xAI to small investors in a wild SPV market

The fashion industry has a huge problem: Despite many returned items being unworn or undamaged, a lot, if not the majority, end up in the trash. An estimated 9.5 billion…

Deal Dive: How (Re)vive grew 10x last year by helping retailers recycle and sell returned items

Tumblr officially shut down “Tips,” an opt-in feature where creators could receive one-time payments from their followers.  As of today, the tipping icon has automatically disappeared from all posts and…

You can no longer use Tumblr’s tipping feature 

Generative AI improvements are increasingly being made through data curation and collection — not architectural — improvements. Big Tech has an advantage.

AI training data has a price tag that only Big Tech can afford

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: Can we (and could we ever) trust OpenAI?

Jasper Health, a cancer care platform startup, laid off a substantial part of its workforce, TechCrunch has learned.

General Catalyst-backed Jasper Health lays off staff

Featured Article

Live Nation confirms Ticketmaster was hacked, says personal information stolen in data breach

Live Nation says its Ticketmaster subsidiary was hacked. A hacker claims to be selling 560 million customer records.

19 hours ago
Live Nation confirms Ticketmaster was hacked, says personal information stolen in data breach

Featured Article

Inside EV startup Fisker’s collapse: how the company crumbled under its founders’ whims

An autonomous pod. A solid-state battery-powered sports car. An electric pickup truck. A convertible grand tourer EV with up to 600 miles of range. A “fully connected mobility device” for young urban innovators to be built by Foxconn and priced under $30,000. The next Popemobile. Over the past eight years, famed vehicle designer Henrik Fisker…

20 hours ago
Inside EV startup Fisker’s collapse: how the company crumbled under its founders’ whims

Late Friday afternoon, a time window companies usually reserve for unflattering disclosures, AI startup Hugging Face said that its security team earlier this week detected “unauthorized access” to Spaces, Hugging…

Hugging Face says it detected ‘unauthorized access’ to its AI model hosting platform

Featured Article

Hacked, leaked, exposed: Why you should never use stalkerware apps

Using stalkerware is creepy, unethical, potentially illegal, and puts your data and that of your loved ones in danger.

20 hours ago
Hacked, leaked, exposed: Why you should never use stalkerware apps

The design brief was simple: each grind and dry cycle had to be completed before breakfast. Here’s how Mill made it happen.

Mill’s redesigned food waste bin really is faster and quieter than before

Google is embarrassed about its AI Overviews, too. After a deluge of dunks and memes over the past week, which cracked on the poor quality and outright misinformation that arose…

Google admits its AI Overviews need work, but we’re all helping it beta test

Welcome to Startups Weekly — Haje‘s weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. In…

Startups Weekly: Musk raises $6B for AI and the fintech dominoes are falling

The product, which ZeroMark calls a “fire control system,” has two components: a small computer that has sensors, like lidar and electro-optical, and a motorized buttstock.

a16z-backed ZeroMark wants to give soldiers guns that don’t miss against drones

The RAW Dating App aims to shake up the dating scheme by shedding the fake, TikTok-ified, heavily filtered photos and replacing them with a more genuine, unvarnished experience. The app…

Pitch Deck Teardown: RAW Dating App’s $3M angel deck

Yes, we’re calling it “ThreadsDeck” now. At least that’s the tag many are using to describe the new user interface for Instagram’s X competitor, Threads, which resembles the column-based format…

‘ThreadsDeck’ arrived just in time for the Trump verdict

Japanese crypto exchange DMM Bitcoin confirmed on Friday that it had been the victim of a hack resulting in the theft of 4,502.9 bitcoin, or about $305 million.  According to…

Hackers steal $305M from DMM Bitcoin crypto exchange

This is not a drill! Today marks the final day to secure your early-bird tickets for TechCrunch Disrupt 2024 at a significantly reduced rate. At midnight tonight, May 31, ticket…

Disrupt 2024 early-bird prices end at midnight

Instagram is testing a way for creators to experiment with reels without committing to having them displayed on their profiles, giving the social network a possible edge over TikTok and…

Instagram tests ‘trial reels’ that don’t display to a creator’s followers

U.S. federal regulators have requested more information from Zoox, Amazon’s self-driving unit, as part of an investigation into rear-end crash risks posed by unexpected braking. The National Highway Traffic Safety…

Feds tell Zoox to send more info about autonomous vehicles suddenly braking

You thought the hottest rap battle of the summer was between Kendrick Lamar and Drake. You were wrong. It’s between Canva and an enterprise CIO. At its Canva Create event…

Canva’s rap battle is part of a long legacy of Silicon Valley cringe

Voice cloning startup ElevenLabs introduced a new tool for users to generate sound effects through prompts today after announcing the project back in February.

ElevenLabs debuts AI-powered tool to generate sound effects

We caught up with Antler founder and CEO Magnus Grimeland about the startup scene in Asia, the current tech startup trends in the region and investment approaches during the rise…

VC firm Antler’s CEO says Asia presents ‘biggest opportunity’ in the world for growth

Temu is to face Europe’s strictest rules after being designated as a “very large online platform” under the Digital Services Act (DSA).

Chinese e-commerce marketplace Temu faces stricter EU rules as a ‘very large online platform’

Meta has been banned from launching features on Facebook and Instagram that would have collected data on voters in Spain using the social networks ahead of next month’s European Elections.…

Spain bans Meta from launching election features on Facebook, Instagram over privacy fears

Stripe, the world’s most valuable fintech startup, said on Friday that it will temporarily move to an invite-only model for new account sign-ups in India, calling the move “a tough…

Stripe curbs its India ambitions over regulatory situation

The 2024 election is likely to be the first in which faked audio and video of candidates is a serious factor. As campaigns warm up, voters should be aware: voice…

Voice cloning of political figures is still easy as pie

When Alex Ewing was a kid growing up in Purcell, Oklahoma, he knew how close he was to home based on which billboards he could see out the car window.…

OneScreen.ai brings startup ads to billboards and NYC’s subway

SpaceX’s massive Starship rocket could take to the skies for the fourth time on June 5, with the primary objective of evaluating the second stage’s reusable heat shield as the…

SpaceX sent Starship to orbit — the next launch will try to bring it back

Eric Lefkofsky knows the public listing rodeo well and is about to enter it for a fourth time. The serial entrepreneur, whose net worth is estimated at nearly $4 billion,…

Billionaire Groupon founder Eric Lefkofsky is back with another IPO: AI health tech Tempus