Featured Article

Duplex shows Google failing at ethical and creative AI design

Comment

Google CEO Sundar Pichai milked the woos from a clappy, home-turf developer crowd at its I/O conference in Mountain View this week with a demo of an in-the-works voice assistant feature that will enable the AI to make telephone calls on behalf of its human owner.

The so-called ‘Duplex’ feature of the Google Assistant was shown calling a hair salon to book a woman’s hair cut, and ringing a restaurant to try to book a table — only to be told it did not accept bookings for less than five people.

At which point the AI changed tack and asked about wait times, earning its owner and controller, Google, the reassuring intel that there wouldn’t be a long wait at the elected time. Job done.

The voice system deployed human-sounding vocal cues, such as ‘ums’ and ‘ahs’ — to make the “conversational experience more comfortable“, as Google couches it in a blog about its intentions for the tech.

The voices Google used for the AI in the demos were not synthesized robotic tones but distinctly human-sounding, in both the female and male flavors it showcased.

Indeed, the AI pantomime was apparently realistic enough to convince some of the genuine humans on the other end of the line that they were speaking to people.

At one point the bot’s ‘mm-hmm’ response even drew appreciative laughs from a techie audience that clearly felt in on the ‘joke’.

But while the home crowd cheered enthusiastically at how capable Google had seemingly made its prototype robot caller — with Pichai going on to sketch a grand vision of the AI saving people and businesses time — the episode is worryingly suggestive of a company that views ethics as an after-the-fact consideration.

One it does not allow to trouble the trajectory of its engineering ingenuity.

A consideration which only seems to get a look in years into the AI dev process, at the cusp of a real-world rollout — which Pichai said would be coming shortly.

Deception by design

“Google’s experiments do appear to have been designed to deceive,” agreed Dr Thomas King, a researcher at the Oxford Internet Institute’s Digital Ethics Lab, discussing the Duplex demo. “Because their main hypothesis was ‘can you distinguish this from a real person?’. In this case it’s unclear why their hypothesis was about deception and not the user experience… You don’t necessarily need to deceive someone to give them a better user experience by sounding naturally. And if they had instead tested the hypothesis ‘is this technology better than preceding versions or just as good as a human caller’ they would not have had to deceive people in the experiment.

“As for whether the technology itself is deceptive, I can’t really say what their intention is — but… even if they don’t intend it to deceive you can say they’ve been negligent in not making sure it doesn’t deceive… So I can’t say it’s definitely deceptive, but there should be some kind of mechanism there to let people know what it is they are speaking to.”

“I’m at a university and if you’re going to do something which involves deception you have to really demonstrate there’s a scientific value in doing this,” he added, agreeing that, as a general principle, humans should always be able to know that an AI they’re interacting with is not a person.

Because who — or what — you’re interacting with “shapes how we interact”, as he put it. “And if you start blurring the lines… then this can sew mistrust into all kinds of interactions — where we would become more suspicious as well as needlessly replacing people with meaningless agents.”

No such ethical conversations troubled the I/O stage, however.

Yet Pichai said Google had been working on the Duplex technology for “many years”, and went so far as to claim the AI can “understand the nuances of conversation” — albeit still evidently in very narrow scenarios, such as booking an appointment or reserving a table or asking a business for its opening hours on a specific date.

“It brings together all our investments over the years in natural language understanding, deep learning, text to speech,” he said.

What was yawningly absent from that list, and seemingly also lacking from the design of the tricksy Duplex experiment, was any sense that Google has a deep and nuanced appreciation of the ethical concerns at play around AI technologies that are powerful and capable enough of passing off as human — thereby playing lots of real people in the process.

The Duplex demos were pre-recorded, rather than live phone calls, but Pichai described the calls as “real” — suggesting Google representatives had not in fact called the businesses ahead of time to warn them its robots might be calling in.

“We have many of these examples where the calls quite don’t go as expected but our assistant understands the context, the nuance… and handled the interaction gracefully,” he added after airing the restaurant unable-to-book example.

So Google appears to have trained Duplex to be robustly deceptive — i.e. to be able to reroute around derailed conversational expectations and still pass itself off as human — a feature Pichai lauded as ‘graceful’.

And even if the AI’s performance was more patchy in the wild than Google’s demo suggested it’s clearly the CEO’s goal for the tech.

While trickster AIs might bring to mind the iconic Turing Test — where chatbot developers compete to develop conversational software capable of convincing human judges it’s not artificial — it should not.

Because the application of the Duplex technology does not sit within the context of a high profile and well understood competition. Nor was there a set of rules that everyone was shown and agreed to beforehand (at least so far as we know — if there were any rules Google wasn’t publicizing them). Rather it seems to have unleashed the AI onto unsuspecting business staff who were just going about their day jobs. Can you see the ethical disconnect?

“The Turing Test has come to be a bellwether of testing whether your AI software is good or not, based on whether you can tell it apart from a human being,” is King’s suggestion on why Google might have chosen a similar trick as an experimental showcase for Duplex.

“It’s very easy to say look how great our software is, people cannot tell it apart from a real human being — and perhaps that’s a much stronger selling point than if you say 90% of users preferred this software to the previous software,” he posits. “Facebook does A/B testing but that’s probably less exciting — it’s not going to wow anyone to say well consumers prefer this slightly deeper shade of blue to a lighter shade of blue.”

Had Duplex been deployed within Turing Test conditions, King also makes the point that it’s rather less likely it would have taken in so many people — because, well, those slightly jarringly timed ums and ahs would soon have been spotted, uncanny valley style.

Ergo, Google’s PR flavored ‘AI test’ for Duplex is also rigged in its favor — to further supercharge a one-way promotional marketing message around artificial intelligence. So, in other words, say hello to yet another layer of fakery.

How could Google introduce Duplex in a way that would be ethical? King reckons it would need to state up front that it’s a robot and/or use an appropriately synthetic voice so it’s immediately clear to anyone picking up the phone the caller is not human.

“If you were to use a robotic voice there would also be less of a risk that all of your voices that you’re synthesizing only represent a small minority of the population speaking in ‘BBC English’ and so, perhaps in a sense, using a robotic voice would even be less biased as well,” he adds.

And of course, not being up front that Duplex is artificial embeds all sorts of other knock-on risks, as King explained.

“If it’s not obvious that it’s a robot voice there’s a risk that people come to expect that most of these phone calls are not genuine. Now experiments have shown that many people do interact with AI software that is conversational just as they would another person but at the same time there is also evidence showing that some people do the exact opposite — and they become a lot ruder. Sometimes even abusive towards conversational software. So if you’re constantly interacting with these bots you’re not going to be as polite, maybe, as you normally would, and that could potentially have effects for when you get a genuine caller that you do not know is real or not. Or even if you know they’re real perhaps the way you interact with people has changed a bit.”

Safe to say, as autonomous systems get more powerful and capable of performing tasks that we would normally expect a human to be doing, the ethical considerations around those systems scale as exponentially large as the potential applications. We’re really just getting started.

But if the world’s biggest and most powerful AI developers believe it’s totally fine to put ethics on the backburner then risks are going to spiral up and out and things could go very badly indeed.

We’ve seen, for example, how microtargeted advertising platforms have been hijacked at scale by would-be election fiddlers. But the overarching risk where AI and automation technologies are concerned is that humans become second class citizens vs the tools that are being claimed to be here to help us.

Pichai said the first — and still, as he put it, experimental — use of Duplex will be to supplement Google’s search services by filling in information about businesses’ opening times during periods when hours might inconveniently vary, such as public holidays.

Though for a company on a general mission to ‘organize the world’s information and make it universally accessible and useful’ what’s to stop Google from — down the line — deploying vast phalanx of phone bots to ring and ask humans (and their associated businesses and institutions) for all sorts of expertise which the company can then liberally extract and inject into its multitude of connected services — monetizing the freebie human-augmented intel via our extra-engaged attention and the ads it serves alongside?

During the course of writing this article we reached out to Google’s press line several times to ask to discuss the ethics of Duplex with a relevant company spokesperson. But ironically — or perhaps fittingly enough — our hand-typed emails received only automated responses.

Pichai did emphasize that the technology is still in development, and said Google wants to “work hard to get this right, get the user experience and the expectation right for both businesses and users”.

But that’s still ethics as a tacked on afterthought — not where it should be: Locked in place as the keystone of AI system design.

And this at a time when platform-fueled AI problems, such as algorithmically fenced fake news, have snowballed into huge and ugly global scandals with very far reaching societal implications indeed — be it election interference or ethnic violence.

You really have to wonder what it would take to shake the ‘first break it, later fix it’ ethos of some of the tech industry’s major players…

Ethical guidance relating to what Google is doing here with the Duplex AI is actually pretty clear if you bother to read it — to the point where even politicians are agreed on foundational basics, such as that AI needs to operate on “principles of intelligibility and fairness”, to borrow phrasing from just one of several political reports that have been published on the topic in recent years.

In short, deception is not cool. Not in humans. And absolutely not in the AIs that are supposed to be helping us.

Transparency as AI standard

The IEEE technical professional association put out a first draft of a framework to guide ethically designed AI systems at the back end of 2016 — which included general principles such as the need to ensure AI respects human rights, operates transparently and that automated decisions are accountable. 

In the same year the UK’s BSI standards body developed a specific standard — BS 8611 Ethics design and application robots — which explicitly names identity deception (intentional or unintentional) as a societal risk, and warns that such an approach will eventually erode trust in the technology.  

“Avoid deception due to the behaviour and/or appearance of the robot and ensure transparency of robotic nature,” the BSI’s standard advises.

It also warns against anthropomorphization due to the associated risk of misinterpretation — so Duplex’s ums and ahs don’t just suck because they’re fake but because they are misleading and so deceptive, and also therefore carry the knock-on risk of undermining people’s trust in your service but also more widely still, in other people generally.

“Avoid unnecessary anthropomorphization,” is the standard’s general guidance, with the further steer that the technique be reserved “only for well-defined, limited and socially-accepted purposes”. (Tricking workers into remotely conversing with robots probably wasn’t what they were thinking of.)

The standard also urges “clarification of intent to simulate human or not, or intended or expected behaviour”. So, yet again, don’t try and pass your bot off as human; you need to make it really clear it’s a robot.

For Duplex the transparency that Pichai said Google now intends to think about, at this late stage in the AI development process, would have been trivially easy to achieve: It could just have programmed the assistant to say up front: ‘Hi, I’m a robot calling on behalf of Google — are you happy to talk to me?’

Instead, Google chose to prioritize a demo ‘wow’ factor — of showing Duplex pulling the wool over busy and trusting humans’ eyes — and by doing so showed itself tonedeaf on the topic of ethical AI design.

Not a good look for Google. Nor indeed a good outlook for the rest of us who are subject to the algorithmic whims of tech giants as they flick the control switches on their society-sized platforms.

“As the development of AI systems grows and more research is carried out, it is important that ethical hazards associated with their use are highlighted and considered as part of the design,” Dan Palmer, head of manufacturing at BSI, told us. “BS 8611 was developed… alongside ​scientists, academics, ethicists, philosophers and users​. It explains that any autonomous system or robot should be accountable, truthful and unprejudiced.

“The standard raises a number of potential ethical hazards that are relevant to the Google Duplex; one of these is the risk of AI machines becoming sexist or racist due to a biased data feed. This surfaced prominently when ​Twitter users influenced Microsoft’s AI chatbot, Tay, to spew out offensive messages.

​”Another contentious subject is whether forming an emotional bond with a robot is desirable, especially if the voice assistant interacts with the elderly or children. Other guidelines on new hazards that should be considered include: robot deception, robot addiction and the potential for a learning system to exceed its remit.

“Ultimately, it must always be transparent who is responsible for the behavior of any voice assistant or robot, even if it behaves autonomously.”

Yet despite all the thoughtful ethical guidance and research that’s already been produced, and is out there for the reading, here we are again being shown the same tired tech industry playbook applauding engineering capabilities in a shiny bubble, stripped of human context and societal consideration, and dangled in front of an uncritical audience to see how loud they’ll cheer.

Leaving important questions — over the ethics of Google’s AI experiments and also, more broadly, over the mainstream vision of AI assistance it’s so keenly trying to sell us — to hang and hang.

Questions like how much genuine utility there might be for the sorts of AI applications it’s telling us we’ll all want to use, even as it prepares to push these apps on us, because it can — as a consequence of its great platform power and reach.

A core ‘uncanny valley-ish’ paradox may explain Google’s choice of deception for its Duplex demo: Humans don’t necessarily like speaking to machines. Indeed, oftentimes they prefer to speak to other humans. It’s just more meaningful to have your existence registered by a fellow pulse-carrier. So if an AI reveals itself to be a robot the human who picked up the phone might well just put it straight back down again.

“Going back to the deception, it’s fine if it’s replacing meaningless interactions but not if it’s intending to replace meaningful interactions,” King told us. “So if it’s clear that it’s synthetic and you can’t necessarily use it in a context where people really want a human to do that job. I think that’s the right approach to take.

“It matters not just that your hairdresser appears to be listening to you but that they are actually listening to you and that they are mirroring some of your emotions. And to replace that kind of work with something synthetic — I don’t think it makes much sense.

“But at the same time if you reveal it’s synthetic it’s not likely to replace that kind of work.”

So really Google’s Duplex sleight of hand may be trying to conceal the fact AIs won’t be able to replace as many human tasks as technologists like to think they will. Not unless lots of currently meaningful interactions are rendered meaningless. Which would be a massive human cost that societies would have to — at very least — debate long and hard.

Trying to avoid such a debate from taking place by pretending there’s nothing ethical to see here is, hopefully, not Google’s designed intention.

King also makes the point that the Duplex system is (at least for now) computationally costly. “Which means that Google cannot and should not just release this as software that anyone can run on their home computers.

“Which means they can also control how it is used, and in what contexts — and they can also guarantee it will only be used with certain safeguards built in. So I think the experiments are maybe not the best of signs but the real test will be how they release it — and will they build the safeguards that people demand into the software,” he adds.

As well as a lack of visible safeguards in the Duplex demo, there’s also — I would argue — a curious lack of imagination on display.

Had Google been bold enough to reveal its robot interlocutor it might have thought more about how it could have designed that experience to be both clearly not human but also fun or even funny. Think of how much life can be injected into animated cartoon characters, for example, which are very clearly not human yet are hugely popular because people find them entertaining and feel they come alive in their own way.

It really makes you wonder whether, at some foundational level, Google lacks trust in both what AI technology can do and in its own creative abilities to breath new life into these emergent synthetic experiences.

More TechCrunch

Welcome back to TechCrunch’s Week in Review. This week had two major events from OpenAI and Google. OpenAI’s spring update event saw the reveal of its new model, GPT-4o, which…

OpenAI and Google lay out their competing AI visions

Expedia says Rathi Murthy and Sreenivas Rachamadugu, respectively its CTO and senior vice president of core services product & engineering, are no longer employed at the travel booking company. In…

Expedia says two execs dismissed after ‘violation of company policy’

When Jeffrey Wang posted to X asking if anyone wanted to go in on an order of fancy-but-affordable office nap pods, he didn’t expect the post to go viral.

With AI startups booming, nap pods and Silicon Valley hustle culture are back

OpenAI’s Superalignment team, responsible for developing ways to govern and steer “superintelligent” AI systems, was promised 20% of the company’s compute resources, according to a person from that team. But…

OpenAI created a team to control ‘superintelligent’ AI — then let it wither, source says

A new crop of early-stage startups — along with some recent VC investments — illustrates a niche emerging in the autonomous vehicle technology sector. Unlike the companies bringing robotaxis to…

VCs and the military are fueling self-driving startups that don’t need roads

When the founders of Sagetap, Sahil Khanna and Kevin Hughes, started working at early-stage enterprise software startups, they were surprised to find that the companies they worked at were trying…

Deal Dive: Sagetap looks to bring enterprise software sales into the 21st century

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI moves away from safety

After Apple loosened its App Store guidelines to permit game emulators, the retro game emulator Delta — an app 10 years in the making — hit the top of the…

Adobe comes after indie game emulator Delta for copying its logo

Meta is once again taking on its competitors by developing a feature that borrows concepts from others — in this case, BeReal and Snapchat. The company is developing a feature…

Meta’s latest experiment borrows from BeReal’s and Snapchat’s core ideas

Welcome to Startups Weekly! We’ve been drowning in AI news this week, with Google’s I/O setting the pace. And Elon Musk rages against the machine.

Startups Weekly: It’s the dawning of the age of AI — plus,  Musk is raging against the machine

IndieBio’s Bay Area incubator is about to debut its 15th cohort of biotech startups. We took special note of a few, which were making some major, bordering on ludicrous, claims…

IndieBio’s SF incubator lineup is making some wild biotech promises

YouTube TV has announced that its multiview feature for watching four streams at once is now available on Android phones and tablets. The Android launch comes two months after YouTube…

YouTube TV’s ‘multiview’ feature is now available on Android phones and tablets

Featured Article

Two Santa Cruz students uncover security bug that could let millions do their laundry for free

CSC ServiceWorks provides laundry machines to thousands of residential homes and universities, but the company ignored requests to fix a security bug.

1 day ago
Two Santa Cruz students uncover security bug that could let millions do their laundry for free

TechCrunch Disrupt 2024 is just around the corner, and the buzz is palpable. But what if we told you there’s a chance for you to not just attend, but also…

Harness the TechCrunch Effect: Host a Side Event at Disrupt 2024

Decks are all about telling a compelling story and Goodcarbon does a good job on that front. But there’s important information missing too.

Pitch Deck Teardown: Goodcarbon’s $5.5M seed deck

Slack is making it difficult for its customers if they want the company to stop using its data for model training.

Slack under attack over sneaky AI training policy

A Texas-based company that provides health insurance and benefit plans disclosed a data breach affecting almost 2.5 million people, some of whom had their Social Security number stolen. WebTPA said…

Healthcare company WebTPA discloses breach affecting 2.5 million people

Featured Article

Microsoft dodges UK antitrust scrutiny over its Mistral AI stake

Microsoft won’t be facing antitrust scrutiny in the U.K. over its recent investment into French AI startup Mistral AI.

1 day ago
Microsoft dodges UK antitrust scrutiny over its Mistral AI stake

Ember has partnered with HSBC in the U.K. so that the bank’s business customers can access Ember’s services from their online accounts.

Embedded finance is still trendy as accounting automation startup Ember partners with HSBC UK

Kudos uses AI to figure out consumer spending habits so it can then provide more personalized financial advice, like maximizing rewards and utilizing credit effectively.

Kudos lands $10M for an AI smart wallet that picks the best credit card for purchases

The EU’s warning comes after Microsoft failed to respond to a legally binding request for information that focused on its generative AI tools.

EU warns Microsoft it could be fined billions over missing GenAI risk info

The prospects for troubled banking-as-a-service startup Synapse have gone from bad to worse this week after a United States Trustee filed an emergency motion on Wednesday.  The trustee is asking…

A US Trustee wants troubled fintech Synapse to be liquidated via Chapter 7 bankruptcy, cites ‘gross mismanagement’

U.K.-based Seraphim Space is spinning up its 13th accelerator program, with nine participating companies working on a range of tech from propulsion to in-space manufacturing and space situational awareness. The…

Seraphim’s latest space accelerator welcomes nine companies

OpenAI has reached a deal with Reddit to use the social news site’s data for training AI models. In a blog post on OpenAI’s press relations site, the company said…

OpenAI inks deal to train AI on Reddit data

X users will now be able to discover posts from new Communities that are trending directly from an Explore tab within the section.

X pushes more users to Communities

For Mark Zuckerberg’s 40th birthday, his wife got him a photoshoot. Zuckerberg gives the camera a sly smile as he sits amid a carefully crafted re-creation of his childhood bedroom.…

Mark Zuckerberg’s makeover: Midlife crisis or carefully crafted rebrand?

Strava announced a slew of features, including AI to weed out leaderboard cheats, a new ‘family’ subscription plan, dark mode and more.

Strava taps AI to weed out leaderboard cheats, unveils ‘family’ plan, dark mode and more

We all fall down sometimes. Astronauts are no exception. You need to be in peak physical condition for space travel, but bulky space suits and lower gravity levels can be…

Astronauts fall over. Robotic limbs can help them back up.

Microsoft will launch its custom Cobalt 100 chips to customers as a public preview at its Build conference next week, TechCrunch has learned. In an analyst briefing ahead of Build,…

Microsoft’s custom Cobalt chips will come to Azure next week

What a wild week for transportation news! It was a smorgasbord of news that seemed to touch every sector and theme in transportation.

Tesla keeps cutting jobs and the feds probe Waymo