Startups

Where is voice tech going?

Comment

Image Credits: Luis Alvarez (opens in a new window) / Getty Images

Mark Persaud

Contributor

Mark Persaud is digital product manager and practice lead at Moonshot by Pactera, a digital innovation company that leads global clients through the next era of digital products with a heavy emphasis on artificial intelligence, data and continuous software delivery.

2020 has been all but normal. For businesses and brands. For innovation. For people.

The trajectory of business growth strategies, travel plans and lives have been drastically altered due to the COVID-19 pandemic, a global economic downturn with supply chain and market issues, and a fight for equality in the Black Lives Matter movement — amongst all that complicated lives and businesses already.

One of the biggest stories in emerging technology is the growth of different types of voice assistants:

  • Niche assistants such as Aider that provide back-office support.
  • Branded in-house assistants such as those offered by BBC and Snapchat.
  • White-label solutions such as Houndify that provide lots of capabilities and configurable tool sets.

With so many assistants proliferating globally, voice will become a commodity like a website or an app. And that’s not a bad thing — at least in the name of progress. It will soon (read: over the next couple years) become table stakes for a business to have voice as an interaction channel for a lovable experience that users expect. Consider that feeling you get when you realize a business doesn’t have a website: It makes you question its validity and reputation for quality. Voice isn’t quite there yet, but it’s moving in that direction.

Voice assistant adoption and usage are still on the rise

Adoption of any new technology is key. A key inhibitor of technology is often distribution, but this has not been the case with voice. Apple, Google, and Baidu have reported hundreds of millions of devices using voice, and Amazon has 200 million users. Amazon has a slightly more difficult job since they’re not in the smartphone market, which allows for greater voice assistant distribution for Apple and Google.

Image Credits: Mark Persaud

But are people using devices? Google said recently there are 500 million monthly active users of Google Assistant. Not far behind are active Apple users with 375 million. Large numbers of people are using voice assistants, not just owning them. That’s a sign of technology gaining momentum — the technology is at a price point and within digital and personal ecosystems that make it right for user adoption. The pandemic has only exacerbated the use as Edison reported between March and April — a peak time for sheltering in place across the U.S.

Image Credits: Mark Persaud

When we look at the adoption cycle, voice is evolving in different stages. Measured by monthly active users, we are still in early stages of voice’s overall adoption lifecycle with devices such as smartwatches. But use of smartphones has penetrated half the U.S. population. Voice search is mature, with two-thirds of the U.S. population using it because they’re comfortable with it. As with most technologies, change happens unevenly. “Voice first” doesn’t mean everyone is using voice the same way, rather in a breadth of ways, which speaks to its applicability across contexts.

Voice is global

It’s all too easy to think of voice just in the context of the U.S. market, but voice is a global phenomenon. China accounts for 30%-40% of smart speaker sales, and the rate of total installed base is catching up. Albeit the digital context for using voice is different in China, it’s usually tied to a super app’s ecosystem.

Regional differences become even more striking when you examine the different assistants catching on globally. The big voice assistants such as Alexa, Cortana, Google Assistant and Siri do not speak for the world.

Image Credits: Mark Persaud

This is a global technology adoption and consumer behavior movement, which makes it exceedingly exciting to be involved with and continue to explore for businesses around the world.

Voice design and sonic branding are becoming more prevalent

With all these (perhaps commoditized) voice experiences, remember that value gets created from the experience and relationship established with users. Voice design and voice user interface (VUI) creation still greatly matter, and will continue to grow in importance. It’s far too easy to create poor voice experiences — unfortunately the public has seen many, many poor Alexa skills or Google Actions that leave you in a voice interaction loop or an inability to course correct. A poor voice user experience is frustrating for users and more harmful to a brand than a bad text-based website interaction.

That’s because a voice-based experience is less forgiving. With a poorly designed VUI, the user lacks a way to decipher the content or information further. User comments like “Where do I go from here?”, “That’s not what I asked” and “I’m not sure what to do with that information” are statements that VUI designers do not want to hear. This is, of course, provided that the user was understood by the automated speech recognition (ASR) and natural language understanding (NLU), and received a response from the voice application.

All of this decreases the user’s trust in the medium and pushes them back to, say, websites or phone calls. As a result, the bad brand experience might result in the user not wanting to interact with the brand via the voice interface again, which will be a major setback when competitors are thriving in the space and voice commerce becomes more prevalent. It’s tempting and easy for users to try voice and say, “I like the old way better” because the old way is more reliable, or they know how to navigate it. That’s the common issue with the new and change altogether.

The uptake of voice assistants reminds me of the adoption of websites into mainstream society. Websites weren’t always as helpful or as beautiful as they are today. While many factors influenced the proliferation of websites (the internet, internet speed, browser compatibility, mobile versions, etc.), it all started with content sharing and simple functionality. Over time, websites have evolved into aesthetically beautiful, eye-luring, easily navigable media.

Voice will be no different, having started with a very wide breadth of voice experiences and homing in on what works and what doesn’t for the users and brands they serve, to adding contextual relevancy for where they’re being used, and last to adding personality and sonic branding.

Some brands (McDonald’s and CBS to name a few) have adopted a jingle or sonic brand. When you hear their familiar notes, you think of the brands. Those moments of familiarity pay off years of effort and user training with the voice medium.

Additionally, consider brands that have a strong brand personality such as Slim Jim, Headspace and Airbnb that are utilized to create voice-based experiences with personalities to complement their visual identities. This comes to life when brand voice experience considers tone, timbre, intonation and lexicon. Literally being able to exude the brand voice straight to a user’s ears. This will push the brand-user relationship to be even stronger (perhaps even reestablishing loyalty in newer generations), when done correctly.

Addressing 2020 head-on with voice

Contactless (commercial, public, retail) interactions

As brands address the health and safety concerns of consumers to restart their businesses, contactless interactions rise to the top. Removing (or minimizing) the physical touchpoints of a business is making people think digital-first in a quick, prioritized way as, for many businesses, their livelihood depends on it in a way not felt before. Businesses are adapting their mindset from “when I have time for digital’ to “digital has to happen now.”

Using voice-enabled applications has now become a part of that transformation — to do everything from browsing, getting information and navigating to ordering products and checking out. From a personal health standpoint, using our voices is less risky behavior than an interface that requires touching a user-shared screen or paying with and receiving unsanitized cash (activities that usually require you to be within six feet of others, especially strangers). The airport and restaurant industries will likely be the first to address these issues as they’ve been hit hard with today’s pandemic and the recessionary economy.

Assisting at-home education

In the spring of 2020, many parents everywhere suddenly became de facto home schoolers as schools shut down and kids were sent home. This unbelievably stressful burden may continue into the fall. The situation is untenable. A recently published New York Times article says it all: “In the Covid-19 Economy, You Can Have a Kid or a Job. You Can’t Have Both.

Voice is attempting to provide some relief. Google showed us one example. Earlier in 2020, Google launched a new voice assistant that helps parents who are home-schooling their kids. Titled Diya, the assistant is designed to teach children how to read. Diya uses stories and word games to help kids five and up. Diya uses Google’s speech recognition technology to spot mistakes and areas that are challenging kids. I imagine there are more ways voice can and will help parents as they attempt to manage the demands of working and home-schooling.

Empowering physical and mental health

As people sought ways to understand the health threat created by COVID-19, the Mayo Clinic introduced an Alexa skill for people to get answers to questions about COVID-19. This was an important example of how voice could contribute to the well-being of others while simplifying access.

Of course, the pandemic has created unprecedented levels of stress as people manage the health threat of an unchecked pandemic, forced isolation, and the threat of job loss and economic instability. People are struggling to cope. I see a meaningful opportunity for voice to help people manage mental health. For example, MoonPie created a virtual roommate that entertains people stuck at home in isolation — a whimsical example, to be sure, but in 2020, entertainment has taken on a more meaningful role.

Meanwhile, meditation app Headspace provides a voice-based interface to make it easier to meditate with a voice command. That kind of a tool could be a lifesaver for anyone who counts themselves among the surging numbers of people fighting mental exhaustion and stress.

Sharing workplace culture at home

The future of the workplace remains uncertain. Some companies are slowly opening their brick-and-mortar locations and offices. Others are not. Twitter famously told employees they can work at home indefinitely. This dramatic change in how we work creates new challenges, such as maintaining a sense of culture when people are not in the same place.

For example, using voice to share customized messages amongst colleagues, or using random voice Easter eggs to mimic someone stopping by your desk to share an inside joke. We miss our colleagues and their ad hoc banter, their interesting insights and their supportive attitudes (the terms “work-wife” or “work-husband” exist for a reason). Voice can help people make life apart have more lovable teammate moments and reinvigorate the culture we’re missing.

Supporting social awareness (and justice)

In the wake of the global social equality unrest that erupted around the world, Amazon, Apple and Google made some important changes to Alexa, Siri and Google Assistant. As a number of news outlets reported, if you ask Google Assistant whether Black lives matter, Google Assistant began providing more thoughtful replies, such as, “Black people deserve the same freedoms afforded to everyone in this country, and recognizing the injustice they face is the first step towards fixing it.” If you asked whether “all lives matter,” Google Assistant replies, “Saying ‘Black Lives Matter’ doesn’t mean that all lives don’t. It means Black lives are at risk in ways that others are not.” Both Alexa and Siri respond with similarly sensitive, nuanced answers instead of “of course,” or “I don’t understand your question.”

Enterprises might do well to listen to ideas bubbling up at a grassroots level. I recently read about a Reddit user who developed a Siri shortcut that makes it possible for someone when being pulled over by the police, to say, “Hey, Siri, I’m getting pulled over” — which results in Siri sending your current location to a designated person and automatically starts recording a video.

How might businesses go beyond using voice to make us more aware of Black Lives Matter to actually helping protect social justice and civic responsibility?

What does this all mean

The possibilities for voice are ever expanding — getting smarter, more personalized, in more contexts, assisting with broader messaging — especially in how it fits into a brand’s digital ecosystem, and more importantly the consumer’s ecosystem. Start investigating your voice ideas by running a voice design sprint. It’s a new world, and voice technology is shaping it.

More TechCrunch

PayHOA, a previously bootstrapped Kentucky-based startup that offers software for self-managed homeowner associations (HOAs), is an example of how real-world problems can translate into opportunity. It just raised a $27.5…

Meet PayHOA, a profitable and once-bootstrapped SaaS startup that just landed a $27.5M Series A

Restaurant365, which offers a restaurant management suite, has raised a hot $175M from ICONIQ Growth, KKR and L Catterton.

Restaurant365 orders in $175M at $1B+ valuation to supersize its food service software stack 

Venture firm Shilling has launched a €50M fund to support growth-stage startups in its own portfolio and to invest in startups everywhere else. 

Portuguese VC firm Shilling launches €50M opportunity fund to back growth-stage startups

Chang She, previously the VP of engineering at Tubi and a Cloudera veteran, has years of experience building data tooling and infrastructure. But when She began working in the AI…

LanceDB, which counts Midjourney as a customer, is building databases for multimodal AI

Trawa simplifies energy purchasing and management for SMEs by leveraging an AI-powered platform and downstream data from customers. 

Berlin-based trawa raises €10M to use AI to make buying renewable energy easier for SMEs

Lydia is splitting itself into two apps — Lydia for P2P payments and Sumeria for those looking for a mobile-first bank account.

Lydia, the French payments app with 8 million users, launches mobile banking app Sumeria

Cargo ships docking at a commercial port incur costs called “disbursements” and “port call expenses.” This might be port dues, towage, and pilotage fees. It’s a complex patchwork and all…

Shipping logistics startup Harbor Lab raises $16M Series A led by Atomico

AWS has confirmed its European “sovereign cloud” will go live by the end of 2025, enabling greater data residency for the region.

AWS confirms will launch European ‘sovereign cloud’ in Germany by 2025, plans €7.8B investment over 15 years

Go Digit, an Indian insurance startup, has raised $141 million from investors including Goldman Sachs, ADIA, and Morgan Stanley as part of its IPO.

Indian insurance startup Go Digit raises $141M from anchor investors ahead of IPO

Peakbridge intends to invest in between 16 and 20 companies, investing around $10 million in each company. It has made eight investments so far.

Food VC Peakbridge has new $187M fund to transform future of food, like lab-made cocoa

For over six decades, the nonprofit has been active in the financial services sector.

Accion’s new $152.5M fund will back financial institutions serving small businesses globally

Meta’s newest social network, Threads, is starting its own fact-checking program after piggybacking on Instagram and Facebook’s network for a few months.

Threads finally starts its own fact-checking program

Looking Glass makes trippy-looking mixed-reality screens that make things look 3D without the need of special glasses. Today, it launches a pair of new displays, including a 16-inch mode that…

Looking Glass launches new 3D displays

Replacing Sutskever is Jakub Pachocki, OpenAI’s director of research.

Ilya Sutskever, OpenAI co-founder and longtime chief scientist, departs

Intuitive Machines made history when it became the first private company to land a spacecraft on the moon, so it makes sense to adapt that tech for Mars.

Intuitive Machines wants to help NASA return samples from Mars

As Google revamps itself for the AI era, offering AI overviews within its search results, the company is introducing a new way to filter for just text-based links. With the…

Google adds ‘Web’ search filter for showing old-school text links as AI rolls out

Blue Origin’s New Shepard rocket will take a crew to suborbital space for the first time in nearly two years later this month, the company announced on Tuesday.  The NS-25…

Blue Origin to resume crewed New Shepard launches on May 19

This will enable developers to use the on-device model to power their own AI features.

Google is building its Gemini Nano AI model into Chrome on the desktop

It ran 110 minutes, but Google managed to reference AI a whopping 121 times during Google I/O 2024 (by its own count). CEO Sundar Pichai referenced the figure to wrap…

Google mentioned ‘AI’ 120+ times during its I/O keynote

Firebase Genkit is an open source framework that enables developers to quickly build AI into new and existing applications.

Google launches Firebase Genkit, a new open source framework for building AI-powered apps

In the coming months, Google says it will open up the Gemini Nano model to more developers.

Patreon and Grammarly are already experimenting with Gemini Nano, says Google

As part of the update, Reddit also launched a dedicated AMA tab within the web post composer.

Reddit introduces new tools for ‘Ask Me Anything,’ its Q&A feature

Here are quick hits of the biggest news from the keynote as they are announced.

Google I/O 2024: Here’s everything Google just announced

LearnLM is already powering features across Google products, including in YouTube, Google’s Gemini apps, Google Search and Google Classroom.

LearnLM is Google’s new family of AI models for education

The official launch comes almost a year after YouTube began experimenting with AI-generated quizzes on its mobile app. 

Google is bringing AI-generated quizzes to academic videos on YouTube

Around 550 employees across autonomous vehicle company Motional have been laid off, according to information taken from WARN notice filings and sources at the company.  Earlier this week, TechCrunch reported…

Motional cut about 550 employees, around 40%, in recent restructuring, sources say

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: Watch all of the AI, Android reveals

Google Play has a new discovery feature for apps, new ways to acquire users, updates to Play Points, and other enhancements to developer-facing tools.

Google Play preps a new full-screen app discovery feature and adds more developer tools

Soon, Android users will be able to drag and drop AI-generated images directly into their Gmail, Google Messages and other apps.

Gemini on Android becomes more capable and works with Gmail, Messages, YouTube and more

Veo can capture different visual and cinematic styles, including shots of landscapes and timelapses, and make edits and adjustments to already-generated footage.

Google Veo, a serious swing at AI-generated video, debuts at Google I/O 2024