We’re Not Ready for a Voice Assistant That Understands Us

By Alyssa BereznakNov. 3, 2016, 2:13 pm UTC • 4 min

This spring, at an open-air amphitheater typically reserved for major Bay Area concerts, Google’s vice president of product management Mario Queiroz took the stage and whipped out a cylindrical device he described as “a voice-activated remote control to the real world.” With the familiar rhetoric that most tech execs use to sell unnecessary but cool toys, he touted its abilities as a home entertainment system, its utility as a hub for smart appliances, and its capacity to learn your personal schedule. “What makes Google Home really shine is that it has search built in,” Queiroz said. “It draws on 17 years of innovation in organizing the world’s information to answer questions which are difficult for other assistants to handle.”

I would agree, based on the few days I’ve had with Google Home. But the fact that it’s a competent communicator doesn’t change the incredibly low standards we have for voice-activated technology. Before there was Google Home, or Alexa or Siri, the average person’s most common encounter with a robotic voice was over the phone, likely on some enraging Time Warner Cable customer service quest. No matter how easy to talk to those voice bots promised us they’d be at the beginning of a call, they buckled under the slightest complication. They were programmed to understand maybe a handful of commands. And if you even slightly mumbled, they’d ask you to repeat yourself. They were psychological torture devices in disguise.

Apple’s 2011 premiere of Siri didn’t help with our pessimism in this category, either. As my colleague Victor Luckerson noted earlier this year, Apple’s iOS-based assistant was wildly overhyped. Despite a promise from the company’s marketing head that Siri would be “amazing right out of the box,” she fumbled with our most basic inquiries. She often punted questions to a generic and incomplete web search, which always felt like a roundabout waste of time. And even today, people continue to post screenshots of her being surprisingly dumb.

But things are looking up. This Friday, Google will release its very first voice-activated speaker, a smooth little contraption called Home. Its main selling point, aside from being able to play you sweet, sweet whale sounds, is that you can speak to it like you’d speak to an actual person. No need to ask questions in the form of disjointed, caveman-like search queries like “how cook chicken fast,” or add excessive details about your question. You don’t even have to yell at it. As long as anyone in the same room begins a sentence with “OK Google” in a voice that’s above a whisper, Home will answer. Whether it has the information you need is another question entirely (it sometimes doesn’t!). But for the first time in a while, you’ll feel confident that this voice-activated assistant actually understands you. That doesn’t mean, though, that you’ll necessarily understand how to talk to it.

Our past with voice-activated technology has been clunky, slow, and aggravating at best. At the same time, we’ve become programmed to use online search engines in a specific way; stiffly and formally. Combine these two trained traits, and it figures that even if a digital assistant is good (or better) at listening to us, we are bad at talking to it. If our past experiences with voice-automated gadgets have been clunky, so will our own attempts to interact with them now. For real-life proof of this, I needed to look no further than my coworkers at The Ringer’s New York office.

At my request that they ask Home some questions, they — all digital-savvy people who likely have smartphone addictions — began interacting with the speaker in all the ways Google promised you wouldn’t have to. They crowded around, as if they were worried the thing wouldn’t hear them from afar. And once up close, a few people still felt the need to yell. Others worded their inquires carefully, adding specific years and pronouns as if they were typing something into a search bar. They also asked a lot of random stuff that Home couldn’t always answer. “OK Google, do you like the Ringer dot com?” one coworker asked. “Why did N.W.A break up?” “Is Tom Hanks black?” “What does woke mean?” (Answer: “Past of wake.” Google’s not wrong.) “How do I microdose MDMA?” another queried very loudly (Home inexplicably offered a recipe for chili, and then, when asked again, referred us to a Wikipedia page on microdosing.) “How tall is Barbara Bush?” “How tall is Hillary Clinton?” “How tall is Jake Gyllenhaal?” (Google might want to consider a partnership with celebheights.com.)

As the only person in the room who’d been briefed on Home’s capabilities by its own creators, I swooped in to demonstrate its more dazzling skills. “What sound does an anteater make?” I asked. The speaker promptly replied by playing a horrifying sound — sort of a cross between a bird chirping and a rat squeaking — and my coworkers ooohed and ahhed. Clearly, a little human training goes a long way in communicating with robots.

Alyssa Bereznak

We’re Not Ready for a Voice Assistant That Understands Us

Keep Exploring