No, I’m not talking about their sound. I mean, sure, developers can definitely help relating voice assistants’ with the average user by fine-tuning their voices to sound like people and not cyborgs. I think, though, that they should retain some robotic quality so that users clearly know what they’re interacting with as opposed to who.
How Google Now, Siri, Cortana, or any other voice service interacts with you is more important. Important to you, the person who wants data from the internet. It’s most important to the companies behind the services which gain income off of you via data mining and advertising.
So, voice assistants, if you want to help yourself help me and the rest of the world, then you’ll have to pay attention to the things people say and how they say them. Because if we take displeasure from having to repeat things to humans, you can bet it’s all the more frustrating with a freaking machine. I’ll be mainly focusing on Google Now since it provides the most robust search data.
All three of the main services rely on an internet connection to interpret what you’re saying. It’s not device-side, but server-side voice processing that contextualizes any dictation mistakes either you or the service has made. This is especially true in names. It’s the difference between searching for Natalie Price (a name using a recognized word) and looking up who Natalie Prass is.
And when we request a return, we like to get answers as fast as possible. If there doesn’t have to be another tap and another page loaded, then save me the hassle. That’s why search engines have integrated first-line results in which the answer you’re looking for — aggregated from multiple sources (or maybe just Wikipedia) — is placed up top in a special card for voice assistants to read to you. Depending on what you search for, the card can be very specific or even be pretty broad.
When I search for “My Morning Jacket,” for example, on Google Now, a pop-up card shows me a slew of information from social media accounts to where I can listen to The Waterfall to when their next concert dates closest to me are happening, if applicable.
Using the data you have
But when I ask Google Now “when’s the next My Morning Jacket concert in Boston,” it just piles on a couple of featured ticket sellers who bought their way to the top of the listings I’m looking at.
The fact that search engines have fast and easy access to specific and bits of data is great. But as a searcher, I don’t want to have to go through hoops if I don’t have to.
Uber fare estimates are a perfect example of this. Now, I might want to wait the 10 seconds it takes to open up the app, figure out where I want to go and then ask for a fare estimate. The advantage is that I’d know about any applicable surge pricing happening right there and then. But I usually tend to plan my trips out in advance on Google Maps and usually include Uber if I can’t public transit into the mix. Conveniently enough, one of Maps’s transit options includes the option to Uber along with estimated price, pickup and trip length details.
And while I can ask Google Now to show me most of the same navigation results Maps provides, it won’t show me Uber fares straight from the card. I can’t even ask Google Now how much a fare might be.
If your search engine can connect pieces of data together in your native suite of apps (like humans do), shouldn’t the voice assistant naturally do the same?
Don’t take this the wrong way…
If you see the post time above, let me tell you that I’ve been working all night. Irrelevant to you, I suppose. Around my parts (which I’m more than happy to share with you), though, it’s actually quite easy to go out and take a break with a relative plenty of 24-hour businesses. So, if I want to pick up a soda from the 24-hour nearest convenience store …
I’d have to kick up the car I don’t have or take the public transit that isn’t running to get to those gas stations. Also, that dot closest to me is just Google teasing me that the 7-Eleven’s closed. Yep. Well, maybe if I loosen up my query a bit …
That’s getting somewhere. I mean, two of the three main results don’t fit the criteria of being open all the livelong day and that odd one out seems to just point out a pharmacist at the CVS and not the CVS itself. Terrific.
Hey, you know what, I can get a cold one at the supermarket! I wonder if one’s open right now.
Well, there you go. It doesn’t matter if I said “store,” the widest term I could use for a place to purchase things, I’d have to search for a 24-hour supermarket to get the result I’m looking for.
Clearly defining terms is important when you’re reading the fine print for instructions on how to defuse a nuclear warhead. But I just want some soda, man! Give me a break and everyone else who wants to find something even the slightest bit vague a break.
Like, um… don’t interrupt me
It’s especially egregious on Android Wear to this day, but when you’re dictating a request for Google Now on the go, it’s kinda hard getting your thoughts together to shoot off a request or a text in one straight go. You probably have a few “uhh” moments in there, along with other speech fillers.
I’m not saying it should wait an hour and a half after hearing the last of your words to run with the query. Rather, if a search engine can be patient enough to actually let you finish your filler to get to the rest of your thought and then be smart enough to remove those filters, we’d be living in a less angsty world right now. This could also apply to being rudely interrupted where you’d follow that with a vocal (and visceral) reaction.
Sure, maybe you’d want to start a new search after something like that. But I’m just saying that if voice assistant developers want to get their products to where they think enough like humans do for me to use more often, that’d be something on my checklist.
What’s still on yours? Are you satisfied with what your voice assistant’s got? Leave a comment below.