By Jaime Rivera | July 1, 2012 9:29 PM
There’s no denying that the definition of what a smartphone is has shifted quite a few times in less than a decade. At first it was a phone with PIM functionality, later a PDA that had phone functionality, later a phone that was smart enough to handle anything developers could throw at it, and as of a few months ago, it’s all about a phone that talks to you. I know many of you started reading this thinking that Voice Control is the big thing now, but as we’ve been going through the different alternatives in the market lately, it’s clear that the technology is not there yet.
Our first impression of Siri back in October 2011 was “Let’s wait and see”. We had seen so many companies try to implement Voice Command before and fail. If I asked any of you now if you ever use Google Voice Actions on either Android device or the iPhone, or the Bing Voice Search functionality on Windows Phone, I’m sure that more than half of you don’t care much about it or didn’t even know it was there.
At first we were curious as to why Apple wanted to innovate in a market that hadn’t yet succeeded, but as history has shown, that’s exactly how they behave. They like to serve markets that others haven’t figured out yet. Just like Apple didn’t invent the first computer, music player, smartphone or tablet, they didn’t invent voice control either. Where Apple has shined though, is in taking an idea, and serving it, as it should. I know some of you don’t think much of Voice Control, but competitors like Samsung are liking the idea enough to try to compete with it.
Has Apple done it right again? Well, you could say they did since they’ve sold more iPhones with the 4S than they ever did before, and the differences are mainly based on Siri. Sadly, I’ll admit that I’ve used it very lightly during its iOS 5 period. For me, anything ranging from a Stylus to SmartStay are gimmicks if they don’t make my experience with the phone any better. Siri’s first iteration wasn’t there yet, but iOS 6 changes a lot of things. I carry 243 apps on my iPhone, and being able to launch an app through Siri and seeing how it rarely ever fails at doing it right on beta 2 is an example of what I mean.
So, what things would a voice assistant have to do to get you to use it? I’ll start by sharing my needs, and I hope you join me in the comments with your needs:
1. It should get things done and fast
If it can’t set an appointment, set a reminder, set an alarm or tell me the weather, it should die a quick death. If I have to learn commands to interact with it, and only talk to it in a specific way, it should follow. That was last decade’s technology. Surely any machine talking back to you is nice to have, but it defeats its own purpose if it’s not there to help you do something better than you could on your own.
One of our biggest gripes with Samsung’s S Voice, is that even though it was neat and better than Siri in certain commands it understood, there came moments when it was so slow, that it was faster to just do things ourselves. That shouldn’t be, no matter how new the software is.
2. It should be able to adapt
Siri does something cool, that’s sadly still in its infancy. If I tell it where my house is, my office, or what the contact card of my son is, it’ll remember and refer to these places by my specific scenario. All I have to do is tell it to call my son, and it remembers whom to call.
Adaptability is sadly more than just applying nicknames to people and places. It should be artificially intelligent, or at least pretend it is. Google Now is an example of a step in the right direction. The phone uses your location and search history to more-or-less predict what your next move will be. If you’re on the road, it’ll intelligently adapt and give you traffic information, even if you haven’t asked for it. If you’re at a restaurant, it’ll give you tips about it.
Where Google Now fails is in not asking me for more. Search history or locations are not enough to determine my behavior patterns. It would be awesome if Google had the option of allowing me to tell the service what I like to do when I get home, or get to work, etc.
3. If it doesn’t know everything, it should at least handle the phone completely
The most difficult part of interacting with any phone is digging through the settings. I find it extremely annoying that even though Siri knows how to launch any app on iOS 6, I can’t use it to switch-off Wi-Fi. How hard can that be?
For me, Voice Control has to really be in control. I want it to activate Bluetooth, password lock my phone, switch “Do not disturb” on, or even do a Spotlight Search of something that’s in my phone. I want it to make Bluetooth pairing easier, know my battery status and even tell me how long it will last depending on my current usage. Again, if I have to dig around for it, then it’s not really assisting me at all.
4. It needs to learn how to read
Something that always frustrates me about Siri, is that it can only read text messages. I can’t even remember the last time I sent a text message. I’ve been using email and online chats on my phone for the last five years so somebody should really lighten-up at Cupertino. If I ask it to read me the latest email, it just pulls-up the excuse that I have more than 25 emails, and that it can’t read them all, even if my question was directed to only one.
Companies have invested a lot on dictation, so it’s clear that their software is able to hear and transcribe. Their next challenge should be for it to read. Most phones can already do it through accessibility options, but the phone purposely reads the buttons as well as the contents of any email. You’d think they could just use this same concept with less detail, but they haven’t. When I’m working out, or driving, or simply being lazy, I wish my phone was smart enough to read stuff to me. There’s no point in voice dictation if I’ll have to interact with the phone first to read my message before I respond.
5. If I have to press-and-hold a button, it sucks
Pressing and holding a button is seriously annoying. Android’s recent UI changes with Ice Cream Sandwich are kind of weird. I do voice searches a lot more than I switch between apps. At first, I thought their whole idea of going for soft keys instead of persistent buttons on the Galaxy Nexus would be for you to be able to program these depending on your needs, but sadly, that requires a root. I think I’d have more use for voice control, than for the multi-tasking UI, and therefore, I’d prefer to tap and hold to multi-task, than to call on Voice Control. I seriously don’t understand why Google ditched the persistent search button either. They could’ve just left it there and used it as a quick alternative for voice services.
Another good but difficult idea is to have the phone respond to a name. Sadly this is still a challenging solution, since one of our major priorities in any smartphone is for the battery to last through the day. Free services like Vlingo do a good job in waking-up with your voice, but it only works half of the time. It’s a step in the right direction, even though the technology is not there yet.
The bottom line
I’ll quote Steve Jobs’ phrase from MacWorld 2007: “Every once in a while, a product comes along that changes everything”. In my opinion, the product is not here yet, but Siri, S Voice, Google’s new Voice Search and Vlingo are all steps in the right direction. None is powerful enough to make voice interaction ubiquitous, but it’s bound to become something of the future if everybody keeps pushing it so vehemently. The great thing about any of these services is that all their shortcomings can be fixed with software, since that’s what they mainly are. Whether they fix it this year or the next, it’s just a matter of time and bright ideas. Be sure to share your thoughts on how it should be in the comments down bellow.