Serial entrepreneur and social media expert Gary Vaynerchuk said it best: “I don’t think anybody who’s a major internet company can live without having a major voice strategy.”
Voice will account for nearly 50% of search on mobile in the next three years, and will be the primary way consumers search for and make purchases on their phones and voice-enabled devices. Everything we do will involve voice in some way, from shopping to ordering groceries to asking what the weather is outside. And thanks to the nearly 20 million voice-enabled smart speakers sold thus far by Amazon and Google, we’re already seeing the fight for the real estate in consumers’ homes.
The smart speaker movement is taking the world by storm. We’re seeing some of the largest companies in the world, like Amazon and Google, set the tone for how and where voice can impact consumers. And why wouldn’t they? Nearly 30 million households are projected to have a voice-first device by the end of the year. While they are key in kickstarting the revolution, these smart devices won’t be the last stop on the voice train.
While the adoption of smart speakers has risen rapidly, the adoption of third party applications — or “skills” — has been less than impressive. Less than 3% of skills built on top of Alexa are actually getting used, which is in most cases a result of the limited functionality of these applications. For example, Domino’s users want more than just their “easy order” or something they order over and over again. Reviews of the Starbucks integration with Alexa note that many orders never make it to the barista, only to bring more users back to the mobile app instead. And those that want flowers for that special someone, are limited to seasonality or occasion with the 1-800 Flowers integration, without a way to actually see the flowers or gifts they’re ordering.
Voice is capable of so much more.
The large platform companies have focused their energies on developing voice assistants that offer broad yet shallow integrations with third parties. As a result, “skills” just scratch the surface of what’s possible with voice. As the examples above demonstrate, while it’s relatively simple to build and launch an application for these platforms, you’re limited by what’s possible from a functional point of view.
To realize the full potential for voice, companies need a more deeply integrated voice experience that is brand specific and is all knowing about their business. In retail, for example, the art of the consumer experience lies in discovery and personalization. Very rarely do you know the exact product you’re looking for when you go to a website or browse the “most popular” products section. That’s why companies invest millions in their mobile apps and websites, helping guide you through an experience to find exactly what you’re looking for. Voice is going through an evolution of its own, helping companies understand how to build much more meaningful customer experiences. It could be as simple as knowing that when you want a new sweatshirt, ordering your last order just won’t do. You want to see it in front of you, most likely on a mobile device, and will ask a series of questions around color, price, size, material and reviews.
Imagine a voice experience where you can search for flowers for your mother, or someone special, watching the results on the screen in front of you change based on your budget, your color preference or type of flower. Imagine a fall selection of coffee flavors coming out in your favorite chain coffee shop, empowering you to add pumpkin spice, a shot of vanilla, and change from cold to hot once you’ve stepped outside.
There’s no question that Google and Amazon have birthed a truly revolutionary time in technology, conditioning us to be comfortable using a voice-powered device or experience, but we’re just at the precipice of the potential of voice. Voice, just like the introduction of the mobile device before it, will fundamentally change how we interact with the world around us. Only when you understand that, will you see how powerful voice will be.
Dr. Peter Cahill is the founder and CEO of Voysis. He has over 15 years’ experience in speech technology and neural network R&D. Peter is an active member of the speech research community where he chairs SynSIG, the global speech synthesis special interest group in addition to being a reviewer of all leading journals and conferences in his field.