Every day there is a new headline preaching the importance of voice search. It makes sense as the cost of entry into voice-activated virtual assistants is now under fifty dollars, thanks to competition in manufacturing, which has created 14 new voice-activated virtual assistants in less than two years. The headlines are being fueled from reputable sources including Forbes, Tech Crunch, and Harvard Business Review who have researched user behavior, analyzed market trends, and adoption rates of voice enabled devices. These reputable sources then create hyperbolic headlines that send marketers touting this knowledge as gospel in conferences, boardrooms, and blogs amplifying the importance of voice search. Unfortunately, the findings are based on data that is too general and cannot be applied to specific verticals or demographics.
Here are a few recent headlines:
- “Optimizing For Voice Search Is More Important Than Ever” – Forbes
- “The Voice Search Explosion…Will Change Local Search” – SEL
- “By 2020, 50% of Search Will Be Voice Conducted” – Geo Marketing
- “How Voice Search Changes Everything” – Yext
- “Adoption rate for Google Assistant will be 95% by 2019” – The Economist
These exciting headlines make very little sense because they fail to address the why and how. There are key behavioral differences in the type of voice searches user perform and none of the sources address the actions taken by the user after a voice command, which for many B2C companies is the most important aspect of voice search.
Let’s define a typical voice search:
- Device receives a single spoken query.
- Devices returns a single spoken result.
If the majority of our searches could be answered in a single sentence then yes, I would believe we need to prepare for a voice search revolution. However, this is not the case and the vast majority of search queries are navigational in nature. Many other queries require a refinement or are brand new search query that Google has never seen.
Mass adoption rate of voice search must have:
- Understanding of complex spoken queries.
- Capacity to deliver conversational & actionable spoken results or screen results.
The typical voice search example assumes the spoken query has been understood and answered to the satisfaction of the user. A typical spoken result may differ from a screen result and omit valuable information. Since there isn’t parody between a spoken result and a screen result, the adoption rate of voice search will continue to be hindered. Users will only fully adopt voice search when they feel they are getting the same or better experience than if they were to type the query into a search engine. This behavior is similar to the behavior exhibited by early adopters of m.dot sites, in which the majority of the users chose the full experience of a www over a slimmed down mobile experience.
The mass adoption scenario is a significant step in search technology. Google announced at the Google I/O conference in May that Google Assistant would soon be able to answer queries with conditional statements. If artificial intelligent voice-activated virtual assistants are on the horizon, then Google Home may become the leader in voice search. This may impact our marketing tactics because it could remove the visual nature of product listing ads (shopping ads). This would not change our marketing strategy to rank well for search queries.
Future adoption rates of voice search are highly dependent on the very difficult process of voice recognition. Passing an understood spoken query to the search engine as text is critical for adoption as misunderstood queries will lead to frustration, abandonment, or latency in adoption.
Voice Search Is Good for Single Response Queries
Voice search to a spoken result is good for single response queries when the query is understood and the answer satisfies the user’s question. For example:
“Who was the 17th president of the U.S.?
This is a perfect query for voice response. Google Home will respond with: “The 17th President of the United States is Andrew Jackson.”
Another single response voice search is:
“When was Zumiez founded?”
Google Home will use Wikipedia to deliver the answer:
“Zumiez was founded in 1978 in Seattle, Washington.”
Voice Search Struggles at Decision Support
However, voice search to a spoken result do not work well for decision support: Open-ended search queries where we want help to make a decision. The query result in these cases can be ambiguous and requires that we review it. For example:
“What is the best skate shop in Seattle?” or “Where is the best coffee near me?”
Search engines can’t interpret “best” because it is subjective so the results are often presented in a list based on ratings in the form of:
“Here is a list of results for…”
Shopping through voice search doesn’t work well, either:
“Buy a Thrasher Tee Shirt.”
Buying a product through voice search is near impossible unless you have previously purchased the item through Amazon. Even with a screen result the Amazon Echo Show forces the users to go to a touch screen to navigate.
Improving the User Experience
In the next four years, the AI’s industry growth will start to explode and its impact on business and society will begin to emerge. Voice response for multiple-option query results is awful user experience and that will be the first to change by improving voice recognition, context, and localization. Introducing a visual or heads up display on TV, monitor, watch, or contact lens could revolutionize the voice search user experience. The Amazon Echo Spot and Show both show they are headed towards this direction.
There’s no voice replacement for a decision support page. Google Home or Alexa could speak a list of results, letting us say something like “that one” when we hear a result we like.
That’s the best user experience 1985 had to offer: We used to call 411, ask a question, and have an operator give us a list of possibilities over the phone while we frantically scribbled them down.
I didn’t buy a Google Home so that I could brush up on my shorthand skills.
Who Is Using Voice-Activated Virtual Assistants
Teens use voice-activated virtual assistants including their phones more than the average adult. More than half of 1,400 teens surveyed used voice search daily.
43% used VAVA to call someone
38% VAVA to ask for directions
31% used VAVA for help with homework
30% play a song
40% ask for directions
39% used VAVA to dictate texts
31% used VAVA to call someone
We should continue to watch how voice search evolves. The most consistent figure published on the number of users using voice search is 20%, and as noted above very few users are using voice search for shopping.