How Google engineered Android to recognize your Voice

aHave you ever wondered how your Android recognizes the voice commands you casually bash out with unfathomed ease and accuracy? The voice recognition system on previous versions of Android may have been below par, but it has been immaculate and flawless on the newer version of Android- Android Jellybean 4.1.

So, how exactly did Google amplify the voice standards? How did it come up with a system that’s strikingly accurate and stunningly fleet-footed? Well, the answer to those questions lies in your head. No pun, intended.

According to Vincent Vanhoucke, a Google research scientist, who steered the efforts in developing Google’s voice recognition system, the secret to Google’s efficient, fast and accurate voice recognition system is its design. The scientists at Google designed a neural network that worked like a human brain. The design change drastically improved the accuracy of voice searches by more than 25 percent. Besides, people don’t have to talk to phones like robots. They can, now, talk to their phones as casually as they talk to other people.

People are starting to get more comfortable with voice commands and things that were done at fingertips, are now done by merely moving the lips.

“It really is changing the way that people behave,” says Vanchouke.

When you search using Google voice, the spectrogram of your voice is split and sent to eight different computers. It’s then processed through neural network developed by Vanchouke, the results are collected and the response is sent back to your smartphone. Every step occurs in the blink of an eye. Google’s strategy to split up the spectrogram speeds up the search and reduces the turn-around time for fetching results.


Every language has a different neural network.  For instance, there’s a neural network for English language that is different from German language. The reason for having different neural networks is quite apparent. The pronunciations of words differ from one language to another. The neural network is developed using real world data, and is not simulated through computer, which explains why it’s so accurate. Also, the neural network is developed using fuzzy logic using sets of inputs, outputs, test-cases and scenarios. The network is not programmed to understand the language. Rather, it learns to understand the language through the given result sets. That’s what the key feature of neural network is- it learns.

Like brain’s neural network. Google’s neural network is also multi-layered. It first tries to break down on the vowels and the consonants that are being used in the speech.  Once it’s through the first layer, it then tries to decipher what those vowels and consonants collectively mean.

What works for speech, works for images too. These neural networks can be used to find structure in the various pixels in an image. The first layer starts analyzing the edges of the image, and then another layer digs deeper from the matches it found from the first layer. Hence, a neural network pipelines all the search results and makes searching through voice and images simpler.

Source: Gizmodo

Google has been researching neural networks for quite some time. Believe it or not, but Google has also designed a neural network program that can recognize cats in a YouTube video.

There’s been a long fought debate on which one is a better voice search engine- Android or Siri. When Apple released iPhone 4S, Google’s voice recognition system was crippled and inaccurate. However, Google has continued developing its voice search engine since then. In the present times, Android’s voice search feature is much more advanced and accurate than Siri. Hands down.

What’s worth contemplating over is that, unlike Apple, Google did not make the voice search feature the epicenter of its innovativeness, or launch a new flagship device to boast its ingenuity. Indeed, people need better voice search features so that they can get things done easily, but is that a good enough reason to launch a new smartphone? We don’t think so. With widgets like Google Now surpassing Siri, in both accuracy and speed, we wonder what new trick would the Cupertino giant come up with to take down its archrival?

2 Replies to “How Google engineered Android to recognize your Voice”

  1. is there a manual for the speech recognition.
    . WHat are the commands to go to the next line or for space line, period, new paragraph, etc

Comments are closed.