While a less heralded part of Apple’s impending iOS7 update is expected to be minor improvements to the Siri voice recognition system, rivals Microsoft are thought to be undertaking a natural development of their own, with the development division of the American company claimed to have produced an enhanced smartphone voice recognition method.
The Windows Phone maker state that their new system is able to ‘process vocal commands’ at double the speed of ‘current-generation devices’ (services such as Siri and Samsung’s S Voice), with a method based upon ‘deep neural networks’ that ‘mimics human brain function’.
The service is able to write text messages and perform smart websearches using Bing, with Microsoft claiming that the results of their voice recognition test demonstrate a 15% improvement in accuracy.
Microsoft research program manager Michael Tjalve summarised: “For a normal sentence, you will have one less word to correct.”
Despite the obvious intentions of anyone hearing of this story, Microsoft have refused any comparison between their system to that of Apple or Samsung, amongst other voice-recognition software providers, claiming that other companies have implemented different approaches in their approach to ‘processing vocal information’.
The Bing Voice team explained their ‘process’ towards the improvements in a blog post, writing: “Over the past year, we’ve been working closely with Microsoft Research (MSR) to address limitations of the previous voice experience. To achieve the speed and accuracy improvements, we focused on an advanced approach called Deep Neural Networks (DNNs). DNN is a technology that is inspired by the functioning of neurons in the brain. In a similar way, DNN technology can detect patterns akin to the way biological systems recognize patterns.
“By coupling MSR’s major research breakthroughs in the use of DNNs with the large datasets provided by Bing’s massive index, the DNNs were able to learn more quickly and help Bing voice capabilities get noticeably closer to the way humans recognize speech. We also made a few improvements under the hood that allowed Bing to more easily identify speech patterns and cut through ambient and background noise – cutting down response time by half and improving the word error rate by 15 percent, even in noisy situations.”
Set to begin life in data centres in America in the near future, the service will be one which enables readings around half a second quicker than before, but will that extra time be enough to convince any converts from opposing smartphone voice platforms?