Microsoft Researchers Say Speech Recognition System At Par With Humans

This would mean that the speech recognition system can recognize words from a conversation, just as well as humans do.

 

With Machine Learning being one of the things that will change the landscape for our daily lives, and also that of the technology companies around the world, it is imperative that different enterprises invest in it already, and look to build their own capabilities in the area. That has been the case with Microsoft, whose researchers now claim, that the human speech recognition system they have been working on, is almost at part with that of humans.

What this means, is that the speech recognition system can now pick out from conversations almost as well as humans do. On technical terms, it means that the rate of errors by the system, is almost at par with that of humans, or that humans who transcribed the recorded speech, was at a similar level of accuracy as the Microsoft Speech Recognition System.

During the study tests, Microsoft team of researchers found that the error rate of words by the speech recognition system, was 5.9%, which just about the rate of human-parity, or the rate at which humans would create errors at transcribing a audio recording.

In the official blog, the Redmond based company mentioned, "The 5.9% error rate is about equal to that of people who were asked to transcribe the same conversation, and it's the lowest ever recorded against the industry standard Switchboard speech recognition task." The company’s Chief Speech Scientist, Xuedong Huang thus conclusively said, “We have reached human parity.”

This form of advance technology will have all-round implications, so to say. From business to consumer products, the speech recognition system will become a real-time ‘smart’ feature of most products like the interface of X-box where users will be able to carry on their gaming experience, using the human speech recognition system. Also, the implication will be on Cortana, the company’s own digital assistant will become sharper with the use of human speech recognition.

But, does this mean that the system will come on to the products soon? No yet, and the reason is applications in the real world. With external noise cancellation, and the varied accents of human involved, or even when there are multiple people in the same conversation, the refinement is yet to come in to be able to work in these conditions. The next stage for Microsoft would be work on the accuracy factor of the speech recognition system.


TAGS: Microsoft Cortana, speech translation, Machine Learning