AI Milestone: Microsoft Voice Transcription Beats Humans

Isaac Cain
August 22, 2017

Furthermore, it has highlighted some challenges that still need to be addressed, such as achieving human levels of recognition in noisy environments with distant microphones and recognizing accented speech, speaking styles or languages that only have limited training data available.

The company's research team reached a 5.1 percent error rate in their speech recognition system, reaching human-level accuracy, and passing the 5.9 percent word error rate set a year ago. Last year Microsoft's researchers recorded 5.9 percent error rate in the system and now with the recent development, it stands at 5.1 percent error rate. This puts the recognition engine in the same class as professional human transcribers, even though the humans have the ability to re-listen to speech as many times as they want (while the computer hears it only once).

Researchers at Microsoft have also been working on neural net-based acoustic as well as on language models, which resulted in the reduction of error rate. Microsoft Research says the improved accuracy, which is the level researchers have found is achieved by transcribers, was achieved by improved machine learning algorithms that are able to predict words in conversation as well as better recognition.

More news: Cure for peanut allergy in sight after world first trial

It can now transcribe human speech with a 5.1% error rate, Microsoft technical fellow Xuedong Huang wrote in a blog post - the same error rate as humans.

So while Microsoft's tech is impressive, it won't be on a par with humans in all real-world situations just yet.

The achievement arrived much sooner than Microsoft expected, as in 2015 Huang informed media that developing a system which could surpass a human in transcription was "four to five years away".

Other reports by LeisureTravelAid

Discuss This Article

FOLLOW OUR NEWSPAPER