How AI can translate mouth movements into speech

speech-synthesisI recently wrote about a fascinating project from a team of researchers from Oxford University that was developing an AI that could effectively lip read.  They have developed a new product, called LipNet, which they believe is capable of lip reading considerably better than existing software.

The app was capable of reading lips with 93.4% accuracy, which is not only considerably better than most other lip reading applications on the market, but is also far superior to the best human lip readers, who manage an accuracy of around 52%.  It does this by looking at the entire sentence rather than individual words.

Further signs of the progress being made in this area comes from a second study that was looking at speech synthesis.  It looks to read the movements of the mouth, just as the Oxford project did, and then translate those movements directly into intelligible speech, bypassing the need for a human to actually speak the words themselves.

Automated speech synthesis

Whilst there are obvious applications for those who can’t speak at the moment, this kind of lip-reading capabilities does open up the possibility for more accurate annotation tools that can accurately record conversations, such as in legal or medical scenarios.

The system uses nine sensors to capture the full range of movements in the mouth area, including of the tongue, lips and jaw.  A neural network is then trained on a large data set to be able to translate these movements into words.  These are then sent onto a vocoder to be emitted as sounds.

Now, whilst the sounds that emerge are undoubtedly robotic and monotonous, they are nonetheless accurate and distinguishable, and as tech companies are making headway with the speech of services like Siri, it’s perhaps reasonable to expect this side of things to improve significantly.

Suffice to say, there is still work to be done before this kind of technology finds its way onto the marketplace, but it’s a nice sign of what is being developed in our universities, and what might soon be finding its way into tangible products.

Check out the video below to see the device in action.

 

Related

Facebooktwitterredditpinterestlinkedinmail

Leave a Reply

Your email address will not be published. Required fields are marked *

Captcha loading...