Speech is a fundamental aspect of human behavior, yet it remains something that many of us struggle with. It’s believed that around 1 in 14 adults in the United States have some kind of voice disorder, and our understanding of such disorders makes it difficult to both diagnose and treat.
A team from MIT and the Massachusetts General Hospital believe that machine learning can play a part in better understanding speech disorders.
In a recent paper, they describe using a wearable device to collect accelerometer data to detect differences in people with Muscle Tension Dysphonia (MTD) and a control group. After such individuals with MTD had received therapy for the condition, their behaviors appeared to converge with that of the control group.
“We believe this approach could help detect disorders that are exacerbated by vocal misuse, and help to empirically measure the impact of voice therapy,” the authors say. “Our long-term goal is for such a system to be used to alert patients when they are using their voices in ways that could lead to problems.”
Machine learning
The team used unsupervised learning to try and gain a better understanding of just when vocal misuse was occurring, and there correlation between misuse and accelerometer data.
“People with vocal disorders aren’t always misusing their voices, and people without disorders also occasionally misuse their voices,” the authors explain. “The difficult task here was to build a learning algorithm that can determine what sort of vocal cord movements are prominent in subjects with a disorder.”
So, the team divided participants into two groups depending on whether they had a voice disorder or not. The two groups then went about their lives as per normal, whilst wearing a wearable accelerometer device to capture the motion of their vocal folds.
The data was then crunched, with over 110 million glottal pulses captured during the test period. There was a noticeable difference in the clustering of these pulses between the two groups, but this difference was shown to vanish when the voice disorder group received therapy for their condition.
The study is important as it’s the first of its kind to use machine learning to provide clear evidence of the impact voice therapy has on a patient.
“When a patient comes in for therapy, you might only be able to analyze their voice for 20 or 30 minutes to see what they’re doing incorrectly and have them practice better techniques,” the team explain. “As soon as they leave, we don’t really know how well they’re doing, and so it’s exciting to think that we could eventually give patients wearable devices that use round-the-clock data to provide more immediate feedback.”
The team hope that they will be able to further develop the approach such that it can help to diagnose specific disorders and potentially even provide insight into how disorders work.
This could potentially be done via a smartphone app that provides a level of biofeedback to help patients better manage their conditions and live a life that’s conducive to healthier vocal behaviors.