Using Social Data To Mine For Drug-Related Side Effects

Social media provides a window into our minds in all manner of ways, with researchers using what we share online to understand a wide range of phenomenon. A recent study from the First Moscow Medical University suggests that we can do a similar thing and gain insights into the side effects of taking medicine.

The researchers propose a system whereby complaints such as feeling a little giddy that are posted on social media can now be translated into medical terms such as vertigo.

The researchers utilized recurrent neural networks and semantic vector word representation to help with their translation. The team uploaded medical texts to their system, whereby a special vocabulary was forged. This data was then used to assign a vector to each word.

“We used user comments from the web. Our network is a recurrent one, so it’s capable of memorizing. Of course, not in the literal sense of that word, because the network is not a thinking system, but there is a specific mechanism it uses to memorize texts. We upload texts to it, and it then compares them to the International Classification of Diseases (ICD). The outputs are word vectors, and words and terms often encountered in a similar context are assigned similar coordinates. Thus, the neural network “compares” user texts and official medical terms,” the researchers explain.

Text analysis

The system was able to map words such as queasy, with symptoms such as nausea, but the team believe it goes beyond mere comparison of vocabulary. This is crucial as many of the complaints people make online, don’t resemble medical terms at all.

“The importance of this research is caused by a growing demand for text data analysis. In our project, we use text analysis methods and machine learning to extract useful information from the available data,” the researchers explain.

The team believe that their work can help to improve communication in healthcare, as the language used by patients and the medicine industry are often very different, which can result in confusion and misunderstanding.

“Algorithmically speaking, this task is more like translating between different languages, albeit very similar ones. The solution lies within natural language processing. In the last years, the most successful solution for most tasks in speech and text processing have been based on deep neural networks which help determine complex regularities in data. In particular, recurrent neural networks work well with serialized data because they can find links between elements while taking consideration of the context,” they say.

The researchers are confident that their work can provide the medical industry with a greater understanding of how patients respond to the various treatments they’re given, and especially help to uncover any unwanted side affects. They plan to do more work to fine-tune the technology so that it can also take into account lifestyle factors, including diet.