The Algorithm That Could Stamp Out Online Abuse For Women

Online abuse targeted at women has sadly become all too common on social media in recent years, with this abuse often including threats of sexual violence or physical harm.  New research from the Queensland University of Technology proposes an algorithm to help eradicate it.

The researchers trawled through around a million tweets before honing in on ones containing three particular keywords: slut, whore, and rape.

“At the moment, the onus is on the user to report abuse they receive. We hope our machine-learning solution can be adopted by social media platforms to automatically identify and report this content to protect women and other user groups online,” they explain.  “The key challenge in misogynistic tweet detection is understanding the context of a tweet. The complex and noisy nature of tweets makes it difficult.”

Understanding tweets

The researchers developed a text mining system whereby the algorithm was capable of learning the language as it went.  It did this first by developing a base-level understanding before then building on this with tweet-specific and knowledge of the abusive language.

“We implemented a deep learning algorithm called Long Short-Term Memory with Transfer Learning, which means that the machine could look back at its previous understanding of terminology and change the model as it goes, learning and developing its contextual and semantic understanding over time,” the researchers say.

Despite having a robust base dictionary to work from, it was important that context and intent were monitored to ensure that the algorithm was able to successfully discern between sarcastic comments and abuse.

“Take the phrase ‘get back to the kitchen’ as an example–devoid of context of structural inequality, a machine’s literal interpretation could miss the misogynistic meaning,” they explain.  “But seen with the understanding of what constitutes abusive or misogynistic language, it can be identified as a misogynistic tweet.”

The key to the success with the project was this ability to teach the system to be able to differentiate context through the text alone.  One of the first signs of real success in this regard was when the algorithm successfully identified the phrase “go back to the kitchen” as misogynistic.

Spotting sexists

The algorithm was able to spot sexist and misogynistic context with an accuracy of around 75%, which is a better performance than a lot of similar approaches.

“Other methods based on word distribution or occurrence patterns identify abusive or misogynistic terminology, but the presence of a word by itself doesn’t necessarily correlate with intent,” the researchers explain.  “Once we had refined the 1M tweets to 5000, those tweets were then categorised as misogynistic or not based on context and intent, and were input to the machine learning classifier, which used these labelled samples to begin to build its classification model.”

The ultimate aim is for the algorithm to become a part of platforms such as Twitter, which will give them the ability to remove misogynistic tweets.  They even believe that the project could be expanded to also cover things such as racism and homophobia.

“Our end goal is to take the model to social media platforms and trial it in place,” they conclude. “If we can make identifying and removing this content easier, that can help create a safer online space for all users.”

Facebooktwitterredditpinterestlinkedinmail