Can Hate Speech Be Contained Like A Virus?

The spread of hate speech has been one of the more evident negative side effects of the rise of social media, with daily examples of abuse and mistreatment visible on our screens.  New research from the University of Cambridge suggests that a similar approach to that used to quarantine computer viruses could be valuable in stemming the tide.

The researchers highlight how the very definition of hate speech varies by country, and indeed by platform, so simply blocking keywords is a poor approach.  This makes blocking hate speech automatically a difficult endeavor, and there is ongoing debate around freedom of speech versus the publication of harmful language.

The new research leans upon techniques honed in the cyber security world to give greater control to those who are both targeted and harmed by the hate speech, without resorting to the blunt instrument of censorship.

The researchers utilized a database of both threats and violent insults to help them develop and train algorithms to accurately detect whether messages shared online contain hate speech or not.  The algorithm gives each message a score on a likelihood scale, and the researchers believe their ‘Hate-O-Meter’ could power a warning system to alert users to potential exposure to hate speech, alongside the severity of the warning, the name of the sender, and then an option to view the content or delete it unseen.

Filtering malware

The approach has many similarities to that used by cybersecurity vendors in seeking to protect people from spam and malware, and the researchers believe it could significantly reduce the amount of hate speech people are exposed to as they travel the web.  The system is still at an early stage, but they hope to have developed a prototype early in 2020.

“Hate speech is a form of intentional online harm, like malware, and can therefore be handled by means of quarantining,” the researchers say. “In fact, a lot of hate speech is actually generated by software such as Twitter bots.”

The researchers highlight how the main online platforms typically respond reactively to any hate speech encountered on their sites.  This approach can be effective for people who only encounter hate speech sporadically, but for many, it’s far too little, far too late to be effective.

Indeed, for many minority groups who are in the public eye, hate speech can be received simply for having an online presence, which can act as a significant deterrent from continuing in public life (or even starting in the first place).  This has significant consequences as these groups are often most in need of greater representation.

Hillary Clinton has repeatedly cited such online hate speech as a threat to democracy itself, and numerous female politicians have cited online abuse as a reason for withdrawing from public life.

A complex picture

The researchers are humble enough to accept that their work doesn’t provide a complete solution to what is an incredibly complex problem.  It’s part of a range of approaches that veers from the extreme libertarian/laissez faire approach that allows everything to a more authoritarian style that prohibits everything instead.

Where their approach stands out is that ultimately the individual becomes the arbiter of their own experience.  It removes the potential for companies or governments to decide on their behalf the content they’re exposed to.

“Our system will flag when you should be careful, but it’s always your call. It doesn’t stop people posting or viewing what they like, but it gives much needed control to those being inundated with hate,” the researchers explain.

They also believe that their approach affords a greater degree of accuracy than many systems in use today.  Indeed, they suggest that many detection algorithms struggle to get past 60% accuracy, which is scarcely better than chance.  By contrast, their system was capable of achieving accuracy levels of around 80%, with further improvements on the horizon.

This will be supported by the growing collection of training data they have available on verified hate speech, which helps to refine and improve the confidence scores that ultimately determine whether content should be quarantined or not.

Detecting nuance

The researchers highlight how something as relatively straightforward as being able to identify when a word such as ‘bitch’ is being used as a misogynistic slur and when it’s being used in the context of dog breeding.  They believe their algorithm is capable of understanding where such words sit in relation to the words surrounding them, and detect the semantic relationship between them.  This then helps to inform the hate speech score assigned to the content.

“Identifying individual keywords isn’t enough, we are looking at entire sentence structures and far beyond. Sociolinguistic information in user profiles and posting histories can all help improve the classification process,” the researchers say.  “Through automated quarantines that provide guidance on the strength of hateful content, we can empower those at the receiving end of the hate speech poisoning our online discourses.”

As with the cybersecurity industry, there’s a constant arms race between those who wish to produce hate speech, and those who wish to restrict it.  Projects such as this at least show the work being done to fight the good fight.

Facebooktwitterredditpinterestlinkedinmail