The use of social media data to track trends in real life has been relatively common for a few years now, so can the same approach be used to track the spread of the COVID-19 virus? New data from researchers at Georgia State University attempts to find out.
The dataset, which has been made publicly available, contains over 140 million tweets about the virus, and builds upon previous work from the team using this approach to track mobility patterns during natural disasters. It’s updated every two days, and the team hope that by releasing it to the public, it can enable other researchers to build upon the work.
“It was a big decision to make to release the data before having a few papers prepared on it, but it is for the common good,” the researchers say. “We are all on the same planet together, and any additional data that could be easily available for other researchers to analyze can make the difference. I am a big believer in open science, and this is definitely a time where it’s important to have the greatest number of eyes on the research.”
Unique insights
The team believes that the data provides a valuable and unique insight into the pandemic, whether on the movement of people, the diagnoses of the virus or the treatment administered. It also provides a unique history of the event as it unfolds.
“This dataset,” they say, “will allow researchers to investigate the spread of misinformation relating to COVID-19, study the change in population behaviors and sentiments as the virus spreads in different geographic areas, and quantify the effects of social distancing efforts and changes in human mobility patterns over course of the pandemic.”
The belief is that the data, which has been collected since March 10, could provide information on a number of different aspects of the pandemic, including the public sentiment towards government responses and the search for information on the outbreak.
With around 4.5 million tweets collected every day, it provides a new way of providing real-time insights into the pandemic. It’s an approach the team hope might eventually lead to changes in both public behavior and policy making.
“Indirectly, by being able to tackle sources of disinformation and highlight instances of people not following rules, I believe we can get everybody to do their part in flattening the curve,” they say. “In a future scenario, having this data will allow researchers to be better prepared and build systems to detect community transmission, and devise interventions to not be in the current position we are now.”