Computer scientists have developed a new tweet-screening method that not only delivers real-time data on flu cases, but also filters out online chatter that is not linked to actual flu infections.
Sifting through social media messages has become a popular way to track when and where flu cases occur, but a key hurdle hampers the process: how to identify flu-infection tweets. Some tweets are posted by people who have been sick with the virus, while others come from folks who are merely talking about the illness. If you are tracking actual flu cases, such conversations about the flu in general can skew the results.
To address this problem, Johns Hopkins computer scientists and researchers in the School of Medicine have developed a new tweet-screening method that not only delivers real-time data on flu cases, but also filters out online chatter that is not linked to actual flu infections. Comparing their method, which is based on analysis of 5,000 publicly available tweets per minute, to other Twitter-based tracking tools, the Johns Hopkins researchers say their real-time results track more closely with government disease data that takes much longer to compile.
"When you look at Twitter posts, you can see people talking about being afraid of catching the flu or asking friends if they should get a flu shot or mentioning a public figure who seems to be ill," said Mark Dredze, an assistant research professor in the Department of Computer Science who uses tweets to monitor public health trends. "But posts like this don't measure how many people have actually contracted the flu. We wanted to separate hype about the flu from messages from people who truly become ill."