I know, “Who? Bought what?”
Connotate is a data fusion company which uses software bots (agents) to harvest information. Fetch Technologies, founded more than a decade ago, processes structured data.
Share ideas that matter on the social web and experience
the benefits of curating the world's best content.
I don't have a Facebook, a Twitter or a LinkedIn account
Your new post is loading...
|
luiy's curator insight,
January 7, 5:12 PM
TREC's tools for downloading tweets also have to allow for Twitter's strict rate-limiting. TREC attendees trying to download their dataset of 16 million tweets are subject to a maximum of 180 API calls every 15 minutes. Under those limits, it would take more than two weeks to get all 16 million tweets. Researchers trying to gather their own datasets would use Twitter's streaming APIs, which returns a real-time feed of tweets posted to Twitter, but the publicly-available API for that is limited to only a small fraction of total tweets to the service, around 1 percent. There is an API that allows all tweets to the service to be collected, the "firehose," but Twitter strongly limits access to it and charges a fee that is well outside the budget of most academic research.Gnip, Twitter's resellers for firehose access, charges $0.10 per thousand tweets. It would cost TREC, whose participants have no funding at all, $16,000 to get their 16 million tweets that way.
Patrizia Bertini's curator insight,
January 8, 3:31 AM
With millions of tweets transmitted every day, Twitter has become an important historical and cultural record and an immensely useful resource for researchers of politics, history, literature, language, and anything else you can imagine. Delete the scoop?
Are you sure you want to delete this scoop?
Yes
|