Bits 'n Pieces on Big Data
1.2K views | +0 today
Follow
Bits 'n Pieces on Big Data
Innovative information and insight into Big Data (if you like the content, please consider donating to my bitcoin address #3Pjof6N9xRAYXXSPZ4EAFLfHGn51ZdPcxi)
Curated by onur savas
Your new post is loading...
Your new post is loading...
Scooped by onur savas
Scoop.it!

Flying faster with Twitter Heron | Twitter Blogs

Flying faster with Twitter Heron | Twitter Blogs | Bits 'n Pieces on Big Data | Scoop.it
A new real-time analytics platform at Twitter scale.
onur savas's insight:

Possibly replacing Apache Storm.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Twitter Cuts Off DataSift To Step Up Its Own Big Data Business

Twitter Cuts Off DataSift To Step Up Its Own Big Data Business | Bits 'n Pieces on Big Data | Scoop.it
In the push for more revenue growth, Twitter has been building up its business in areas like advertising and commerce, but a move made late Friday night..
more...
No comment yet.
Scooped by onur savas
Scoop.it!

All-pairs similarity via DIMSUM | Twitter Blogs

All-pairs similarity via DIMSUM | Twitter Blogs | Bits 'n Pieces on Big Data | Scoop.it
Given a dataset of sparse vector data, we solve the problem of finding all similar vector pairs according to a similarity function.
onur savas's insight:

Twitter uses this algorithm to find users, hashtags and ads that are very similar to one another, so they may be recommended and shown to users and advertisers. Sources codes are also provided:


Scalding github pull-request: https://github.com/twitter/scalding/pull/833
Spark github pull-request: https://github.com/apache/spark/pull/336

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Fighting spam with BotMaker | Twitter Blogs

Fighting spam with BotMaker  | Twitter Blogs | Bits 'n Pieces on Big Data | Scoop.it
Spam on Twitter is different from traditional spam primarily because of two aspects of our platform: Twitter exposes developer APIs to make it easy to interact with the platform and real-time conte...
more...
No comment yet.
Scooped by onur savas
Scoop.it!

CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises

CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises | Bits 'n Pieces on Big Data | Scoop.it

Locating timely and useful information during crises is critical for making potentially life-saving decisions. As the use of Twitter to broadcast useful information during such situations becomes more widespread, the problem of locating it becomes more difficult. CrisisLex is a lexicon of terms that frequently appear in crisis-relevant tweets. CrisisLex can be used to collect crisis-related messages from Twitter, and to automatically identify new terms that describe a specific crisis.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Happiness of Cities: Do Happy People Take Happy Images?

The Happiness of Cities: Do Happy People Take Happy Images? | Bits 'n Pieces on Big Data | Scoop.it
A team of researchers from the University of California, San Diego and The Graduate Center, City University of New York (CUNY) is one of only six groups to win one of Twitter’s inaugural #DataGrants. To do so, they beat out more than 1,300 rival proposals from around the world.
onur savas's insight:

Also, check http://selfiecity.net/

more...
No comment yet.
Scooped by onur savas
Scoop.it!

How Your Tweets Reveal Your Home Location | MIT Technology Review

How Your Tweets Reveal Your Home Location | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
IBM researchers have developed an algorithm that predicts your home location using your last 200 tweets.
onur savas's insight:

The paper: http://arxiv.org/ftp/arxiv/papers/1403/1403.2345.pdf

 

more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Languages of Twitter Users

The Languages of Twitter Users | Bits 'n Pieces on Big Data | Scoop.it

Twitter styles itself as the “global town square” for public conversations — a place for fans to gossip about an actress falling at the Academy Awards and for critics of Tunisia’s government topublicly decry the death of an opposition leader.

To measure Twitter’s global impact, Gnip, a social data firm, studied the firehose of posts over the years. The above chart tracks tweets from users who selected a primary language in their profiles since the service went live in 2006. Last year, among people who told Twitter they had a preferred language, almost 49 percent of tweets were from users who chose Japanese, Spanish, Portuguese and other languages other than English.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters

Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters | Bits 'n Pieces on Big Data | Scoop.it
People connect to form groups on Twitter for a variety of purposes. The networks they create have identifiable contours that are shaped by the topic being discussed, the information and influencers driving the conversation, and the social network structures of the participants.
more...
No comment yet.
Rescooped by onur savas from Papers
Scoop.it!

Twitter Trends Help Researchers Forecast Viral Memes

Twitter Trends Help Researchers Forecast Viral Memes | Bits 'n Pieces on Big Data | Scoop.it

What makes a meme— an idea, a phrase, an image—go viral? For starters, the meme must have broad appeal, so it can spread not just within communities of like-minded individuals but can leap from one community to the next. Researchers, by mining public Twitter data, have found that a meme's “virality” is often evident from the start. After only a few dozen tweets, a typical viral meme (as defined by tweets using a given hashtag) will already have caught on in numerous communities of Twitter users. In contrast, a meme destined to peter out will resonate in fewer groups.

 


Via Claudia Mihai, Complexity Digest
more...
june holley's curator insight, January 23, 2014 8:31 AM

Some important ideas here for people interested in change.

Premsankar Chakkingal's curator insight, January 30, 2014 8:58 AM

Forecasting the Future Twitter Trends in hashtags

Christian Verstraete's curator insight, February 3, 2014 4:48 AM

Twitter, what happens when things go viral?

Scooped by onur savas
Scoop.it!

The Curses of Heterogeneity in Big Data

The Curses of Heterogeneity in Big Data | Bits 'n Pieces on Big Data | Scoop.it
“Both theoretical and empirical research may be unnecessarily complicated by failure to recognize the effects of heterogeneity” - Vaupel & Yashin Big Data is daily topic of conversation among data analysts, with much said and written about its...
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Graphics Processors Speed Up Twitter Visualizations | MIT Technology Review

Graphics Processors Speed Up Twitter Visualizations | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
A new database tool dramatically improves processing speeds using technology that’s already in your computer.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Twitter's Influence Problem, Visualized

Twitter's Influence Problem, Visualized | Bits 'n Pieces on Big Data | Scoop.it
BuzzFeed's internal data shows that Twitter is a trendsetterbut competing social networks drive more traffic.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

How the #BattleForNumber10 played out on Twitter | Twitter Blogs

How the #BattleForNumber10 played out on Twitter  | Twitter Blogs | Bits 'n Pieces on Big Data | Scoop.it
The campaign debates unfolds on Twitter as election season heats up.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Secret Science of Retweets | MIT Technology Review

The Secret Science of Retweets | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
There’s a secret to persuading strangers to retweet your messages. And a machine learning algorithm has discovered it.
onur savas's insight:

The paper is at http://arxiv.org/abs/1405.3750

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Web Science 2014 Data Challenge

Web Science 2014 Data Challenge | Bits 'n Pieces on Big Data | Scoop.it

The web has generated huge amounts of data at massive
scale, but making sense of these datasets and representing them in a
compact and easily-interpretable way remains very difficult. The goal
of this challenge is to encourage innovative visualizations of web
data.  To enable this visualization, the following several large-scale, easy-to-use, publicly-available datasets are prepared:

1. Web traffic data, including more than 200 million HTTP requests
from browsers to servers;
2. Twitter data, including a sample of more than 22 million tweets;
3. Social bookmarking data, consisting of about 430,000 bookmarked pages;
4. Co-authorship of academic papers, consisting of about 21.5 million papers
and 10.8 million authors

onur savas's insight:

Rules:
1. For fairness, the visualization must be primarily based on the data
that we provide. Other datasets may be used to augment ours, but these
datasets must be publicly-available and described in detail in the
documentation (see #4 below).

2. The visualization must be a static image, and must be submitted as
a PDF. In addition to the main PDF, please submit a PNG version at a
resolution of about 640x480, for display on web pages, social media
sites, mobile devices, etc. This PNG version need not contain the full
visualization, but should be an appropriate representation (e.g. a
subset of the full PDF).

3. Please include a separate PDF file containing a description of the
visualization, including: (1) name(s), affiliation(s), and contact
information of the creator(s), (2) the purpose of the visualization,
(3) which dataset(s) were used, (4) a brief description of how the
visualizations was created, and (5) any other information you would
like to share with the judges.

4. By submitting your visualization, you agree to allow us to display
your visualization at the conference and on the Web Science website
and social media channels. (We will give proper attribution, of
course.) You also certify that you are the copyright holder of the
visualization and are authorized to give us this permission.

5. Entries are due by 11:59PM Hawaii time on April 15, 2014. Please
e-mail your entry to David Crandall. (If you do not receive a
confirmation email within 24 hours, your entry has not been received
and should be re-sent.)

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Twitter #DataGrants selections | Twitter Blogs

Twitter #DataGrants selections | Twitter Blogs | Bits 'n Pieces on Big Data | Scoop.it
Learn more about the six institutions we’ve selected to receive Twitter #DataGrants.
onur savas's insight:

These are the selections:

Harvard Medical School / Boston Children’s Hospital (US): Foodborne Gastrointestinal Illness Surveillance using Twitter DataNICT (Japan): Disaster Information Analysis SystemUniversity of Twente (Netherlands): The Diffusion And Effectiveness of Cancer Early Detection Campaigns on TwitterUCSD (US): Do happy people take happy images? Measuring happiness of citiesUniversity of Wollongong (Australia): Using GeoSocial Intelligence to Model Urban Flooding in Jakarta, IndonesiaUniversity of East London (UK): Exploring the relationship between Tweets and Sports Team Performance

 

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Analyzing Tweets on Malaysia Flight #MH370

Analyzing Tweets on Malaysia Flight #MH370 | Bits 'n Pieces on Big Data | Scoop.it
My QCRI colleague Dr. Imran is using our AIDR platform (Artificial Intelligence for Disaster Response) to collect & analyze tweets related to Malaysia Flight 370 that went missing several days ...
more...
No comment yet.
Rescooped by onur savas from Papers
Scoop.it!

The Bursty Dynamics of the Twitter Information Network

In online social media systems users are not only posting, consuming, and resharing content, but also creating new and destroying existing connections in the underlying social network. While each of these two types of dynamics has individually been studied in the past, much less is known about the connection between the two. How does user information posting and seeking behavior interact with the evolution of the underlying social network structure?
Here, we study ways in which network structure reacts to users posting and sharing content. We examine the complete dynamics of the Twitter information network, where users post and reshare information while they also create and destroy connections. We find that the dynamics of network structure can be characterized by steady rates of change, interrupted by sudden bursts. Information diffusion in the form of cascades of post re-sharing often creates such sudden bursts of new connections, which significantly change users' local network structure. These bursts transform users' networks of followers to become structurally more cohesive as well as more homogenous in terms of follower interests. We also explore the effect of the information content on the dynamics of the network and find evidence that the appearance of new topics and real-world events can lead to significant changes in edge creations and deletions. Lastly, we develop a model that quantifies the dynamics of the network and the occurrence of these bursts as a function of the information spreading through the network. The model can successfully predict which information diffusion events will lead to bursts in network dynamics.

 

The Bursty Dynamics of the Twitter Information Network
Seth A. Myers, Jure Leskovec

http://arxiv.org/abs/1403.2732


Via Complexity Digest
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Twitter Data Grants

Twitter Data Grants | Bits 'n Pieces on Big Data | Scoop.it

With more than 500 million Tweets a day, Twitter has an expansive set of data from which we can glean insights and learn about a variety of topics, from health-related information such as when and where the flu may hit to global events like ringing in the new year. To date, it has been challenging for researchers outside the company who are tackling big questions to collaborate with us to access our public, historical data. Twitter Data Grants program aims to change that by connecting research institutions and academics with the data they need.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Twitter buzz about papers does not mean citations later

Twitter buzz about papers does not mean citations later | Bits 'n Pieces on Big Data | Scoop.it
Analysis of science on social media service finds little correlation with standard measures of academic success.
more...
No comment yet.