Bits 'n Pieces on Big Data
1.2K views | +0 today
Follow
Bits 'n Pieces on Big Data
Innovative information and insight into Big Data (if you like the content, please consider donating to my bitcoin address #3Pjof6N9xRAYXXSPZ4EAFLfHGn51ZdPcxi)
Curated by onur savas
Your new post is loading...
Your new post is loading...
Scooped by onur savas
Scoop.it!

Who’s More Famous Than Jesus?

Who’s More Famous Than Jesus? | Bits 'n Pieces on Big Data | Scoop.it
And more questions answered by a just-released interactive catalog of fame compiled by a team of researchers at M.I.T.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Detecting Emotional Contagion in Massive Social Networks

Detecting Emotional Contagion in Massive Social Networks | Bits 'n Pieces on Big Data | Scoop.it

Happiness and other emotions spread between people in direct contact, but it is unclear whether massive online social networks also contribute to this spread. Here, we elaborate a novel method for measuring the contagion of emotional expression. With data from millions of Facebook users, we show that rainfall directly influences the emotional content of their status messages, and it also affects the status messages of friends in other cities who are not experiencing rainfall. For every one person affected directly, rainfall alters the emotional expression of about one to two other people, suggesting that online social networks may magnify the intensity of global emotional synchrony.

more...
No comment yet.
Rescooped by onur savas from Papers
Scoop.it!

The Bursty Dynamics of the Twitter Information Network

In online social media systems users are not only posting, consuming, and resharing content, but also creating new and destroying existing connections in the underlying social network. While each of these two types of dynamics has individually been studied in the past, much less is known about the connection between the two. How does user information posting and seeking behavior interact with the evolution of the underlying social network structure?
Here, we study ways in which network structure reacts to users posting and sharing content. We examine the complete dynamics of the Twitter information network, where users post and reshare information while they also create and destroy connections. We find that the dynamics of network structure can be characterized by steady rates of change, interrupted by sudden bursts. Information diffusion in the form of cascades of post re-sharing often creates such sudden bursts of new connections, which significantly change users' local network structure. These bursts transform users' networks of followers to become structurally more cohesive as well as more homogenous in terms of follower interests. We also explore the effect of the information content on the dynamics of the network and find evidence that the appearance of new topics and real-world events can lead to significant changes in edge creations and deletions. Lastly, we develop a model that quantifies the dynamics of the network and the occurrence of these bursts as a function of the information spreading through the network. The model can successfully predict which information diffusion events will lead to bursts in network dynamics.

 

The Bursty Dynamics of the Twitter Information Network
Seth A. Myers, Jure Leskovec

http://arxiv.org/abs/1403.2732


Via Complexity Digest
more...
No comment yet.
Scooped by onur savas
Scoop.it!

BigData Top100

BigData Top100 | Bits 'n Pieces on Big Data | Scoop.it

The BigData Top100 List initiative is an open community-based effort for benchmarking big data systems.

onur savas's insight:

The goal is to create a benchmarking suite to rank top 100 Big Data systems, similar to, for example, top 500 supercomputers.

A related article: http://www.datanami.com/datanami/2013-03-06/a_new_benchmark_for_big_data.html?featured=top 

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Yahoo! Webscope

Yahoo! Webscope | Bits 'n Pieces on Big Data | Scoop.it

The Yahoo Webscope Program is a reference library of interesting and scientifically useful datasets for non-commercial use by academics and other scientists. All datasets have been reviewed to conform to Yahoo’s data protection standards, including strict controls on privacy. We offer data in the following categories: Graph and Social Data, Ratings and Classification Data, Advertising and Market Data, Competition Data, Computing Systems Data, Image Data, and Language Data.

onur savas's insight:

Only available for academia though.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

50+ Open Source Tools for Big Data

50+ Open Source Tools for Big Data | Bits 'n Pieces on Big Data | Scoop.it
pen source software tools have become all the rage, especially around big data and that is a GOOD thing. I have accumulated 3 lists that are very popular. Please let me know if you see things missing
more...
No comment yet.
Scooped by onur savas
Scoop.it!

An Analysis of Facebook Photo Caching

onur savas's insight:

"This paper examines the workload of Facebook’s photoserving
stack and the effectiveness of the many layers
of caching it employs. Facebook’s image-management
infrastructure is complex and geographically distributed.
It includes browser caches on end-user systems, Edge
Caches at ~20 PoPs, an Origin Cache, and for some
kinds of images, additional caching via Akamai. The
underlying image storage layer is widely distributed, and
includes multiple data centers."

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Data-intensive ecology needed to understand what makes the biosphere tick

Data-intensive ecology needed to understand what makes the biosphere tick | Bits 'n Pieces on Big Data | Scoop.it

Have you looked closely at a local pond, meadow or forest--or at nature in your suburb or city--and observed changes in it over time? That's exactly what scientists are trying to do on a larger, regional to continental scale--a macrosystems biology scale.

Macrosystems biology might be called "biological sciences writ large.

Scientists funded by the National Science Foundation's (NSF) MacroSystems Biology Program are working to better detect, understand and predict the effects of climate and land-use change on organisms and ecosystems at regional to continental scales.

The researchers have published new results in this month's special issue of the journal Frontiers in Ecology and the Environment, published by the Ecological Society of America.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Evolution: quantity over quality?

Evolution: quantity over quality? | Bits 'n Pieces on Big Data | Scoop.it

When you think about evolution, 'survival of the fittest' is probably one of the first things that comes into your head. However, new research from Oxford University finds that the 'fittest' may never arrive in the first place and so aren’t around to survive.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Elsevier opens its 11M papers to text-mining

Elsevier opens its 11M papers to text-mining | Bits 'n Pieces on Big Data | Scoop.it

Elsevier says that it has now made it easy for scientists to extract facts and data computationally from its more than 11 million online research papers.

more...
Ralph Poole's curator insight, February 5, 2014 10:27 AM

This is important news for those of us that work with clients in knowledge intensive scientific industries.  Ingesting and analyzing this content will have profound impact on our ability to make connections and see patterns in scientific literature.

Scooped by onur savas
Scoop.it!

A co-Relational Model of Data for Large Shared Data Banks - ACM Queue

A co-Relational Model of Data for Large Shared Data Banks - ACM Queue | Bits 'n Pieces on Big Data | Scoop.it
contrary to popular belief, sql and nosql are really just two sides of the same coin.
more...
No comment yet.
Rescooped by onur savas from Papers
Scoop.it!

Who is Dating Whom: Characterizing User Behaviors of a Large Online Dating Site

Online dating sites have become popular platforms for people to look for potential romantic partners. It is important to understand users' dating preferences in order to make better recommendations on potential dates. The message sending and replying actions of a user are strong indicators for what he/she is looking for in a potential date and reflect the user's actual dating preferences. We study how users' online dating behaviors correlate with various user attributes using a large real-world dateset from a major online dating site in China. Many of our results on user messaging behavior align with notions in social and evolutionary psychology: males tend to look for younger females while females put more emphasis on the socioeconomic status (e.g., income, education level) of a potential date. In addition, we observe that the geographic distance between two users and the photo count of users play an important role in their dating behaviors. Our results show that it is important to differentiate between users' true preferences and random selection. Some user behaviors in choosing attributes in a potential date may largely be a result of random selection. We also find that both males and females are more likely to reply to users whose attributes come closest to the stated preferences of the receivers, and there is significant discrepancy between a user's stated dating preference and his/her actual online dating behavior. These results can provide valuable guidelines to the design of a recommendation engine for potential dates.

 

Who is Dating Whom: Characterizing User Behaviors of a Large Online Dating Site
Peng Xia, Kun Tu, Bruno Ribeiro, Hua Jiang, Xiaodong Wang, Cindy Chen, Benyuan Liu, Don Towsley

http://arxiv.org/abs/1401.5710


Via Complexity Digest
more...
Urbansocial's curator insight, July 14, 2014 11:41 AM

Urban Social - Online dating for sociable singles www.urbansocial.com

Scooped by onur savas
Scoop.it!

Why Big Data Isn’t Necessarily Better Data | Observations, Scientific American Blog Network

Why Big Data Isn’t Necessarily Better Data | Observations, Scientific American Blog Network | Bits 'n Pieces on Big Data | Scoop.it
Tech companies—Facebook, Google and IBM, to name a few—are quick to tout the world-changing powers of “big data” gleaned from mobile devices, Web searches, citizen science ...
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Presentations - 2014 NIST Data Science Symposium

Presentations - 2014 NIST Data Science Symposium | Bits 'n Pieces on Big Data | Scoop.it
Presentations for the NIST Data Science Symposium that took place on March 4-5 2014
more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Parable of Google Flu: Traps in Big Data Analysis

The Parable of Google Flu: Traps in Big Data Analysis | Bits 'n Pieces on Big Data | Scoop.it
onur savas's insight:

The paper in PDF is at http://gking.harvard.edu/files/gking/files/0314policyforumff.pdf.

 

 

more...
No comment yet.
Scooped by onur savas
Scoop.it!

BG Benchmark

BG Benchmark | Bits 'n Pieces on Big Data | Scoop.it

BG is a benchmark to evaluate performance of a data store for interactive social networking actions and sessions.These actions and sessions either read or update a very small amount of the entire data set.

One may use BG to compute either a Social Action Rating (SoAR) or a Socialites rating of a data store.These ratings compute the number of concurrent actions performed by a system when a fixed percentage of requests (say 95%) observe a latency equal to or lower than a pre-specifid threshold (say 100 msec) with the amount of unpredictable data less than a fixed threshold (say 0.01%) for some fixed duration of time (say 10 minutes). The values in the parantheses are inputs to BG. BG's output is the SoAR and Socialites rating of its target data store.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Stratosphere » Next Generation Big Data Analytics Platform

Stratosphere » Next Generation Big Data Analytics Platform | Bits 'n Pieces on Big Data | Scoop.it
Stratosphere is an Open Source platform for massively parallel big data analytics. It features a rich set of operators, advanced, iterative data flows, an efficient runtime, and automatic program optimization.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters

Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters | Bits 'n Pieces on Big Data | Scoop.it
People connect to form groups on Twitter for a variety of purposes. The networks they create have identifiable contours that are shaped by the topic being discussed, the information and influencers driving the conversation, and the social network structures of the participants.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Science of Social Interactions on the Web

The Science of Social Interactions on the Web | Bits 'n Pieces on Big Data | Scoop.it

Last September, Google Research Scientist +Ed Chi gave the talk, The Science of Social Interactions on the Web, at the 3rd Stanford Conference for Computational Social Science (CompSS, http://goo.gl/v2sNTI). 

In his ~30 minute talk, Ed gives a broad overview of the active research in modeling successful online social systems, how they start, and ideas to improve them, touching upon ecological population growth models, cultural anthropologist Edward Hall’s proxemics (http://goo.gl/14ZI7L), and information diffusion across language barriers.  

Hosted by the Center for Computational Social Science (http://goo.gl/BJ1Jrc), CompSS provides a platform for collaboration across the social science and computer science research communities, advancing both with theories and understandings that can guide computational analysis.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Social Media Analysis of the Syrian Conflict

Social Media Analysis of the Syrian Conflict | Bits 'n Pieces on Big Data | Scoop.it

But the event most heavily covered by social media is the civil war in Syria, which has now raged for almost three year. The conflict has been extensively recorded on videos which are regularly uploaded to YouTube and then tweeted around the world. All sides in the conflict seem to be engaged with numerous social media accounts.

So an interesting question is to what extent does social media activity reflect the situation on the ground. That’s exactly the problem addressed today by Derek O’Callaghan at University College Dublin and a few pals. Their conclusion is that “social media activity in Syria is considerably more convoluted than reported in many other studies of online political activism that find a straightforward polarization effect.”

onur savas's insight:

The paper can be found in: arxiv.org/abs/1401.7535 (Online Social Media in the Syria Conflict: Encompassing the Extremes and the In-Betweens)

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Twitter Data Grants

Twitter Data Grants | Bits 'n Pieces on Big Data | Scoop.it

With more than 500 million Tweets a day, Twitter has an expansive set of data from which we can glean insights and learn about a variety of topics, from health-related information such as when and where the flu may hit to global events like ringing in the new year. To date, it has been challenging for researchers outside the company who are tackling big questions to collaborate with us to access our public, historical data. Twitter Data Grants program aims to change that by connecting research institutions and academics with the data they need.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

1st International Workshop on Scalable Computing For Real-Time Big Data Applications (SCRAMBL'14)

1st International Workshop on Scalable Computing For Real-Time Big Data Applications (SCRAMBL'14) | Bits 'n Pieces on Big Data | Scoop.it

1st International Workshop on Scalable Computing For Real-Time Big Data Applications. This workshop aims at providing a venue for designers, practitioners, researchers, developers, and industrial/governmental partners to come together, present and discuss leading research results, use cases, innovative ideas, challenges, and opportunities that arise from real-time big data applications.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

DARPA Open Catalog

DARPA Open Catalog | Bits 'n Pieces on Big Data | Scoop.it

DARPA Open Catalog contains a curated list of DARPA-sponsored software and peer-reviewed publications.

onur savas's insight:

Mostly from DARPA XDATA.

more...
No comment yet.
Rescooped by onur savas from Papers
Scoop.it!

Twitter Trends Help Researchers Forecast Viral Memes

Twitter Trends Help Researchers Forecast Viral Memes | Bits 'n Pieces on Big Data | Scoop.it

What makes a meme— an idea, a phrase, an image—go viral? For starters, the meme must have broad appeal, so it can spread not just within communities of like-minded individuals but can leap from one community to the next. Researchers, by mining public Twitter data, have found that a meme's “virality” is often evident from the start. After only a few dozen tweets, a typical viral meme (as defined by tweets using a given hashtag) will already have caught on in numerous communities of Twitter users. In contrast, a meme destined to peter out will resonate in fewer groups.

 


Via Claudia Mihai, Complexity Digest
more...
june holley's curator insight, January 23, 2014 8:31 AM

Some important ideas here for people interested in change.

Premsankar Chakkingal's curator insight, January 30, 2014 8:58 AM

Forecasting the Future Twitter Trends in hashtags

Christian Verstraete's curator insight, February 3, 2014 4:48 AM

Twitter, what happens when things go viral?