Bits 'n Pieces on...
Follow
Find
713 views | +0 today
Scooped by onur savas
onto Bits 'n Pieces on Big Data R&D
Scoop.it!

Time-resolved contact data to estimate potential infection routes in hospitals

Time-resolved contact data to estimate potential infection routes in hospitals | Bits 'n Pieces on Big Data R&D | Scoop.it
A research project that aims to uncover fundamental patterns in social dynamics and coordinated human activity through a data-driven approach.
more...
No comment yet.
Bits 'n Pieces on Big Data R&D
Information and insight into Big Data R&D
Curated by onur savas
Your new post is loading...
Scooped by onur savas
Scoop.it!

Experimental evidence of massive-scale emotional contagion through social networks

Experimental evidence of massive-scale emotional contagion through social networks | Bits 'n Pieces on Big Data R&D | Scoop.it

"We show, via a massive (N = 689,003) experiment on Facebook, that emotional states can be transferred to others via emotional contagion, leading people to experience the same emotions without their awareness. We provide experimental evidence that emotional contagion occurs without direct interaction between people (exposure to a friend expressing an emotion is sufficient), and in the complete absence of nonverbal cues.

 "
more...
No comment yet.
Scooped by onur savas
Scoop.it!

One Hundred Million Creative Commons Flickr Images for Research

One Hundred Million Creative Commons Flickr Images for Research | Bits 'n Pieces on Big Data R&D | Scoop.it

The Flickr Creative Commons dataset as part of Yahoo Webscope’s datasets is made available for researchers . The dataset is one of the largest public multimedia datasets that has ever been released—99.3 million images and 0.7 million videos, all from Flickr and all under Creative Commons licensing.


The dataset (about 12GB) consists of a photo_id, a jpeg url or video url, and some corresponding metadata such as the title, description, title, camera type, title, and tags. Plus about 49 million of the photos are geotagged! What’s not there, like comments, favorites, and social network data, can be queried from the Flickr API

onur savas's insight:

"A back of the envelope estimation reports 10% of all photos in the world were taken in the last 12 months, and that was calculated three years ago. "


more...
No comment yet.
Scooped by onur savas
Scoop.it!

From mobile phone data to the spatial structure of cities : Scientific Reports : Nature Publishing Group

From mobile phone data to the spatial structure of cities : Scientific Reports : Nature Publishing Group | Bits 'n Pieces on Big Data R&D | Scoop.it
Pervasive infrastructures, such as cell phone networks, enable to capture large amounts of human behavioral data but also provide information about the structure of cities and their dynamical properties. In this article, we focus on these last aspects by studying phone data recorded during 55 days in 31 Spanish cities. We first define an urban dilatation index which measures how the average distance between individuals evolves during the day, allowing us to highlight different types of city structure. We then focus on hotspots, the most crowded places in the city. We propose a parameter free method to detect them and to test the robustness of our results. The number of these hotspots scales sublinearly with the population size, a result in agreement with previous theoretical arguments and measures on employment datasets. We study the lifetime of these hotspots and show in particular that the hierarchy of permanent ones, which constitute the /`heart/' of the city, is very stable whatever the size of the city. The spatial structure of these hotspots is also of interest and allows us to distinguish different categories of cities, from monocentric and [ldquo]segregated[rdquo] where the spatial distribution is very dependent on land use, to polycentric where the spatial mixing between land uses is much more important. These results point towards the possibility of a new, quantitative classification of cities using high resolution spatio-temporal data.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

StreetScore

StreetScore | Bits 'n Pieces on Big Data R&D | Scoop.it

"This is a collection of map visualizations of perceived safety of street views from cities in the United States. We will be releasing a map of perceived safety for a new city each week. These maps are based on StreetScore — a machine learning algorithm designed to predict how safe a street view looks to a human observer (see FAQ). The StreetScore algorithm was created by Nikhil Naik as part of a collaboration between the Macro Connections group and the Camera Culture group at MIT Media Lab. Jade Philipoom created the visualizations presented in the StreetScore website. "

onur savas's insight:

As of today (6/4/14), NYC, Boston, Chicago and Detroit is available.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Finding the Zebra in a Herd of Ponies- A new look at anomaly detection

Finding the Zebra in a Herd of Ponies- A new look at anomaly detection | Bits 'n Pieces on Big Data R&D | Scoop.it
The second publication in the O’Reilly Practical Machine Learning series, subtitled A New Look at Anomaly Detection by Ted Dunning and me, is being released this week.  In the previous book, which focused on practical approaches to recommendation, we started with the idea that everyone thinks “I want a pony”.  Here in the second book, what we want is to find the outlier, the zebra in a herd of ponies, the fish swimming against the school of fish, the rare event.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises

CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises | Bits 'n Pieces on Big Data R&D | Scoop.it

Locating timely and useful information during crises is critical for making potentially life-saving decisions. As the use of Twitter to broadcast useful information during such situations becomes more widespread, the problem of locating it becomes more difficult. CrisisLex is a lexicon of terms that frequently appear in crisis-relevant tweets. CrisisLex can be used to collect crisis-related messages from Twitter, and to automatically identify new terms that describe a specific crisis.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

RecSys 2014 (Silicon Valley) - RecSys

RecSys 2014 (Silicon Valley) - RecSys | Bits 'n Pieces on Big Data R&D | Scoop.it
Foster City, Silicon Valley, USA, 6th-10th October 2014

 

The ACM Recommender System conference (RecSys) is the premier international forum for the presentation of new research results, systems and techniques in the broad field of recommender systems.

onur savas's insight:

Also, of interest is the First Workshop on Recommendation Systems for Television and online Video (RecSysTV) that is happening in conjunction with this conference: http://boxfish.com/recsys

more...
No comment yet.
Scooped by onur savas
Scoop.it!

GraphLab | GraphLab Conference 2014

GraphLab | GraphLab Conference 2014 | Bits 'n Pieces on Big Data R&D | Scoop.it

Monday, July 21, 2014 from 8:00 AM to 7:00 PM (PDT) at Hotel Nikko San Francisco 

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Data Mining Reddit Posts Reveals How to Ask For a Favor--And Get it | MIT Technology Review

Data Mining Reddit Posts Reveals How to Ask For a Favor--And Get it | MIT Technology Review | Bits 'n Pieces on Big Data R&D | Scoop.it
There’s a secret to asking strangers for something and getting it. Now data scientists say they’ve discovered it by studying successful requests on the web
onur savas's insight:

The paper: http://arxiv.org/abs/1405.3282

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Geotagging One Hundred Million Twitter Accounts with Total Variation Minimization

Geotagging One Hundred Million Twitter Accounts with Total Variation Minimization | Bits 'n Pieces on Big Data R&D | Scoop.it
more...
No comment yet.
Scooped by onur savas
Scoop.it!

How One Woman Hid Her Pregnancy From Big Data

How One Woman Hid Her Pregnancy From Big Data | Bits 'n Pieces on Big Data R&D | Scoop.it
A Princeton University professor tried to hide her pregnancy from targeted online advertising for the past nine months. It wasn't easy.
onur savas's insight:

"...Pregnant women are incredibly valuable to marketers. For example, if a woman decides between Huggies and Pampers diapers, that's a valuable, long-term decision that establishes a consumption pattern. According to Vertesi, the average person's marketing data is worth 10 cents; a pregnant woman's data skyrockets to $1.50. And once targeted advertising finds a pregnant woman, it won't let up..."

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Projects | DATAINTERFACES

Projects | DATAINTERFACES | Bits 'n Pieces on Big Data R&D | Scoop.it

Data Inter­faces is an exper­i­men­tal research lab­o­ra­tory that aims to merge the com­pe­tences of com­mu­ni­ca­tion design, complex sys­tems sci­ence, and com­puter sci­ence in the cre­ation of inter­faces between data and people.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Happiness of Cities: Do Happy People Take Happy Images?

The Happiness of Cities: Do Happy People Take Happy Images? | Bits 'n Pieces on Big Data R&D | Scoop.it
A team of researchers from the University of California, San Diego and The Graduate Center, City University of New York (CUNY) is one of only six groups to win one of Twitter’s inaugural #DataGrants. To do so, they beat out more than 1,300 rival proposals from around the world.
onur savas's insight:

Also, check http://selfiecity.net/

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Google Replaces MapReduce With New Hyper-Scale Cloud Analytics System

Google Replaces MapReduce With New Hyper-Scale Cloud Analytics System | Bits 'n Pieces on Big Data R&D | Scoop.it
Says old distributed computing system does not handle petabyte-scale analytics well enough Read More
more...
No comment yet.
Scooped by onur savas
Scoop.it!

KDD Workshop on Learning Emergencies from Social Information 2014

KDD Workshop on Learning Emergencies from Social Information 2014 | Bits 'n Pieces on Big Data R&D | Scoop.it

Mobile phone data and the content generated by hundreds of millions of users on social media such as Twitter, or Facebook, present continuous data streams of human social activities, and offer an unprecedented opportunity to mine and understand the structure and dynamics of social and information behavior in various situations. In this workshop we will call attention to researching situations following large-scale emergencies, including natural disasters, terrorist attacks, and so on. These emergency events are now among the largest threats to national security. Over the last decade, natural disasters have affected more than 2.4 billions of people. There is an indisputably increasing need for new tools to strengthen disaster resilience at all levels of society. 

 How can we deal with data collected from heterogeneous and potentially biased sources? How can we properly understand social dynamics during emergencies? How can we turn such understanding into tools for decision makers? To better prepare for future emergencies, it is valuable to deeply understand the context within which the research can be applied. 
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Data Mining Reveals the Factors Driving the Price of Bitcoins | MIT Technology Review

Data Mining Reveals the Factors Driving the Price of Bitcoins | MIT Technology Review | Bits 'n Pieces on Big Data R&D | Scoop.it
Two years ago a single bitcoin was worth around $5. Today it is worth around $600. Now one economist has worked out exactly what forces are behind this dramatic increase.
onur savas's insight:

Ref: arxiv.org/abs/1406.0268 : What Are the Main Drivers of the Bitcoin Price? Evidence from Wavelet Coherence Analysis

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Open data saves New York City drivers from parking tickets

Open data saves New York City drivers from parking tickets | Bits 'n Pieces on Big Data R&D | Scoop.it

parking ticHere’s a great example of how making government data open can directly benefit you

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Bitcoin Transaction Network Dataset

Bitcoin Transaction Network Dataset | Bits 'n Pieces on Big Data R&D | Scoop.it

"Bitcoin (bitcoin.org) is a digital, cryptographically secure currency. Transactions between public-key "addresses" maintained in a distributed, verified public ledger form a transaction network that can be studied by network scientists. This code processes binary-format Bitcoin .dat files generated by the Bitcoin client (bitcoin.org, tested on v0.5.3.1 or lower) into human-readable flat-file formats, retaining all available information. Furthermore, we provide a data model to facilitate storage and querying in a relational database."

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Cloud computing beckons scientists

Cloud computing beckons scientists | Bits 'n Pieces on Big Data R&D | Scoop.it
Price and flexibility appeal as data sets grow.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

RMOA: Massive online data stream classifications with R & MOA

RMOA: Massive online data stream classifications with R & MOA | Bits 'n Pieces on Big Data R&D | Scoop.it

For those of you who don't know MOA. MOA stands for Massive On-line Analysis and is an open-source framework that allows to build and run experiments of machine learning or data mining on evolving data streams. The website of MOA (http://moa.cms.waikato.ac.nz) indicates it contains machine learning algorithms for classification, regression, clustering, outlier detection and recommendation engines.

onur savas's insight:
It is recommended especially for R users who work with a lot of data or encounter RAM issues when building models on large datasets, MOA. It uses a limited amount of memory.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Secret Science of Retweets | MIT Technology Review

The Secret Science of Retweets | MIT Technology Review | Bits 'n Pieces on Big Data R&D | Scoop.it
There’s a secret to persuading strangers to retweet your messages. And a machine learning algorithm has discovered it.
onur savas's insight:

The paper is at http://arxiv.org/abs/1405.3750

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Emotions Are Data, Too

Emotions Are Data, Too | Bits 'n Pieces on Big Data R&D | Scoop.it
And they’re not all about us.
onur savas's insight:

Also: http://blogs.wsj.com/experts/2014/04/28/the-limits-of-emotional-intelligence/

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Massachusetts Open Cloud » Rafik Hariri Institute for Computing and Computational Science & Engineering | Boston University

Massachusetts Open Cloud » Rafik Hariri Institute for Computing and Computational Science & Engineering | Boston University | Bits 'n Pieces on Big Data R&D | Scoop.it

With the governor of Massachusetts pledging $3 million in state support, BU leaders Friday announced plans for development of a pathbreaking computing cloud that could spur economic growth and technology innovation.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Web Science 2014 Data Challenge

Web Science 2014 Data Challenge | Bits 'n Pieces on Big Data R&D | Scoop.it

The web has generated huge amounts of data at massive
scale, but making sense of these datasets and representing them in a
compact and easily-interpretable way remains very difficult. The goal
of this challenge is to encourage innovative visualizations of web
data.  To enable this visualization, the following several large-scale, easy-to-use, publicly-available datasets are prepared:

1. Web traffic data, including more than 200 million HTTP requests
from browsers to servers;
2. Twitter data, including a sample of more than 22 million tweets;
3. Social bookmarking data, consisting of about 430,000 bookmarked pages;
4. Co-authorship of academic papers, consisting of about 21.5 million papers
and 10.8 million authors

onur savas's insight:

Rules:
1. For fairness, the visualization must be primarily based on the data
that we provide. Other datasets may be used to augment ours, but these
datasets must be publicly-available and described in detail in the
documentation (see #4 below).

2. The visualization must be a static image, and must be submitted as
a PDF. In addition to the main PDF, please submit a PNG version at a
resolution of about 640x480, for display on web pages, social media
sites, mobile devices, etc. This PNG version need not contain the full
visualization, but should be an appropriate representation (e.g. a
subset of the full PDF).

3. Please include a separate PDF file containing a description of the
visualization, including: (1) name(s), affiliation(s), and contact
information of the creator(s), (2) the purpose of the visualization,
(3) which dataset(s) were used, (4) a brief description of how the
visualizations was created, and (5) any other information you would
like to share with the judges.

4. By submitting your visualization, you agree to allow us to display
your visualization at the conference and on the Web Science website
and social media channels. (We will give proper attribution, of
course.) You also certify that you are the copyright holder of the
visualization and are authorized to give us this permission.

5. Entries are due by 11:59PM Hawaii time on April 15, 2014. Please
e-mail your entry to David Crandall. (If you do not receive a
confirmation email within 24 hours, your entry has not been received
and should be re-sent.)

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Looking for the Needle in a Stack of Needles: Tracking Shadow Economic Activities in the Age of Big Data | MIT Technology Review

Looking for the Needle in a Stack of Needles: Tracking Shadow Economic Activities in the Age of Big Data | MIT Technology Review | Bits 'n Pieces on Big Data R&D | Scoop.it

The undocumented guys hanging out in the home-improvement-store parking lot looking for day labor, the neighborhood kids running a lemonade stand, and Al Qaeda terrorists plotting to do harm all have one thing in common: They operate in the underground economy, a shadowy zone where businesses, both legitimate and less so, transact in the currency of opportunity, away from traditional institutions and their watchful eyes.

onur savas's insight:

Check the tool from SynerScope mentioned in the article: http://www.synerscope.com/

more...
No comment yet.