Bits 'n Pieces on Big Data
1.3K views | +0 today
Follow
Bits 'n Pieces on Big Data
Innovative information and insight into Big Data (if you like the content, please consider donating to my bitcoin address #3Pjof6N9xRAYXXSPZ4EAFLfHGn51ZdPcxi)
Curated by onur savas
Your new post is loading...
Your new post is loading...
Scooped by onur savas
Scoop.it!

Web Science 2014 Data Challenge

Web Science 2014 Data Challenge | Bits 'n Pieces on Big Data | Scoop.it

The web has generated huge amounts of data at massive
scale, but making sense of these datasets and representing them in a
compact and easily-interpretable way remains very difficult. The goal
of this challenge is to encourage innovative visualizations of web
data.  To enable this visualization, the following several large-scale, easy-to-use, publicly-available datasets are prepared:

1. Web traffic data, including more than 200 million HTTP requests
from browsers to servers;
2. Twitter data, including a sample of more than 22 million tweets;
3. Social bookmarking data, consisting of about 430,000 bookmarked pages;
4. Co-authorship of academic papers, consisting of about 21.5 million papers
and 10.8 million authors

onur savas's insight:

Rules:
1. For fairness, the visualization must be primarily based on the data
that we provide. Other datasets may be used to augment ours, but these
datasets must be publicly-available and described in detail in the
documentation (see #4 below).

2. The visualization must be a static image, and must be submitted as
a PDF. In addition to the main PDF, please submit a PNG version at a
resolution of about 640x480, for display on web pages, social media
sites, mobile devices, etc. This PNG version need not contain the full
visualization, but should be an appropriate representation (e.g. a
subset of the full PDF).

3. Please include a separate PDF file containing a description of the
visualization, including: (1) name(s), affiliation(s), and contact
information of the creator(s), (2) the purpose of the visualization,
(3) which dataset(s) were used, (4) a brief description of how the
visualizations was created, and (5) any other information you would
like to share with the judges.

4. By submitting your visualization, you agree to allow us to display
your visualization at the conference and on the Web Science website
and social media channels. (We will give proper attribution, of
course.) You also certify that you are the copyright holder of the
visualization and are authorized to give us this permission.

5. Entries are due by 11:59PM Hawaii time on April 15, 2014. Please
e-mail your entry to David Crandall. (If you do not receive a
confirmation email within 24 hours, your entry has not been received
and should be re-sent.)

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Looking for the Needle in a Stack of Needles: Tracking Shadow Economic Activities in the Age of Big Data | MIT Technology Review

Looking for the Needle in a Stack of Needles: Tracking Shadow Economic Activities in the Age of Big Data | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it

The undocumented guys hanging out in the home-improvement-store parking lot looking for day labor, the neighborhood kids running a lemonade stand, and Al Qaeda terrorists plotting to do harm all have one thing in common: They operate in the underground economy, a shadowy zone where businesses, both legitimate and less so, transact in the currency of opportunity, away from traditional institutions and their watchful eyes.

onur savas's insight:

Check the tool from SynerScope mentioned in the article: http://www.synerscope.com/

more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Limits of Big Data: A Review of Social Physics | MIT Technology Review

The Limits of Big Data: A Review of Social Physics | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
Tapping into big data, researchers and planners are building mathematical models of personal and civic behavior. But the models may hide rather than reveal the deepest sources of social ills.
more...
No comment yet.
Rescooped by onur savas from CxConferences
Scoop.it!

Massive Data Flow: Understanding the Complex Dynamics of the Web

The Web is perhaps the most complex system that we know. Its massive scale, complex dynamism, open richness, and social character mean that it may be more profitable to study it using tools and concepts appropriate for understanding nervous systems, organisms, ecosystems and society, rather than approaches more traditionally employed to engineer technology. Simultaneously, the scientists trying to understand this wide array of complex natural systems may have much to gain by considering the emergingstudy of the Web.

 

Massive Data Flow: Understanding the Complex Dynamics of the Web
Workshop at the ACM Web Science Conference 2014 (http://www.websci14.org )
10:00 - 18:00, June 23rd, 2014
Indiana University, Bloomington

http://sacral.c.u-tokyo.ac.jp/event/MDF_WebSci/ ;


Via Complexity Digest
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Eight (No, Nine!) Problems With Big Data

Eight (No, Nine!) Problems With Big Data | Bits 'n Pieces on Big Data | Scoop.it

It’s a valuable tool for analysis, but don’t believe all the hype.

onur savas's insight:

Opinions and insights on Big Data by Gary Marcus and Ernest Davis. Gary Marcus is a professor of psychology at New York University and an editor of the forthcoming book “The Future of the Brain.” Ernest Davis is a professor of computer science at New York University.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

How Your Tweets Reveal Your Home Location | MIT Technology Review

How Your Tweets Reveal Your Home Location | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
IBM researchers have developed an algorithm that predicts your home location using your last 200 tweets.
onur savas's insight:

The paper: http://arxiv.org/ftp/arxiv/papers/1403/1403.2345.pdf

 

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Yelp Dataset Challenge | Yelp

Yelp Dataset Challenge | Yelp | Bits 'n Pieces on Big Data | Scoop.it

How well can you guess a review's rating from its text alone? Can you take all of the reviews of a business and predict when it will be the most busy, or when the business is open? Can you predict if a business is good for kids? Has Wi-Fi? Has Parking? What makes a review useful, funny, or cool? Can you figure out which business a user is likely to review next? How much of a business's success is really just location, location, location? What businesses deserve their own subcategory (i.e., Szechuan or Hunan versus just "Chinese restaurants"), and can you learn this from the review text? What makes a tip useful? There is a myriad of deep, machine learning questions to tackle with this rich dataset.

onur savas's insight:

Targeted for academic research though. The deadline is Thursday, July 31, 2014.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Data Mining Reveals How Conspiracy Theories Emerge on Facebook | MIT Technology Review

Data Mining Reveals How Conspiracy Theories Emerge on Facebook | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
Some people are more susceptible to conspiracy theories than others, say computational social scientists who have studied how false ideas jump the “credulity barrier” on Facebook.
onur savas's insight:

The paper "Collective attention in the age of (mis)information" is at http://arxiv.org/abs/1403.3344. A group of scientists from NEU, INRIA and IMT (Italy).

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Facebook Creates Software That Matches Faces Almost as Well as You Do | MIT Technology Review

Facebook Creates Software That Matches Faces Almost as Well as You Do | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
Facebook’s new AI research group reports a major improvement in face-processing software.
onur savas's insight:

Using deep learning. It looks like Google is not the only one investing in deep learning.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Massive Visualizations at CeBIT Depict The Scale of “Big Data”

Massive Visualizations at CeBIT Depict The Scale of “Big Data” | Bits 'n Pieces on Big Data | Scoop.it

At this year’s CeBIT computer trade fair in Hannover, Germany, the world’s most impressive and eccentric new technology has been on display. But between the pole-dancing droids and the robot moon monkeys, the massive data visualizations on display at the fair’s CODE_n exhibition in Hall 16 have turned heads with their artistry, execution and scale

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Lab41/Dendrite

Lab41/Dendrite | Bits 'n Pieces on Big Data | Scoop.it

It turns out that much of the world, both physical and virtual, can be represented as a graph. Graphs describe things that are linked together such as web pages and human societies. Like many other topics, Web technologies can make these types of powerful mathematical concepts more accessible to everyday users. Dendrite is a Lab41 exploration of ways to analyze, manipulate, version, and share extremely large graphs:

The Web frontend leverages AngularJS to provide a responsive data-driven experience.The UI interacts with a backend instance of the Titan Distributed Graph Database.The backend uses GraphLab, Faunus, and Jung for graph analytics.
onur savas's insight:

For Lab41: https://www.lab41.org/

 

Director Bob Gleichof's talk in DG'13: http://www.youtube.com/watch?v=L4FiVuUckJc

more...
Scooped by onur savas
Scoop.it!

Strata 2014: Joe Hellerstein and Tutti Taygerly, "Big Data Moonshots and Ground Control"

http://strataconf.com/strata2014/public/schedule/detail/33714 f Big Data is the grand challenge of our time, most analytic effort is like ground control: the...
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Streamtools: A Graphical Tool for Working with Streams of Data

Streamtools: A Graphical Tool for Working with Streams of Data | Bits 'n Pieces on Big Data | Scoop.it

We see a moment coming when the collection of endless streams of data is commonplace. As this transition accelerates it is becoming increasingly apparent that our existing toolset for dealing with streams of data is lacking. Over the last 20 years we have invested heavily in tools that deal with tabulated data, from Excel, MySQL, and MATLAB to Hadoop, R, and Python+Numpy. These tools, when faced with a stream of never-ending data, fall short and diminish our creative potential.

In response to this shortfall we have created streamtools—a new, open source project by the New York Times R&D Lab which provides a general purpose, graphical tool for dealing with streams of data. It offers a vocabulary of operations that can be connected together to create live data processing systems without the need for programming or complicated infrastructure. These systems are assembled using a visual interface that affords both immediate understanding and live manipulation of the system

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Projects | DATAINTERFACES

Projects | DATAINTERFACES | Bits 'n Pieces on Big Data | Scoop.it

Data Inter­faces is an exper­i­men­tal research lab­o­ra­tory that aims to merge the com­pe­tences of com­mu­ni­ca­tion design, complex sys­tems sci­ence, and com­puter sci­ence in the cre­ation of inter­faces between data and people.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Happiness of Cities: Do Happy People Take Happy Images?

The Happiness of Cities: Do Happy People Take Happy Images? | Bits 'n Pieces on Big Data | Scoop.it
A team of researchers from the University of California, San Diego and The Graduate Center, City University of New York (CUNY) is one of only six groups to win one of Twitter’s inaugural #DataGrants. To do so, they beat out more than 1,300 rival proposals from around the world.
onur savas's insight:

Also, check http://selfiecity.net/

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Twitter #DataGrants selections | Twitter Blogs

Twitter #DataGrants selections | Twitter Blogs | Bits 'n Pieces on Big Data | Scoop.it
Learn more about the six institutions we’ve selected to receive Twitter #DataGrants.
onur savas's insight:

These are the selections:

Harvard Medical School / Boston Children’s Hospital (US): Foodborne Gastrointestinal Illness Surveillance using Twitter DataNICT (Japan): Disaster Information Analysis SystemUniversity of Twente (Netherlands): The Diffusion And Effectiveness of Cancer Early Detection Campaigns on TwitterUCSD (US): Do happy people take happy images? Measuring happiness of citiesUniversity of Wollongong (Australia): Using GeoSocial Intelligence to Model Urban Flooding in Jakarta, IndonesiaUniversity of East London (UK): Exploring the relationship between Tweets and Sports Team Performance

 

more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Data Mining Techniques That Reveal Our Planet's Cultural Links and Boundaries | MIT Technology Review

The Data Mining Techniques That Reveal Our Planet's Cultural Links and Boundaries | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
Studying cultural variation around the world has always been expensive, time-consuming work. Which is why the newfound ability to mine the data from location-based social networks is revolutionizing this science.
onur savas's insight:

The paper is available at  http://arxiv.org/abs/1404.1009: You Are What You Eat (and Drink): Identifying Cultural Boundaries By Analyzing Food & Drink Habits In Foursquare

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Scientific method: Statistical errors

Scientific method: Statistical errors | Bits 'n Pieces on Big Data | Scoop.it
P values, the 'gold standard' of statistical validity, are not as reliable as many scientists assume.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

Data Analytics Handbook

Data Analytics Handbook | Bits 'n Pieces on Big Data | Scoop.it

On-the-job experiences in the Big Data Industry with employees from LinkedIn, Facebook, Yelp, Cloudera, and more.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Last 20 Inches: Data’s Treacherous Journey from the Screen to the Mind | MIT Technology Review

The Last 20 Inches: Data’s Treacherous Journey from the Screen to the Mind | MIT Technology Review | Bits 'n Pieces on Big Data | Scoop.it
Data is crucial to our lives, but it can be hard to make sense of. That’s what makes these visualization tools potentially transformative.
more...
No comment yet.
Scooped by onur savas
Scoop.it!

DIMACS Workshop 2014

DIMACS Workshop  2014 | Bits 'n Pieces on Big Data | Scoop.it
DIMACS Workshop on Building Communities for Transforming Social Media Research Through New Approaches for Collecting, Analyzing, and Exploring Social Media DataApril 10 - 11, 2014 
DIMACS Center, CoRE Building, Rutgers University
onur savas's insight:

Many interesting papers. For example: Matthew J. Salganik, "Wiki Surveys: Open and Quantifiable Social Data Collection."

Presented under the auspices of the DIMACS Special Focus on Information Sharing and Dynamic Data Analysis.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Zooniverse - Real Science Online

Zooniverse - Real Science Online | Bits 'n Pieces on Big Data | Scoop.it

The Zooniverse is home to the internet's largest, most popular and most successful citizen science projects. Our current projects are here but plenty more are on the way. If you're new to the Zooniverse, we suggest picking a project and diving in - the same account will get you into all of our projects, and you can keep track of what you've contributed by watching 'My Zooniverse'.

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Making Sense of Data (Course)

Making Sense of Data (Course) | Bits 'n Pieces on Big Data | Scoop.it

This self-paced, online course is intended for anyone who wants to learn more about how to structure, visualize, and manipulate data. This includes student, educators, researchers, journalists, and small business owners.

onur savas's insight:

A short one (March 18-April 4) though covers basics of data science. 

more...
No comment yet.
Scooped by onur savas
Scoop.it!

Analyzing Tweets on Malaysia Flight #MH370

Analyzing Tweets on Malaysia Flight #MH370 | Bits 'n Pieces on Big Data | Scoop.it
My QCRI colleague Dr. Imran is using our AIDR platform (Artificial Intelligence for Disaster Response) to collect & analyze tweets related to Malaysia Flight 370 that went missing several days ...
more...
No comment yet.
Scooped by onur savas
Scoop.it!

The Languages of Twitter Users

The Languages of Twitter Users | Bits 'n Pieces on Big Data | Scoop.it

Twitter styles itself as the “global town square” for public conversations — a place for fans to gossip about an actress falling at the Academy Awards and for critics of Tunisia’s government topublicly decry the death of an opposition leader.

To measure Twitter’s global impact, Gnip, a social data firm, studied the firehose of posts over the years. The above chart tracks tweets from users who selected a primary language in their profiles since the service went live in 2006. Last year, among people who told Twitter they had a preferred language, almost 49 percent of tweets were from users who chose Japanese, Spanish, Portuguese and other languages other than English.

more...
No comment yet.