Research
6 views | +0 today
Follow
Your new post is loading...
Your new post is loading...
Rescooped by Hugo Marcelo Muriel Arriaran from e-Xploration
Scoop.it!

Clustering Datasets by Complex Networks Analysis | #datascience #complexity

Clustering Datasets by Complex Networks Analysis | #datascience #complexity | Research | Scoop.it

This paper proposes a method based on complex networks analysis, devised to perform clustering on multidimensional datasets. In particular, the method maps the elements of the dataset in hand to a weighted network according to the similarity that holds among data. Network weights are computed by transforming the Euclidean distances measured between data according to a Gaussian model. Notably, this model depends on a parameter that controls the shape of the actual functions. Running the Gaussian transformation with different values of the parameter allows to perform multiresolution analysis, which gives important information about the number of clusters expected to be optimal or suboptimal.

 

Clustering datasets by complex networks analysis
Giuliano Armano and Marco Alberto Javarone

Complex Adaptive Systems Modeling 2013, 1:5 http://dx.doi.org/10.1186/2194-3206-1-5


Via Complexity Digest, luiy
more...
luiy's curator insight, July 28, 2013 8:11 AM

The proposed method, called DAN (standing for Datasets as Networks), makes a step forward in the direction of investigating the possibility of using complex network analysis as a proper machine learning tool. The remainder of the paper is structured as follows: Section Methods describes how to model a dataset as complex network and gives details about multiresolution analysis. For the sake of readability, the section briefly recalls also some informative notion about the adopted community detection algorithm. Section Results and discussion illustrates the experiments and analyzes the corresponding results. The section recalls also some relevant notions of clustering, including two well‐known algorithms, used therein for the sake of comparison. Conclusions (i.e. Section Conclusions) end the paper.

Rescooped by Hugo Marcelo Muriel Arriaran from e-Xploration
Scoop.it!

#Virality Prediction and Community Structure in Social Networks | #dataviz #prediction

#Virality Prediction and Community Structure in Social Networks | #dataviz #prediction | Research | Scoop.it
How does network structure affect diffusion? Recent studies suggest that the answer depends on the type of contagion. Complex contagions, unlike infectious diseases (simple contagions), are affected by social reinforcement and homophily.

Via luiy
more...
luiy's curator insight, September 1, 2013 5:47 PM

How does network structure affect diffusion? Recent studies suggest that the answer depends on the type of contagion. Complex contagions, unlike infectious diseases (simple contagions), are affected by social reinforcement and homophily. Hence, the spread within highly clustered communities is enhanced, while diffusion across communities is hampered. A common hypothesis is that memes and behaviors are complex contagions. We show that, while most memes indeed spread like complex contagions, a few viral memes spread across many communities, like diseases. We demonstrate that the future popularity of a meme can be predicted by quantifying its early spreading pattern in terms of community concentration. The more communities a meme permeates, the more viral it is. We present a practical method to translate data about community structure into predictive knowledge about what information will spread widely. This connection contributes to our understanding in computational social science, social media analytics, and marketing applications.

Rescooped by Hugo Marcelo Muriel Arriaran from e-Xploration
Scoop.it!

Twitter Data #Analytics. Data Mining and Machine Learning Lab | #Crawling

Twitter Data #Analytics. Data Mining and Machine Learning Lab | #Crawling | Research | Scoop.it

Via luiy
more...
luiy's curator insight, August 31, 2013 2:11 PM

Social media has become a major platform for information sharing. Due to its openness in sharing data, Twitter is a prime example of social media in which researchers can verify their hypotheses, and practitioners can mine interesting patterns and build realworld applications. This book takes a reader through the process of harnessing Twitter data to find answers to intriguing questions. We begin with an introduction to the process of collecting data through Twitter's APIs and proceed to discuss strategies for curating large datasets. We then guide the reader through the process of visualizing Twitter data with realworld examples, present challenges and complexities of building visual analytic tools, and provide strategies to address these issues. We show by example how some powerful measures can be computed using various Twitter data sources. This book is designed to provide researchers, practitioners, project managers, and graduate students new to the field with an entry point to jump start their endeavors. It also serves as a convenient reference for readers seasoned in Twitter data analysis.

Terry Woodward's curator insight, September 1, 2013 10:52 AM

Paper with code samples

Rescooped by Hugo Marcelo Muriel Arriaran from e-Xploration
Scoop.it!

Big data journalism - NYTimes analysing every tweet | #virality #dataviz

Big data journalism - NYTimes analysing every tweet | #virality #dataviz | Research | Scoop.it

The New York Times is documenting every tweet, retweet and click on every shortened url for Twitter and Facebook that points back to its site, according to Sinan Aral writing for the HBR Blog Network.


Via Irina Radchenko, luiy
more...
luiy's curator insight, August 30, 2013 12:24 PM

The point of this big data project according to Aral is to “understand and predict when an online cascade or conversation will result in a tidal wave of content consumption on the Times, and also when it won’t,” and then to turn this knowledge into actionable intelligence to drive sales and product development.


He describes the scale of the data project as Herculean and outlines the importance of visualisation in helping to interpret the data.


Aral gives three examples in the article; in the first, the tweets and article views seem to operate independently of each other, in the second the Twitter conversation is intense but translates into very little traffic for the Times. And finally in the third example (see images below), “an intense Twitter conversation moves in lockstep with engagement.”

Fàtima Galan's curator insight, September 17, 2013 11:20 AM

"The point of this big data project according to Aral is to “understand and predict when an online cascade or conversation will result in a tidal wave of content consumption on the Times, and also when it won’t,” and then to turn this knowledge into actionable intelligence to drive sales and product development."

 

"When people talk about data journalism or about what Amazon boss Jeff Bezos may have in mind for the Washington Post, this HBR Blog is a pretty good jumping off point."

Rescooped by Hugo Marcelo Muriel Arriaran from e-Xploration
Scoop.it!

#Visualization: The Simple Way to Simplify #BigData | #dataviz

#Visualization: The Simple Way to Simplify #BigData | #dataviz | Research | Scoop.it
Data visualization has an amazing ability to make the complex simple, and the latest tools can do much more than give everyone the same view of data. It?s

Via luiy
more...
luiy's curator insight, September 1, 2013 5:52 PM

Data visualization has an amazing ability to make the complex simple, and the latest tools can do much more than give everyone the same view of data. It’s only through visualization that we can take something as abstract as symbols and turn it into a physical image that has dimensions that our eyes can quickly see and our brains understand. We can grasp data’s meaning more quickly. Visualization tells us when trends are heading in the wrong direction and we need to intervene.

 

But even more, when we visualize the scope and scale of today’s enormous data sets, we pick up things with the naked eye that would otherwise be hidden. We can see data’s previously untold story. We can see where Pennsylvania’s Vote ID Law is disproportionately affecting minorities and students. Visualization of Twitter data can show us a revolution as it unfolds in Cairo. What was being reported in the media as mass crowds forming in Tahrir Square comes to life when the network data becomes visualized as it grows.

 

The two images below, for example, show early Twitter traffic around the protests and then traffic as the word gets out that Mubarak was stepping down.