e-Xploration
23.2K views | +2 today

 Rescooped by luiy from Politique des algorithmes onto e-Xploration

# SLAVES TO THE ALGORITHM

luiy's insight:

Worse though, far worse, would be if someone in Hollywood filched his computer. It is here that the iconoclasm happens. When Meaney is given a job by a studio, the first thing he does is quantify thousands of factors, drawn from the script. Are there clear bad guys? How much empathy is there with the protagonist? Is there a sidekick? The complex interplay of these factors is then compared by the computer to their interplay in previous films, with known box-office takings. The last calculation is what it expects the film to make. In 83% of cases, this guess turns out to be within \$10m of the total. Meaney, to all intents and purposes, has an algorithm that judges the value—or at least the earning power—of art.

To explain how, he shows me a two-dimensional representation: a grid in which each column is an input, each row a film. "Curiously," Meaney says, "if we block this column…" With one hand, he obliterates the input labelled "star", casually rendering everyone from Clooney to Cruise, Damon to De Niro, an irrelevancy. "In almost every case, it makes no difference to the money column."

"For me that’s interesting. The first time I saw that I said to the mathematician, ‘You’ve got to change your program—this is wrong.’ He said, ‘I couldn’t care less—it’s the numbers.’" There are four exceptions to his rules. If you hire Will Smith, Brad Pitt or Johnny Depp, you seem to make a return. The fourth? As far as Epagogix can tell, there is an actress, one of the biggest names in the business, who is actually a negative influence on a film. "It’s very sad for her," he says. But hers is a name he cannot reveal.

No comment yet.

# e-Xploration

antropologo.net, dataviz, collective intelligence, algorithms, social learning, social change, digital humanities
Curated by luiy

## Popular Tags

 Scooped by luiy

## Oycib ::: Collective Intelligence. "Kaan". Network Visualisation.

luiy's insight:

Beginning with the origins, Oycib means in Mayan language "the place of honey". In this projet, Oycib is an e-Research infrastructure for the Collective Intelligence Analysis.

With Oycib infrastructure we propose an analysis model, based in the digital practices and collaboration profiles for the development of Social Learning and the Context Awareness in the Collective Intelligence process.

The infrastructure design and the profiles proposed here, are based on historical studies about social organization glyphs in Mayan culture made by Montgomery (2002) and Calvin (2012).

Initially we worked with four collaboration profiles: the "Itzaat", the "Pitziil", the "Ayuxul" and the "Sajal" (profiles), but we can find others depending of the organization context. Thus, it's important to mention that each profile is found based on the e-Xploración model and they are the qualitative and quantitative interpretation of the collaborative practices. In this way, we propose methods based on Social Network Analysis for the learning and knowledge management.

Thus, the network in Oycib is called "Kaan" (sky or network in Mayan Lenguage). In the "Kaan" we present the visualization of the subjects and objects, such as persons, forums, blogs, files, groups and all the interactions among them. Additionally, each profile and their interactions is presented.

... you can interact with "Kaan" here.

http://viz.oycib.org/net_all_3/network/index.html

No comment yet.
 Scooped by luiy

## #Apocalypse when? Infographic guide to Doomsday threats | #nano #bioterrorism

Which apocalyptic threats are most likely to wipe out Earth’s population and when? Our infographic reveals all.
No comment yet.
 Scooped by luiy

## #memex : Human Traffickers Caught on Hidden Internet | #deepWeb

A new set of search tools called Memex, developed by DARPA, peers into the “deep Web” to reveal illegal activity
luiy's insight:

DARPA has said very little about Memex and its use by law enforcement and prosecutors to investigate suspected criminals.

According to published reports, including one from Carnegie Mellon University, the NYDA’s Office is one of several law enforcement agencies that have used early versions of Memex software over the past year to find and prosecute human traffickers, who coerce or abduct people—typically women and children—for the purposes of exploitation, sexual or otherwise. “Memex”—a combination of the words “memory” and “index” first coined in a 1945 article for The Atlantic—currently includes eight open-source, browser-based search, analysis and data-visualization programs as well as back-end server software that perform complex computations and data analysis.

Such capabilities could become a crucial component of fighting human trafficking, a crime with low conviction rates, primarily because of strategies that traffickers use to disguise their victims’ identities (pdf). The United Nations Office on Drugs and Crimeestimates there are about 2.5 million human trafficking victims worldwide at any given time, yet putting the criminals who press them into service behind bars is difficult. In its 2014 study on human trafficking (pdf) the U.N. agency found that 40 percent of countries surveyed reported less than 10 convictions per year between 2010 and 2012. About 15 percent of the 128 countries covered in the report did not record any convictions.

http://www.scientificamerican.com/slideshow/scientific-american-exclusive-darpa-memex-data-maps/

http://www.cmu.edu/news/stories/archives/2015/january/detecting-sex-traffickers.html

http://www.darpa.mil/NewsEvents/Releases/2014/02/09.aspx

No comment yet.
 Scooped by luiy

## #DataMining Reveals a Global Link Between #Corruption and #Wealth | #dataviz

Social scientists have never understood why some countries are more corrupt than others. But the first study that links corruption with wealth could help change that.

One question that social scientists and economists have long puzzled over is how corruption arises in different cultures and why it is more prevalent in some countries than others. But it has always been difficult to find correlations between corruption and other measures of economic or social activity.

Michal Paulus and Ladislav Kristoufek at Charles University in Prague, Czech Republic, have for the first time found a correlation between the perception of corruption in different countries and their economic development.

The data they use comes from Transparency International, a nonprofit campaigning organisation based in Berlin, Germany, and which defines corruption as the misuse of public power for private benefit. Each year, this organization publishes a global list of countries ranked according to their perceived levels of corruption. The list is compiled using at least three sources of information but does not directly measure corruption, because of the difficulties in gathering such data.

Instead, it gathers information from a wide range of sources such as the African Development Bank and the Economist Intelligence Unit. But it also places significant weight on the opinions of experts who are asked to assess corruption levels.

The result is the Corruption Perceptions Index ranking countries between 0 (highly corrupt) to 100 (very clean). In 2014, Denmark occupied of the top spot as the world’s least corrupt nation while Somalia and North Korea prop up the table in an unenviable tie for the most corrupt countries on the planet.

No comment yet.
 Scooped by luiy

## artoo.js · The client-side #scraping companion | #ddj

luiy's insight:

Features

- Scrape everything, everywhere: invoke artoo in the JavaScript context of any web page.

- Loaded with helpers: Scrape data quick & easy with powerful methods such as artoo.scrape.

- Spiders: Crawl pages through ajax and retrieve accumulated data with artoo's spiders.

- Content expansion: Expand pages' content programmatically thanks to artoo.autoExpand utilities.

- Store: stash persistent data in the localStorage with artoo's handyabstraction.

- Sniffers: hook on XHR requests to retrieve circulating data with a variety oftools.

- Instructions: record the instructions typed into the console and save them for later use.

- jQuery: jQuery is injected alongside artoo in the pages you visit so you can handle the DOM easily.

- Custom bookmarklets: you can use artoo as a framework and easily create custom bookmarklets to execute your code.

- User Interfaces: build parasitic user interfaces easily with a creative usageof Shadow DOM.

- Chrome extension: trying to scrape a nasty page abiding by some sneaky HTML5 rules? Here, have a chrome extension.

No comment yet.
 Scooped by luiy

## Abridged List of #MachineLearning Topics. #Resources #tools #datascience

luiy's insight:

- Deep learning is a set of algorithms in machine learning that attempt to model high-level abstractions in data by using model architectures composed of multiple non-linear transformations.

- Online machine learning is a model of induction that learns one instance at a time thus reducing the amount of memory required.

- Natural Language Toolkit (NLTK) - a leading tool for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

-Computer Vision. OpenCV – popular computer vision library designed to by computational efficiency with a strong focus on real-time applications.

No comment yet.
 Scooped by luiy

## Overview of #Python Visualization Tools | #dataviz #datascience

Overview of common python visualization tools
luiy's insight:

Introduction

In the python world, there are multiple options for visualizing your data. Because of this variety, it can be really challenging to figure out which one to use when. This article contains a sample of some of the more popular ones and illustrates how to use them to create a simple bar chart. I will create examples of plotting data with:

- Pandas

- Seaborn

- ggplot

- Bokeh

- pygal

- Plotly

No comment yet.
 Rescooped by luiy from ESN - RSE & SocBiz

## Les facteurs de réussite d’un réseau social d’entreprise | #CI #analytics #RSE

Les performances des réseaux internes d'une vingtaine d'entreprise ont été comparées par le cabinet de conseil Lecko pour déterminer ce qui les mène au succès... ou pas.

Via Eric Laurent
luiy's insight:

Beaucoup de sociétés ont des réseaux sociaux d’entreprise (RSE), mais toutes ne rencontrent pas le même succès avec ces projets. Pour tenter de déterminer les facteurs de réussite de ces espaces collaboratifs, le cabinet de conseil Lecko a réalisé un benchmark pour la deuxième année consécutive. Une vingtaine de grandes entreprises ont été comparées via l’outil Lecko RSE Analytics qui renvoie des métriques sur l’activité sociale enregistrée sur les plateformes (création d’un profil, ajout d’un commentaire ou « like »). Pour compléter le tout, plus de 90 community manager ont été interrogés pour comparer leurs pratiques. L’importance des community manager ne se dément pas. 71 % des espaces performants sont nés de l’initiative d’un community manager (voir le tome 7 de l'étude sur l'Etat de l'art des réseaux sociaux d'entreprise de Lecko)

No comment yet.
 Scooped by luiy

## OpenGraphiti : Data Visualization Framework | #SNA #open #dataviz

luiy's insight:
Description

OpenGraphiti is a free and open source 3D data visualization engine for data scientists to visualize semantic networks and to work with them. It offers an easy-to-use API with several associated libraries to create custom-made datasets. It leverages the power of GPUs to process and explore the data and sits on a homemade 3D engine.

No comment yet.
 Scooped by luiy

## The Emerging Science of Human-Data Interaction | #bigdata #HDI

The rapidly evolving ecosystems associated with personal data is creating an entirely new field of scientific study, say computer scientists. And this requires a much more powerful ethics-based infrastructure.
luiy's insight:

... Richard Mortier at the University of Nottingham in the UK and a few pals say the increasingly complex, invasive and opaque use of data should be a call to arms to change the way we study data, interact with it and control its use. Today, they publish a manifesto describing how a new science of human-data interaction is emerging from this “data ecosystem” and say that it combines disciplines such as computer science, statistics, sociology, psychology and behavioural economics.

They start by pointing out that the long-standing discipline of human-computer interaction research has always focused on computers as devices to be interacted with. But our interaction with the cyber world has become more sophisticated as computing power has become ubiquitous, a phenomenon driven by the Internet but also through mobile devices such as smartphones. Consequently, humans are constantly producing and revealing data in all kinds of different ways.

Mortier and co say there is an important distinction between data that is consciously created and released such as a Facebook profile; observed data such as online shopping behaviour; and inferred data that is created by other organisations about us, such as preferences based on friends’ preferences.

Original Article : http://arxiv.org/abs/1412.6159

No comment yet.
 Scooped by luiy

## What Is “ #OpenAccess ”? | #OpenScience

Imagine the progress that can happen—in health, science, education—when scholarly research is made freely available among scientists, patients, inventors, and others.
luiy's insight:

Before the open access model existed, almost all peer-reviewed articles based on scholarly research were published in corporate-owned print journals, whose subscription fees were often prohibitively expensive—despite the fact that authors are not paid for their articles. Publishers rarely invest in the actual research and typically provide little added value in the articles’ preparation and distribution.

These journals were available to the general public only at university libraries in wealthy countries. This meant that doctors treating patients with HIV and AIDS in remote regions of Africa, for instance, could not access complete articles describing the results of the latest medical research on treatments, even when the research upon which these articles were based was undertaken in their remote regions.

No comment yet.
 Scooped by luiy

## Ethnography for the Internet | #Anthropology #CyberEthnography

The internet has become embedded into our daily lives, no longer an esoteric phenomenon, but instead an unremarkable way of carrying out our interactions with one another. Online and offline are interwoven in everyday experience. Using the internet has become accepted as a way of being present in the world, rather than a means of accessing some discrete virtual domain. Ethnographers of these contemporary Internet-infused societies consequently find themselves facing serious methodological dilemmas: where should they go, what should they do there and how can they acquire robust knowledge about what people do in, through and with the internet?

This book presents an overview of the challenges faced by ethnographers who wish to understand activities that involve the internet. Suitable for both new and experienced ethnographers, it explores both methodological principles and practical strategies for coming to terms with the definition of field sites, the connections between online and offline and the changing nature of embodied experience. Examples are drawn from a wide range of settings, including ethnographies of scientific institutions, television, social media and locally based gift-giving networks. - See more at: http://www.bloomsbury.com/uk/ethnography-for-the-internet-9780857855701/#sthash.q1UHC7O1.dpuf

luiy's insight:

1 Introduction
2 The E3 Internet: The Embedded, Embodied, Everyday Internet
3 Ethnographic Strategies for the Embedded, Embodied, Everyday Internet
4 Observing and Experiencing Online/Offline Connections
5 Connective Ethnography in Complex Institutional Landscapes
6 The Internet in Ethnographies of the Everyday
7 Conclusion
References

No comment yet.
 Scooped by luiy

## #Fractals in D3: Dragon Curves | #dataviz

luiy's insight:

In this post we're looking at examples of generating some really cool fractals called dragon curves (also referred to as Heighway dragons).  This post is a continuation of the previous one on fractal ferns.  Take a look at that post if you want some basic info on fractals and some links I found useful.  Fractals are a world unto themselves, so there are plenty of interesting things to be investigated in this area.  We are just scratching the surface with these two posts.

No comment yet.
 Rescooped by luiy from Politique des algorithmes

## Connecting the Dots Behind the 2016 Candidates | #ddj #politics

How the teams behind some likely and announced 2016 candidates are connected to previous campaigns, administrations and organizations.

Via Dominique Cardon
No comment yet.
 Scooped by luiy

## District Data Labs - How to Transition from Excel to #R | #datascience

How to Transition from Excel to R - An Intro to R for Microsoft Excel Users
luiy's insight:

In today's increasingly data-driven world, business people are constantly talking about how they want more powerful and flexible analytical tools, but are usually intimidated by the programming knowledge these tools require and the learning curve they must overcome just to be able to reproduce what they already know how to do in the programs they've become accustomed to using. For most business people, the go-to tool for doing anything analytical is Microsoft Excel.

No comment yet.
 Scooped by luiy

## Introducing the #streamgraph htmlwidget #R Package | #datascience

We were looking for a different type of visualization for a project at work this past week and my thoughts immediately gravitated towards streamgraphs. The TLDR on streamgraphs is they they are generalized versions of stacked area graphs with free baselines across the x axis. They are somewhat controversial but have a “draw you in” […]
luiy's insight:

Streamgraphs require a continuous variable for the x axis, and thestreamgraph widget/package works with years or dates (support for xtsobjects and POSIXct types coming soon). Since they display categorical values in the area regions, the data in R needs to be in long format which is easy to do with dplyr & tidyr.

The package recognizes when years are being used and does all the necessary conversions for you. It also uses a technique similar to expand.grid to ensure all categories are represented at every observation (not doing so makesd3.stack unhappy).

No comment yet.
 Scooped by luiy

## How MIT Visualizes Supply Chain Risk | #SNA #predictive

MIT Supply Chain Management Director Bruce Arntzen discusses risk visualization and Sourcemap [Video]

How does a company keep tabs on thousands of suppliers? That’s the question Bruce Arntzen tried to answer when he started the Hi-Viz Research Project. As Executive Director of MIT’s Supply Chain Management Program, Arntzen works with corporations to find innovative solutions to supply chain problems. The idea for the Hi-Viz project came during a 2011 meeting of the Supply Chain Risk Leadership Council. A survey of attendees listed Supply Chain Visibility as the top concern. Why? With thousands of suppliers and sub-suppliers, it can be very time-consuming to find the weakest link in a supply chain. Arntzen’s solution: an automatic visualization of the end-to-end supply chain where the weakest links could be seen in real time. Watch his interview to learn how MIT and Sourcemap developed the first automated risk visualization [more details below the fold].

In 2015, the Hi-Viz project is partnering with actuarial data providers to provide predictive risk analytics. Sourcemap is making available inventory risk mapping as part of its enterprise software-as-a-service. Want to get involved? Learn more about the Hi-Viz project, or contact Sourcemap for a demo.

No comment yet.
 Scooped by luiy

## The SHOGUN #MachineLearning #Toolbox | #datascience

The Shogun Machine learning toolbox provides a wide range of unified and efficientMachine Learning (ML) methods. The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms. We combine modern software architecture in C++ with both efficient low-level computing backends and cutting edge algorithm implementations to solve large-scale Machine Learning problems (yet) on single machines.

One of Shogun's most exciting features is that you can use the toolbox through aunified interface from C++, Python, Octave, R, Java, Lua, C#, etc. This not just means that we are independent of trends in computing languages, but it also lets you use Shogun as a vehicle to expose your algorithm to multiple communities. We use SWIGto enable bidirectional communication between C++ and target languages. Shogun runs under Linux/Unix, MacOS, Windows.

No comment yet.
 Scooped by luiy

Reporters love Twitter and geeks love coding. Today, I’m merging the best of both worlds! On the menu: Python scripts to use Twitter to its full potential!

luiy's insight:

When my friend @TerraCiolfe showed me @WeAreTheDeads project, I said to myself that I really need to learn how to control Twitter through Python. @WeAreTheDeads is a Twitter account publishing the name of a fallen soldiers at the 11th minute of each hour.

Of course, nobody is working behind the screen. A program chooses the soldier in a database and publishes his name, hour after hour. With 119,000 names to publish, the script will run until 2023, according to the author of this great idea, the reporter @GlenMcGregor from the Ottawa Citizen.

With a little bit of research (my sources are at the end of the article), I learnt how to work with Twitter from a Python script. Actually, we can do way more than automatically publish tweets! It’s also possible to extract a lot of data about users and their tweets. For example, you can research specific tweets in a specific location. I created a nice animated map at the end. You’ll see!

No comment yet.
 Scooped by luiy

## TULIP : Data Visualization Software | #SNA #dataviz

Tulip is a software system dedicated to the visualization of huge graphs. It enables, 3D visualizations, 3D modifications, plugin support, support for clusters and navigation, and automatic graph drawing.
luiy's insight:

Tulip is an information visualization framework dedicated to the analysis and visualization of relational data. Tulip aims to provide the developer with a complete library, supporting the design of interactive information visualization applications for relational data that can be tailored to the problems he or she is addressing.

Written in C++ the framework enables the development of algorithms, visual encodings, interaction techniques, data models, and domain-specific visualizations. One of the goal of Tulip is to facilitates the reuse of components and allows the developers to focus on programming their application. This development pipeline makes the framework efficient for research prototyping as well as the development of end-user applications.

No comment yet.
 Rescooped by luiy from Social Network Analysis #sna

## Davos on Twitter: who do the attendees follow? | #dataviz #SNA

Via ukituki
luiy's insight:

Every year, the World Economic Forum brings together the most recognisable figures of business and politics. With all eyes on Davos, we decided to turn the optics upside down and see who the twitterati gathered in Switzerland follow on social media.

The inner ring of circles represent the 20 most-followed accounts by Davos attendees, while the outer circles are individual attendees.

ukituki's curator insight,

Network Visualization by Finanacial Times

 Scooped by luiy

## After Ayotzinapa's Disappeared, Locals Are Taking Power In Tecoanapa | #democracy #socialchange

luiy's insight:

On Sept. 26, 2014, municipal police attacked a group of students from Ayotzinapa school in Mexico’s Guerrero state. Of the 43 disappeared students, eight came from Tecoanapa. Now their fellow citizens have shut down the local government buildings and set up a people’s council. It’s a movement that is gathering momentum across Guerrero.

No comment yet.
 Scooped by luiy

## #BigData, new epistemologies and paradigm shifts | #socialscience #DH

luiy's insight:

Whilst Jim Gray envisages the fourth paradigm of science to be data-intensive and a radically new extension of the established scientific method, others suggest that Big Data ushers in a new era of empiricism, wherein the volume of data, accompanied by techniques that can reveal their inherent truth, enables data to speak for themselves free of theory. The empiricist view has gained credence outside of the academy, especially within business circles, but its ideas have also taken root in the new field of data science and other sciences. In contrast, a new mode of data-driven science is emerging within traditional disciplines in the academy. In this section, the epistemological claims of both approaches are critically examined, mindful of the different drivers and aspirations of business and the academy, with the former preoccupied with employing data analytics to identify new products, markets and opportunities rather than advance knowledge per se, and the latter focused on how best to make sense of the world and to determine explanations as to phenomena and processes.

http://bds.sagepub.com/content/1/1/2053951714528481.full

No comment yet.
 Scooped by luiy

## How Diversity Makes Us Smarter | #IntelligenceCollective

luiy's insight:

Information and Innovation

The key to understanding the positive influence of diversity is the concept of informational diversity. When people are brought together to solve problems in groups, they bring different information, opinions and perspectives. This makes obvious sense when we talk about diversity of disciplinary backgrounds—think again of the interdisciplinary team building a car. The same logic applies to social diversity. People who are different from one another in race, gender and other dimensions bring unique information and experiences to bear on the task at hand. A male and a female engineer might have perspectives as different from one another as an engineer and a physicist—and that is a good thing.

No comment yet.
 Scooped by luiy

## Your Digital Image: Factors Behind #Demographic and #Psychometric #Predictions from Social Network Profiles | #identity

luiy's insight:

Our system allows users to examine the factors influencing the predictions, so users can determine how “Liking” a certain item changes the predictions regarding their intel ligence, or how changing the number of friends they have affects the predictions regarding their personality. Clearly, these factors are under the control of the user, and users may modify their behavior on Facebook to be perceived in a positive manner. As people can form judgments on others based on their social media profiles [4], this phenomenon is not new. However, we believe an automated tool can allow people to easily determine how others may perceive thembased on their behavior on social networks.

No comment yet.
 Rescooped by luiy from Parlons Data !

## The Graphic Continuum | #dataviz

[This is a guest post by Jon Schwabish* and Severino Ribecca**, about the informational poster The Graphic Continuum]

How many different graph types exist? How do they relate to one another?

Via AymericBds
luiy's insight:

How many different graph types exist? How do they relate to one another? Can you use the same graphic type for different types of data? These are the questions that we tried to tackle in our recent project, The Graphic Continuum.

Documenting the many chart types is something we were both working on independently for the past couple of years. Severino was building his Data Visualisation Catalogue, an online reference tool of data visualizations. At the same time, I was teaching data visualization to different audiences and was thinking about how to best show my students different graphic types and how they relate to one another.

No comment yet.