e-Xploration
32.6K views | +2 today

 Scooped by luiy onto e-Xploration

An introduction to #machinelearning with scikit-learn - #datamining #algorithms

Machine learning: the problem setting

In general, a learning problem considers a set of n samples of data and then tries to predict properties of unknown data. If each sample is more than a single number and, for instance, a multi-dimensional entry (aka multivariate data), is it said to have several attributes or features

luiy's insight:

We can separate learning problems in a few large categories:

- Supervised learning: in which the data comes with additional attributes that we want to predict (Click here to go to the scikit-learn supervised learning page).This problem can be either:

- Classification: samples belong to two or more classes and we want to learn from already labeled data how to predict the class of unlabeled data. An example of classification problem would be the handwritten digit recognition example, in which the aim is to assign each input vector to one of a finite number of discrete categories. Another way to think of classification is as a discrete (as opposed to continuous) form of supervised learning where one has a limited number of categories and for each of the n samples provided, one is to try to label them with the correct category or class.

- Regression: if the desired output consists of one or more continuous variables, then the task is called regression. An example of a regression problem would be the prediction of the length of a salmon as a function of its age and weight.

- Unsupervised learning, in which the training data consists of a set of input vectors x without any corresponding target values. The goal in such problems may be to discover groups of similar examples within the data, where it is called clustering, or to determine the distribution of data within the input space, known as density estimation, or to project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization.

No comment yet.

e-Xploration

antropologiaNet, dataviz, collective intelligence, algorithms, social learning, social change, digital humanities
Curated by luiy

Popular Tags

 Scooped by luiy

Oycib ::: Collective Intelligence. "Kaan". Network Visualisation.

luiy's insight:

Beginning with the origins, Oycib means in Mayan language "the place of honey". In this projet, Oycib is an e-Research infrastructure for the Collective Intelligence Analysis.

With Oycib infrastructure we propose an analysis model, based in the digital practices and collaboration profiles for the development of Social Learning and the Context Awareness in the Collective Intelligence process.

The infrastructure design and the profiles proposed here, are based on historical studies about social organization glyphs in Mayan culture made by Montgomery (2002) and Calvin (2012).

Initially we worked with four collaboration profiles: the "Itzaat", the "Pitziil", the "Ayuxul" and the "Sajal" (profiles), but we can find others depending of the organization context. Thus, it's important to mention that each profile is found based on the e-Xploración model and they are the qualitative and quantitative interpretation of the collaborative practices. In this way, we propose methods based on Social Network Analysis for the learning and knowledge management.

Thus, the network in Oycib is called "Kaan" (sky or network in Mayan Lenguage). In the "Kaan" we present the visualization of the subjects and objects, such as persons, forums, blogs, files, groups and all the interactions among them. Additionally, each profile and their interactions is presented.

... you can interact with "Kaan" here.

http://viz.oycib.org/net_all_3/network/index.html

No comment yet.
 Scooped by luiy

Connected: The Power of Six Degrees - YouTube I #networks #SNA

CONNECTED: THE POWER OF SIX DEGREES (alternate title: How Kevin Bacon Cured Cancer) is a 2008 documentary film by Annamaria Talas. It was first aired in 2009 on the Science Channel. The documentary introduces the audience to the main ideas of network science through the exploration of the concept of six degrees of separation. Stars: Nyaloka Auma, Kevin Bacon, Albert-László Barabás
No comment yet.
 Scooped by luiy

SkyLens: Visual Analysis of Skyline on Multi-dimensional Data

No comment yet.
 Scooped by luiy

Creating Data Visualizations | NodeBox #dataviz #tools

No comment yet.
 Scooped by luiy

dc.js - Dimensional Charting Javascript Library

dc.js is a javascript charting library with native crossfilter support, allowing highly efficient exploration on large multi-dimensional datasets (inspired by crossfilter's demo). It leverages d3 to render charts in CSS-friendly SVG format. Charts rendered using dc.js are data driven and reactive and therefore provide instant feedback to user interaction.

dc.js is an easy yet powerful javascript library for data visualization and analysis in the browser and on mobile devices.
No comment yet.
 Scooped by luiy

A Complete Tutorial on Tree Based Modeling from Scratch (in R & Python)

This tutorial is meant to help beginners learn tree based modeling from scratch. After the successful completion of this tutorial, one is expected to become proficient at using tree based algorithms and build predictive models.
No comment yet.
 Rescooped by luiy from 1-2-3, A-B-C : Infographics, «cheatsheets», Manifestos, tips, tricks, how-to, diagrams and shortcuts

Cognitive Bias Infographic

Via Claude Emond
No comment yet.
 Scooped by luiy

Top-Secret NSA Report Details Russian Hacking Effort Days Before 2016 Election

The spear-phishing email contained a link directing the employees to a malicious, faux-Google website that would request their login credentials and then hand them over to the hackers. The NSA identified seven “potential victims” at the company. While malicious emails targeting three of the potential victims were rejected by an email server, at least one of the employee accounts was likely compromised, the agency concluded. The NSA notes in its report that it is “unknown whether the aforementioned spear-phishing deployment successfully compromised all the intended victims, and what potential data from the victim could have been exfiltrated.”

VR Systems declined to respond to a request for comment on the specific hacking operation outlined in the NSA document. Chief Operating Officer Ben Martin replied by email to The Intercept’s request for comment with the following statement:

Phishing and spear-phishing are not uncommon in our industry. We regularly participate in cyber alliances with state officials and members of the law enforcement community in an effort to address these types of threats. We have policies and procedures in effect to protect our customers and our company.
No comment yet.
 Scooped by luiy

Graph Commons – Map networks together

Graph Commons is a collaborative platform for making, analyzing and publishing network maps.
No comment yet.
 Scooped by luiy

Recherche Ina : Logo recognition and retrieval... (VF) - vidéo Dailymotion

Axe de recherche "Visualisation, Indexation & Fouille de données"
luiy's insight:
No comment yet.
 Scooped by luiy

Tommaso Venturini / we love #networks, mais pourquoi ? I #SNA

SÉMINAIRE 2012 DE L’ERG : ENTRETIEN INFINI - ART ET SCIENCES (QUELS DIALOGUES ?) les 5, 6 et 7 mars 2012 / de 10h à 23h Halles de Schaerbeek,…
No comment yet.
 Scooped by luiy

Fighting #Censorship with #ProtonMail Encrypted Email Over Tor - ProtonMail Blog - #privacy

In the past two years, ProtonMail has grown enormously, especially after the recent US election, and today we are the world’s largest encrypted email service with over 2 million users. We have come a long way since our user community initially crowdfunded the project.

ProtonMail today is much larger in scope than what was originally envisioned when our founding team met at CERN in 2013. As ProtonMail has evolved, the world has also been changing around us.

Civil liberties have been increasingly restricted in all corners of the globe. Even Western democracies such as the US have not been immune to this trend, which is most starkly illustrated by the forced enlistment of US tech companies into the US surveillance apparatus.

In fact, we have reached the point where it simply not possible to run a privacy and security focused service in the US or in the UK. At the same time, the stakes are also higher than ever before. As ProtonMail has grown, we have become increasingly aware of our role as a tool for freedom of speech, and in particular for investigative journalism.

Last fall, we were invited to the 2nd Asian Investigative Journalism Conference and were able to get a firsthand look at the importance of tools like ProtonMail in the field.

No comment yet.
 Scooped by luiy

Sciences du Design : #Algorithmes - Éditorial - Cairn.info I #cartographie #dataviz

dans le cadre de ce numéro 04, la rubrique Visualisation accueille trois réalisations. La première, proposée par Sonia Pelloux, s’intitule « Cartographie des données : donner à voir l’open data SNCF ». Elle rend compte, notamment, des possibilités et avantages multiples d’une navigation par graphe au sein de la structure des données (de l’exploration visuelle de concepts et bases de données diverses). La deuxième réalisation titrée « Cloud Map » est de Ianis Lallemand. Partant de l’analyse de 2998 coordonnées GPS de centres de données, celle-ci situe la distribution géographique mondiale des infrastructures matérielles allouées au « Cloud Computing ». Il en résulte une représentation cartographique « en volume » offrant de voir autant que de sentir le poids, l’imposante présence d’une implantation dès lors massive de ce type d’installation. Enfin, la troisième réalisation fut composée en 2014 par Yann Le Guennec † – celle-ci vient en complément ou « résonance » avec le texte publié à titre posthume dans la rubrique Supplément. Intitulée « Les Paysages des erreurs », il s’agit d’une création artistique avec les données (Data Art) lesquelles deviennent ici le matériau brut d’une « activation » picturale/plastique programmée modifiant, altérant l’image « support » et sa signification. Ces trois projets sont à retrouver également sur le mini-site Visualisation de la revue à l’adresse : visu.sciences-du-design.org – à noter la publication d’une vidéo de présentation de l’application logicielle ici introduite par Sonia Pelloux.
No comment yet.
 Scooped by luiy

The Internet's Own Boy: The Story of Aaron Swartz (CC available) - YouTube I #History #Hackers

The film follows the story of programming prodigy and information activist Aaron Swartz. From Swartz's help in the development of the basic internet protocol RSS to his co-founding of Reddit, his fingerprints are all over the internet. But it was Swartz's groundbreaking work in social justice and political organizing combined with his aggressive approach to information access that ensnared him in a two-year legal nightmare. It was a battle that ended with the taking of his own life at the age of 26. Aaron's story touched a nerve with people far beyond the online communities in which he was a celebrity. This film is a personal story about what we lose when we are tone deaf about technology and its relationship to our civil liberties.
No comment yet.
 Scooped by luiy

Barabási Albert-László - Books

Network Science

ALBERT-LÁSZLÓ BARABÁSI

Networks are everywhere, from the Internet, to social networks, and the genetic networks that determine our biological existence. Illustrated throughout in full colour, this pioneering textbook, spanning a wide range of topics from physics to computer science, engineering, economics and the social sciences, introduces network science to an interdisciplinary audience. From the origins of the six degrees of separation to explaining why networks are robust to random failures, the author explores how viruses like Ebola and H1N1 spread, and why it is that our friends have more friends than we do. Using numerous real-world examples, this innovatively designed text includes clear delineation between undergraduate and graduate level material. The mathematical formulas and derivations are included within Advanced Topics sections, enabling use at a range of levels. Extensive online resources, including films and software for network analysis, make this a multifaceted companion for anyone with an interest in network science.

No comment yet.
 Scooped by luiy

- Topogram - Visualize how networks change over time and space

"Topogram"

Topogram is a web-based app to visualize the evolution of networks over time and space.

Try it online at app.topogram.io or install your own.

Features
Time-based navigation in graph
Network layouts + geographic data
Online/real-time data update via API

No comment yet.
 Scooped by luiy

Data trails: “The comparative machine” (1836)

This post is part of a blog series on historical data vis titled “Data Trails. Snapshots from the history of data visualisation” and originally appeared over at my friends’ from Idalab, a Berlin-based specialist on data science and machine learning.

There is a very basic joy in roaming through atlases and in looking at maps. Atlases are rich collections of places, and if there is one thing they can do it is making you travel around the world, to places near and far. Cartography – an age-old discipline combining science and art – has created a unique way of looking at our planet, of flying over continuous landscapes and endless oceans. The 19th century was a golden age of atlases. Beautiful pieces for the educated household were printed in many different countries. Many of them feature one very special type of map: the comparative tableau showing the longest rivers and highest mountains of the world.
No comment yet.
 Scooped by luiy

Interactive flow visualization in R

Exploring flows between origins and destinations visually is a common task, but can be difficult to get right. In R, there are many tutorials on the web that show how to produce static flow maps (see here, here, here, and here, among others).

Over the past couple years, R developers have created an infrastructure to bridge R with JavaScript using the htmlwidgets package, allowing for the generation of interactive web visualizations straight from R. I’d like to demonstrate here a few examples for exploratory interactive flow graphics that use this infrastructure.
No comment yet.
 Scooped by luiy

Learning Analytics : quelles sont les données du problème ? | LINC

L’éducation et la formation tout au long de la vie constituent l’un des nouveaux eldorados du numérique. Les méthodes d’apprentissage en ligne se développent sous la forme de Moocs, de plateformes d’e-learning, passent par les réseaux sociaux, sont disponibles sur ordinateurs, tablettes, et smartphones.
No comment yet.
 Rescooped by luiy from 1-2-3, A-B-C : Infographics, «cheatsheets», Manifestos, tips, tricks, how-to, diagrams and shortcuts

50 Research Methods for Innovation Infographic

Via Claude Emond
No comment yet.
 Scooped by luiy

Interactive timeline of the PRISM scandal - Virostatiq

Constructing the visualization.

The visualization was constructed entirely in HTML5 and JavaScript. Four major libraries were used:

-  Sigma.js for displaying the networks. The latest version does not contain some key functionality for dynamically and additively loading and unloading of subgraphs into the main graph, so the source code was updated with required methods. Separate article on that topic is upcoming.

- Three.js for rotating Earth and all geographically-related work. Simile Timeline for the timeline.

- Flot for the bar graph.

No comment yet.
 Scooped by luiy

56Kast #94 – Archiver les réactions numériques aux attentats

► Abonnez-vous : http://bit.ly/abo56kast La chercheuse Valérie Schafer coordonne le projet Asap, qui veut analyser les réactions des internautes au
No comment yet.
 Scooped by luiy

Recherche Ina - Exploration vidéo (VF) - vidéo Dailymotion

Axe de recherche "Visualisation, Indexation & Fouille de données"
No comment yet.
 Scooped by luiy

#Visualizing the network of Donald #Trump - @Linkurious - Understand the connections in your data I #publicpolicy #SNA

On January 15, BuzzFeed released a large dataset of Donald Trump’s connections, including people, organizations and the nature of their relationships.

In their article “Help Us Map TrumpWorld”, the four authors of the investigation, John Templon, Anthony Cormier, Alex Campbell and Jeremy Singer-Vine, asked the public to help them understand and analyse the data.

Now we are asking the public to use our data to find connections we may have missed, and to give us context we don’t currently understand. We hope you will help us — and the public — learn more about TrumpWorld and how this unprecedented array of businesses might affect public policy.”

So we decided to see what it would looks like in Linkurious, our graph analysis and visualization tool.

The dataset is publicly available in a Google spreadsheet. We imported it in a Neo4j graph database using the following script inspired by Michael Hunger’s work:

No comment yet.
 Scooped by luiy

Spatial History Project I #dataviz #cartographie

This research measures and maps the production of nineteenth-century space using the tools of the digital age. Computational analysis allowed me to quantify how late nineteenth-century newspapers crafted a view of the world for their readers. Specifically, I examine the Houston Daily Post from 1894 to 1901 to study how late nineteenth-century America appeared from a specific vantage point in time and space. What places loomed large in the paper’s imagined geography? How did large-scale processes of incorporation, standardization, and nationalization shape the paper’s production of space and place? What was the relationship between region and nation? To answer these questions I combine traditional historical research with digital analysis of the paper.

No comment yet.
 Scooped by luiy

Predict Crime | Predictive Policing Software | PredPol

PredPol, "Predictive Policing Software", is an policing technology that helps law enforcement predict and prevent crime.

No comment yet.