e-Xploration
Follow
Find tag "neo4j"
24.4K views | +1 today
e-Xploration
antropologo.net, dataviz, collective intelligence, algorithms, social learning, social change, digital humanities
Curated by luiy
Your new post is loading...
Your new post is loading...
Scooped by luiy
Scoop.it!

The Five Graphs of Love | #Neo4j #SNA #algorithms

The iDating industry cares about interactions and connections. Those two concepts are closely linked. If someone has a connection to another person, through a shared…
luiy's insight:

Dating sites and apps worldwide have begun to use graph databases to achieve competitive gain. Neo4j provides thousand-fold performance improvements and massive agility benefits over relational databases, enabling new levels of performance and insight. Join us for a webinar, presented by Amanda Laucher, that discusses the five graphs of love, and how companies like eHarmony, Hinge and AreYouInterested.com, are now using graph algorithms to create more interactions and connections.

 

more...
No comment yet.
Scooped by luiy
Scoop.it!

Learning Graph - #Neo4j GraphGist

Learning Graph - #Neo4j GraphGist | e-Xploration | Scoop.it
luiy's insight:

This graph is used to visualize the knowledge a person has in a certain area. In this example, the knowledge a Java developer has over a stack of technologies. The purpose is to document acquired knowledge and to help to further educate oneself in a structured way. This is accomplished by graphing dependencies between technologies as well as resources that can be used to learn a technology and to determine possible learning paths through the graph, which show a way to learn a specific technology, by first learning the technologies, in order, which are prerequisites for the technology to be learned. The graph is meant not to be static, but updated as new connections between technologies are discovered and new knowledge is acquired.

The main use case explored here is how to learn Spring Data Neo4j (SDN) in a structured manner.

Node types

Person: green

Technology: yellow

Concept: orange

Resources: blue

 
more...
No comment yet.
Scooped by luiy
Scoop.it!

Comparison of Popular #NoSql databases (#MongoDb, #CouchDb, #Hbase, #Neo4j, #Cassandra)

luiy's insight:

There are many SQL databases so far.But i personally feel the 15 years history of SQL coming to an end as everyone is moving to an era of BigData. As experts say SQL databases are not a best fit for Big Data No Sql databases came into picture as a best fit for this which provides more flexibility in storing data.


I just want to compare few popular NoSql databases that are available at this point of time.Few well known NoSql databases are:

 

- MongoDb,

- Cassandra,

- Hbase,

- CouchDb,

- Neo4j.

 

NoSql databases differ each other more than the way Sql databases differ from each other.I think its one's responsibility to choose the appropriate NoSql database for their application based on their use case. 

more...
No comment yet.
Scooped by luiy
Scoop.it!

#MongoDB, #Neo4j and #Gephi project I #datascience

#MongoDB, #Neo4j and #Gephi project I #datascience | e-Xploration | Scoop.it
luiy's insight:

The first step in this process, presented by Showk, is importing the cables. Luckily, the WikiLeaks cables follow a simple structure that makes this relatively easy. Showk based his work on the cablegate Python code by Mark Matienzo that scrapes data from the cables in HTML form and converts this to Python objects. For the HTML scraping, the code is using Beautiful Soup, a well-known Python HTML/XML parser that automatically converts the web pages to Unicode and can cope with errors in the HTML tree. Moreover, with a SoupStrainer object, you can tell the Beautiful Soup parser to target a specific part of the document and forget about all the boilerplate parts such as the header, footer, sidebars, and supporting information.

 

After the parsing, The Python natural language toolkit NLTK is used on the text body to bring more structure to the word scramble with the goal of extracting some topics. The first step is tokenization: NLTK allows easily breaking up a text into sentences and each sentence into its separate words. Then for each word the stem is determined, which means that all words are grouped by their root. For example, to analyze the topics of the WikiLeaks cables, it doesn't matter if the word in a text is "language" or "languages", so they are both grouped by their root "languag". An SHA-256 hash value of each stem is then used as a database index.

 

MongoDB, a document-oriented database, is used as document storage for all this data. MongoDB allows transparently inserting and reading records as Python dictionaries, as well as automatic serializing and deserializing of the objects. Then Showk queried the MongoDB database to extract the heaviest occurrences and co-occurrences of words, and converted that to a graph using the Neo4j graph database.

 

For the final step, visualizing and analyzing the data, Bilcke used Gephi, an open source desktop application for the visualization of complex networks. Gephi, to which Bilcke is an active contributor, is a research-oriented graph visualization tool that has been used in the past to visualize some interesting graphs, like open source communities andsocial networks on LinkedIn. It's based on Java and OpenGL, but it also has a headless library, the Gephi Toolkit.

more...
No comment yet.