Big Data Technology, Semantics and Analytics
11.3K views | +0 today
Follow
Big Data Technology, Semantics and Analytics
Trends, success and applications for big data including the use of semantic technology
Curated by Tony Agresta
Your new post is loading...
Your new post is loading...
Scooped by Tony Agresta
Scoop.it!

SWJ: 5 years in, most cited papers | www.semantic-web-journal.net

SWJ: 5 years in, most cited papers | www.semantic-web-journal.net | Big Data Technology, Semantics and Analytics | Scoop.it
Tony Agresta's insight:

The Semantic Web Journal was launched 5 years ago.  There's a wealth of information here on semanitc. 


Below is the abstract for the number two entry - GraphDB (formerly OWLIM). You can download the paper on the site.


"An explosion in the use of RDF, first as an annotation language and later as a data representation language, has driven the requirements for Web-scale server systems that can store and process huge quantities of data, and furthermore provide powerful data access and mining functionality. This paper describes OWLIM (now called GraphDB), a family of semantic repositories that provide storage, inference and novel data-access features delivered in a scalable, resilient, industrial-strength platform."

more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

Ontotext Delivers Semantic Publishing Solutions to the World’s Largest Media & Publishing Companies

Ontotext Delivers Semantic Publishing Solutions to the World’s Largest Media & Publishing Companies | Big Data Technology, Semantics and Analytics | Scoop.it
Washington DC (PRWEB) August 27, 2014 -- Ontotext Media & Publishing delivers semantic publishing solutions to the world’s largest media and publishing companies including automated content enrichment, data management, content and user analytics and natural language processing. Recently, Ontotext Media and Publishing has been enhanced to include contextually-aware reading recommendations based on content and user behavior, delivering an even more powerful user experience.
Tony Agresta's insight:

Semantic Recommendations are all about personalized, contextual recommendations based on a blend of search history, users profiles and, most importantly, semantically enriched content.  This refers to content that has been analyzed using natural language processing. Entities are extracted from the text, classified and indexed inside a graph database.  When a visitor comes to a website or information portal,  "Semantic Recommendations" understands more than just the past browsing history.  It understands what other articles have relevant, contextual information of interest to the reader.  This, in turn, creates a fantastic user experience because visitors get much more than they originally thought would be available in search results.  This news release talks more about Semantic Recommendations and Ontotext Media and Publishing. By the way, this same technology can be used for any website, any information product, any search and discovery application.  The basic premise is that once all of your content has been semantically enriched, search engines deliver highly relevant results. 

more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

Thought Leaders in Big Data: Atanas Kiryakov, CEO of Ontotext (Part 1)

Thought Leaders in Big Data: Atanas Kiryakov, CEO of Ontotext (Part 1) | Big Data Technology, Semantics and Analytics | Scoop.it
Next»» Next»» This segment is part 1 in the series : Thought Leaders in Big Data: Atanas Kiryakov, CEO of Ontotext1 2 3 4 5
Tony Agresta's insight:

This interview was with Atanas Kiryakov, founder and CEO of Ontotext.  He is an expert in semantic technology and discusses use cases for text mining, graph databases, semantic enrichment and content curation.   This is a five part series and I would recommend this to anyone interested in taking the next step in big data - semantic analysis of text leading to contextual search and discovery applications. 

more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

Ontotext Improves Its RDF Triplestore, GraphDB™ 6.0: Enterprise Resilience, Faster Loading Speeds and Connectors to Full-Text Search Engines Top the List of Enhancements

Ontotext Improves Its RDF Triplestore, GraphDB™ 6.0:  Enterprise Resilience, Faster Loading Speeds and Connectors to Full-Text Search Engines Top the List of Enhancements | Big Data Technology, Semantics and Analytics | Scoop.it
Sofia, Bulgaria (PRWEB) August 20, 2014 -- Today, Ontotext released GraphDB™ 6.0 including enhancements to the high availability enterprise replication cluster, faster loading speeds, higher update rates and connectors for Lucene, SOLR and Elasticsearch. GraphDB™ 6.0 is the next major release of OWLIM – the triplestore known for its outstanding support for OWL 2 and SPARQL 1.1 that already powers some of the most impressive RDF database showcases.
Tony Agresta's insight:

This press release from PRWEB summarizes the latest enhancements to GraphDB from Ontotext including improvements in load speeds, enterprise high availability replication cluster and connectors to Lucene SoRL and Elasticsearch.  

more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

Triplestores Rise In Popularity

Triplestores Rise In Popularity | Big Data Technology, Semantics and Analytics | Scoop.it

Triplestores:  An Introduction and Applications

Tony Agresta's insight:

Triplestores are gaining in popularity.  This article does a nice job at describing what triple stores are and how they differ from graph databases.  But there isn't much in the article on how triple stores are used.  So here goes:

Some organizations are finding that when they apply a combination of semantic facts (triples) with other forms of unstructured and structured data, they can build extremely rich content applications.   In some cases, content pages are constructed dynamically.    The context based applications deliver targeted, relevant results creating a unique user experience.  Single unified architectures that can store and search semantic facts, documents and values at the same time require fewer IT and data processing resources resulting in shorter time to market.  Enterprise grade technology provides the security, replication, availability, role based access and the assurance no data is lost in the process.  Real time indexing provides instant results.

Other organizations are using triples stores and graph databases to  visually show connections useful in uncovering intelligence about your data.    These tools connect to Triplestores and NoSQL databases easily allowing users to configure graphs to show how the data is connected.   There's wide applicability for this but common use cases include identifying fraud and money laundering networks, counter-terrorism, social network analysis, sales performance, cyber security and IT asset management.  The triples, documents and values provide the fuel for  the visualization engine allowing for comprehensive data discovery and faster business decisions.

Other organizations focus on semantic enrichment and then ingest resulting semantic facts into triplestores to enhance the applications mentioned above.  Semantic enrichment extracts meaning from free flowing text and identifies triples.

Today, the growth in open data - pre-built triple stores - is allowing organizations to integrate semantic facts to create richer content applications.   There are hundreds of sources of triple stores that contain tens of billions of triples, all free.

What's most important about these approaches?  Your organization can easily integrate all forms of data in a single unified architecture.  The data is driving smart websites, rich search applications and powerful approaches to data visualization.   This is worth looking at more closely since the end results are more customers, lower costs, greater insights and happier users.

more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

Not All Graph Databases Are Created Equally - An Interview with Atanas Kiryakov - Ontotext

Not All Graph Databases Are Created Equally - An Interview with Atanas Kiryakov - Ontotext | Big Data Technology, Semantics and Analytics | Scoop.it
Graph databases help enterprise organizations transform the management of unstructured data and big data.
Tony Agresta's insight:

Atanas Kiryakov is a 15 year veteran of semantic technology and graph databases.   He will be interviewed on September 30th at 11 AM EDT.   I would suggest you sign up for this webinar which will focus on the following:


  • Significant use cases for semantic technology - How are they transforming business applications today?
  • The importance of graph databases - What makes them unique?
  • Creating text mining pipelines - How are they used in conjunction with graph databases?
  • The Semantic Platform - What other tools make up a complete semantic platform and how are they used?


You can review the webinar using the link above and sign up.  Details about the webinar itself will be e-mailed to you around the middle of September.


more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

Why are graph databases hot? Because they tell a story... - Ontotext

Why are graph databases hot? Because they tell a story... - Ontotext | Big Data Technology, Semantics and Analytics | Scoop.it
Graph databases, text mining and inference allow you extract meaning from text, perform semantic analysis and aid in knowledge management and data discovery
Tony Agresta's insight:

Inference is the ability to infer new facts using existing facts.  For example, if you know that Susan lives in Texas and Texas is in the USA, you can infer that Susan lives in the USA.   Inference can take on much more complex scenarios the results of which can be stored inside a graph database.  As these new facts are "materialized" they can inform websites, search applications and various forms of analysis.  This is where the real power of inference comes into play.


Do this "at scale" requires a high performance graph database that can infer new facts while users are simultaneously querying the database and new facts are being loaded - all within an enterprise resilient environment. This blog post explains more about graph databases, inference and how the semantic integration of data can improve productivity and results.

more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

Text Mining & Graph Databases - Two Technologies that Work Well Together - Ontotext

Text Mining & Graph Databases - Two Technologies that Work Well Together - Ontotext | Big Data Technology, Semantics and Analytics | Scoop.it
Graph databases, also known as triplestores, have a very powerful capability – they can store hundreds of billions of semantic facts (triples) from any subject imaginable.  The number of free semantic facts on the market today from sources such as DbPedia, GeoNames and others is high and continues to grow every day.   Some estimates have this total between 150 and 200 billion right now.   As a result, Linked Open Data can be a good source of information with which to load your graph databases. Linked Open Data is one source of data. When does it become really powerful?  When you create your own semantic triples from your own data and use them in conjunction with linked open data to enrich your database.  This process, commonly referred to as text mining,  extracts the salient facts from free flowing text and typically stores the results in some database.  With this done, you can analyze your enriched data, visualize it, aggregate it and report on it.  In a recent project Ontotext undertook on behalf of FIBO (Finanical Information Business Ontology), we enhanced the FIBO ontologies with Linked Open Data allowing us to query company names and stock prices at the same time to show the lowest trading prices for all public stocks in North America in the last 50 years.   To do this, we needed to combine semantic data sources,  something that’s easy to do with the Ontotext Semantic Platform. We have found that the optimal way to apply text mining is in conjunction with a graph database.  Many of our customers use our Text Mining to do just that. Some vendors only sell graph databases and leave it up to you to figure out how to mine the text.  Other vendors only sell the text mining part and leave it up to…
Tony Agresta's insight:

Here's a summary of how text mining works with graph databases.  It describes the major steps in the text mining process and ends with how entities, articles and relationships are indexed inside the graph database.  The blend of these two major classes of technology allow all of your unstructured data to be discoverable.  Search results are informed by much more than just the metadata associated with the document or e-mail.  They are informed by the meaning inside the document, the text itself which contains important insights about people, places, organizations, events and their relationship to other things. 

more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

September 30th at 11 AM EDT: Not All Graph Databases Are Created Equally - An Interview with Atanas Kiryakov - Ontotext

September 30th at 11 AM EDT: Not All Graph Databases Are Created Equally - An Interview with Atanas Kiryakov - Ontotext | Big Data Technology, Semantics and Analytics | Scoop.it
Graph databases help enterprise organizations transform the management of unstructured data and big data.
Tony Agresta's insight:

Graph databases store semantic facts used to describe entities and relationships to other entities.  This educational webinar will be hosted by Ontotext and will be an interview format with Atanas Kiryakov, an expert in this field.   If you want to learn about use cases for graph databases and how you can extract meaning from free flowing text and store results in the graph databases, this webinar is must. 

more...
No comment yet.
Scooped by Tony Agresta
Scoop.it!

RDF 101 - Cambridge Semantics

RDF 101 - Cambridge Semantics | Big Data Technology, Semantics and Analytics | Scoop.it
Semantic University tutorial and introduction to RDF.
Tony Agresta's insight:

This post is a bit technical but recommending reading since it represents a one of the most important aspects in data science today, the ability to discern meaning from unstructured data. 


To truly appreciate the significance of this technology, consider the following questions -  Since over 80% of the data being created today is unstructured in forms that include text, videos, images, documents, and more, how can organizations interpret the meaning behind this data?   How can they pull out the facts in the data and show relationships between those facts leading to new insights?

 

This post provides you with the foundation on how semantic processing works.   Cutting to the chase, the technologies are referred to as RDF (Resource Description Framework), SPARQL and OWL. They allow us to create underlying data models that understand relationships in unstructured data, even across web sites, document repositories and disparate applications like Facebook and LinkedIn.

 

These data models store data that has properties extracted from the unstructured data.  Consider the following sentence:   "The Monkeys are destroying Tom's garden."  Semantic processing of this text would deconstruct the sentence identifying the subject, object and predicate while also building relationships between the three.  The subject is "monkeys" and they are taking an action on "the garden". The garden is therefore the object and the predicate is "destroying".  

 

Most importantly, there is a connection made between the monkeys and the garden allowing us to show relationships between specific facts pulled from text.   How can this help us? 

 

Assume for a second you’re working for a government agency tracking a suspicious person who exists on a watch list?   Crawling the web looking for that person's name is one way to identify additional information about the person.   Technology to do this exists today.  When the name is detected, identifying relationships between the person being investigated and other subjects (or objects in the text) can lead you to new people that may also be of interest.  For examples if Sam is on the watch list a sentence like this would be of interest:  "Sam works with Steve at ABC Home Builders Corp.”  Relationships between the suspect (Sam) and someone new in the investigation (Steve) could be identified.   The fact that they both work for the same employer allows analysts to connect the subjects through this employer.

 

These interesting facts help investigators make connections within e-mail, phone conversations, in house data and other sources, all of which can be displayed visually in a graph to show the subjects and how they are linked. 

 

Data models to store, search and analyze this data will become one of the primary tools to interpret massive amounts of data being collected today.  This technology allows computers to understand relationships in unstructured data and display those relationships to analysts in the form of visual diagrams that clearly show connections to other data including phone calls, events, accounts, and more.  The implications of this extend far beyond counter terrorism to include social networking, marketing, fraud, cyber security and sales to name a few.


We are at an inflexion point in big data – data stored in silos can now be consolidated with external data from the open web.  Most importantly, the unstructured data can be interpreted as we form connections that are integral in understanding how things are related to each other.  Data visualization technology is the vehicle to display those connections allowing analysts to explore any form of data in a single application. 


Learn more about this technology and other advances in Enterprise NoSQL here.


more...
No comment yet.