Big Data Technology, Semantics and Analytics
12.1K views | +0 today
Big Data Technology, Semantics and Analytics
Trends, success and applications for big data including the use of semantic technology
Curated by Tony Agresta
Your new post is loading...
Your new post is loading...
Scooped by Tony Agresta!

Defining Big Data Visualization and Analysis Use Cases -- TDWI -The Data Warehousing Institute

Defining Big Data Visualization and Analysis Use Cases -- TDWI -The Data Warehousing Institute | Big Data Technology, Semantics and Analytics |
Use these five use cases to spark your thinking about how to combine big data and visualization tools in your enterprise.
Tony Agresta's insight:

One form of data visualization that is underutilized by sales and marketing professionals is a relationship graph which shows you connections between people, places, things, events...any attributes you want to see in the graph.  This form of visualization has long been used by the intelligence community to find bad guys and identify fraud networks.  But it also has practical applications in sales and marketing. 


Let's say you're trying to improve your lead conversion process and accelerate sales cycles.   Wouldn't it be important to analyze relationships between campaigns, qualified leads created, the business development people that created the leads and how fast each lead progressed through sales stages?


Imagine a network graph that showed the campaigns, business development people that worked the lead pool, qualified leads and the number of opportunities created.  Imagine if components of the graph (nodes) were scaled based on the amount of money spent on each campaign, the number of leads each person worked and the value of each opportunity.  Your eye would be immediately drawn to a number of insights.  


You could quickly see which campaigns provided the most bang for your buck - the ones with relatively low cost and high qualified lead production.   You could quickly see which business development reps generated a high volume of qualified leads and how many turned into real opportunities.  Now imagine if you could play the creation of the graph over time.   You could see when campaigns started to generate qualified leads.  How long did it take?   How soon could sales expect to get qualified leads?   Should your campaign planning cycles change?   Are your more expensive campaigns having the impact you expected?   Is this all happening fast enough to meet sales targets?  


This form of data visualization is easier to apply than you think.  There are tools on the market that allow you to connect to CSV files exported from your CRM system and draw the graph in seconds.   As data visualization becomes more common in business, sales and marketing professionals will start to use this approach to measure performance of campaigns and employees while better understanding influencing factors in each stage of the sales cycles.  

No comment yet.
Scooped by Tony Agresta!

The 5 most influential data visualizations of all time

Data visualization allows us all to see and understand our data more deeply. That understanding breeds good decisions. Without data visualization and data anal
Tony Agresta's insight:

Data visualization is the key to unlocking meaning inside of big data.   The ability to explore the data in an unconstrained way through interactive charting allows analysts to uncover insights rapidly.  Related features such as filtering allow users to focus on subsets of big data for further analysis. Collaboration extends this process through force multiplication of efforts.  

Tony Agresta's comment, December 16, 2012 9:49 AM
Most of the data visualization approaches today fall into five classes: charting, tables, geo-spatial, network and time lines. The vast majority of business analysts focus on the charting type and, as you know, there are hundreds of different ways to express data visually using charts. Tables have also been used extensively used in conjunction with icons that show direction for a specific metric using arrows or lines. This approach answers the question "Is the metric in the table trending upward or downward?"Geo-spatial visualizations continue to grow in popularity typically showing concentrations of events or people within a geographic area alongside other key landmarks or metrics. Networks of connected entities (people, places, events such as phone calls or log in access or database access) are beginning to be used more extensively outside of core markets such as government intelligence, law enforcement, cyber security and commercial fraud analysis. Most of these "link-node" or "link analysis" networks are used to identify groups of people that would have otherwise gone undetected without this form of visualization. And with the tremendous growth in social networks, these forms of visualization will continue to be used by commercial and government organizations. Time line analysis, or temporal analysis, is one of the less common forms of visualization and also one of the most revealing since analysts can detect patterns or trends in activity over time. But the fact that is that no single visualization can tell the entire story. This is where interactivity between the visualizations and the ability to explore the big data space in an unconstrained manner including looking across data sets is very important. This interactivity leads to faster learning. The human mind is able to recognize outliers and interesting patterns. Subsets of data can be created to examine the attributes of this new data spin-off. Conclusions can be drawn and shared with the business to make informed decisions.
Scooped by Tony Agresta!

Importance of NoSQL to Discovery - A Data Analysis Road Map You Can Apply Today

Importance of NoSQL to Discovery - A Data Analysis Road Map You Can Apply Today | Big Data Technology, Semantics and Analytics |
When you use the analytical process known as discovery, I recommend that you look for tools and environments that allow you connect to NoSQL platforms
Tony Agresta's insight:

The convergence of data visualization and NoSQL is becoming a hotter topic every day.  We're at the very beginning of this movement  as organizations integrate many forms of data with technology to visualize relationships and detect patterns across and within data sets.  There aren't many vendors that do this well today and demand is growing.  Some organizations are trying to achieve big data visualization through data science as a service.   Some software companies have created connectors to NoSQL (and other) data sources to reach this goal.  As you would expect, deployment options run the gamut. 

Examples of companies that offer data visualization generated from a variety of data sources including NoSQL are Centrifuge Systems who displays results in the form of relationship graphs, Pentaho who provides a full array of analytics including data visualization and predictive analytics and Tableau who supports dozens of data sources along with great charting and other forms of visualization.   Regardless of which you choose (and there are others), the process you apply to select and analyze the data will be important.  

In the article, John L Myers discusses some of the challenges users face with data discovery technology (DDT).  Since DDT operates from the premise that you don’t know all the answers  in advance, it’s more difficult to pinpoint the sources needed in the analysis.    Analysts discover insights as they navigate through the data visualizations.  This challenge isn’t too distant from what predictive modelers face as they decide what variables they want to feed into models.  They oftentimes don’t know what the strongest predictors will be so they apply their experience to carefully select data.  They sometimes transform specific fields allowing an attribute to exhibit greater explanatory power.   BI experts have long struggled with the same issue as they try and decide what metrics and dashboards will be most useful to the business.  

Here are some guidelines that may help you solve the problem.   They can be used to plan your approach to data analysis.

  • Start by writing down a hypothesis you want to prove before you connect to specific sources.  What do you want to explore?  What do you want to prove?  In some cases, you'll want to prove many things. That's fine.   Write down your top ones.
  • For each hypothesis create a list of specific questions you want to ask the data that could prove or disprove the hypothesis.   You may have 20 or 30 questions for each hypothesis.
  • Find the data sources that have the data you need to answer the questions.  What data will you need to arrive at a conclusion? 
  • Begin to profile each field to see how complete the data is.   In other words, take an inventory of the data checking to see if there are a missing values, data quality errors or values that make the specific source a good one. This may point back to changes in data collection needed by your current systems or processes. 
  • Go a layer deeper in your charting and profiling beyond histograms to show relationships between variables you believe will be helpful as you attempt to answer your list of questions and prove or disprove your hypothesis.  Show some relationships between two or more variables using heat maps, cross tabs and drill charts.
  • Reassess your original hypothesis.  Do you have the necessary data?  Or do you need to request additional types of data?
  • Once you are set on the inventory of data and you have the tools to connect to those sources, create a set of visualizations to resolve the answers to each of the questions.  In some cases, it may be 4 or 5 visualizations for each question.  Sometimes, you will be able to answer the question with one visualization.
  • Assemble the results for each question to prove or disprove the hypothesis.    You should arrive at a nice storyboard approach that, when assembled in the right order, allows you to articulate the steps in the analysis and draw conclusions needed to run your business.     

If you take these steps upfront and work with a tool that allows you to easily connect to a variety of data sources, you can quickly test your theory, profile and adjust the variables used in your analysis and create meaningful results the organization can use.  But if you go into the exercise without any data planning, without any goals in mind, you are bound to waste cycle times trying to decide what to include in your analysis and what not to include.    Granted, you won't be able to account for every data analysis issue your department or company has.   The purpose of this exercise is to frame the questions you want to ask of the data in support of a more directed approach to data visualization. 

Intelligence-led-decisions should be well received by your cohorts and applied more readily with this type of up front planning.  The steps you take to analyze the data will run more smoothly.   You will be able to explain and better defend the data visualization path you've taken to arrive at conclusions.  In other words, the story will be more clear when you present it. 

Consider the types of visualizations supported by the analytics technology when you do this. Will you need temporal analysis?   Will you require relationship graphs that show connections between people, events, organizations and more?    Do you need geospatial visualizations to prove your hypothesis?  A little bit of planning when using data discovery and NoSQL technology will go a long way in meeting your analytical needs. 

No comment yet.