In his latest book, statistician and predictive analytics expert Nate Silver describes his approach to forming forecasts out of data.
|Scooped by Tony Agresta|
“Big data is not a cure-all, and it is inherently filled with noise and uncertainty, but it does have tremendous potential if people approach it the right way. ‘The world is not lacking for techniques, it's more about the right goals and right attitudes,’ Silver said.” Having goals associated with big data analysis is a must. Applying technology and techniques to achieve those goals is not far behind.
Different approaches to analysis, some of which are presented in this article, complement one another and allow you to reach those goals faster. Let's take three classic approaches - dashboards, predictive models and data visualization – and the problem of fraud detection. Let’s say our goals include improved fraud detection for incoming insurance claims and more efficient allocation of resources to investigate those claims. If analysts can prioritize the workload for investigators, they can find fraud faster and reduce costs.
BI dashboards typically show key metrics which may lead the analyst to spot trends that they want to model using predictive analysis. They also point analysts to independent data that may have some explanatory power in the model. For example, a BI dashboard showing recent insurance claims by postal code may show a spike in certain areas which could lead to deeper analysis where geographic indicators (city, zip+4) are selected as attributes to predict fraudulent claims. While knowing that the insurance claim has a higher likelihood of being fraudulent is important, understanding the ring of people linked to that claim is potentially more important. Are those people linked to other claims that have been investigated and found to be fraudulent? Do these people share the same address? Are they using the same doctor or pharmacy? Have they worked together in the past?
Data visualization allows you to explore those relationships and picks up where predictive models leave off. In this case, all of the major types of analysis were used to achieve the goal of identifying suspicious claims and ultimately identifying a fraud ring.
Different approaches to analysis can complement one another. Business Intelligence and dashboards provide one level of visibility. They point the analyst to key trends and relationships that may require a model to be built. Results of those models (scores or yes/no indictors) can be used with data discovery tools to understand relationships, identify patterns of behavior, show connections between seemingly disparate data and rapidly draw conclusions. Identifying goals up front will allow analysts to formulate questions they want to ask of the data. Using different types of analysis helps address challenges with big data.
To learn more about how you achieve your goals using Enterprise NoSQL, you can go here: