I am uncomfortable with the growing emphasis on big data and its stylist, visualization. Don't get me wrong — I love infographic representations of large data sets. The value of representing information concisely and effectively dates back to Florence Nightingale, when she developed a new type of pie chart to clearly show that more soldiers were dying from preventable illnesses than from their wounds. On the other hand, I see beautiful exercises in special effects that show off statistical and technical skills, but do not clearly serve an informing purpose. That's what makes me squirm.
Ultimately, data visualization is about communicating an idea that will drive action. Understanding the criteria for information to provide valuable insights and the reasoning behind constructing data visualizations will help you do that with efficiency and impact.
For information to provide valuable insights, it must be interpretable, relevant, and novel. With so much unstructured data today, it is critical that the data being analyzed generate interpretable information. Collecting lots of data without the associated metadata — such as what is it, where was it collected, when, how and by whom — reduces the opportunity to play with, interpret, and gain insights from the data. It must also be relevant to the persons who are looking to gain insights, and to the purpose for which the information is being examined. Finally, it must be original, or shed new light on an area. If the information fails any one of these criteria, then no visualization can make it valuable. That means that only a tiny slice of the data we can bring to life visually will actually be worth the effort.
Once we've narrowed the universe of data down to those that satisfy these three requirements, we must also understand the legitimate reasons to construct data visualizations, and recognize what factors affect the quality of data visualizations. There are three broad reasons for visualizing data: