Big Data: Predictions versus Insights?

Big Data: Predictions versus Insights?

This complimentary excerpt from the new Principles Express course Working with Secondary Data: Syndicated and Big Data, authored by Bill Bean, teaches you how to handle big data when it overwhelms even the most experienced market researchers, especially when deciding where to put your trust.

Big Data: Prediction versus Insight?

As consumers leave more and more digital trails as clues to their behavior, and as the databases grow to contain all of that information, marketers and researchers inevitably turn to machines and algorithms for help in making sense of it all. In the digital world, marketing happens very quickly. Digital advertising is sold, bought, targeted, delivered, and evaluated at speeds previously impossible or bordering on magical. These conditions are usually difficult problems for traditional statistical approaches to solve, so it is no wonder that we rely on machines to help sort things out and overcome these obstacles. 

Prediction and understanding are related but independent goals of Market Research, and it is quite possible to have one without the other. However, striking the right balance between prediction and understanding is still required. While accurate and timely prediction is often essential to leveraging opportunity and minimizing risk, understanding is as often the foundation for innovation. 

Whether using Big Data or not, predictions also require that the future behave at least something like the past. In a famous and much-publicized case from 2012, Google touted its ability to predict the spread of flu in the United States in near real-time using Google search trends information. This ability would be very valuable to public health officials, pharmacy owners, and the manufacturer of any cough and cold remedy to be sure. Yet in 2013, the forecasts fell completely apart, predictions missing by as much as 140%. An extensive post mortem analysis (Lazar et. al, 2014), revealed two major errors that spelled the end of the project: 1) The relationship between the search terms and external events had changed from the previous year, that is, Google had changed the search algorithm so the past data did not carry through to the future; 2) the original algorithm was built on many meaningless random correlations that also did not carry over to the next year, a classic example of what is called overfitting a model.

Google’s mistake with flu prediction is not a warning against the use of Big Data or an indictment against prediction. Big Data is very useful. It is instead an endorsement of sound technique in any form of research. Both of the failures in question could have been avoided by employing any number of sound model building and research practice well known before the appearance of “Big Data”. 

There are a couple of lessons in these cautionary examples that apply to both big and small data:

  1. Quality of data and technique always matters. A lot of data might not be a great fit to your problem. Always look at any data with a critical eye for potential problems that will bias your answers and apply any reasonable corrections to improve its power to help you make the right decision. 
  2. Data may be objective but may still give you the wrong answer. Analysis of a more granular level of data often reveals patterns invisible to a higher level. Products with the same sales and share can have completely different patterns of purchase. Algorithms based on a biased history, carry that bias into their forecasts. Underlying assumptions of a model can change rendering it useless. Pay attention. Insights are more valuable than facts in the long run.

The information described above is just some of what is taught in our affordable online Principles Express course, Working with Secondary Data: Syndicated and Big Data. Interested in learning more? Principles Express courses are totally self-paced, online and easy to complete in 9 to 14 hours. For only $359, you’ll learn about the value and challenges you might encounter when linking primary and secondary data, and receive a certificate from the University of Georgia and MRII for completing the course.  Click here or call +1 706 542 3537 to learn more! 

We are grateful for the course to be sponsored by Kantar Health, a leading consulting and research provider to communicate the value and potential for products and services that fall under healthcare regulation and legislation. Such sponsorships have funded the development of our new line of Principles Express courses, a portfolio of $359 online courses that let you master a research skill at your own pace, with just 9 to 14 hours of study.

Leave a Reply

Your email address will not be published. Required fields are marked *