5 Ways to Deal with Missing Data

5 Ways to Deal with Missing Data

UGA and the MRII are proud to offer a new online course, Introduction to Data Analysis, authored by Ray Poynter. This 12-hour, $359, at-your-own-pace online course will introduce you to the critical concepts common to the analysis of quantitative research data, with special attention to survey data analysis. The following is an excerpt from the course that shows how missing data should be dealt with.

The decision about how missing responses should be dealt with is something that the person conducting the data analysis needs to be involved in. In some cases, the researcher will make different decisions about the treatment of missing responses for different analyses conducted as part of the same data analysis plan.

In many cases, missing data can be left as missing. However, the researcher would have to accept that the totals may not add up to 100%, or that the totals are based on ‘valid answers’. But, if the missing data is going to be used in multivariate analytical processes, a decision needs to be made.

The researcher needs to understand and remember the five techniques below:

  1. Substitute a neutral value – this is common, but can result in problems caused by the data being changed
  2. Substitute an imputed response – this is less common and can be complex; look for patterns in the data to assess what the person might have said; in conjoint analysis, use Hierarchical Bayes (often called just HB)
  3. Listwise deletion – in listwise deletion, the software looks at each participant in turn and, if they have missing data in a key area, they are excluded
  4. Pairwise deletion – in pairwise deletion, the system excludes only those participants who have a missing value for one of the variables being used in the current analysis; this process means that different analyses, or different variable selections, can change the number of participants included
  5. Use techniques that permit missing data – the final option is to leave the data as missing and use techniques that will cope with the missing elements

In commercial market research, pairwise deletion is usually preferred to listwise deletion because more of the data is used. Listwise can result in many participants’ data not being included. In many cases, the software may default to listwise deletion.

Online, Mobile, CAPI, and CATI can be configured to reduce or eradicate missing data by forcing responses to all questions. However, there are ethical and methodological concerns about forcing participants to enter an answer to every question, and for many statistical purposes, a “Don’t know” or “None of These” or “Not applicable” answer has the same impact as a missing response.

For many statistical analyses, “Don’t Know” responses will need to be re-coded as missing data and then treated in one of the ways described above. The key issue is the difference between a code and a numerical value. In many data collection/analysis packages, the default numerical value of scale is the same as its code, but when conducting statistical analyses, the numerical values for answers such as “Don’t Know” needs to be excluded.

The information described above on finding and dealing with missing data is just some of what is taught in our affordable online Principles Express course, Introduction to Data Analysis, available for $359. This course will help analysts, buyers of research services, and those designing research.  

Register online by clicking here or call +1 706 542 3537.

Ray Poynter, Fellow of the Market Research Society, is the founder of NewMR.org, editor of the ESOMAR book Answers to Contemporary Market Research Questions, and the Managing Director of The Future Place, a UK-based consultancy specializing in training.

We are grateful for the course to be sponsored by b3Intelligence, “a leading edge, advanced data analytics, integrated research, and intelligence company fulfilling an integral role in their clients’ growth by maximizing investment with innovative products and services.” Such sponsorships have funded the development of our new line of Principles Express courses, a portfolio of $359 online courses that let you master a research skill at your own pace, with just 9 to 14 hours of study.

Leave a Reply

Your email address will not be published. Required fields are marked *