Statistical Primer: how to deal with missing data in scientific research?

Grigorios Papageorgiou, Stuart W. Grant, Johanna J.M. Takkenberg, Mostafa Mokhles

February 2018

PDF DOI e-Print

Abstract

Missing data are a common challenge encountered in research which can compromise the results of statistical inference when not handled appropriately. This paper aims to introduce basic concepts of missing data to a non-statistical audience, list and compare some of the most popular approaches for handling missing data in practice and provide guidelines and recommendations for dealing with and reporting missing data in scientific research. Complete case analysis and single imputation are simple approaches for handling missing data and are popular in practice, however, in most cases they are not guaranteed to provide valid inferences. Multiple imputation is a robust and general alternative which is appropriate for data missing at random, surpassing the disadvantages of the simpler approaches, but should always be conducted with care. The aforementioned approaches are illustrated and compared in an example application using Cox regression.

Type

Journal article

Publication

In Interactive CardioVascular and Thoracic Surgery (ICVTS)