Back to Journals » Clinical Epidemiology » Volume 9

Missing data and multiple imputation in clinical epidemiological research

Authors Pedersen AB, Mikkelsen EM, Cronin-Fenton D, Kristensen NR, Pham TM, Pedersen L, Petersen I

Received 8 December 2016

Accepted for publication 12 February 2017

Published 15 March 2017 Volume 2017:9 Pages 157—166

DOI https://doi.org/10.2147/CLEP.S129785

Checked for plagiarism Yes

Review by Single-blind

Peer reviewers approved by Dr Colin Mak

Peer reviewer comments 3

Editor who approved publication: Dr M. Alan Brookhart


Alma B Pedersen,1 Ellen M Mikkelsen,1 Deirdre Cronin-Fenton,1 Nickolaj R Kristensen,1 Tra My Pham,2 Lars Pedersen,1 Irene Petersen1,2

1Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus N, Denmark; 2Department of Primary Care and Population Health, University College London, London, UK

Abstract: Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data.

Keywords: missing data, observational study, multiple imputation, MAR, MCAR, MNAR

Creative Commons License This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]  View Full Text [HTML][Machine readable]