Back to Journals » Open Access Medical Statistics » Volume 2

Average effect estimation with dichotomized events when the missing data mechanism is not missing at random

Authors Kwon A, Ren

Received 17 October 2012

Accepted for publication 16 November 2012

Published 18 December 2012 Volume 2012:2 Pages 85—92


Checked for plagiarism Yes

Review by Single-blind

Peer reviewer comments 4

Amy M Kwon,1 Dianxu Ren2

1Biostatistics and Bioinformatics Core, James Graham Brown Cancer Center, University of Louisville, Louisville, KY, 2Department of Biostatistics, University of Pittsburgh Center for Research and Evaluation, School of Nursing, University of Pittsburgh, Pittsburgh, PA, USA

Background: The purpose of this work was to estimate the average effect of the covariate of interest when the outcome variable is dichotomized from a continuous variable and data are incomplete, with the missing data not missing at random (NMAR). The motivating example is to estimating the effect of vitamin D levels on secondary hyperparathyroidism among patients with chronic kidney disease.
Methods: The average effect of the covariate of interest is computed by a two-step procedure. In the first step, we identify the conditional distribution of the original variable given the covariates by obtaining the parameter estimates. In the second step, we draw the predictive values from the identified distribution, and create binary values from the predictive values by dichotomizing them at the threshold.
Results: According to the simulation results, the biases of the effects between logistic regression with the complete data and the estimated logistic regression with the converted binary variable are negligible. For the application example, the effect of vitamin D on the occurrence of secondary hyperparathyroidism is highly significant in the complete case analysis, but only a modest effect of vitamin D on secondary hyperparathyroidism is observed under the NMAR assumption.
Conclusion: It is impossible to find consistent estimates without knowing the exact nature of the missing data when the missing data mechanism is NMAR. Also, the outcome variable is binary, so we may be faced with an unidentifiability problem when the missing data mechanism is NMAR. To avoid this problem, we estimated the average effect of the covariate of interest in the framework of a generalized linear model from the relationship between a dichotomized outcome and a continuous original outcome, and the estimated effect showed negligible bias according to this simulation.

Keywords: average effect, NMAR, not missing at random, dichotomized events, secondary hyperparathyroidism

Creative Commons License This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF] 


Readers of this article also read:

Simultaneous growth of two cancer cell lines demonstrates variability in growth rates

Hamon A, Tosolini M, Ycart B, Pont F, Fournie JJ

Open Access Medical Statistics 2014, 4:29-37

Published Date: 28 November 2014

Analysis of graded lesions in long-term carcinogenicity studies

Fry JS, Lee PN, Hamling JS

Open Access Medical Statistics 2013, 3:11-37

Published Date: 14 May 2013

Analysis of tooth decay data in Japan using asymmetric statistical models

Yamamoto K, Tomizawa S

Open Access Medical Statistics 2012, 2:61-64

Published Date: 19 November 2012

Baseline medication adherence and response to an electronically delivered health literacy intervention targeting adherence

Ownby RL, Waldrop-Valverde D, Caballero J, Jacobs RJ

Neurobehavioral HIV Medicine 2012, 4:113-121

Published Date: 18 October 2012

Body mass index and the risk of prostate cancer

McGee DL, Crespo CJ

Open Access Medical Statistics 2012, 2:53-60

Published Date: 27 September 2012

Appropriate statistical testing of quality of life scores from children with asthma and their caregivers

Myers JA, Steiner RWP, Legleiter J, Chen YT, Esterhay RJ

Open Access Medical Statistics 2012, 2:15-20

Published Date: 20 February 2012

Assessment of reaching proficiency in procedural skills: fiberoptic airway simulator training in novices

Duan X, Wu D, Bautista AF, Akca O, Carter MB, Latif RK

Open Access Medical Statistics 2011, 1:45-50

Published Date: 17 November 2011

Assessing the variability of the attributable causes of death

Fu WJ, Wu T, Wang Y, Meng H, Huang J

Open Access Medical Statistics 2011, 1:37-43

Published Date: 23 September 2011

Bayesian meta-analysis for test accuracy

Broemeling LD

Open Access Medical Statistics 2011, 1:21-35

Published Date: 5 September 2011