Back to Journals » Clinical Epidemiology » Volume 7
Evaluating the evaluation
Authors Berger V
Received 20 December 2014
Accepted for publication 22 December 2014
Published 22 January 2015 Volume 2015:7 Pages 117—118
DOI https://doi.org/10.2147/CLEP.S79643
Checked for plagiarism Yes
Editor who approved publication: Professor Henrik Sørensen
Vance W Berger
National Cancer Institute, University of Maryland Baltimore County, Biometry Research Group, Rockville, MD, USA
Zhang et al1 sought to determine which adjustment method is the best. That is a laudable objective, but their approach leaves quite a bit to be desired. When we cut to the chase, we find that they pre-supposed that the analysis of covariance (ANCOVA) was ideal, and, presumably, confirmed this empirically by noting that the ANCOVA results were most aligned with the ANCOVA gold standard. This is fairly perplexing logic. Had any of the other methods been chosen instead as the gold standard, then that method would have been found to be the best by virtue of agreeing with its own results. This is hardly a compelling endorsement. Beyond that, even if the authors did use a more reasoned approach, how can one trial be used to validate an analysis?
View original paper by Zhang and colleagues.
Dear editor
Zhang et al1 sought to determine which adjustment method is the best. That is a laudable objective, but their approach leaves quite a bit to be desired. When we cut to the chase, we find that they pre-supposed that the analysis of covariance (ANCOVA) was ideal, and, presumably, confirmed this empirically by noting that the ANCOVA results were most aligned with the ANCOVA gold standard. This is fairly perplexing logic. Had any of the other methods been chosen instead as the gold standard, then that method would have been found to be the best by virtue of agreeing with its own results. This is hardly a compelling endorsement. Beyond that, even if the authors did use a more reasoned approach, how can one trial be used to validate an analysis?
An analysis is good or bad based on how well its results align with the underlying reality of the situation. In a simulation study, we would know this reality. In actual trials, we do not. There is no gold standard. Moreover, there is only one trial being considered. This is most assuredly not the way to compare analysis techniques. It is worth noting, however, that ANCOVA relies on normality, among other assumptions, for its validity. Since the data are never actually normally distributed, the method is never technically valid.2 This indisputable fact should give us pause before we blindly accept so fanciful a method. There is a valid and exact method that is based on a ranking of the pairs, pre and post, without having to make any assumptions at all.3 Surely, this method, which was developed for categorical data but applies equally well to continuous data, might have been considered as well. Finally, it is stated that “no methods are available for analysis of data that are ‘missing not at random’.” This is patently untrue4 and almost reaches the level of lunacy attained two sentences later when it is stated that because there were not much missing data, the data were assumed to be missing at random. Missing data can never be demonstrated to be missing at random. This is an entirely academic construct with no application in the real world.5
Disclosure
The author reports no conflict of interest in this work.
References
Zhang S, Paul J, Nantha-Aree M, et al. Empirical comparison of four baseline covariate adjustment methods in analysis of continuous Outcomes in randomized controlled trials. Clin Epidemiol. 2014;6:227–235. | |||||||
Berger VW. Pros and cons of permutation tests in Clinical trials. Stat Med. 2000;19:1319–1328. | |||||||
Berger VW, Zhou YY, Ivanova A, Tremmel L. Adjusting for ordinal covariates by inducing a partial ordering. Biomet J. 2004;46(1):48–55. | |||||||
Lachin JM. Worst-rank score analysis with informatively missing observations in clinical trials. Control Clin Trials. 1999;20(5):408–422. | |||||||
Berger VW. Conservative handling of missing data. Contemp Clin Trials. 2012;33:460. Authors’ reply Shiyuan Zhang, Lehana Thabane Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON, Canada Correspondence: Lehana Thabane, Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON, Canada, Email [email protected] Dear editor We thank Dr Vance Berger for his interest in and insightful discussion of our paper.1 Essentially, we agree with his sentiments, however we need to clarify the central points of his discussion, namely: 1) the objective of the study; and 2) the handling of missing data in our statistical analyses. The objective of the study: we designed and conducted the study to assess the sensitivity or robustness2 of the findings from the original trial (MOBILE trial)3 by varying a specific factor in the analysis, in this case the method of analysis. We judged robustness based on the magnitude, direction, and statistical significance of the effect estimate. The stated objectives of the study are clear on this goal. We did not intend to compare the methods on the basis of their statistical properties—something we agree with the author on, that can be done only through simulation. We also discuss this issue in the Discussion section, supplemented with findings from published simulation studies. The “missing at random (MAR)” assumption in multiple imputation: again we agree with the author that the assumption of MAR cannot be verified. Given this potential limitation, a commonly used approach to handle missing data is to assess the impact of “missingness” on the findings through some sensitivity analyses. The goal is to check whether, under certain assumptions of missingness, the findings would remain robust if the missing data were imputed through some imputation strategy. We did this in our study and we found that the results remained robust irrespective of the method of handling missing data. We also discuss the limitations of the imputation methods used to assess robustness in the Discussion section. We make no claims about the validity of the MAR assumption. We hope that this response provides some clarity on the objectives and context to our paper. Disclosure The authors report no conflicts of interest in this communication. References
© 2015 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms. |