Imbalance p values for baseline covariates in randomized controlled trials: a last resort for the use of p values? A pro and contra debate
Authors Stang A, Baethge C
Received 4 January 2018
Accepted for publication 2 March 2018
Published 8 May 2018 Volume 2018:10 Pages 531—535
Checked for plagiarism Yes
Review by Single-blind
Peer reviewers approved by Ms Justinn Cochran
Peer reviewer comments 3
Editor who approved publication: Professor Irene Petersen
Andreas Stang,1,2 Christopher Baethge3,4
1Center of Clinical Epidemiology, Institute of Medical Informatics, Biometry and Epidemiology, Medical Faculty, University Hospital of Essen, Hufelandstr, Essen, Germany; 2Department of Epidemiology, School of Public Health, Boston University, Boston, MA, USA; 3Department of Psychiatry and Psychotherapy, University of Cologne Medical School, Cologne, Germany; 4Editorial Offices, Deutsches Ärzteblatt and Deutsches Ärzteblatt International, Deutscher Ärzte-Verlag, Cologne, Germany
Background: Results of randomized controlled trials (RCTs) are usually accompanied by a table that compares covariates between the study groups at baseline. Sometimes, the investigators report p values for imbalanced covariates. The aim of this debate is to illustrate the pro and contra of the use of these p values in RCTs.
Pro: Low p values can be a sign of biased or fraudulent randomization and can be used as a warning sign. They can be considered as a screening tool with low positive-predictive value. Low p values should prompt us to ask for the reasons and for potential consequences, especially in combination with hints of methodological problems.
Contra: A fair randomization produces the expectation that the distribution of p values follows a flat distribution. It does not produce an expectation related to a single p value. The distribution of p values in RCTs can be influenced by the correlation among covariates, differential misclassification or differential mismeasurement of baseline covariates. Given only a small number of reported p values in the reports of RCTs, judging whether the realized p value distribution is, indeed, a flat distribution becomes difficult. If p values ≤0.005 or ≥0.995 were used as a sign of alarm, the false-positive rate would be 5.0% if randomization was done correctly, and five p values per RCT were reported.
Conclusion: Use of a low p value as a warning sign that randomization is potentially biased can be considered a vague heuristic. The authors of this debate are obviously more or less enthusiastic with this heuristic and differ in the consequences they propose.
Keywords: randomized controlled trial, distribution, statistical, random allocation
This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.Download Article [PDF] View Full Text [HTML][Machine readable]