Back to Journals » Therapeutics and Clinical Risk Management » Volume 3 » Issue 3

Creating diagnostic scores using data-adaptive regression: An application to prediction of 30-day mortality among stroke victims in a rural hospital in India

Authors Merrill D Birkner, SP Kalantri, Vaishali Solao, Priya Badam, Rajnish Joshi, Ashish Goel, Madhukar Pai, Alan E Hubbard

Published 15 July 2007 Volume 2007:3(3) Pages 475—484

Merrill D Birkner1, SP Kalantri2, Vaishali Solao2, Priya Badam2, Rajnish Joshi2,3, Ashish Goel4, Madhukar Pai5, Alan E Hubbard1

1University of California, Berkeley, School of Public Health, Division of Biostatistics, University Hall, Berkeley, CA, USA; 2Department of Medicine, Mahatma Gandhi Institute of Medical Sciences, Sevagram, Maharashtra, India; 3University of California, Berkeley, School of Public Health, Division of Epidemiology, Warren Hall, Berkeley, CA, USA; 4Department of Medicine, All India Institute of Medical Sciences, New Delhi, India; 5Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal Canada

Abstract: Developing diagnostic scores for prediction of clinical outcomes uses medical knowledge regarding which variables are most important and empirical/statistical learning to find the functional form of these covariates that provides the most accurate prediction (eg, highest specificity and sensitivity). Given the variables chosen by the clinician as most relevant or available due to limited resources, the job is a purely statistical one: which model, among competitors, provides the most accurate prediction of clinical outcomes, where accuracy is relative to some loss function. An optimal algorithm for choosing a model follows: (1) provides a flexible, sequence of models, which can “twist and bend” to fit the data and (2) use of a validation procedure that optimally balances bias/variance by choosing models of the right size (complexity). We propose a solution to creating diagnostic scores that, given the available variables, will appropriately trade-off model complexity with variability of estimation; the algorithm uses a combination of machine learning, logistic regression (POLYCLASS) and cross-validation. For example, we apply the procedure to data collected from stroke victims in a rural clinic in India, where the outcome of interest is death within 30 days. A quick and accurate diagnosis of stroke is important for immediate resuscitation. Equally important is giving patients and their families an indication of the prognosis. Accurate predictions of clinical outcomes made soon after the onset of stroke can also help choose appropriate supporting treatment decisions. Severity scores have been created in developed nations (for instance, Guy’s Prognostic Score, Canadian Neurological Score, and the National Institute of Health Stroke Scale). However, we propose a method for developing scores appropriate to local settings in possibly very different medical circumstances. Specifically, we used a freely available and easy to use exploratory regression technique (POLYCLASS) to predict 30-day mortality following stroke in a rural Indian population and compared the accuracy of the technique with these existing stroke scales, resulting in more accurate prediction than the existing scores (POLYCLASS sensitivity and specificity of 90% and 76%, respectively). This method can easily be extrapolated to different clinical settings and for different disease outcomes. In addition, the software and algorithms used are open-source (free) and we provide the code in the appendix.

Keywords: prediction, mortality, stroke, prognostic model, accuracy

Download Article [PDF] 

Readers of this article also read:

Review of tenofovir-emtricitabine

Saba Woldemichael Masho, Cun-Lin Wang, Daniel E Nixon

Therapeutics and Clinical Risk Management 2007, 3:1097-1104

Published Date: 15 January 2008

Chronic non-cancer pain: Focus on once-daily tramadol formulations

Flaminia Coluzzi, Consalvo Mattia

Therapeutics and Clinical Risk Management 2007, 3:819-829

Published Date: 15 November 2007

Acute migraine: Current treatment and emerging therapies

Arun A Kalra, Debra Elliott

Therapeutics and Clinical Risk Management 2007, 3:449-459

Published Date: 15 July 2007

Rasagiline – a novel MAO B inhibitor in Parkinson’s disease therapy

Shimon Lecht, Simon Haroutiunian, Amnon Hoffman, Philip Lazarovici

Therapeutics and Clinical Risk Management 2007, 3:467-474

Published Date: 15 July 2007

Effect of patient-specific factors on weekly warfarin dose

Heather P Whitley, Joli D Fermo, Elinor CG Chumney, Walter Adam Brzezinski

Therapeutics and Clinical Risk Management 2007, 3:499-504

Published Date: 15 July 2007

Editorial || FREE PAPER ||

Garry M Walsh

Therapeutics and Clinical Risk Management 2007, 3:211-212

Published Date: 15 May 2007