Back to Journals » Risk Management and Healthcare Policy » Volume 12

Machine Learning For Tuning, Selection, And Ensemble Of Multiple Risk Scores For Predicting Type 2 Diabetes

Authors Liu Y, Ye S, Xiao X, Sun C, Wang G, Wang G, Zhang B

Received 2 August 2019

Accepted for publication 8 October 2019

Published 5 November 2019 Volume 2019:12 Pages 189—198

DOI https://doi.org/10.2147/RMHP.S225762

Checked for plagiarism Yes

Review by Single-blind

Peer reviewers approved by Dr Shashank Kaushik (PT)

Peer reviewer comments 2

Editor who approved publication: Professor Marco Carotenuto


Yujia Liu,1 Shangyuan Ye,2 Xianchao Xiao,1 Chenglin Sun,1 Gang Wang,1 Guixia Wang,1 Bo Zhang3

1Department of Endocrinology and Metabolism, The First Hospital of Jilin University, Changchun, Jilin 130021, People’s Republic of China; 2Department of Population Medicine, Harvard Pilgrim Health Care and Harvard Medical School, Boston, MA, USA; 3Department of Neurology and ICCTR Biostatistics and Research Design Center, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA

Correspondence: Bo Zhang
Department of Neurology and ICCTR Biostatistics and Research Design Center, Boston Children’s Hospital and Harvard Medical School, 21 Autumn Street, Boston, MA 02115, USA
Email bo.zhang@childrens.harvard.edu
 
Guixia Wang
Department of Endocrinology and Metabolism, The First Hospital of Jilin University, 71 Xinmin Street, Changchun, Jilin 130021, People’s Republic of China
Email gwang168@jlu.edu.cn

Background: This study proposes the use of machine learning algorithms to improve the accuracy of type 2 diabetes predictions using non-invasive risk score systems.
Methods: We evaluated and compared the prediction accuracies of existing non-invasive risk score systems using the data from the REACTION study (Risk Evaluation of Cancers in Chinese Diabetic Individuals: A Longitudinal Study). Two simple risk scores were established on the bases of logistic regression. Machine learning techniques (ensemble methods) were used to improve prediction accuracies by combining the individual score systems.
Results: Existing score systems from Western populations performed worse than the scores from Eastern populations in general. The two newly established score systems performed better than most existing scores systems but a little worse than the Chinese score system. Using ensemble methods with model selection algorithms yielded better prediction accuracy than all the simple score systems.
Conclusion: Our proposed machine learning methods can be used to improve the accuracy of screening the undiagnosed type 2 diabetes and identifying the high-risk patients.

Keywords: type 2 diabetes, risk score, machine learning, voting, stacking, prediction


Creative Commons License This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]  View Full Text [HTML][Machine readable]