Back to Journals » Clinical Epidemiology » Volume 8

An algorithm for identification and classification of individuals with type 1 and type 2 diabetes mellitus in a large primary care database

Authors Sharma M, Petersen I, Nazareth I, Coton SJ

Received 23 May 2016

Accepted for publication 23 June 2016

Published 12 October 2016 Volume 2016:8 Pages 373—380


Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Professor Henrik Toft Sørensen

Manuj Sharma,1 Irene Petersen,1,2 Irwin Nazareth,1 Sonia J Coton,1

1Department of Primary Care and Population Health, University College London, London, UK; 2Department of Clinical Epidemiology, Aarhus University, Aarhus, Denmark

Background: Research into diabetes mellitus (DM) often requires a reproducible method for identifying and distinguishing individuals with type 1 DM (T1DM) and type 2 DM (T2DM). 
Objectives: To develop a method to identify individuals with T1DM and T2DM using UK primary care electronic health records. 
Methods: Using data from The Health Improvement Network primary care database, we developed a two-step algorithm. The first algorithm step identified individuals with potential T1DM or T2DM based on diagnostic records, treatment, and clinical test results. We excluded individuals with records for rarer DM subtypes only. For individuals to be considered diabetic, they needed to have at least two records indicative of DM; one of which was required to be a diagnostic record. We then classified individuals with T1DM and T2DM using the second algorithm step. A combination of diagnostic codes, medication prescribed, age at diagnosis, and whether the case was incident or prevalent were used in this process. We internally validated this classification algorithm through comparison against an independent clinical examination of The Health Improvement Network electronic health records for a random sample of 500 DM individuals. 
Results: Out of 9,161,866 individuals aged 0–99 years from 2000 to 2014, we classified 37,693 individuals with T1DM and 418,433 with T2DM, while 1,792 individuals remained unclassified. A small proportion were classified with some uncertainty (1,155 [3.1%] of all individuals with T1DM and 6,139 [1.5%] with T2DM) due to unclear health records. During validation, manual assignment of DM type based on clinical assessment of the entire electronic record and algorithmic assignment led to equivalent classification in all instances. 
Conclusion: The majority of individuals with T1DM and T2DM can be readily identified from UK primary care electronic health records. Our approach can be adapted for use in other health care settings.

Keywords: diabetes and endocrinology, epidemiology, public health, databases, algorithm 

A Letter to the Editor has been received and published for this article.

Creative Commons License This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]  View Full Text [HTML][Machine readable]