Back to Journals » Clinical Epidemiology » Volume 11

Measuring prevalence and incidence of chronic conditions in claims and electronic health record databases

Authors Rassen JA, Bartels DB, Schneeweiss S, Patrick AR, Murk W

Received 24 July 2018

Accepted for publication 24 October 2018

Published 17 December 2018 Volume 2019:11 Pages 1—15


Checked for plagiarism Yes

Review by Single-blind

Peer reviewer comments 2

Editor who approved publication: Professor Henrik Toft Sørensen

Jeremy A Rassen,1 Dorothee B Bartels,2 Sebastian Schneeweiss,1,3,4 Amanda R Patrick,1 William Murk1,5

1Aetion, Inc, New York, NY, USA; 2BI X GmbH, Ingelheim, Germany; 3Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital, Boston, MA, USA; 4Harvard Medical School, Boston, MA, USA; 5Jacobs School of Medicine, University at Buffalo, Buffalo, NY, USA

Background: Health care databases are natural sources for estimating prevalence and incidence of chronic conditions, but substantial variation in estimates limits their interpretability and utility. We evaluated the effects of design choices when estimating prevalence and incidence in claims and electronic health record databases.
Methods: Prevalence and incidence for five chronic diseases at increasing levels of expected frequencies, from cystic fibrosis to COPD, were estimated in the Clinical Practice Research Datalink (CPRD) and MarketScan databases from 2011 to 2014. Estimates were compared using different definitions of lookback time and contributed person-time.
Results: Variation in lookback time substantially affected estimates. In 2014, for CPRD, use of an all-time vs a 1-year lookback window resulted in 4.3–8.3 times higher prevalence (depending on disease), reducing incidence by 1.9–3.3 times. All-time lookback resulted in strong temporal trends. COPD prevalence between 2011 and 2014 in MarketScan increased by 25% with an all-time lookback but stayed relatively constant with a 1-year lookback. Varying observability did not substantially affect estimates.
This framework draws attention to the underrecognized potential for widely varying incidence and prevalence estimates, with implications for care planning and drug development. Though prevalence and incidence are seemingly straightforward concepts, careful consideration of methodology is required to obtain meaningful estimates from health care databases.

Keywords: epidemiology, epidemiologic methods, epidemiological monitoring, sentinel surveillance, pharmacoepidemiology, cross-sectional studies, secondary databases, prevalence, prevalence studies, incidence

Creative Commons License This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]  View Full Text [HTML][Machine readable]