Spirometry evaluation to assess performance of a claims-based predictive model identifying patients with undiagnosed COPD
Received 18 September 2018
Accepted for publication 28 December 2018
Published 15 February 2019 Volume 2019:14 Pages 439—446
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 2
Editor who approved publication: Dr Richard Russell
Chad Moretz,1 Srinivas Annavarapu,1 Rakesh Luthra,2 Seth Goldfarb,1 Andrew Renda,3 Asif Shaikh,2 Shuchita Kaila2
1Comprehensive Health Insights, Louisville, KY, USA; 2Boehringer Ingelheim, Ridgefield, CT, USA; 3Humana Inc., Louisville, KY, USA
Background: A claims-based model to predict patients likely to have undiagnosed COPD was developed by Moretz et al in 2015. This study aims to assess the performance of the aforementioned model using prospectively collected spirometry data.
Methods: A study population aged 40–89 years enrolled in a Medicare Advantage plan with prescription drug coverage or commercial health plan and without a claim for COPD diagnosis was identified from April 1, 2012 to March 31, 2016 in the Humana claims database. This population was stratified into subjects likely or unlikely to have undiagnosed COPD using the claims-based predictive model. Subjects were randomly selected for spirometry evaluation of FEV1 and FVC. The predictive model was validated using airflow limitation ratio (FEV1/FVC <0.70).
Results: A total of 218 subjects classified by the predictive model as likely and 331 not likely to have undiagnosed COPD completed spirometry evaluation. Those predicted to have undiagnosed COPD had a higher mean age (70.2 vs 67.9 years, P=0.0012) and a lower mean FEV1/FVC ratio (0.724 vs 0.753, P=0.0002) compared to those predicted not to have undiagnosed COPD. Performance metrics for the predictive model were: area under the curve =0.61, sensitivity =52.5%, specificity =64.6%, positive predictive value =33.5%, and negative predictive value =80.1%.
Conclusion: The claims-based predictive model identifies those not at risk of having COPD eight out of ten times, and those who are likely to have COPD one out of three times.
Keywords: COPD, exacerbation, predictive model, clinical validation, prevention
COPD is an umbrella term used to describe a number of lung conditions characterized by persistent limited airflow to the lungs.1 According to the World Health Organization, ~5% of all deaths globally (estimated at 3 million) in 2015 were caused by COPD.1 The worldwide prevalence of the disease is expected to rise due to the aging population, decreased likelihood of dying from other diseases, and the burgeoning epidemic of smoking.1 In 2011, nearly 6.4% of adults in the US (15.7 million adults) indicated they had been diagnosed with COPD.2
COPD is a preventable and treatable condition.3 Early detection of the condition is essential for implementing behavioral changes (smoking cessation) and initiating therapies that can slow the progression of the disease.3 Despite the importance of early detection of the disease, delay in diagnosis is unfortunately common. For example, an analysis of data from the Third National Health and Nutrition Examination Survey showed that 63% of patients with low lung function are undiagnosed.4 There are several explanations for delayed diagnosis of COPD. Early-stage COPD is often not associated with any noticeable symptoms.3 Once respiratory symptoms start to occur, patients may gradually and physiologically adapt to the slow decline in lung function and not seek the advice of a practitioner.5,6 In addition, health care providers may fail to detect or respond to changes in patients’ lung function.6
The clinical and economic consequences of the progression to advanced stages of COPD are substantial. The early identification of patients with undiagnosed COPD could help to target interventions to better manage or treat COPD. To that end, administrative claims-based predictive models have been previously developed to identify and characterize COPD patients.7–11 Such predictive models developed using administrative claims databases – which include demographic, clinical, and enrollment information collected by health plans and organized providers – may provide an efficient way to enable identification of patients with COPD.
A predictive model was developed by Moretz et al using Humana’s administrative claims database for identifying patients likely to have undiagnosed COPD.10 This claims database includes members enrolled in commercial or Medicare Advantage plans with prescription drug benefit (MAPD) plans. Demographic and clinical characteristics and health care resource utilization information derived from these administrative claims data were used to develop a predictive model to identify subjects likely to have undiagnosed COPD. The model demonstrated a positive predictive value (PPV) of 0.73. Of the subjects that were identified as having undiagnosed COPD by the predictive model, 73% were correctly identified using International Classification of Diseases, Clinical Modification, Ninth Revision (ICD-9-CM) diagnosis code for COPD. Other model performance measures included a negative predictive value (NPV) of 0.66 and an area under the curve (AUC) of 0.76.10
A limitation of the model developed by Moretz et al was that a confirmed diagnosis of COPD could not be established since spirometry test results were not available in the claims database.10 Spirometry assessment helps assess lung function and the presence of airflow limitation. Persistent airflow limitation identified through administration of a bronchodilator during spirometry testing is required for a clinical diagnosis of COPD.12 This current study aimed to assess the performance of the claims-based predictive model by Moretz et al in identification of patients likely to have undiagnosed COPD using prospectively collected spirometry data. This study consisted of three parts: 1) identification of persons likely or unlikely to have undiagnosed COPD using Moretz et al’s claims-based predictive model, 2) prospective data collection to assess airflow limitation in a subset of persons identified by the predictive model in the first part, and 3) assessment of predictive model performance using spirometry data.
Materials and methods
Identification of study subjects likely or unlikely to have undiagnosed COPD using a claims-based predictive model
Methodologies used for the identification of study population and claims-based predictive model were based on the study by Moretz et al.10 An initial study population consisting of subjects likely or unlikely to have undiagnosed COPD was selected from the Humana Inc. administrative claims database during the identification period from April 1, 2012 to March 31, 2016. The Humana database contains integrated medical claims, pharmacy claims, and enrollment data, representing more than 12 million current and former Humana members enrolled in commercial and MAPD plans. The data have national coverage, with a high proportion of members from Texas, Florida, and Ohio. For this study, Medicare Advantage and commercially insured populations were examined.
Subjects aged 40–89 years and who were enrolled in an MAPD or commercial health plan with ≥2 years of continuous enrollment were included. Subjects with one or more medical claims with a diagnosis of COPD (ICD-9-CM code 491.xx, 492.xx, or 496.xx) or diagnosis of cystic fibrosis (ICD-9-CM code 277.0x), pulmonary tuberculosis (ICD-9-CM code 011.xx), or malignant neoplasms (ICD-9-CM codes 140.xx–172.xx, 174.xx–209.3x, or 209.7x) during the study period were excluded. The continuous enrollment period was used to confirm that patients were not previously diagnosed with COPD and to confirm they did not have any of the conditions listed in the exclusion criteria. A graphical representation of the population identification and sample selection process is shown in Figure 1.
Figure 1 Attrition diagram.
This initial study population was grouped into persons likely or unlikely to have undiagnosed COPD based on the probabilities generated by the predictive model described by Moretz et al.10 Anticholinergic bronchodilators (odds ratio: 3.336) and tobacco cessation counseling (odds ratio: 2.871) had the largest influence on the model. Key parameters from the COPD predictive model are presented in Table 1. Subjects with a probability of ≥0.027 of having undiagnosed COPD were classified into the likely COPD group and those with a probability of <0.027 of having undiagnosed COPD were classified into the unlikely COPD group.
Table 1 Parameter estimates for the COPD predictive model
Guidance centers are located throughout the US for the benefit of Humana members to provide health and well-being programs and services. The 89,940 (44,970 predicted to have undiagnosed COPD and 44,970 predicted to not have undiagnosed COPD) subjects classified as likely and unlikely to have COPD were then matched based on geographical proximity to three Humana guidance centers (Knoxville, Tennessee; Tampa, Florida; and Tamarac, Florida) using zip code matching. The three guidance centers were selected based on membership density, feedback from Humana’s subject matter experts, and geographic distribution around the guidance centers. There were 3,213 subjects predicted to have COPD and 3,774 subjects predicted to not have COPD located near one of the three guidance centers. In order to recruit these subjects for spirometry assessment, an advance notice letter was mailed initially, followed by a cover letter, which formally invited the selected subjects to participate in the study. Thereafter, a centralized call center (operated by ANA Research, Minneapolis, MN, USA) attempted to recruit the subjects through a phone call and followed up with a mailed information packet, informed consent form, and appointment reminder letter. Of the subjects contacted for participation in the study, some no longer met the eligibility criteria while others were not included for other reasons (ie, not interested, refused to participate, unable to participate, or invalid contact information).
The following patient characteristics were evaluated on identification date: age, gender, race/ethnicity, dual eligibility (Medicaid and Medicare), low income subsidy status, and line of business (Medicare or commercial). The distinction of line of business is required as the majority of Humana’s data is comprised of patients enrolled in Medicare. This group of patients has characteristics distinct from patients enrolled in commercial offerings, which necessitates separation of analysis by insurance program. Demographic characteristics were reported for study subjects stratified based on prediction of COPD. The following clinical characteristics were evaluated on the day of spirometry evaluation: body mass index (BMI), smoking status (smoker or non-smoker), and smoking history (number of years and number of cigarettes per day), and reported for subjects stratified based on prediction of COPD.
Baseline patient characteristics were reported using summary statistics: mean, SD, median, and interquartile ranges for continuous variables and proportion and frequencies for categorical variables. The baseline patient characteristics were then compared by performing univariate statistical comparisons (chi-squared tests and Wilcoxon rank-sum test).
Prospective data collection to assess airflow limitation in a subset of persons identified by the claims-based predictive model
Prospective spirometry evaluations were conducted in series from November 2016 to February 2017 at the study sites where the Humana guidance centers were located (at Knoxville, Tennessee; Tampa, Florida; and Tamarac, Florida). Registered respiratory therapists were trained by clinical experts in spirometry testing and by study investigators on the collection and entry of data, complying with Institutional Review Board approval. Study subjects completed an informed consent and health information release forms followed by BMI measurement and recording of smoking history. Spirometry evaluation was then conducted using two EasyOne Plus spirometers manufactured by NDD Medical Technologies (Andover, MA, USA). Achievement of an adequate test was ensured by repeating the procedure three times assuring acceptable test results were obtained. The spirometers were set up to read out results only if three acceptable test results were obtained. From the three readings, the best two were to be within 150 mL or 5% of each other. Finally, the best FEV1 and FVC test was recorded in an electronic case report form. A copy of spirometry results was provided to study subjects. A second copy of spirometry results and a reference sheet along with a cover letter were provided to the subject to deliver to their physician or were mailed directly to the study subject’s physician (if the subject agreed and provided the physician’s medical clinic address). A US$50 incentive was provided to compensate study subjects for time and travel to participate in the research study.
Model performance assessment
COPD status predicted by the model was compared to spirometry data to determine misclassification rates using a range of cut-off values from 0.005 to 1.000. Predicted and actual values of a diagnosis of COPD were classified as follows: false negative (FN, predicted by the model as not having COPD but has airflow limitation based on spirometry), false positive (FP, predicted by the model as having COPD but does not have airflow limitation based on spirometry), true negative (TN, predicted by the model as not having COPD and does not have airflow limitation based on spirometry), and true positive (TP, predicted by the model as having COPD and has airflow limitation based on spirometry). Airflow limitation was assessed for the study subjects using a threshold of FEV1/FVC <0.7. The following model performance parameters were reported for a cut-off value of 0.027 (selected to maximize classification rate [sensitivity + specificity]): sensitivity (TP/[TP+ FN]), specificity (TN/[FP+ TN]), PPV ([TP/predicted to have COPD] *100), and NPV ([TN/predicted to not have COPD] *100).
AUC index based on the receiver operating characteristics (ROC) curve, which measures the predictive accuracy of the model, was computed. ROC curves are obtained by plotting 1-specificity on the horizontal axis and the sensitivity on a vertical axis, for a range of cut-off values. Each point on the ROC corresponds to a particular cut-off value. In terms of model comparison, the ideal curve coincides with the upper end of the left-hand axis. The AUC index assesses overall model performance for a range of cut-off values.
The study protocol and informed consent form were approved by Schulman Institutional Review Board and the study was conducted in accordance with the Declaration of Helsinki.
Retrospective identification of study subjects likely or unlikely to have undiagnosed COPD using a claims-based predictive model
A total of 2,432,651 subjects met the study inclusion and exclusion criteria (Figure 1). Of the initial study population, 44,970 subjects were classified by the predictive model as having undiagnosed COPD and 1,479,407 (a control subject classified as not having undiagnosed COPD was randomly selected stratified by index year and line of business, yielding 44,970 control subjects) as not having undiagnosed COPD. A total of 6,987 study subjects residing within 20 miles of the three Humana guidance centers were randomly selected for spirometry assessment: 3,213 subjects likely to have undiagnosed COPD and 3,774 subjects not likely to have undiagnosed COPD (Table 2). A subset of subjects contacted for participation in the study attended spirometry appointments (Table 2) for evaluation of airflow limitation: 218 subjects predicted by the model to have undiagnosed COPD and 331 subjects predicted to not have undiagnosed COPD. The rates of attendance of spirometry visits from those recruited were 7% (218/3,213) in the group likely to have COPD and 9% (331/3,774) in the group unlikely to have COPD.
Table 2 Number of subjects who were recruited and who attended spirometry evaluation
A comparison of demographic characteristics of study subjects predicted and not predicted to have COPD is presented in Table 3. Subjects predicted to have COPD had a higher mean age including a similar proportion of females compared to those predicted to not have COPD. All subjects were from the geographic location classified as South. A higher proportion of subjects predicted to have COPD were dual eligible and were low income subsidy recipients compared to those predicted to not have COPD. A lower proportion of subjects predicted to have COPD were enrolled in Medicare health plans compared to those predicted to not have COPD.
A comparison of clinical characteristics of study subjects predicted and not predicted to have COPD is presented in Table 4. There was no difference in BMI between the two cohorts. A higher proportion of subjects predicted to have COPD were classified as smokers compared to those predicted to not have COPD. Among smokers, there was no difference in mean number of years of smoking or number of cigarettes per day between subjects predicted to have COPD compared to those predicted to not have COPD.
Table 4 Baseline clinical characteristics of study population
Prospective data collection to assess airflow limitation in a subset of persons identified by the predictive model
Mean FEV1 and mean FVC values were lower among subjects predicted to have COPD compared to those predicted to not have COPD (Table 5). Mean FEV1/FVC ratio was also lower among subjects predicted to have COPD compared to those predicted to not have COPD.
Model performance assessment
Among the 218 subjects predicted to have COPD by the model, 73 (33.5%) were demonstrated to have airflow limitation by spirometry. Among the 331 subjects predicted to not have COPD by the model, 265 (80.1%) were shown to not have airflow limitation by spirometry. There were 145 false positives (patients predicted to have COPD by the model that did not have airflow limitation assessed by spirometry) and 66 false negatives (patients predicted to not have COPD by the model who had airflow limitation assessed by spirometry). The proportion of subjects correctly identified by the model was lower among subjects predicted to have COPD compared to those predicted to not have COPD. Based on the proportion of subjects correctly predicted by the model as having COPD or not having COPD, the performance parameters were as follows: AUC =0.61 (Figure 2), sensitivity =52.5%, specificity =64.6%, PPV =33.5%, and NPV =80.1%.
Figure 2 ROC curve.
Claims-based algorithms to identify patients with undiagnosed COPD have been developed previously.7,8,11 The model developed by Moretz et al10 used medical and pharmacy claims from a primarily Medicare population. The strongest predictors of COPD were use of anticholinergic bronchodilators, tobacco cessation counseling, anticholinergic beta-agonist combination agents, and smoking cessation medications. Comorbid heart failure was also predictive of COPD (Table 1). These findings are in agreement with prior studies that describe an association between COPD diagnosis and a history of smoking and increased comorbidity burden.7,8,11 While the claims-based model described by Moretz et al had a PPV of 73%, the database used did not contain spirometry values for validation. The current study had a specific aim to validate the claims-based COPD predictive model developed by Moretz et al, using prospectively collected spirometry data from a predominately Medicare population.
We observed that the model has a clinically validated PPV of 33.5%, which implies that for every three patients identified by the model as likely to have COPD, one of them will have undiagnosed COPD based on the operational definition of a COPD diagnosis used in this study. Further, the NPV was 80.1% indicating that eight out of ten patients that are identified by the model will likely not have undiagnosed COPD. The rate of visit completion in the study was low (7% in the group likely to have COPD and 9% in the group unlikely to have COPD), and there exists a possibility of bias if a proportion of the subjects who responded to the study invitation did so out of concern for their respiratory health. The most obvious effect of such bias would be that the “not likely to have COPD” arm of this study would be sicker than the general population, which could result in a downwardly biased estimate of the model’s performance.
While the ability of the predictive model validated in this study to predict COPD is low (PPV of 33.5%), it compares favorably to other published predictive models aimed at identifying those with undiagnosed COPD. The algorithm developed by Mapel et al7 used medical and pharmacy claims from a health plan database containing commercial and managed Medicare and Medicaid enrollees. Key variables predictive of COPD were history of tobacco use, pulmonary heart disease, asthma, asphyxia, and edema. The model was validated by reviewing medical records for presence of two of the following findings to support COPD diagnosis: history of chronic respiratory complaints, spirometry results, chest radiographs, or history of cigarette smoking. However, the model was not validated by prospective spirometry evaluation of test subjects. Another model developed by Mapel et al8 used pharmacy claims only to identify patients with undiagnosed COPD. This model was shown to have a PPV of 25% after validation using spirometry evaluation and a respiratory symptoms questionnaire. Key variables associated with COPD diagnosis were use of respiratory medications, antibiotics, and cardiovascular medications.
Studies have demonstrated effectiveness of interventions to slow disease progression in patients with COPD.12 This clinically validated predictive model may be useful as a cost-effective method to identify persons with undiagnosed COPD in a managed care environment enabling identification earlier in the disease continuum. A follow-up evaluation including spirometry may be conducted to confirm airflow limitation and initiate disease management and education programs such as smoking cessation counseling, medication therapy management, and transitions of care programs to improve lung function, dyspnea, and quality of life, and reduce risk of exacerbations.13
The following limitations should be considered when interpreting the results of this study. While we did not use post-bronchodilator spirometry as recommended by GOLD guidelines in our study, the predictive value of pre-bronchodilator spirometry to diagnose COPD is high and close to the predictive value of post-bronchodilator spirometry based on two large studies.14,15 However, this procedure limits the ability to differentiate between COPD and asthma, and may result in misclassification. Also, patients with unclassified spirometry or Preserved Ratio Impaired Spirometry were not evaluated in the study.16,17 Even though the persistent nature of airflow limitation was not evaluated in this study, confirmation of airflow obstruction through spirometry testing further enhances the value and application of the claims-based predictive model.
Patients with diagnosis codes for asthma were not excluded from the study. However, it may be valuable to target patients with airflow limitation for further evaluation or interventions.
All subjects were from the southern region of the US, and may have different characteristics from the initial study population identified by the predictive model as likely or unlikely to have undiagnosed COPD. Study subjects that completed the appointment had different characteristics compared to the potential study subjects.
The results of this study are partially based on administrative claims data from a large national health plan in US. Retrospective database studies using administrative claims are prone to coding errors of omission and commission and incomplete claims information which may lead to misclassification.
The current study provides prospective spirometry data for assessment of performance of a claims-based COPD predictive model, which may be used for early identification of undiagnosed patients with COPD.
This study was funded by Boehringer Ingelheim.
Rakesh Luthra, Asif Shaikh, and Shuchita Kaila are employees of Boehringer Ingelheim. Chad Moretz, Srinivas Annavarapu, and Seth Goldfarb are employees of Comprehensive Health Insights, Inc., which conducted the study. Andrew Renda is an employee of Humana Inc., and provided project consultation. The authors report no other conflicts of interest in this work.
World Health Organization. Chronic obstructive pulmonary disease (COPD). 2017 October 25. Available from: http://www.who.int/mediacentre/factsheets/fs315/en/. Accessed December 3, 2017.
Wheaton AG, Cunningham TJ, Ford ES, Croft JB. Centers for Disease Control and Prevention (CDC). Employment and activity limitations among adults with chronic obstructive pulmonary disease – United States, 2013. MMWR Morb Mortal Wkly Rep. 2015;64(11):289–295.
Global strategy for the diagnosis, management and prevention of COPD (2018 report), Global initiative for chronic Obstructive Lung Disease (GOLD) 2018. Available from: http://www.goldcopd.org/. Accessed August 1, 2018.
Mannino DM. COPD: epidemiology, prevalence, morbidity and mortality, and disease heterogeneity. Chest. 2002;121(5 Suppl):121S–126S.
van den Boom G, Rutten-van Mölken MP, Tirimanna PR, van Schayck CP, Folgering H, van Weel C. Association between health-related quality of life and consultation for respiratory symptoms: results from the DIMCA programme. Eur Respir J. 1998;11(1):67–72.
van Schayck CP, van der Heijden FM, van den Boom G, Tirimanna PR, van Herwaarden CL. Underdiagnosis of asthma: is the doctor or the patient to blame? The DIMCA project. Thorax. 2000;55(7):562–565.
Mapel DW, Frost FJ, Hurley JS, et al. An algorithm for the identification of undiagnosed COPD cases using administrative claims data. JMCP. 2006;12(6):458–465.
Mapel DW, Petersen H, Roberts MH, Hurley JS, Frost FJ, Marton JP. Can outpatient pharmacy data identify persons with undiagnosed COPD? Am J Manag Care. 2010;16(7):505–512.
Macaulay D, Sun SX, Sorg RA, et al. Development and validation of a claims-based prediction model for COPD severity. Respir Med. 2013;107(10):1568–1577.
Moretz C, Zhou Y, Dhamane AD, et al. Development and validation of a predictive model to identify individuals likely to have undiagnosed chronic obstructive pulmonary disease using an administrative claims database. J Manag Care Spec Pharm. 2015;21(12):1149–1159.
Mapel DW, Dutro MP, Marton JP, Woodruff K, Make B. Identifying and characterizing COPD patients in US managed care. A retrospective, cross-sectional analysis of administrative claims data. BMC Health Serv Res. 2011;11:43.
Bettoncelli G, Blasi F, Brusasco V, et al. The clinical and integrated management of COPD. Sarcoidosis Vasc Diffuse Lung Dis. 2014;31(Suppl 1):3–21.
Rennard SI, Drummond MB. Early chronic obstructive pulmonary disease: definition, assessment, and prevention. Lancet. 2015;385(9979):1778–1788.
Fortis S, Eberlein M, Georgopoulos D, Comellas AP. Predictive value of prebronchodilator and postbronchodilator spirometry for COPD features and outcomes. BMJ Open Respir Res. 2017;4(1):e000213.
Mannino DM, Diaz-Guzman E, Buist S. Pre- and post-bronchodilator lung function as predictors of mortality in the lung health study. Respir Res. 2011;12:136.
Wan ES, Fortis S, Regan EA, et al. Longitudinal phenotypes and mortality in preserved ratio impaired spirometry in the COPD Gene study. Am J Respir Crit Care Med. 2018;198(11):1397–1405.
Guerra S, Sherrill DL, Venker C, Ceccato CM, Halonen M, Martinez FD. Morbidity and mortality associated with the restrictive Spirometric pattern: a longitudinal study. Thorax. 2010;65(6):499–504.
This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.Download Article [PDF]