Investigating sensitivity, specificity, and area under the curve of the Clinical COPD Questionnaire, COPD Assessment Test, and Modified Medical Research Council scale according to GOLD using St George's Respiratory Questionnaire cutoff 25 (and 20) as reference
Received 3 November 2015
Accepted for publication 30 December 2015
Published 18 May 2016 Volume 2016:11(1) Pages 1045—1052
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 2
Editor who approved publication: Dr Richard Russell
Ioanna G Tsiligianni,1,2 Harma J Alma,1,2 Corina de Jong,1,2 Danijel Jelusic,3 Michael Wittmann,3 Michael Schuler,4 Konrad Schultz,3 Boudewijn J Kollen,1 Thys van der Molen,1,2 Janwillem WH Kocks1,2
1Department of General Practice, 2GRIAC Research Institute, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands; 3Klinik Bad Reichenhall, Center for Rehabilitation, Pulmonology and Orthopedics, Bad Reichenhall, 4Department of Medical Psychology, Psychotherapy and Rehabilitation Sciences, University of Würzburg, Würzburg, Germany
Background: In the GOLD (Global initiative for chronic Obstructive Lung Disease) strategy document, the Clinical COPD Questionnaire (CCQ), COPD Assessment Test (CAT), or modified Medical Research Council (mMRC) scale are recommended for the assessment of symptoms using the cutoff points of CCQ ≥1, CAT ≥10, and mMRC scale ≥2 to indicate symptomatic patients. The current study investigates the criterion validity of the CCQ, CAT and mMRC scale based on a reference cutoff point of St George’s Respiratory Questionnaire (SGRQ) ≥25, as suggested by GOLD, following sensitivity and specificity analysis. In addition, areas under the curve (AUCs) of the CCQ, CAT, and mMRC scale were compared using two SGRQ cutoff points (≥25 and ≥20).
Materials and methods: Two data sets were used: study A, 238 patients from a pulmonary rehabilitation program; and study B, 101 patients from primary care. Receiver-operating characteristic (ROC) curves were used to assess the correspondence between the recommended cutoff points of the questionnaires.
Results: Sensitivity, specificity, and AUC scores for cutoff point SGRQ ≥25 were: study A, 0.99, 0.43, and 0.96 for CCQ ≥1, 0.92, 0.48, and 0.89 for CAT ≥10, and 0.68, 0.91, and 0.91 for mMRC ≥2; study B, 0.87, 0.77, and 0.9 for CCQ ≥1, 0.76, 0.73, and 0.82 for CAT ≥10, and 0.21, 1, and 0.81 for mMRC ≥2. Sensitivity, specificity, and AUC scores for cutoff point SGRQ ≥20 were: study A, 0.99, 0.73, and 0.99 for CCQ ≥1, 0.91, 0.73, and 0.94 for CAT ≥10, and 0.66, 0.95, and 0.94 for mMRC ≥2; study B, 0.8, 0.89, and 0.89 for CCQ ≥1, 0.69, 0.78, and 0.8 for CAT ≥10, and 0.18, 1, and 0.81 for mMRC ≥2.
Conclusion: Based on data from these two different samples, this study showed that the suggested cutoff point for the SGRQ (≥25) did not seem to correspond well with the established cutoff points of the CCQ or CAT scales, resulting in low specificity levels. The correspondence with the mMRC scale seemed satisfactory, though not optimal. The SGRQ threshold of ≥20 corresponded slightly better than SGRQ ≥25, recently suggested by GOLD 2015, with the established cutoff points for the CCQ, CAT, and mMRC scale.
Keywords: pulmonary disease, chronic obstructive, health status
COPD is a prevalent disease worldwide, characterized by persistent airflow limitation.1 COPD patients suffer for years and die prematurely due to its complications.1,2 Patients’ experiences vary independently of airflow limitation, with some able to cope well with daily activities, while others are completely handicapped.2 Until 2009, GOLD (Global initiative for chronic Obstructive Lung Disease) classification of COPD disease severity was based on spirometry alone, with no regard for health status or dyspnea assessment. Management of the disease was based on forced expiratory volume in 1 second (FEV1), which does not adequately reflect patients’ well-being or disease impact.2,3 From 2011 onward, GOLD suggested COPD patients be classified into a risk-category system according to FEV1, number of exacerbations, and health status or dyspnea assessment.1,4–6 Health status can be evaluated using the COPD Clinical Questionnaire (CCQ)7 and the COPD Assessment Test (CAT);8 alternatively, the level of dyspnea can be evaluated by using the modified British Medical Research Council (mMRC) dyspnea scale.9 Based on this classification, patients are grouped into four categories: A, B, C, and D. This grouping consequently influences treatment decisions.1,4–6 If a discrepancy in risk assessment exists in this classification system, GOLD recommends assignment to the higher-risk category.1,4–6
In the GOLD recommendations, the cutoff points of the CCQ, CAT, and mMRC scale have been set to 1, 10, and 2 points, respectively, while the recent GOLD 2015 update suggests that the CCQ cutoff point could be 1–1.5.1,4–6 However, recent publications have shown a limited correspondence between these cutoff points, resulting in differences in patient classification.10–18
Clinicians and researchers use several health-status questionnaires. St George’s Respiratory Questionnaire (SGRQ),19 the CCQ,7 and the CAT8 are the most commonly used. The CCQ has been ranked first on the International Primary Care Respiratory Group ranking as the most appropriate tool for use in primary care.20 In the same tool guide, the CAT, mMRC scale, and SGRQ follow after the CCQ in this ranking.20 The SGRQ is the most widely used questionnaire in clinical trials to assess health status in COPD patients, and is considered the gold standard.19 However, the use of the SGRQ in daily clinical practice is limited, due to its length, difficulty to administer, and complex score-calculation process.20 Therefore, the well-validated and more practical CCQ and CAT have both been proposed as alternatives by GOLD.1,4–6 Many researchers and clinicians still use the SGRQ though for the assessment of health status in their COPD patients. Until recently, there was no information about which cutoff points of the CCQ and CAT should be used in relation to the SGRQ reference.
Recently, a cutoff point ≥25 of the SGRQ has been suggested as the gold standard.6 The current study aimed to investigate if an SGRQ cutoff point of ≥25 or another cutoff point is (more) appropriate to be used in practice. This study aimed also to investigate the criterion validity of the CCQ, CAT, and mMRC scale cutoff points in differentiating between high- and low-symptom groups using the suggested cutoff point of the SGRQ as the gold standard, based on sensitivity and specificity analyses.
Materials and methods
Data from patients from two different studies were used. In study A, participants were recruited from Clinic Bad Reichenhall, Center for Rehabilitation, Pulmonology, and Orthopedics in Germany between February and November 2013. During this period, 238 patients were enrolled from an ongoing clinical trial conducted to evaluate the effectiveness of a 3-week pulmonary rehabilitation program on health status, psychological well-being, and physiological status. Participants with spirometry-confirmed COPD GOLD category II–IV were included. Patients with a relevant hypercapnic respiratory failure (CO2 partial pressure ≥50 mmHg in rest or indication for noninvasive breathing), linguistic and cognitive limitations, and lack of motivation were excluded. The study was approved by the Ethik-Kommission der Bayerischen Landesärztekammer (12107) and registered in the German Clinical Trial Register (DRKS00004609). All participants gave written informed consent. For this analysis, only prerehabilitation measurements of the CCQ, CAT, mMRC scale, and SGRQ were used.
In study B, 101 patients, recruited mainly from primary care, participated in a three-visit observational study assessing the relationship and responsiveness of four patient-reported outcomes: the CCQ, CAT, SGRQ, and mMRC scale. Patients 45 years of age and older with a smoking history of at least 10 years were included. Exclusion criteria were patients with concomitant asthma, unstable cardiovascular disease, or any respiratory disease other than COPD.21 For the purposes of this study, analysis of the first visit’s measurements was used. Therefore, in total 101 patients were included here instead of the 90 patients who completed all three visits. More details have been published elsewhere.21
The questionnaires – CCQ, CAT, mMRC (study A), and mMRC (study B) – recommended by the GOLD strategy document were used.7–9 The SGRQ, the most widely used questionnaire in COPD research, was also administered.19
The CCQ is a ten-item questionnaire that consists of three domains: symptoms, functional status, and mental state.7 Total scores range from 0 to 6 (0= no impairment). The minimum clinically important difference (MCID) is 0.4.22 The CCQ cutoff point suggested by GOLD for symptomatic patients is ≥1,4,5 and only recently it has been proposed that it may be 1–1.5.1,6
The CAT is an eight item one-dimensional questionnaire of health-status impairment in COPD. Total scores range from 0 to 40 (0= no impairment). The MCID is suggested to be approximately 2 points.23 The CAT cut point suggested by GOLD for symptomatic patients is ≥10.1,4–6
The mMRC is a one-dimensional tool assessing dyspnea during exercise in five levels. Several versions circulate. The mMRC ranges from 0 to 4, and is recommended by GOLD. The (original) mMRC ranges from 1 to 5 and has similar wording.9 To acquire compatible data, the study B scores were lowered by 1 point to calculate representative mMRC scores. The MCID is 1. The mMRC cutoff point suggested by GOLD for symptomatic patients is ≥2.1,4–6
The SGRQ is a self-administered questionnaire that measures health status in patients with chronic airflow limitation. For this study, the 50-item version was used. The total score ranges from 0 (perfect health) to 100, and has three domains: symptoms, activity, and impact.19 The MCID is 4. The suggested SGRQ cutoff point for highly symptomatic patients is ≥25.6
Data analysis was performed using SPSS 20 (IBM, Armonk, NY, USA). Both data sets were assessed for normal distribution. Patient characteristics between the studies were compared using χ2 tests, independent t-tests, or Mann–Whitney U-tests where appropriate. In both studies, correspondence was assessed between the recommended cutoff point of the SGRQ (≥25) and the cutoff points of the CCQ (≥1), CAT (≥10), and mMRC scale (≥2), expressed in sensitivity and specificity using receiver-operating characteristic (ROC) curves.
To assess alternative and possibly superior cutoff points for the suggested SGRQ 25, we have used SGRQ cutoff points of 15–30 to split the data sets. The mean differences in the other questionnaires (CCQ, CAT) between the low and high SGRQ groups were then listed. The maximal differences in the CAT or CCQ were then used to create additional ROC curves.
To assess whether the areas under the curve (AUCs) were significantly different from one another, which would imply that one questionnaire was better able to discriminate between high and low symptoms than another, the differences between two AUCs were tested using the formula:
where SE is standard error and r Pearson product-moment correlation.24
Study A was approved by the Ethik-Kommission der Bayerischen Landesärztekammer. Study B was approved by the local medical ethics committee of the University Hospital of Crete, Greece.
Patient characteristics of data sets A and B are shown in Table 1. Patients in study A were significantly younger and consisted of more female patients. Patients in study B had significantly more pack-years. In general, health-status scores were higher in the pulmonary rehabilitation group (study A) than in the primary care group (study B). Mean baseline levels in study groups A and B for the recommended health-status and dyspnea instruments were (respectively) 2.85 and 1.52 (CCQ), 20.18 and 12.65 (CAT), and 50.13 and 35.24 (SGRQ), while for the mMRC scale were 2.53 in group A and 0.85 in group B.
The AUC of the CCQ was significantly higher than the AUC of the CAT in study A for both SGRQ cutoff points. In study B, the AUC of the CCQ was superior to the CAT only for cutoff point of 25 (Table 2). In addition, in study B the AUC of the CCQ was also superior to the mMRC scale for both cutoff points.
In study A, the proportions of sensitivity, specificity, and AUC for the cutoff point SGRQ ≥25 were (respectively) 0.99, 0.43, and 0.96 for CCQ ≥1; 0.92, 0.48, and 0.89 for CAT ≥10; and 0.68, 0.91, and 0.91 for mMRC ≥2. In study B, these results were for the cutoff point SGRQ ≥25, and were (respectively) 0.87, 0.77, and 0.9 for CCQ ≥1; 0.76, 0.73, and 0.82 for CAT ≥10; and 0.21, 1, and 0.81 for mMRC ≥2 (Table 3). The maximal difference of high versus low CCQ or CAT scores based on the changing SGRQ cutoff of 15–30 was 2.01 for the CCQ and 11.5 for the CAT, both at the SGRQ cutoff point of 20.
When the SGRQ cutoff point was adjusted to ≥20, the proportions of sensitivity, specificity, and AUC were (respectively) 0.99, 0.73, and 0.99 for CCQ ≥1; 0.91, 0.73, and 0.94 for CAT ≥10; and 0.66, 0.95 and 0.94 for mMRC ≥2. In study B, these results were 0.8, 0.89, and 0.89 for CCQ ≥1; 0.69, 0.78, and 0.8 for CAT ≥10; and 0.18, 1, and 0.81 for mMRC ≥2. Visual results for the ROC analysis are demonstrated in Figures 1–4. Overall, the percentage of symptomatic patients in the SGRQ cutoff-25 group amounted to 86.1%, while in the SGRQ cutoff-20 group this percentage was 91.7%. These percentages differed significantly (P<0.001).
This study, using data from two samples of populations with some different characteristics, showed that the suggested cutoff point for the SGRQ (≥25) did not appear to correspond well with the established cutoff points of both the CCQ and CAT, resulting in low specificity levels. This study examined whether an SGRQ cutoff point of ≥25, which was recently proposed as an equivalent to the CAT cutoff point of ≥10, could indeed be considered the gold standard.6,13 Next, this study investigated the criterion validity of the CCQ, CAT, and mMRC cutoff points in differentiating between high- and low-symptom groups using the suggested cut point of the SGRQ as the gold standard based on sensitivity and specificity analysis. The correspondence with the mMRC scale seemed satisfactory, though not optimal. An SGRQ threshold ≥20 results in better sensitivity and specificity for the CCQ and CAT, as well as improved specificity for the mMRC scale.
The GOLD classification of patients according to the CCQ, CAT, or mMRC scale has already been proven to lead to a discrepancy in categorization, mainly between the health-status instruments and the dyspnea instrument (mMRC).10–16 This dissociation was also found in the ROC profiles in the current study. The crucial difference between 25 versus 20 was primarily due to the patients from primary care (study B), as shown in Figures 1–4, meaning that more studies are needed to be able to draw safe conclusions.
The 2015 GOLD statement included a paragraph on the choice of cutoff points.6 Two arguments were forwarded to elect the SGRQ cutoff of 25: firstly, healthy individuals seldom score >25, while COPD patients often score >25; and secondly, in trials for long-acting bronchodilators, 26 is one standard deviation below the studies’ baseline SGRQ scores. Although the SGRQ was developed long before the CAT and CCQ, the CAT, CCQ, and mMRC scale were the first to be included in treatment guidelines and studies.4 Determining the cutoff for the SGRQ based on the differentiation between healthy subjects and COPD patients or on statistical grounds is thus less obvious than relating it to the CAT, CCQ, or mMRC scale cutoff points used in the guidelines.
Several comparisons have been reported. Recent research suggested a CAT score of ten correspondents with a CCQ score of 1.5.25 Mean CCQ and CAT scores were 2.9 and 21.2, respectively, which corresponds with our rehabilitation cohort.25 Kim et al showed that an mMRC-scale score of 2 corresponded to a mean CAT of 21 (standard deviation 8), while an mMRC-scale score of 1 corresponded to a mean CAT score of 13 (standard deviation 6).16 On the other hand, Jones et al demonstrated that an mMRC-scale score of 1 corresponded to a CAT score of 10.15 Han et al performed a study that defined categories using the mMRC scale (0–1 versus ≥2) and the SGRQ (≥25 versus <25 as a surrogate for the CAT ≥10 versus <10), and showed that the choice of symptom measure influenced category assignment.13 Formal testing of the AUCs resulted in superior performance of the CCQ compared to the CAT at both cutoff points. This suggests that the CCQ is better in correctly categorizing low- and high-symptomatic COPD patients using the SGRQ as the gold standard, which is important for the interpretation of clinical trial results.
The discrepancy in patient classification between the CCQ, CAT, SGRQ, and mMRC scale may be explained by the differences in content: the CCQ has ten questions that can be divided into three domains measuring symptoms (dyspnea, cough, sputum) and mental (fear and depression due to respiratory symptoms) and functional status (limitations due to respiratory symptoms). The CAT has been designed as a one-dimensional tool to assess such parameters as cough, phlegm, chest tightness, breathlessness going up hills/stairs, activity limitation, sleep, energy, and confidence leaving home. Apart from symptoms, the SGRQ also assesses activities and impact of the disease similarly to the CCQ. Finally, the mMRC scale measures dyspnea due to exercise. In the original development studies of both the CCQ and CAT, their correlations with dyspnea were moderate.7,8 Although the CAT, SGRQ, and CCQ measure the same construct, due to differences in content it may be difficult to find cutoff points that allow the division of all patient populations in exactly the same groups. For example, patients with exercise limitations due to breathlessness but few other symptoms might score slightly higher on the CCQ than on the CAT, resulting in being classified symptomatic on the CCQ and not symptomatic on the CAT. Outcomes from questionnaires in daily clinical practice should not be regarded as carved in stone; they are indicators that should initiate a discussion between clinicians and patients. In the aforementioned example, the patient may be advised to spend some time each day exercising based on the results of the functional domain of the CCQ or on the one question about activity limitation of the CAT.
Strengths and limitations
Some limitations of this study should be mentioned. The paper did not use symptoms as the gold standard, but instead defined symptomatic patients according to their score on other scales. The subjects in study A were selected to participate in a randomized controlled trial regarding rehabilitation. Usually, patients participating in rehabilitation studies are largely symptomatic and not the most appropriate population to study cutoff points for patients who are symptomatic or not. However, subjects in study B were selected largely from primary care, and may not have been subject to this selection bias. Secondly, we decided for the sake of clarity not to use follow-up data. Although follow-up data would clarify the importance of different cutoff values in clinical practice, the numbers were too small to draw conclusions. One of the strengths of this study is that it is the first real-life study to assess cutoff points for the CCQ, CAT, SGRQ, and mMRC scale.
In these two samples, the suggested cutoff point for the SGRQ (≥25) did not appear to correspond well with the established cutoff points of either the CCQ or CAT, resulting in low specificity levels, while the correspondence with the mMRC scale seemed satisfactory. An SGRQ threshold ≥20 showed better sensitivity and specificity for the CCQ and CAT, as well as improved specificity for the mMRC scale. Based on our findings, we recommend the GOLD committee reconsider its cutoff point for the SGRQ.
This study was funded by the Department of General Practice, University Medical Center of Groningen, Groningen, the Netherlands.
All authors contributed toward data analysis, drafting and revising the paper and agree to be accountable for all aspects of the work.
TvdM developed the CCQ, and holds the copyright. The other authors report no conflicts of interest in this work.
GOLD (Global initiative for chronic Obstructive Lung Disease). Strategy for the Diagnosis, Management, and Prevention of Chronic Obstructive Pulmonary Disease. Bethesda (MD): GOLD; 2014.
Tsiligianni I, Kocks J, Tzanakis N, Siafakas N, van der Molen T. Factors that influence disease-specific quality of life or health status in patients with COPD: a review and meta-analysis of Pearson correlations. Prim Care Respir J. 2011;20:257–268.
Jones PW. Health status measurement in chronic obstructive pulmonary disease. Thorax. 2001;56:880–887.
GOLD (Global initiative for chronic Obstructive Lung Disease). Strategy for the Diagnosis, Management, and Prevention of Chronic Obstructive Pulmonary Disease. Bethesda (MD): GOLD; 2011.
GOLD (Global initiative for chronic Obstructive Lung Disease). Strategy for the Diagnosis, Management, and Prevention of Chronic Obstructive Pulmonary Disease. Bethesda (MD): GOLD; 2013.
GOLD (Global initiative for chronic Obstructive Lung Disease). Strategy for the Diagnosis, Management, and Prevention of Chronic Obstructive Pulmonary Disease. Bethesda (MD): GOLD; 2015.
van der Molen T, Willemse BW, Schokker S, ten Hacken NH, Postma DS, Juniper EF. Development, validity and responsiveness of the Clinical COPD Questionnaire. Health Qual Life Outcomes. 2003;1:13.
Jones PW, Harding G, Berry P, Wiklund I, Chen WH, Kline Leidy N. Development and first validation of the COPD Assessment Test. Eur Respir J. 2009;34:648–654.
Bestall JC, Paul EA, Garrod R, Garnham R, Jones PW, Wedzicha JA. Usefulness of the Medical Research Council (MRC) dyspnoea scale as a measure of disability in patients with chronic obstructive pulmonary disease. Thorax. 1999;54:581–586.
Leivseth L, Brumpton BM, Nilsen TI, Mai XM, Johnsen R, Langhammer A. GOLD classifications and mortality in chronic obstructive pulmonary disease: the HUNT study, Norway. Thorax. 2013;68:914–921.
Lange P, Marott JL, Vestbo J, et al. Prediction of the clinical course of chronic obstructive pulmonary disease, using the new GOLD classification: a study of the general population. Am J Respir Crit Care Med. 2012;186:975–981.
Soriano JB, Alfageme I, Almagro P, et al. Distribution and prognostic validity of the new Global Initiative for Chronic Obstructive Lung Disease grading classification. Chest. 2013;143:694–702.
Han MK, Muellerova H, Curran-Everett, et al. GOLD 2011 disease severity classification in COPDGene: a prospective cohort study. Lancet Respir Med. 2013;1:43–50.
Agusti A, Edwards L, Celli B, et al. Characteristics, stability and outcomes of the GOLD 2011 COPD groups in the ECLIPSE cohort. Eur Respir J. 2013;42:636–646.
Jones PW, Adamek L, Nadeau G, Banik N. Comparisons of health status scores with MRC grades in COPD: implications for the GOLD 2011 classification. Eur Respir J. 2013;42:647–654.
Kim SM, Oh JS, Kim YI, et al. Differences in classification of COPD group using COPD assessment test (CAT) or modified Medical Research Council (mMRC) dyspnea scores: a cross-sectional analyses. BMC Pulm Med. 2013;13:35.
Wedzicha JA. GOLD and ABCD – a good start, but now for the evidence? Lancet Respir Med. 2013;1:4–5.
Jones R, Price D, Chavannes N, et al. GOLD COPD categories are not fit for purpose in primary care. Lancet Respir Med 2013;1:e17.
Jones PW, Quirk FH, Baveystock CM. The St George’s Respiratory Questionnaire. Respir Med. 1991;85 Suppl B:25–31; discussion 33–37.
Cave AJ, Atkinson L, Tsiligianni IG, Kaplan AG. Assessment of COPD wellness tools for use in primary care: an IPCRG initiative. Int J Chron Obstruct Pulmon Dis. 2012;7:447–456.
Tsiligianni IG, van der Molen T, Moraitaki D, et al. Assessing health status in COPD: a head-to-head comparison between the COPD assessment test (CAT) and the clinical COPD questionnaire (CCQ). BMC Pulm Med. 2012;12:20.
Kocks JW, Tuinenga MG, Uil SM, van den Berg JW, Ståhl E, van der Molen T. Health status measurement in COPD: the minimal clinically important difference of the clinical COPD questionnaire. Respir Res. 2006;7:62.
Jones PW, Price D, van der Molen T. Role of clinical questionnaires in optimizing everyday care of chronic obstructive pulmonary disease. Int J Chron Obstruct Pulmon Dis. 2011;6:289–296.
Hanley J, McNeil B. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148:839–843.
Kon SS, Canavan JL, Nolan CM, et al. The clinical chronic obstructive pulmonary disease questionnaire: cut point for GOLD 2013 classification. Am J Respir Crit Care Med. 2014;189:227–228.
This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.Download Article [PDF]