Reliability and usability of a weighted version of the Functional Comorbidity Index
Received 23 August 2018
Accepted for publication 14 November 2018
Published 11 February 2019 Volume 2019:14 Pages 289—299
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 2
Editor who approved publication: Dr Richard Walker
Anouk D Kabboord,1 Monica van Eijk,1,2 Lisette van Dingenen,1 Monique Wouters,1 Marieke Koet,1 Romke van Balen,1 Wilco P Achterberg1
1Department of Public Health and Primary Care, Leiden University Medical Center, Leiden, the Netherlands; 2Department of Old-Age Medicine Hubertusduin, HMC Bronovo, The Hague, the Netherlands
Purpose: To investigate the reliability of a weighted version of the Functional Comorbidity Index (w-FCI) compared with that of the original Functional Comorbidity Index (FCI) and to test its usability.
Patients and methods: Sixteen physicians collected data from 102 residents who lived in 16 different nursing homes in the Netherlands. A multicenter, prospective observational study was carried out in combination with a qualitative part using the three-step test interview, in which participants completed the w-FCI while thinking aloud and being observed, and were then interviewed afterward. To analyze inter-rater reliability, a subset of 41 residents participated. The qualitative part of the study was completed by eleven elderly care physicians and one advanced nurse practitioner.
Measurements: The w-FCI was composed of the original FCI supplemented with a severity rating per comorbidity, ranging from 0 (disease absent) to 3 (severe impact on daily function). The w-FCI was filled out at baseline by 16 physicians and again 2 months later to establish intra-rater reliability (intraclass correlations; ICCs). For inter-rater reliability, four pairs of raters completed the w-FCI independently from each other.
Results: The ICCs were 0.90 (FCI) and 0.94 (w-FCI) for intra-rater reliability, and 0.61 (FCI) and 0.55 (w-FCI) for inter-rater reliability. Regarding usability of the w-FCI, five meaningful themes emerged from the qualitative data: 1) sources of information; 2) deciding on the presence or absence of disease; 3) severity of comorbidities; 4) usefulness; and 5) content.
Conclusion: The intra-rater reliability of the FCI and the w-FCI was excellent, whereas the inter-rater reliability was moderate for both indices. Based on the present results, a modified w-FCI is proposed that is acceptable and feasible for use in older patients and requires further investigation to study its (predictive) validity.
Keywords: older patients, multimorbidity, personalized medicine, function, disease impact
Plain language summary
In this study, we present a new comorbidity index, which is a modified version of the original Functional Comorbidity Index (FCI). In the assessment of comorbidity, simply the sum of present conditions will not reflect total burden of disease. It is also important to assess severity: a condition can be mild, moderate, or severe. With other words, it can have hardly any, partly (moderate), or severe impact on the patient’s life, activities, physical, and/or psychological well-being.
This is why we have modified the original FCI and have designed a severity weighted FCI. To study the usability and reliability of this weighted FCI, we have collected data from 102 nursing home residents and formed rater pairs to calculate the intraclass correlations (ICCs) in order to determine the intra-rater and inter-rater reliability of the index. Furthermore, we have interviewed eleven physicians and one advanced nurse practitioner using a three-step test interview, to test its usability in nursing home practice.
We found that the intra-rater reliability of the FCI and w-FCI was excellent whereas the inter-rater reliability was moderate. On the basis of the results, we composed a brief and practical tool that is suitable for use in older patients to evaluate their comorbidities and maybe also to aid making a functional prognosis after an acute event or hospital admission. This modified w-FCI can be used both in research and in practice, to assess comorbidity in the vulnerable older patient population.
Chronic diseases and their interaction – as in multimorbidity – have an impact on a person’s functional abilities and may delay recovery after acute diseases, or complicate rehabilitation.1–4 With an aging population, clinicians and therapists are increasingly confronted with multimorbidity in their patients. However, assessment of comorbidity is complex and should include more than simply the accumulation of single diseases.5–8 The NICE guideline Multimorbidity: Clinical Assessment and Management confirms this, stating that: “… multimorbidity involves personalized assessment and the development of an individualized management plan”.9
Indices such as the Cumulative Illness Rating Scale, the Index of Co-Existing Diseases (ICED), or the Geriatric Index of Comorbidity include disease severity but are complex, time-consuming, and require training and access to a comprehensive manual.5,10–12 A brief and practical method may support clinicians in assessing individual multimorbidity as part of comprehensive geriatric assessment and, subsequently, in making a functional prognosis when acute diseases occur.
In 2005, the FCI became available.13 That index was specifically designed for use in studies investigating physical function, and included 18 prevalent diagnoses related to physical function. Although the authors discussed whether “… severity ratings are likely to provide better adjustment …” the available FCI does not include severity evaluation.13 This original FCI was developed in a community-dwelling adult population. However, severity-weighted comorbidity might be more strongly related to functional status in older vulnerable patients, such as nursing home residents. In addition, a survey study (2013) showed that most practitioners agreed that the severity of disease affected physical function following hip fracture. The authors concluded that the FCI needs modification to be useful in older patient populations, such as patients with hip fracture.14 Therefore, we investigate an FCI that is supplemented with a severity-weighted rating scale.
The present study aims to examine the reliability of this weighted FCI (w-FCI) by analyzing the intra-rater and inter-rater reliability of the original FCI and the w-FCI. A second aim is to test the usability of the w-FCI by examining its feasibility, acceptability, and completeness in clinical practice. Based on the results, a w-FCI is presented that is ready to be evaluated in both geriatric practice and prognostic research.
Patients and methods
The initial w-FCI was composed of the original index (Figure S1) supplemented with a severity rating for each of the 18 comorbidities, based on the physician’s knowledge about the comorbidities of their patients and their impact on functioning.13 This rating had four categories (Figure 1).8,12 In item 8, an extra example was included, ie, neurodegenerative disorder such as dementia was added after Parkinson’s disease, because dementia is prevalent among nursing home residents and this addition was also recommended in an earlier study.14 A three-page manual was appended as a guide in case of doubt when completing the w-FCI.
Figure 1 Rating scale for functional severity.
Data collection and measurements
The present study is part of the BeCaf study, a prospective multicenter cohort study.15 Sixteen physicians in training to be an elderly care physician (ECP), working in 16 nursing homes, collected data on patients under their responsibility.16,17 Eligible participants were selected when diabetes mellitus had been diagnosed. All eligible participants, their proxy, and the educational nursing homes received adequate oral and written information about the study and were given reasonable time to opt-out. Data collection included anonymous patient data and complied with the Personal Data Protection Act and the Medical Treatment Agreement Act. The study was conducted in accordance with the Declaration of Helsinki and its protocol was approved by the Medical Ethics Committee of Leiden University Medical Center.
To analyze ICCs for intra-rater reliability, comorbidity indices were completed by the same physicians at baseline and again 2 months later. This 2-month interval was considered optimal because it was short enough for the comorbidities to be stable, but long enough for physicians to have forgotten the baseline measurements.11,18–20 The Barthel index was completed by a nurse and was used to assess functional status.21
Furthermore, four different pairs of raters scored the w-FCI in a subset of patients (Table S1). The w-FCI was completed in duplicate, first by an ECP trainee and subsequently by the supervising ECP, independently from each other.16
Data collection and measurements
To test usability of the w-FCI, the three-step test interview (TSTI) was conducted.22 The TSTI combines the “think aloud” and “probing” methods and “is a powerful tool with which to establish whether a measurement is filled out in a consistent way and whether the questions and tasks are understood”.23 Qualitative data were collected by four researchers (AK, LvD, MK, and MW), while interviewing experienced ECPs who worked in various types of nursing homes (Table S2). An ECP is “a medical practitioner who has specialized as a primary care expert in geriatric medicine”.16,17
Per TSTI session, an ECP filled out the index and exchanged thoughts with the researcher. The ECP was asked to verbally express all thoughts while filling out the w-FCI.22 The researchers recorded all observations, ie, the verbally expressed thoughts as well as nonverbal expressions (step 1). This was followed by a retrospective interview during which the observations were discussed (step 2), and an in-depth discussion addressed any difficulties concerning the comorbidities, the descriptions, the understanding of the content, and highlighted further considerations or opinions (step 3).
All data were processed anonymously. Inclusion of ECPs continued until data saturation was achieved. Data were recorded ad verbum for further analysis.
A statistician specialized in reliability studies advised on the appropriate sample size and assisted in analyzing the ICCs; at least 40 participants were necessary to ensure statistical power.24 The SPSS version 23 was used for the analyses. The ICCs were calculated for the FCI and the w-FCI sum scores, calculating the ratio of case variance to total variance using a linear mixed model with the Barthel index as a fixed factor. This model adjusted for nested data and for true functional decline due to intercurrent disease. An ICC of <0.50 was deemed to represent poor, 0.50–0.74 moderate, 0.75–0.89 good, and ≥0.90 excellent agreement.25 The scores of the two different rater groups were tested for significant difference (P<0.05) using a paired t-test. Finally, the relation between FCI and w-FCI sum scores and the Barthel index were studied by calculating the correlation coefficients (Spearman’s rho).
For the qualitative part, data from the TSTIs were summarized in a table to keep track of data saturation. The content was discussed and analyzed by two researchers (AK, MK) who combined, analyzed, and structured the data into meaningful themes.
The study population consisted of 102 residents who had lived in a nursing home for (on average) 21 months (Table 1); their mean age was 82.5 years and 60% was female. The Barthel index was (median) 8, the mean FCI score was 5.0, and the mean w-FCI score was 8.6. The mean time interval between T1 and T2 was 2.4 months. During the study, 7 patients died and 12 patients were lost to follow-up.
Table 1 Characteristics of included patients
The ICCs (intra-rater) were 0.94 for the w-FCI and 0.90 for the FCI. Duplicate comorbidity indices from a subset of 41 patients were completed and the resulting ICCs (inter-rater) were 0.55 for the w-FCI and 0.61 for the FCI. Although the mean FCI was 4.7 in both groups of raters, the mean w-FCI differed between the raters, ie, the ECP trainees assessed a mean of 8.0 and the supervising ECPs 9.3; this difference was significant (P=0.021). Spearman’s rho was −0.103 (P=0.307) between FCI and Barthel index and was −0.240 (P=0.015) for the w-FCI.
After interviewing 12 participants, data saturation was achieved and five themes were extracted.
Discrepancies due to various sources of information
Essential information was collected to decide on whether a disease was present or absent. ECPs used various sources for this, ie, medical history (general practitioner), specialist letters, (electronic) patient records, and the list of actual medication, and also considered the results of recent interviews and physical examinations. Clinical knowledge of the patient was used to decide on the severity of present comorbidities. However, the sources did not always correspond with each other. Furthermore, when a patient has been admitted to a care home or geriatric rehabilitation facility, ECPs experienced that it could take days or weeks until the full medical history was received. One question they raised was “What is an appropriate time to complete a comorbidity index?”
Inconsistency in interpretation and deciding on presence or absence
Information from the different sources was sometimes confusing:
Sometimes the medication list includes a particular medication, whereas no matching indication can be retrieved from the medical history.
Many COPD patients have clinical symptoms of anxiety but don’t have an official diagnosis; in this case, should I decide present or absent?
Furthermore, information was sometimes interpreted in different ways. For example, if a patient had had a disease many years ago, without any residual symptoms, it was considered as currently not invalidating and therefore scored as “absent”, whereas other participants scored this as “present without causing any functional impairments”.
Experienced difficulties during the rating of functional severity
To complete the w-FCI, ECPs needed to know the patient’s medical, physical, and functional situation: ie, comorbidities and their impact. Various problems were experienced when rating the severity:
Who determines what causes functional impairment: the patient or the doctor?
I only see the more severely impaired patients – one can imagine that scoring severity depends on my frame of reference.
Severity of a disease is not static, but changes from day to day. Also, the impact on function can depend on the availability of supportive aids.
Some noted that different diseases may have the same symptoms and cause similar functional impairment, thereby affecting the choice of a rating:
How do we determine whether functional disabilities are caused by disease A or B?
Exacerbation of heart failure and COPD both cause shortness of breath, which causes functional impairment irrespective of the underlying pathophysiological etiology.
In this case, ECPs were inclined to choose “the happy medium”, ie, partly causing functional impairment. Others did not experience this difficulty and indicated that physicians are trained to evaluate symptoms and diagnose diseases; thus, a physician is the appropriate professional to decide what symptom belongs to what disease.
Acceptability and usefulness of the w-FCI
Depending on the availability of information, the conscientiousness of the ECP and the complexity of the patient’s condition, the time spent on filling out the w-FCI ranged from 4 to 13 minutes. None of the participants used the manual. ECPs who took the most time were positive about the usefulness of the w-FCI, whereas ECPs who needed the least time referred to themselves as “quick deciders” and experienced few problems. Others indicated that the w-FCI would need several adaptations to be useful in the care of older patients (see section “Considerations regarding the content and layout”). Finally, there were doubts about the usefulness of the w-FCI in long-term care practice, when gradual and progressive functional decline is expected. However, the index was seen as being potentially useful in the practice of geriatric rehabilitation, where functional recovery is expected.
Considerations regarding the content and layout
Dementia was considered an important cause of functional impairment in an older patient population. The following conditions were also suggested: fractures, liver and kidney failure, malignancies, chronic wounds, alcohol/substance abuse, and/or other psychiatric diseases. Furthermore, it was unclear whether or where diseases such as atrial fibrillation and valve dysfunction should be scored. Regarding the layout: the w-FCI did not allow scoring the primary diagnosis (main reason why the patient was admitted in the nursing home) separately from the co-existing morbidities, whereas this distinction is commonly made. Finally, because some experienced difficulty with the rating of severity, a threefold rating was suggested: (0) absent or present in medical history without any residual symptoms, (1) partly impairing function, and (2) severe impact.
The w-FCI and the considerations that led to the amendments are presented in Figure 2A and B. Major amendments were COPD and asthma were combined into one pulmonary item, dementia was added to the index as a separate comorbid condition, upper gastrointestinal disease was changed into gastrointestinal disease (also the lower intestinal tract was considered important in older persons), some of the additional explanations or examples below the items were adjusted, supplemented, or removed, and some items were reordered (degenerative disc disease and obesity).
In this population of vulnerable nursing home residents characterized by diabetes, multimorbidity, and high functional dependency, the intra-rater reliability of the FCI and w-FCI was excellent, whereas the inter-rater reliability was moderate. Based on these results, we present a modified and weighted version of the FCI (Figure 2A).
Strengths and limitations
The present study has several strengths: this is the first study to add a rating to the FCI based on functional impact, where few of the available comorbidity indices integrate the impact of disease. Another strength is the addition of a qualitative part to gain insight into actual clinical practice and decision-making, and to extract information on factors that may have caused reduced reliability. To our knowledge, the TSTI method has not been used before to collect qualitative data when investigating comorbidity indices. Furthermore, this study provides insight into the clinical practice of assessing comorbidity, which enhances its external validity. However, this strength also has some limitations: the ECPs were not trained in completing the w-FCI but received a brief explanation only and although a manual was available, it was not used by any participant. Furthermore, deciding on “impact on function” is a relatively intuitive process and depends on the opinion of the clinician and his/her knowledge of the patient. Although providing decision rules (as in the New York Heart Association classification of heart failure) might improve reliability, such classifications are lacking for most of the diseases included in the FCI. Another limitation may be that we included only nursing home residents with diabetes, which was decided to create a more homogeneous group among a rather heterogenous group of nursing home residents.15 We believe that it is unlikely that this has influenced the reliability or usability results and the w-FCI could be used in all older patients according to us. Finally, an unexpected finding was that the ECP-group scored a higher overall w-FCI sum score than the trainees. However, a difference of 1.3 does not necessarily indicate a clinical difference.26 In this context the following limitation needs to be considered: the raters for inter-rater reliability that completed the w-FCI could only be the ECP trainee and the supervising ECP in our study, because the w-FCI needs to be completed by someone who has insight in the patients’ diseases and functioning. This condition limits who is eligible to fill out the w-FCI. A possible explanation for the significant difference might be that trainees usually focus on discussing the medical problems with their supervisor and less often the patients’ successful recovery or positive well-being. As a result, supervisors may have scored a more severe impact.
Interpretation of findings
The reason why both indices had moderate inter-rater reliability is probably related to our study design, ie, using a variety of sources from which comorbidities were extracted rather than related to the severity-weighted rating. Our reliability results are in line with those of an earlier study that investigated the reliability of the ICED (a comparable comorbidity index).11 Completion of the ICED requires training; however, in that study, despite using a 20-page manual, the ICCs still ranged from 0.35 to 0.71. Moreover, no improvement in reliability was achieved after extra training of the raters.11 In the present study, none of the physicians used the three-page manual, which may be understandable bearing in mind that: “an index has to be simple to use and not be stressful in any … time consuming way, to be useful in practice”.27
The inter-rater reliability of the FCI was lower than that in a study investigating patients with acute lung injury (ICC: 0.91).26 However, these two studies clearly differ in design and population, eg, comorbidity and age differed widely (in the present study the mean FCI was 5, compared with 1 in the earlier study). Furthermore, the comorbidities were extracted from one retrospective record: an electronic hospital discharge summary.26 Although using one record as the sole source of information may improve reliability (higher ICCs), it is less representative of clinical practice. The present study aimed to investigate reliability in the practice of a nursing home. The results of the correlation analysis support that the w-FCI is more strongly correlated with function than the FCI, although the effect sizes are rather small. This result is in line with some studies but a higher correlation between comorbidity and function was found in other studies.19,28–30
Our second aim was to study the feasibility, acceptability, and completeness of the w-FCI. The five themes that emerged provided insight into its usability, ie, the ability to complete the index, its usefulness, and its imperfections.
Sources of information: Information from different sources did not always fully match or provided conflicting information on the presence/absence of diseases. This may lead to different scores on the index, for both the FCI and w-FCI. This difficulty applies to all comorbidity assessments when various sources of information are used. Moreover, in daily practice a patient file always consists of different medical sources (eg, medication list, specialist letters, GP medical history, and recent laboratory results).
Presence of comorbidity: Even when the medical history was conclusive, the ECPs could differ in their opinion, mainly when residual symptoms were absent. To address this, some ECPs suggested that a threefold rating would be more practical: ie, rating “zero” for disease absence as well as for diseases without impact on function (ie, without residual symptoms).
Severity rating: Completing the w-FCI requires knowledge of the patient’s medical and functional status. Some inconsistencies emerged that may complicate rating the impact of a disease on function and, therefore, contribute to disagreement. First, severity may be dynamic and change over time, eg, due to the nature of the disease progress, or due to the relief of symptoms after successful treatment. In addition, severity can also depend on the environment, eg, the availability of effective supportive aids and social support. Furthermore, who should decide on severity: the doctor or the patient? Originally, the FCI was designed as a self-report index. However, in another study (by the same author) the FCI was completed by research nurses.13,31 In the present study, due to the high prevalence of cognitive impairment in the study population, the w-FCI was not self-reported but was completed by a physician. Finally, some ECPs experienced difficulty in distinguishing between different diseases that may cause similar symptoms and/or impairments. However, the opinion of others was that a physician is specifically trained to recognize diagnoses and to differentiate between symptoms and diseases and thus, a physician seems to have the necessary skills to fill out the w-FCI. Although rating severity of disease is more complex than registering its presence, physicians recognize the importance in relation to functional recovery. In a study, the opinions of various experts in the area of hip fracture and functional recovery were surveyed. In 11 out of the 18 FCI comorbidities a consensus of >85% on the importance of severity was observed.14 Furthermore, the concept of “functional severity” was already published in 1987 being “the impact of a disorder on an individual’s ability to perform age-appropriate activities”. This publication stresses that “persons with equal physiological or morphological disorders may vary widely in the impairments they experience” and “functional severity relates to a person rather than to an organ system”.32
Acceptability, usefulness, and content: We consider the amount of time needed to complete the w-FCI acceptable. Although the majority found completing the list to be feasible, they thought the content needed to be adapted to be useful with an older patient population. Dementia is probably the most important comorbidity to be added to the modified index, because it affects functional abilities and is prevalent in older persons. Another study also stressed the importance of dementia in the FCI.14 The authors also reported that the majority of practitioners suggested that “upper gastrointestinal disease” was not related to physical function (neither its presence nor severity). We argue that changing “upper gastrointestinal” into “gastrointestinal” would be more suitable, since bowel disease (eg, constipation) is prevalent in older patients.33 Combining COPD and asthma together was based on the prevalence in the cohort. A declining prevalence of asthma with advancing age and an increasing prevalence of COPD with advancing age has been described.34 Furthermore, we could not find convincing supportive literature while processing the other suggestions (kidney and liver failure, malignancies, substance abuse, and chronic wounds). At least kidney failure and chronic wounds can be considered in the severity-rated part of the w-FCI when they are a consequence of peripheral vascular disease or diabetes, but further research will be needed to determine whether additional comorbidities, in relation to function, should be included in the index. This could be conducted using a survey method or Delphi procedure that focuses on this specific question.
Conclusion and implications
In this study, the intra-rater reliability of the FCI and w-FCI was excellent, whereas the inter-rater reliability was moderate. We modified the investigated initial w-FCI into a definitive w-FCI, to be acceptable and feasible for use in a vulnerable older patient population, based on the results of this study. This w-FCI is presented, which allows evaluating the impact of comorbidities in older patients and may be used for comprehensive geriatric assessment, eg, in post-acute care and geriatric rehabilitation. However, the predictive validity of this modified index needs further investigation.
Laurens, a nursing care organization in Rotterdam, supported this work. The authors thank Ron Wolterbeek for his supervision regarding the statistical analyses and Laraine Visser for language editing.
ADK, MvE, RvB, and WPA helped in the study concept and design. ADK, LvD, MW, and MK performed the subject and data analysis. ADK, MvE, LvD, MW, MK, RvB, and WPA interpreted the data. ADK, LvD, MW, and MK helped in the preparation of the manuscript. MvE, RvB, and WPA helped in the review of manuscript. All authors contributed toward data analysis, drafting and critically revising the paper, gave final approval of the version to be published, and agree to be accountable for all aspects of the work.
The authors report no conflicts of interest in this work.
Boeckxstaens P, Vaes B, Legrand D, Dalleur O, de Sutter A, Degryse JM. The relationship of multimorbidity with disability and frailty in the oldest patients: a cross-sectional analysis of three measures of multimorbidity in the BELFRAIL cohort. Eur J Gen Pract. 2015;21(1):39–44.
Ferriero G, Franchignoni F, Benevolo E, Ottonello M, Scocchi M, Xanthi M. The influence of comorbidities and complications on discharge function in stroke rehabilitation inpatients. Eura Medicophys. 2006;42(2):91–96.
Bertozzi B, Barbisoni P, Franzoni S, Rozzini R, Frisoni GB, Trabucchi M. Factors related to length of stay in a geriatric evaluation and rehabilitation unit. Aging. 1996;8(3):170–175.
Di Bari M, Virgillo A, Matteuzzi D, et al. Predictive validity of measures of comorbidity in older community dwellers: the Insufficienza Cardiaca negli Anziani Residenti a Dicomano Study. J Am Geriatr Soc. 2006;54(2):210–216.
Rozzini R, Frisoni GB, Ferrucci L, et al. Geriatric Index of Comorbidity: validation and comparison with other measures of comorbidity. Age Ageing. 2002;31(4):277–285.
de Groot V, Beckerman H, Lankhorst GJ, Bouter LM. How to measure comorbidity. a critical review of available methods. J Clin Epidemiol. 2003;56(3):221–229.
Kabboord AD, van Eijk M, Fiocco M, van Balen R, Achterberg WP. Assessment of comorbidity burden and its association with functional rehabilitation outcome after stroke or hip fracture: a systematic review and meta-analysis. J Am Med Dir Assoc. 2016;20(16):30306–30301.
Bayliss EA, Ellis JL, Steiner JF. Subjective assessments of comorbidity correlate with quality of life health outcomes: initial validation of a comorbidity assessment instrument. Health Qual Life Outcomes. 2005;3(51):51.
National Institute for Health and Care Excellence. NICE Guideline Multimorbidity: clinical assessment and management; 2016. Available from: https://www.nice.org.uk/guidance/ng56/chapter/Recommendations. Accessed April 24, 2018.
Miller MD, Paradis CF, Houck PR, et al. Rating chronic medical illness burden in geropsychiatric practice and research: application of the Cumulative Illness Rating Scale. Psychiatry Res. 1992;41(3):237–248.
Imamura K, McKinnon M, Middleton R, Black N. Reliability of a comorbidity measure: the Index of Co-Existent Disease (ICED). J Clin Epidemiol. 1997;50(9):1011–1016.
Greenfield S, Sullivan L, Dukes KA, Silliman R, D’Agostino R, Kaplan SH. Development and testing of a new measure of case mix for use in office practice. Med Care. 1995;33(4 Suppl):47–55.
Groll DL, To T, Bombardier C, Wright JG. The development of a comorbidity index with physical function as the outcome. J Clin Epidemiol. 2005;58(6):595–602.
Hoang-Kim A, Busse JW, Groll D, Karanicolas PJ, Schemitsch E. Co-morbidities in elderly patients with hip fracture: recommendations of the ISFR-IOF hip fracture outcomes working group. Arch Orthop Trauma Surg. 2014;134(2):189–195.
Kromhout MA, van Eijk M, Pieper MJC, Chel VGM, Achterberg WP, Numans ME. BeCaf study: caffeine and behaviour in nursing homes, a study protocol and EBM training program. Neth J Med. 2018;76(3):138–140.
Koopmans RT, Lavrijsen JC, Hoek JF, Went PB, Schols JM. Dutch elderly care physician: a new generation of nursing home physician specialists. J Am Geriatr Soc. 2010;58(9):1807–1809.
Koopmans R, Pellegrom M, van der Geer ER. The Dutch move beyond the concept of nursing home physician specialists. J Am Med Dir Assoc. 2017;18(9):746–749.
Fortin M, Hudon C, Dubois MF, Almirall J, Lapointe L, Soubhi H. Comparative assessment of three different indices of multimorbidity for studies on health-related quality of life. Health Qual Life Outcomes. 2005;3(74):74.
Extermann M. Measurement and impact of comorbidity in older cancer patients. Crit Rev Oncol Hematol. 2000;35(3):181–200.
Crabtree HL, Gray CS, Hildreth AJ, O’Connell JE, Brown J. The Comorbidity Symptom Scale: a combined disease inventory and assessment of symptom severity. J Am Geriatr Soc. 2000;48(12):1674–1678.
Collin C, Wade DT, Davies S, Horne V. The Barthel ADL Index: a reliability study. Int Disabil Stud. 1988;10(2):61–63.
Pool JJ, Hiralal SR, Ostelo RW, van der Veer K, de Vet HC. Added value of qualitative studies in the development of health related patient reported outcomes such as the Pain Coping and Cognition List in patients with sub-acute neck pain. Man Ther. 2010;15(1):43–47.
de Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in Medicine: A Practical Guide. 1st ed. New York: Cambridge University Press; 2011.
Donner A, Eliasziw M. Sample size requirements for reliability studies. Stat Med. 1987;6(4):441–448.
Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–163.
Fan E, Gifford JM, Chandolu S, Colantuoni E, Pronovost PJ, Needham DM. The functional comorbidity index had high inter-rater reliability in patients with acute lung injury. BMC Anesthesiol. 2012;12(21):21.
Bouter LM. Leerboek Epidemiologie. 7th ed. Houten: Bohn Stafleu van Loghum; 2016.
Levine CG, Davis GE, Weaver EM. Functional Comorbidity Index in chronic rhinosinusitis. Int Forum Allergy Rhinol. 2016;6(1):52–57.
Mandelblatt JS, Bierman AS, Gold K, et al. Constructs of burden of illness in older patients with breast cancer: a comparison of measurement methods. Health Serv Res. 2001;36(6 Pt 1):1085–1107.
Levine CG, Weaver EM. Functional comorbidity index in sleep apnea. Otolaryngol Head Neck Surg. 2014;150(3):494–500.
Groll DL, Heyland DK, Caeser M, Wright JG. Assessment of long-term physical function in acute respiratory distress syndrome (ARDS) patients: comparison of the Charlson Comorbidity Index and the Functional Comorbidity Index. Am J Phys Med Rehabil. 2006;85(7):574–581.
Stein RE, Gortmaker SL, Perrin EC, et al. Severity of illness: concepts and measurements. Lancet. 1987;2(8574):1506–1509.
Andy UU, Vaughan CP, Burgio KL, Alli FM, Goode PS, Markland AD. Shared risk factors for constipation, fecal incontinence, and combined symptoms in older U.S. adults. J Am Geriatr Soc. 2016;64(11):e183–e188.
Oraka E, Kim HJ, King ME, Callahan DB. Asthma prevalence among US elderly by age groups: age still matters. J Asthma. 2012;49(6):593–599.
Table S1 Characteristics of the rater pairs
Table S2 Characteristics of participants in the TSTI