Selecting score types for longitudinal evaluations: the responsiveness of the Comprehensive Developmental Inventory for Infants and Toddlers in children with developmental disabilities

Yu-Pei Tsai; Li-Chen Tung; Ya-Chen Lee; Yu-Lin Wang; Yun-Shan Yen; Kuan-Lin Chen

doi:10.2147/NDT.S99171

Back to Journals » Neuropsychiatric Disease and Treatment » Volume 12

Original Research

Selecting score types for longitudinal evaluations: the responsiveness of the Comprehensive Developmental Inventory for Infants and Toddlers in children with developmental disabilities

Authors Tsai Y, Tung L, Lee Y, Wang Y, Yen Y, Chen K

Received 26 October 2015

Accepted for publication 1 March 2016

Published 4 May 2016 Volume 2016:12 Pages 1103—1109

DOI https://doi.org/10.2147/NDT.S99171

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 3

Editor who approved publication: Professor Wai Kwong Tang

Download Article [PDF]

Yu-Pei Tsai,^1,2 Li-Chen Tung,^1,3 Ya-Chen Lee,⁴ Yu-Lin Wang,^1,5 Yun-Shan Yen,¹ Kuan-Lin Chen^4,6

¹Department of Physical Medicine and Rehabilitation, Chi-Mei Medical Center, Tainan, ²Department of Special Education, National Chiayi University, Chiayi, ³School of Medicine, Chung Shan Medical University, Taichung, ⁴Department of Occupational Therapy, College of Medicine, National Cheng Kung University, ⁵Department of Sports Management, Chia Nan University of Pharmacy and Science, ⁶Department of Physical Medicine and Rehabilitation, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan

Objective: The objective of this study was to examine the responsiveness of the Comprehensive Developmental Inventory for Infants and Toddlers (CDIIT) in children with developmental disabilities (DD).
Methods: The responsiveness of a measure is its ability to detect change over time, and it is fundamental to an outcome measure for detecting changes over time. We compared the responsiveness of four types of scores (ie, raw scores, developmental ages [DAs], percentile ranks [PRs], and developmental quotients [DQs]) in the five subtests of the CDIIT. The CDIIT was administrated three times at intervals of 3 months on 32 children with DD aged between 5 months and 64 months (mean =30.6, standard deviation [SD] =17.8). The CDIIT is a pediatric norm-referenced assessment commonly used for clinical diagnosis of developmental delays in five developmental areas: cognition, language, motor, social, and self-care skills. The responsiveness was analyzed using three methods: effect size, standardized response mean, and paired t-test.
Results: The effect size results showed that at the 3-month and 6-month follow-ups, responsiveness was small or moderate in the raw scores and DAs of most of the subtest scores of the CDIIT, but the level of responsiveness varied in the PRs and DQs. The standardized response mean results of the 3-month and 6-month follow-ups showed that most of the subtest scores of the CDIIT had respectively moderate and large responsiveness in raw scores and DAs, but the responsiveness varied (from no to large) in PRs and DQs.
Conclusion: The findings generally support the use of the CDIIT as an outcome measure. We also suggest using the raw scores and DAs when using a norm-referenced pediatric developmental assessment to evaluate developmental changes and program effectiveness in children with DD.

Keywords: responsiveness, developmental assessment, developmental disabilities

Introduction

Developmental disabilities (DD) are a group of chronic conditions that are attributable to physical and mental impairments during the developmental period.¹ Common examples are intellectual disability, cerebral palsy, and autism spectrum disorder.² Children with DD often manifest lifelong disabilities in cognition, language, motor, social, and self-care skills.² According to a new report from the Federal Centers for Disease Control and Prevention, the prevalence of DD is approximately one in six, which means that ~15% of children aged 3–17 years have one or more DD³ with various degrees of severity and need coordinated services for their special health care, education, and social welfare, such as early intervention and continuing special education. Therefore, a comprehensive measure is warranted to detect the area and extent of developmental delays, to evaluate the effectiveness of the early interventions or education programs, and to predict the prognosis and needs for future health care and services in children with DD.

The Comprehensive Developmental Inventory for Infants and Toddlers (CDIIT)⁴ is specifically designed for infants and children aged 3–71 months. It is commonly used to assess five important developmental areas: cognition, language, motor, social, and self-care skills. The CDIIT was designed to be used as a diagnostic and screening test to identify strengths and weaknesses in the five developmental areas and to establish developmental levels.⁴ The CDIIT is included among the recommended measures for children with DD in child developmental centers in Taiwan because of its comprehensive coverage of pediatric development, concrete and interesting materials, complete norm establishment, and clinical applicability. The CDIIT has been proved to be psychometrically sound, having good internal consistency, test–retest and interrater reliabilities, construct validity, concurrent validity, predictive validity, and diagnostic accuracy,^4–10 and may have the potential for use as an outcome measure to assess and monitor developmental skills when children with DD are the subjects of intervention.

Responsiveness is fundamental to an outcome measure for detecting changes over time (its evaluative purpose).^11,12 The responsiveness of a measure is its ability to detect change over time, especially in response to an intervention.^11,13 Therefore, in both clinical practice and research, an outcome measure must have sufficient responsiveness to detect treatment effects.^11,13–15 However, the responsiveness of the CDIIT has yet to be established, so the potential of the CDIIT for use as an outcome measure for evaluating children’s development and treatment effects longitudinally is unknown.

The purpose of this study was to examine the responsiveness of the CDIIT longitudinally and thoroughly in children with DD. We compared the responsiveness of the four types of CDIIT scores: the raw scores, developmental ages (DAs), percentile ranks (PRs), and developmental quotient (DQ). The results may serve as a reference in determining which scores to use as outcome indicators of the CDIIT and also as a reference for choosing scores for a norm-referenced pediatric developmental assessment.

Methods

Participants

A total of 32 children with DD aged between 5 months and 64 months were recruited from the Child Development and Assessment Center of the Chi Mei Medical Center in Taiwan between March 2012 and December 2014. These children were receiving early intervention programs at the time of the study. The early intervention program was individually based on the results of each child’s individualized assessments, including observation of free play or play in a group, interviews of the caregiver, and standard assessment tools, but not targeted to specific activities in the CDIIT. Written informed consent was given by their primary caregivers, and the Institutional Review Board of the Chi Mei Medical Center approved the protocol for this study.

Measures

Comprehensive Developmental Inventory for Infants and Toddlers

The CDIIT consists of two parts: the diagnostic test (CDIIT-DT) and the screening test (CDIIT-ST).^4,5 Only the CDIIT-DT was used in this study. The CDIIT-DT includes five subtests and a behavior rating scale for assessing a child’s developmental capacities and behavioral characteristics in five developmental areas: cognition, language, motor, social, and self-care skills. The cognition subtest assesses a child’s mental capacities, including attention; perception; memory; reasoning; and concepts of color, shape, size, and number. The language subtest consists of expression and comprehension subdomains. The motor subtest includes two subdomains: gross motor and fine motor. The gross motor subdomain includes items to assess gravity compensation, locomotion, and body-movement coordination, while the fine motor subdomain includes items for basic hand use and visual-motor coordination. The social subtest has sections on interpersonal communication, affection, personal responsibility, and environmental adaptation. The self-help subtest comprises items about feeding, dressing, and hygiene skills.

Every item on the CDIIT-DT is scored 0 or 1, respectively, indicating whether the child “fails” or “passes” that item. Scores can be assigned based on clinical testing or home observation by the caregivers. In the present study, items in the cognition and motor subtests and part of the language subtest were individually and directly assessed by a trained administrator. The social and self-help subtests were scored by the primary caregivers. Based on the CDIIT manual, the raw score of each subtest and total score can be transformed into three other types of scores: DA, PR, and DQ. Altogether, the four types of scores were obtained for each subtest, for the gross motor and fine motor subdomains, and for the whole test.

As regard the reliability of the CDIIT-DT, the internal consistency,⁹ test–retest reliability (intraclass correlation coefficient =0.76–1.00), and interrater reliability (intraclass correlation coefficient =0.76–1.00) of the subtests and composites are good.⁵ The CDIIT-DT has good accuracy, similar to that of the Peabody Developmental Motor Scales – Second Edition for motor development evaluation in preschool children.⁷ With respect to its validity, the construct validity, concurrent validity, and predictive validity have been shown to be valid. The construct validity has been validated with exploratory factor analysis.⁸ Regarding the concurrent validity, the scores of the CDIIT subtests have been shown to be significantly and moderately correlated with the scores of the Bayley Scales of Infant Development-II in preterm and full-term infants.^6,10 In addition, the CDIIT also has fairly good predictive validity for diagnostic results and later school performances or special education needs, as measured by the Child Problems Referral Survey and Preschool Children Development Checklist.¹⁶

Procedures

The children were administered the CDIIT three times at intervals of 3 months by trained administrators in clinical settings. The administrators were therapists of occupational therapy, physical therapy, speech therapy, and psychology, all of whom were trained in the standard procedures of the developmental center. Demographic information was collected from the caregivers of the children, and the administrators, children, and caregivers were blinded to the purpose of the study.

Statistical analysis

Children’s CDIIT raw scores were transformed into DAs, PRs, and DQs according to the norms of normally developing children presented in the original manual. The demographic properties of the participants and the CDIIT scores were then characterized with descriptive analysis. The four types of scores (ie, raw scores, DAs, PRs, and DQs) were used for analyzing the responsiveness.

The responsiveness of the CDIIT was examined with the effect size (ES), standardized response mean (SRM), and paired t-test. All statistical analyses were performed using SPSS 17.0 (SPSS Inc., Chicago, IL, USA).

Effect size

The ES, a measure of change, is calculated by dividing the mean difference between baseline and follow-up measurements by the pooled SD of the baseline and follow-up measurements.¹⁷ Values of 0.20, 0.50, and 0.80 indicate small, moderate, and large ES, respectively.¹⁸

Standardized response mean

The SRM is the mean difference in the scores of two consecutive measurements divided by the SD of that difference.¹⁹ Thus, the SRM gives an estimate of change in the measure that is standardized relative to the variability of change scores. As with ES, values of 0.20, 0.50, and 0.80, respectively, are considered to show small, moderate, and large responsiveness.¹⁸

Paired t-test

The statistical significance of the change in scores was determined using the paired t-test.²⁰ The alpha level was set at 0.05.

Results

Participant characteristics

A total of 32 children with DD (23 boys and nine girls) ranging in age from 5 months to 64 months (mean: 30.6 months, SD: 17.8 months) and their caregivers participated in the study. The diagnoses of the children with disabilities consisted of psychomotor retardation (n=18), cerebral palsy (n=6), attention deficit hyperactivity disorder (n=4), Prader–Willi syndrome (n=2), Rubinstein–Taybi syndrome (n=1), and Marfan syndrome (n=1). The characteristics of the 32 children are presented in Table 1. Table 2 presents the mean and SD of the raw scores, the DAs, the PRs, and the DQs for each subtest of the CDIIT.

Table 1 Sample characteristics of children with developmental delays (N=32)
Abbreviations: SD, standard deviation; M, male; F, female.

Table 2 Descriptive statistics of the CDIIT for the children with developmental delays (N=32)
Notes: ^aMean difference (3-month follow up – baseline). ^bMean difference (6-month follow up – baseline).
Abbreviations: CDIIT, Comprehensive Developmental Inventory for Infants and Toddlers; DA, developmental age; PR, percentile rank; DQ, developmental quotient; SD, standard deviation.

Responsiveness

Table 3 shows the responsiveness for the four types of scores (raw scores, DAs, PRs, and DQs) for each subtest of the CDIIT.

Table 3 The responsiveness of the CDIIT: ES, SRM, and paired t-test (N=32)
Note: *P<0.05.
Abbreviations: CDIIT, Comprehensive Developmental Inventory for Infants and Toddlers; ES, effect size; SRM, standardized response mean; DA, developmental age; PR, percentile rank; DQ, developmental quotient.

Effect size

At 3-month follow-up, all the subtests had small responsiveness in the raw scores, except for the language subtest, which was not responsive. Regarding the DAs, all the subtests had small responsiveness (0.21–0.30). However, in the PRs, only the language and the motor subtests had small responsiveness. The other subtests were not responsive in the PRs. For the DQs, only the motor (0.30) and self-care (0.23) subtests had small responsiveness.

At 6-month follow-up, all the subtests had small responsiveness in the raw scores (0.34–0.47). Regarding the DAs, all the subtests had small responsiveness, but greater than that at 3-month follow-up (0.32–0.45). However, in the PRs, except for the social and self-care subtests, which had no (0.04 and 0.19, respectively) responsiveness, the cognition and language subtests had small responsiveness (0.43 and 0.38, respectively) and the motor subtest even had moderate responsiveness (0.66). For the DQs, the social and self-care subtests had no responsiveness (0.10 and 0.12, respectively), and the other three subtests had small responsiveness (0.24–0.48).

Standardized response mean

At 3-month follow-up, with regard to the SRMs of the raw scores of the subtests, the social and self-care subtests had small responsiveness (0.39 and 0.48), the language subtest had moderate responsiveness (0.55), and the cognition and motor subtests had large responsiveness (1.22 and 1.24). For the DAs, all subtests had responsiveness that was better than moderate; the cognition and motor subtests had large responsiveness (1.07 and 1.13) and the other three had moderate responsiveness (0.62–0.78). As regard the SRM of the PRs, only the language and motor subtests were responsive (0.34 and 0.36). For the DQ, the subtests of cognition, motor, and self-care were responsive (0.20–0.29) and the other two were not responsive (0.17 and 0.19).

At 6-month follow-up, all the SRMs of the raw scores of the subtests had extremely large responsiveness (1.38–1.96), except for the social and self-care subtests, which were moderately responsive (0.57 and 0.65). For the DAs, all the subtests had responsiveness that was better than moderate; the cognition, language, and motor subtests had extremely large responsiveness (1.29–1.68) and the other two had moderate responsiveness (0.72 and 0.78). As regard the SRMs of the PRs, the motor subtest was moderately responsive (0.54) and the cognition and language subtests had small responsiveness (0.49 and 0.35). The other two were not responsive (0.04 and 0.19). For the DQs, the cognition subtest was moderately responsive (0.54) and the language and motor subtests had small responsiveness (0.26 and 0.44). The other two were not responsive (0.12 and 0.14).

Paired t-test

At 3-month follow-up, all the changes in subtest scores were significant (P<0.01) for the raw scores and DAs, but not for the PRs and DQs. Furthermore, at 6-month follow-up, the results were similar to those at 3-month follow-up, but with additional significant changes in the PRs and DQs in the cognition and language subtests.

Discussion

We believe that this is the first study to examine the responsiveness of the CDIIT in children with DD. In this study, the responsiveness of the CDIIT was thoroughly analyzed. The raw scores DAs, PRs, and DQs were examined with three statistical methods of responsiveness. Regarding the variability of the scores of the initial assessment (ES), the results of the 3-month and 6-month follow-ups showed that most of the subtest scores of the CDIIT had small responsiveness in raw scores and DAs, but the responsiveness varied in PRs and DQs. Regarding the variability of the change scores (SRM), the results of the 3-month and 6-month follow-ups showed that most of the subtest scores of the CDIIT had moderate and large responsiveness, respectively, in raw scores and DAs, but the responsiveness varied (from no to large) in PRs and DQs. These findings about responsiveness support the use of the raw scores and DAs of the CDITT by clinicians and researchers as an outcome indicator to track change over time and to evaluate program effectiveness and developmental changes for children with DD.

Based on the results, both the raw scores and the DAs of the CDIIT are suggested for evaluative purposes because the two types of scores have different purposes. A raw score represents how many items a child passes (1) or fails (0) in a subtest. The DA refers to a child’s level of development within a subtest.²¹ Therefore, changes in raw scores reflect the degree to which the child has mastered items of functional skills and behaviors in relation to the results of a previous assessment. On the other hand, a change in DA reflects the degree to which the level of development has changed in the intervening time between repeated assessments. Therefore, both the raw scores and the DAs can be used for different purposes to track changes in children’s performance over time, depending on the focus on mastery or development of skills/behaviors.

In this study, the PRs and DQs were less responsive than the raw scores and DA or not responsive at all. Thus, the PR of the CDIIT is not recommended for use as an outcome measure. The PR is the percentage of scores in its normative sample that are better than, the same as, or lower than it, which might explain why PRs were less responsive to children’s changes. The functional performance and behaviors of the children with DD did improve, possibly due to intervention or normal development, as indicated by the raw scores and DAs. However, these improvements of the children with DD did not surpass those of normally developing children of the same age in the normative sample provided in the CDIIT manual.⁴

Because no well-accepted index has been acknowledged for evaluative purposes,¹⁸ especially in the pediatric field, we used three indices to examine the responsiveness of the CDIIT. We found that in general, the values of ES were smaller than those of SRM. This systematic difference can be ascribed to the different denominators of the formulas. The formula for ES is ES = X_change/SD_pooled; its counterpart for SRM is SRM = X_change/SD_change. The two formulas have the same numerators (X_change, mean change between baseline and follow-up measurements). The denominators, however, are different. That for ES is the pooled SD of the baseline measurement and follow-up measurement (SD_pooled), while that for SRM is the SD of change in scores (SD_change). In our study, SD_change was smaller than SD_pooled for every type of score (raw scores, DAs, PRs, and DQs). Thus, in our study, the ES values were smaller than the SRM values. From these observations, it appears that multiple indices should be used to examine the responsiveness of a measure for better interpretation in different contexts.¹⁸

One possibility might explain why the responsiveness of the social and self-care subtests was generally smaller than those of the other three subtests, especially those of the PRs and DQs. The social and self-care subtests, which are composite skills, are comparably advanced and based on the fundamental/basic component skills in the other three developmental areas (cognition, language, and motor). Children’s component skills improve and should be integrated, and then their advanced skills can improve. Therefore, children’s social and self-care skills are unlikely to improve a great deal in a short period of time (eg, 6 months in this study) along with the other three developmental areas. The children in our study, as expected, improved a smaller amount in the development of social and self-care skills.

Limitations

This study has several limitations. First, the children with DD with various diagnoses were recruited from a single medical center in southern Taiwan, so the representativeness of our sample was limited. Second, we did not examine whether differential responsiveness existed in subgroups with different diagnoses because of the small sample size. The responsiveness of the CDIIT may require further investigation in specific groups of children with DD or with a population-based sample. Third, despite the interval of 3 months between assessments, the possibility of a practice effect cannot be excluded nor can the possible inflation of the responsiveness of the CDIIT as a result of this effect. Fourth, although the interrater reliability has been examined in the clinical setting,⁵ it was not specifically examined in the present study. The fifth limitation is the small sample size of this pilot study. Therefore, additional studies in a larger cohort with equal representation across diagnostic categories need to be carried out to generalize the findings and recommendations.

Conclusion

Our results revealed that the CDIIT was responsive in terms of raw scores and DAs, and they supported the use of the CDIIT as an outcome measure for assessing the developmental areas at intervals of 3 months and 6 months in children with DD. In addition, the raw scores and DAs are suggested for evaluative purposes in norm-referenced pediatric developmental measures. Additional studies with a larger sample size are needed to support our findings.

Disclosure

The authors report no conflicts of interest in this work.

References

1.		Yeargin-Allsopp M, Murphy CC, Oakley GP, Sikes RK. A multiple-source method for studying the prevalence of developmental disabilities in children: the metropolitan Atlanta developmental disabilities study. Pediatrics. 1992;89(4 pt 1):624–630.
2.		Boulet SL, Boyle CA, Schieve LA. Health care use and health and functional impact of developmental disabilities among us children, 1997–2005. Arch Pediatr Adolesc Med. 2009;163(1):19–26.
3.		Boyle CA, Boulet S, Schieve LA, et al. Trends in the prevalence of developmental disabilities in us children, 1997–2008. Pediatrics. 2011;127(6):1034–1042.
4.		Wang TM. The Comprehensive Developmental Inventory for Infants and Toddlers – Manual. Taipei: Special Education Division, Ministry of Education; 2003.
5.		Liao HF, Pan YL. Test-retest and inter-rater reliability for the comprehensive developmental inventory for infants and toddlers diagnostic and screening tests. Early Hum Dev. 2005;81(11):927–937.
6.		Liao HF, Wang TM, Yao G, Lee WT. Concurrent validity of the comprehensive developmental inventory for infants and toddlers with the Bayley scales of infant development-II in preterm infants. J Formos Med Assoc. 2005;104(10):731–737.
7.		Wu HY, Liao HF, Yao G, Lee WT, Wang TM, Hsieh JY. Diagnostic accuracy of the motor subtest of comprehensive developmental inventory for infants anti toddlers and the peabody developmental motor scales-second edition. J Formos Med Assoc. 2005;9(3):312–322.
8.		Hwang AW, Weng LJ, Liao HF. Construct validity of the comprehensive developmental inventory for infants and toddlers. Pediatr Int. 2010;52(4):598–606.
9.		Wang TM, Su CW, Liao HF, Lin LY, Chou KS, Lin SH. The standardization of the comprehensive developmental inventory for infants and toddlers. Psychol Test. 1998;45:19–46.
10.		Liao HF, Yao G, Wang TM. Concurrent validity in Taiwan of the comprehensive developmental inventory for infants and toddlers who were full-term infants. Percept Mot Skills. 2008;107(1):29–44.
11.		Kirshner B, Guyatt G. A methodological framework for assessing health indices. J Chronic Dis. 1985;38(1):27–36.
12.		Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR; Clinical Significance Consensus Meeting Group. Methods to explain the clinical significance of health status measures. Mayo Clin Proc. 2002;77(4):371–383.
13.		Guyatt GH, Kirshner B, Jaeschke R. Measuring health status: what are the necessary measurement properties? J Clin Epidemiol. 1992;45(12):1341–1345.
14.		Wilkin D, Hallam L, Doggett M. Measures of Need and Outcome for Primary Health Care. Oxford: Oxford University Press; 1992.
15.		Tieman BL, Palisano RJ, Sutlive AC. Assessment of motor development and function in preschool children. Ment Retard Dev Disabil Res Rev. 2005;11(3):189–196.
16.		Wang TM. Predictive validity of comprehensive developmental inventory for infants and toddlers (CDIIT). Bull Spec Educ. 2005;29:1–24.
17.		Cohen J. Statistical Power Analysis for the Behavior Sciences. New York: Academic Press; 1977.
18.		Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol. 2000;53(5):459–468.
19.		Liang MH, Fossel AH, Larson MG. Comparisons of five health status instruments for orthopedic evaluation. Med Care. 1990;28(7):632–642.
20.		Deyo RA, Centor RM. Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance. J Chronic Dis. 1986;39(11):897–906.
21.		Anastasi A, Urbina S. Norms and the meaning of test scores. In: Anastasi A, Urbina S, editors. Psychological Testing. Upper Saddle River, NJ: Prentice-Hall; 1997:48–83.

Creative Commons License © 2016 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]