Back to Journals » Psychology Research and Behavior Management » Volume 12

Examining psychometric properties and measurement invariance of a Chinese version of the Self-Compassion Scale – Short Form (SCS-SF) in nursing students and medical workers

Authors Meng R, Yu Y, Chai S, Luo X, Gong B, Liu B, Hu Y, Luo Y, Yu C

Received 20 May 2019

Accepted for publication 15 August 2019

Published 30 August 2019 Volume 2019:12 Pages 793—809


Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Professor Igor Elman

Runtang Meng,1,2,* Yong Yu,2,3,* Shouxia Chai,4,* Xiangyu Luo,5 Boxiong Gong,6 Bing Liu,2,3 Ying Hu,1,7 Yi Luo,8 Chuanhua Yu1,7

1Department of Preventive Medicine, School of Health Sciences, Wuhan University, Wuhan 430071, People’s Republic of China; 2Centre of Health Administration and Development Studies, Hubei University of Medicine, Shiyan 442000, People’s Republic of China; 3School of Public Health and Management, Hubei University of Medicine, Shiyan 442000, People’s Republic of China; 4School of Nursing, Hubei University of Medicine, Shiyan 442000, People’s Republic of China; 5Department of Cardiothoracic Surgery, Taihe Hospital, Hubei University of Medicine, Shiyan 442000, People’s Republic of China; 6Department of Oncology, Taihe Hospital, Hubei University of Medicine, Shiyan 442000, People’s Republic of China; 7Global Health Institute, Wuhan University, Wuhan 430072, People’s Republic of China; 8School of Nursing, Ningbo College of Health Sciences, Ningbo 315100, People’s Republic of China

Correspondence: Chuanhua Yu
Department of Preventive Medicine, School of Health Sciences, Wuhan University, Wuhan 430071, People’s Republic of China
Tel +86 276 875 9299

*These authors contributed equally to this work

Background: Self-compassion has been regarded as a key psychological construct and a protective factor of mental health status. The focus of the present study was to adapt the Self-Compassion Scale (SCS) into Chinese, assess the validity and reliability of the measure and test measurement invariance (MI) across nursing students and medical workers.
Methods: The current study assessed the psychometric properties and invariance of the SCS-Short Form (SCS-SF) in two samples of 2676 from nursing students and medical workers. For construct validity, confirmatory and exploratory factor analyses (CFAs and EFAs) were conducted. Using Perceived Stress Questionnaire , Short Form-8 Health Survey (SF-8) and Goldberg Anxiety and Depression Scale, we evaluated concurrent validity and convergent/divergent validity. For reliability, internal consistency and test–retest analysis were employed. Multi-group analyses were conducted to examine MI of the different SCS-models across populations.
Results: CFA showed that the proposed six-factor second‐order model could not be replicated and the six-factor first‐order model was a reasonable to mediocre fitting model in both samples. EFA supported a three-factor structure which consisted of one positive and two negative factors. CFA confirmed that the hypothesized three-factor structure with 10 items ultimately was considered as the optimal model on the fitted results. The SCS-SF‐10 (10 items form) also demonstrated acceptable internal consistency and test–retest reliability, as well as strong concurrent validity with measures of stress perception, health status, anxious and depressive symptoms. Convergent/divergent validity was not satisfactory. Multi-group CFAs provided support for the validity of the established models.
Conclusion: The Chinese version of the SCS-SF‐10 has sound psychometric properties and can be applied to efficiently assess self-compassion in Chinese-speaking populations. The current study contributes to the identification and measurement of self-compassion after adversities.

Keywords: medical workers, nursing students, self-compassion, Chinese, measurement invariance, psychometric assessment


Deriving from various Buddhist psychology thoughts and principles, Neff has innovatively proposed the multifaceted construct of self-compassion which simply represents compassion turned inward and involves approaching oneself perceived failure and inadequacy with kindness or alleviating personal suffering through acceptance of unfavorable events.13 Self-compassion entails three main theoretical facets, represented by pairs of which have a positive and negative pole that identifies compassionate versus uncompassionate behavior: self-kindness and self-judgment, a sense of common humanity and isolation, mindfulness and over-identification.3 These elements, indeed, combine and mutually interact to create a self-compassionate frame of mind.4 Self-compassion offers being benevolent and compassionate toward oneself while confronted with personal shortcomings and challenges, warmth and understanding the ubiquity of suffering, and being attuned to negative emotions without suppressing or exaggerating. An ever-increasing body of research indicates that this construct enables people to suffer less while also helping them to thrive.4 Hence, the practice of self-compassion has garnered a great deal of attention in the scientific arena, especially on psychological health and psychopathology fields.5,6

More systematic studies suggest that self-compassion can be beneficial for target populations and is definitely considered as a protective factor against adverse mental health outcomes. Self-compassion may be a critical quality to cultivate for promoting positive health behaviors such as eating habits, exercise, sleep behaviors, and stress management, due in part to its association with adaptive emotions.7 Self-compassion can buffer psychological stress and enhance psychological well-being8; a meta-analysis revealed a moderate effect size for the association between self-compassion and well-being.9 A negative correlation existed between self-compassion and affective disorders of depression, anxiety and stress.5 Another recent meta-analysis study has found that self-compassion is increasingly explored as a protective factor in relation to psychopathology.6 Empirical studies and study protocols were all focused on specific populations such as medical samples, and found that self-compassion is associated with better medical adherence among people with fibromyalgia, chronic fatigue syndrome, and cancer, due in part to lower stress,10 and that using controlled clinical trial to evaluate the effect of the interventions, whether mindfulness and self-compassion could reduce work stress and burnout in family and community medicine physicians and nurses.11

Alternatively, self‐compassion links to the sensitivity to their own experience of suffering and encompasses three basic components—self-kindness, common humanity, and mindfulness. Self-kindness pertains to the support and understanding toward oneself independent of being self-critical; common humanity alludes to recognizing that everyone fails, makes mistakes, and gets it wrong sometimes; and finally, mindfulness refers to awareness of our negative thoughts and emotions so that they are approached with balance and equanimity.4 Additionally, it is important to highlight that self-compassion as a total construct is also broader in scope than mindfulness because it includes the additional elements of self-kindness and common humanity: actively soothing and comforting oneself when painful experiences arise, and remembering that such experiences are part of being human.

Self-Compassion Scale (SCS) has emerged as the most common instrument to measure this key psychological construct. According to theoretical analysis in the literature, the developer of the SCS initiative determines self‐compassion into these three dimensions. The subscales then are compiled into each dimension and finally can be synthesized as total scale. A report on exploratory factor analysis (EFA) of all items in the SCS was not found in the relevant papers published by the original author. Up to the present, the SCS has been translated into at least 17 languages and was validated independently (or federatively) through their international counterparts (retrieved from:, timestamp: 10/08/2019). There were two alternative forms concerning this scale: SCS – Long Form (SCS-LF or commonly just referred to as SCS)2 and SCS – Short Form (SCS-SF).12 The SCS-LF and SCS-SF were, respectively, developed by Neff, Raes and coworkers to assess how people treated themselves in difficult times.2,12 The original SCS (SCS-LF) includes 26 items, measuring six dimensions of self-compassion (i.e. Self-Kindness = SK, Self-Judgment = SJ, Common Humanity = CH, Isolation = IS, Mindfulness = MI and Over-Identification = OI),2 whereas the SCS-SF is comprised of a 12-item subset of the SCS. This shorter form of the SCS has been validated over time and its use as a mini-mental health assessment tool.12

In the original validation study, the developer identified a six-factor first-order structure with the six intercorrelated factors and a six-factor second-order (the term also known as “higher-order” and can be used interchangeably) model with a general self‐compassion factor behind the six components2 and subsequently chose the latter as a default model. Note that one of the most comprehensive studies to review and examine the SCS factor structure recently found that, using secondary data from 20 diverse samples (N=11,658) of Neff and coworkers, results supported the use of a total SCS score or six subscale scores.13 Another study in four distinct populations suggested that the six-factor correlated model demonstrated the best fit across samples and a total SCS score can be used as an overall measure of self-compassion.14 Several replications have been conducted across the globe, but the findings are not without contradictions.15 That is, this factor structure has not been consistently confirmed globally. For instance, two studies from Dutch16 and Portuguese17 suggested a two-factor solution with one positive factor and one negative factor (i.e., formed by the positively and negatively formulated items, respectively), one study in a Buddhist sample presented a model where the three positive-intercorrelated factors are present and a general negative factor representing all negative aspects,18 a second study concerning the Chinese revision of SCS supported a three-factor structure.19 Still another study from Turkish confirmed a one-factor model with a general self-compassion factor.20 Furthermore, four studies in Greek-speaking populations supported a model that factor analyses established similar factor solutions to the English versions, but noticed that the first sudden change on the scree plot chart emerged after the third factor.21

The English version of the SCS-LF, not surprisingly, was judged easy to read and had relatively good validity and high reliability in diverse samples around the world. The SCS-SF is widely applied to different groups of population, especially in children,22 adolescent,23,24 and adult.2527 The findings from settings in different countries need to be compared and discussed. In Dutch, Spanish, Portuguese and Slovenian, the SCS-SF supported the same six-factor structure as found in the SCS-LF, showing adequate goodness of fit, as well as a single higher-order factor of self-compassion.12,2527 Also, the SCS-SF in these four studies demonstrated adequate internal consistency, test–retest reliability and a high correlation with the SCS-LF. In Chinese undergraduate students, the full version SCS, SCS-LF, suggested good reliability encompassing Cronbach’s α coefficient (0.84) and test–retest reliability (0.89, in 2-week interval) and supported a six-factor structure model.28 However, it should be noted that factor loadings of the items on each factor are not clear, which may be caused by the close relationship between three dimensions (self-kindness, common humanity, and mindfulness) of the SCS, the way of expression of the items and the East-West cultural differences. Hence, this full Chinese version does not name the factors (subscales).28 Another study in high school students19 suggested that the full Chinese version SCS finally retained 12 items (and yet deleted 14 items) with quality criteria as above mentioned in the literature. Note that this shorten version supported a three-factor structure, including the sense of common humanity, self-kindness and mindfulness. It has acceptable internal consistency for the total scale reliability (0.77).

Medical professionals, including nursing, are widely considered as a challenging occupation. Medical workers are particularly vulnerable to stress overload and compassion fatigue due to an emotionally exhausting environment.29 Nursing students also are confronted with high levels of adverse mental health outcomes, such as stress, anxiety and depression.30,31 Self-compassion has been recognized as a protective factor against adversities by cultivating and training program (or course) to improve the mental health of medical workers and nursing students.11,32 Having compassion for others entails self-compassion. Enhancing focus on developing self-compassion holds promise for reducing perceived stress and increasing compassionate clinical care in health professionals.29 Self-compassion, as one of the feasible intervention strategies, especially is relevant for health professionals. Hence, this study focused on the two key groups, nursing students and medical workers.

The aim of this study was to validate and assess some psychometric properties of the Chinese version of the SCS-SF. The reason for validating only a shorter Chinese version of the SCS was that there had been reported previously in China.19,28 That is, the two translated versions used an SCS-LF in both of these studies. To investigate the SCS-SF on some psychometric indicators, samples of many different environments thus are being collected and analyzed.

Materials and methods

Measuring instruments

Self-Compassion Scale (SCS)

Forward–backward or/and counter-translation is surely one of the most commonly used translation process and have specific guidelines.33,34 We established a triangular group including two PhD/MD students, one master student majoring in translation and interpreting, one master student majoring in nursing psychology and the first author as well as corresponding author who served as moderator. For a start, this procedure concerns a forward translation from the original language (English) to that of the target language (Chinese). Next, the target language (Chinese) is then back–translated into the original language (English) and compared to the original version. Inaccuracies in the target language are simply identified through differences in meaning that occur in the backward translation. Differences in items were retranslated and modified until full agreement was achieved between the authors and the two groups’ independent translators. To make items more easily understandable and avoid potential errors, the Chinese pre-final version was tested in a hospital including nine nursing students (trainee). After further modifications, we finalized the Chinese version SCS-SF (see Supplementary material). The response scale ranged from 1 (almost never) to 5 (almost always). To calculate a total self-compassion score mean (0–1), i.e. (raw score − the lowest possible score)/(the highest possible score − the lowest possible score), reverse score the negative subscale items,2 with higher scores denoting greater levels of self-compassion.

Perceived Stress Questionnaire (PSQ)

The PSQ was developed by Levenstein and colleagues to assess perceived stress in the last month or in the long run.35 It would require participants to respond to 30 statements based on their subjective experiences (or/and feelings) rather than specific and objective state. Each item is scored from 1 (almost never) to 4 (usually), and eight items are reverse-scored to ensure accuracy of response. The resulting PSQ index similarly is linearly transformed between 0 (lowest possible level of stress) and 1 (highest possible level of stress); index is equal to (raw score − 30)/90.35 Originally designed in English and Italian, this questionnaire has been translated into multiple languages and validated in different populations. The PSQ’s authors granted us permission to introduce this instrument and authorized the final version in China.36 Psychometric evaluation of a Chinese version of the PSQ (C-PSQ) was recently validated and applied.

Short Form-8 Health Survey (SF-8)

The SF-8 Health Survey (SF-8), an alternate form to the most widely used SF-36 Health Survey (SF-36), is a member of the assessment of quality of life instruments.37 It measures eight dimensions of health functioning, which accordingly are used to compute two summary measure scores (physical component summary, PCS and mental component summary, MCS).37 The SF-8 has been translated into Chinese in prior studies, where Chinese version was validated in some large representative samples and was also available.38,39 Similarly, the Medical Outcomes Study scoring system was still applied to this study.40 Total scores are computed as the weighted sum of the scores for all items and the final score ranges between 0 and 100, with higher scores indicating better health status.

Goldberg Anxiety and Depression Scale (GADS)

The Goldberg Anxiety Scale (GAS) and the Goldberg Depression Scale (GDS) formed from the GADS, each scale provided a count of symptoms consistent with depression and anxiety.41 The GAS and the GDS ask respondents to report, “Yes” or “No” (respectively, assigning it one or zero point), that if they had felt in the respective nine anxiety symptoms and nine depression symptoms over the past month. Total score of each subscale is summed to reach a maximum of nine. The higher the total score is, the greater the level of symptomatology. Using the translation approaches mentioned above, the Chinese version of the GADS has been reported elsewhere and showed strong correlation with the C-PSQ only in nursing students.36 The GADS presented a simple, quick and accurate method of screening depression and anxiety in the general population.

Participants and procedure

A self-administered questionnaire survey method was used to collect data from 2676 valid samples in Ningbo and Shiyan, China. These two samples were, respectively, named NB and SY in the study. The participants were recruited from universities (colleges) and hospitals, which is closely related to medical field (see Table 1). Two of our papers did offer details on how to perform the sampling process in Ningbo.36,42 Briefly, using traditional paper-and-pencil and centralized data collection, we tested nursing students from one of the higher vocational colleges (i.e., sample NB) by requesting feedback anonymously and confidentially. This investigation studied sample NB with the class as the unit to enroll. A pretest and a retest were also conducted to randomly select one class (50 students) after a 1-week interval in sample NB. Similarly, this study collected medical workers on questionnaire data with hospital and department as the unit. These participants were derived from three tertiary hospitals in Shiyan (i.e., sample SY). The only difference is that most of the medical workers were busy with work and scattered distribution, thereby developing into decentralized administration among some respondents. Each participant received a small incentive as compensation for his or her time when they completed the questionnaire: a piece of chocolate (or a small notebook and pen) worth 5 Chinese RMB (equal to 0.8 US dollars).

Table 1 Basic statistics on the sample and socio-demographic characteristics of participants

Before data collection, the ethics committee of Wuhan University School of Medicine gave their approval for the study design. The study was conducted in accordance with the declaration of Helsinki (revised form, version 2013).43 All respondents in our study were informed that participation was voluntary. We obtained the written informed consent of all medical workers and nursing students prior to their participation.

Analytical procedure

Given cross‐cultural equivalence of the two translated versions of an instrument and uncertainty of factor analysis results in previous studies,12,19,28,33 we performed confirmatory factor analysis (CFA) and EFA to examine factor structure of the SCS-SF, and then tested measurement invariance (MI)44 across nursing students and medical workers using multi-group confirmatory factor analysis (MGCFA). Considering all that an adequate sample size with at least 200 is probably an appropriate threshold, 500 or more cases are strongly recommended in factor analysis.45,46 Effective samples of the current study were in compliance with the requirements for CFA and EFA.

First and foremost, to evaluate the fit of the previously proposed SCS models, CFAs were used. We performed two separate CFAs (one on the nursing students’ sample and one on the medical workers’ sample) for testing the original first‐order and second‐order six‐factor structure, reflecting the theoretical operationalization/dimensionality of the construct.12 The CFA-models were fitted for each group separately for nursing students and medical workers, as well as their combined sample. Maximum likelihood (ML) estimation can be applied to CFA because the distributions of the items were multivariate normal distribution.

In model fit of the single and multi-group CFAs, a combination of common indexes was applied to assess the global fit of the models to the data, based on the views of their distinct rules for good fit4749: Normed chi‐square (NC), with “good fit” if 3?>NC>2?,48 and owing to chi-square test that is sensitive to sample-size49; the Tucker–Lewis index (TLI)50 and the comparative fit index (CFI),51 with “good fit” if >0.90; the root mean square error of approximation (RMSEA) and RMSEA if ≤0.06 for “good fit,” RMSEA if in the 0.06–0.08 range for “adequate fit”;52 the P of Close Fit (PClose), with “good fit” if >0.05.53 To compare the goodness-of-fit between the nested MI models, we followed the aforementioned recommendation of using differences in RMSEA, CFI, and TLI. Hereby, models with a change in CFI (ΔCFI) ≤0.010, change in RMSEA (ΔRMSEA) ≤0.015, and change in TLI (ΔTLI) ≤0.010 were favored.5456 In addition, models were compared with a chi-square difference test. However, the consensus was that this may be an overly stringent criterion since Δχ2, in common with χ2, is dependent on sample size with a rejection of models with trivial practical misfit in large samples.57,58

Afterward, following suggestions on EFA,59 the distribution of the items was examined to ensure there were not severe non-normalities. None of the SCS’s items showed severe non-normal distribution. Two independent samples should be used when exploring (EFA) and then confirming (CFA) the internal structure of a tool. A series of EFA were conducted to further examine the factor structure of the SCS-SF. In case the CFA might not support the six-factor structure, especially in high‐order factor structure, we intended to perform EFAs to explore the SCS-SF factorial structure. Based on the prior literature, EFAs indeed were conducted to investigate some different factor structures, included one, two or three factors as mentioned above.16,17,1921 ML method with an oblique rotation (promax, power coefficient = 4) was used since the objective was to identify latent underlying constructs and there were assumptions of the factors as being related in previous studies.12 Principal component analysis (PCA) is not a factor extraction technique but simply an approach for data reduction, as it uses all the variance in the observed variables without discriminating between shared and unique variance. Actually, the results of factor extraction were the same for both varimax rotation and oblique rotation. We performed in subsequent EFAs on the data of nursing students and medical workers to explore the underlying structure of the SCS-SF. Then, CFAs were performed orderly on the nursing students’ sample, the medical workers’ sample and the combined sample to verify the identified factor solution.

The number of factors extracted was determined by reviewing the scree plot and considering the following criteria: eigenvalues (>1), items content and interpretability and proportion of total variance explained by extracted factors combined (usually >60%, at least 50%).47 Of these, adequate item-total correlation should fluctuate between above 0.20 and below 0.80; items with lower correlation should be discarded.60,61 Factor loadings above 0.45 were considered important, based on 20% overlapping variance (fair) proposed by Comrey and Lee.45 Due to the large-sample size, loadings above 0.30 (only for sample sizes of 350 or greater) or even 0.20 were considered significant.47,62 Cross-loadings, items loaded on more than one dimension, should not exceed 0.32, while keeping a gap of at least 0.2 between the target loadings (factor loadings) and each of the cross-loadings. Additionally, all variables with communalities should also be over 0.50 to ensure sufficient explanation.47

Finally, MGCFAs were performed to test whether aspects of the factorial structure replicated in a separate sample, that is, across nursing students and medical workers groups for purposes of cross-validation. A two-step procedure, first‐order model and second‐order model fitting, was used by testing these samples. Three commonly tested levels, that are factor loadings, intercepts, and residual variances, were investigated by conducting multi-group factor models to examine the sensitivity of goodness of fit indexes.55

Reliability links to the consistency and precision of a measurement.63 Consistency was assessed with Cronbach’s alpha (α), Guttman’s lambda-2 (λ2, a better reliability estimation method64), and item-total correlation; test–retest reliability. Both alpha and lambda-2, as the indicators of internal reliability coefficient, were included in the present study. Test–retest reliability (reproducibility) would be exposed through Pearson’s correlation and the intraclass correlation coefficient (ICC) at an interval of 7 days. ICC was calculated using the single measures and two-way mixed-effects model with absolute agreement type in view of method and range in collecting retest data.65,66 For measurement precision evaluation, standard error of measurement also was calculated to quantify the variability of measurement errors.67 Validity refers to the true value and accuracy that a measure attempts to capture.63 Validity was measured in terms of construct validity, concurrent validity and convergent/divergent validity. Construct validity and dimensional structure were evaluated through CFAs and EFAs in aspects of exploration, item reduction, confirmation and validation. Concurrent validity and convergent/divergent validity were examined by testing Pearson’s correlations of the different forms of SCS-SF with the instruments mentioned above. The mean difference testing (Student’s test) was used to compare levels of symptom between key groups. Besides statistical significance, effect size should be also considered for interpreting the magnitude of the associations with the criterion measures, as well as the differences in mean SCS-SF scores between nursing students and medical workers. The established level of statistical significance was set at 5%.

The author set up the database using EpiData (version 3.1; Jens M. Lauritsen & Michael Bruus, Odense, Denmark) software. All data were analyzed in SPSS/PASW (version 18.0; SPSS Inc., Chicago, IL, USA) and AMOS, as well as AMOS Plugin from Gaskination’s StatWiki.


Validity analysis

Using a sequence of nursing students, medical workers, combined sample, we ran a series of CFA for the proposed six-factor model (first‐ and second‐order) according to the steps outlined previously. The SCS’s proposed second‐order six-factor structure was considered to be a poor-fitting model and could not be replicated in this study. Some values of fit indices in the first‐order six-factor structure were considered marginal, thereby providing a reason to mediocre fitting model (the first 3 lines, see Tables 2 and 3). For that reason, EFAs were conducted to analyze the factor structure of the SCS-SF in separate samples. Sampling adequacy for factor analysis was preferentially tested for nursing students and medical workers groups. The Kaiser–Meyer–Olkin indexes are equal to 0.778 (nursing students) and 0.770 (medical workers) >0.60 and their Bartlett’s test of sphericity were significant (P<0.001), indicating that the items could be considered for good factor analyses.68 Factor analysis in two separately samples extracted almost the same as three factors and the total cumulative was, respectively, 35.63% (nursing students) and 36.82% (medical workers). Inspection of eigenvalues, scree plot and item content and interpretability suggested a three‐factor solution. A summary of the EFA and item-total correlations was presented in Table 4, with component loadings (obtained from the pattern matrix) for the entire set of items and selected items shown. The two items, item 2 and item 10, existed with flaws of cross-loadings (item 2 on Factor 2 was −0.402 in medical workers). In addition, item 2 and item 10 with item-total correlations were lower than 0.20 (0.096 and −0.009 as well as 0.025 and 0.093, respectively). Still, we decided to accept this relatively low communality and keep most of items with a view to keeping scales of enough length and in light of its satisfactory other indexes. According to the criteria mentioned earlier, we tried to retain 10 items, except for items 2 and 10, as the Chinese version of SCS-SF‐10 (i.e., 10 items form).

Table 2 First‐order model fitting results for each measurement invariance level of the SCS-SF across nursing students and medical workers

Table 3 Second‐order model fitting results for each measurement invariance level of the SCS-SF across nursing students and medical workers

Table 4 Summary of exploratory factor analysis with maximum likelihood and promax-rotation of the SCS-SF items

Using these samples, we carried out a series of CFA to validate various factor structures (See Table 2). Self-compassion or self-compassion subscales were entered as endogenous variables whereas all 12 items (or removed items 2, 10) were considered exogenous variables. Because item 2 loaded on Factor 1 (nursing students) or Factor 2 (medical workers), we in sequence run CFAs to confirm the factor structure of the SCS-SF using nursing students, medical workers, combined sample. Owing to a poor fitting effect in high‐order model and reduced results, this test only checked for the first‐order model.

Finally, we in turn validated three-factor structure (included 10 items and 12 items form) based on the EFA results. Several indices were often used for assessing goodness of fit. On the whole, three-factor models were a better fit than six-factor models. Further exploration revealed that three-factor model removed item 2, and item 10 was thus a better fit than three-factor model (12 items form). As can be seen, three-factor structure (model) with 10 items ultimately was considered as the optimal model on the fitted results. Of these, these 10 items can be assigned to three factors on a regular basis. We renamed these three factors combined with the literature,16,17 which consisted of one positive factor (Factor 1: C3, C5, C6, C7), two negative factors (Factor 2: C1, C4, C8, C9; Factor 3: C11, C12). In the meantime, to evaluate concurrent and convergent/divergent validity of the SCS-SF‐10, we used the existing gold standard scales, like the Chinese version of the PSQ, SF‐8 and GADS. Self-compassion scores were negatively associated with PSQ index (r=−0.517, P<0.001) and GADS scores (r=−0.451, P<0.001), positively associated with SF‐8 scores (r=0.405, P<0.001). SCS-SF‐10 and SCS-SF as well as its subscales intercorrelations and gold standard scales their reliability results were presented in Tables 5 and 6.

Table 5 SCS-SF‐10 subscales intercorrelations and concurrent as well as convergent/divergent validity

Table 6 SCS-SF subscales intercorrelations and concurrent as well as convergent/divergent validity

In the second phase of the analyses, MI was tested across key groups (nursing students vs. medical workers). Tables 2 and 3 present the details of the CFA-models and model fitting results for each level of MI. Overall, some Δχ2 (Δdf) values were significant, yet difficult to interpret given our large-sample sizes. Although most of χ2 and Δχ2 test were significant (except in the few), other model fit indices (ΔTLI, ΔCFI, ΔRMSEA) in the first‐ and second‐order three-factor model (removed items 2 and 10) did not decrease more than the recommended cut-off values, indicating key groups invariance on three levels of factor loadings, intercepts, and residual variances. That is, MI could be achieved in either group. In contrast, some ΔTLI values were above the recommended cut-off value in the first‐ and second‐order six-factor model. Further, mainly model fit indices (ΔTLI, ΔCFI, ΔRMSEA) in the first‐order three-factor model (retained 12 items) also were up to the standard. In short, MI in three-factor model (removed items 2 and 10) across nursing students and medical workers was the best.

Reliability analysis

The Chinese version of the SCS-SF consists of 12 items. Cronbach’s alpha and Guttman’s lambda-2 values of the SCS-SF were 0.639 and 0.672, which indicates marginally acceptable reliability of this tool in the present study sample. Note that removed items 2 and 10, Cronbach’s α value increased to 0.686 whereas Guttman’s lambda-2 value increased to 0.706 with an acceptable reliability value. The reliability coefficients for each subscale separately were 0.741 (Factor 1), 0.696 (Factor 2) and 0.540 (Factor 3) in addition to 0.744 (Factor 1), 0.697 (Factor 2) and 0.540 (Factor 3), respectively, corresponding to Cronbach’s alpha and Guttman’s lambda-2. Item-total correlation values had increased and strong central tendency when deleted these two items, which can meet quality criteria. In addition, test–retest reliability analysis for the pretest sample of 50 nursing students was 0.617 (Pearson’s correlation, 95% CI = 0.436–0.790) and 0.618 (ICC, 95% CI = 0.413–0.764), indicating fair agreement. However, the 12-item form has higher test–retest reliability. In brief, the 10‐item form showed acceptable reliability, detailed results are given in Table 7. Given that the SCS-SF‐10 (or/and the SCS-SF) is not unidimensional, the reliability coefficients for each subscale separately were provided (see Table 8).

Table 7 Reliability of the 10 and 12 items form SCS-SF for whole scale in total sample

Table 8 Reliability of the 10 and 12 items form SCS-SF for each subscale in total sample

Levels of self-compassion, perceived stress and anxious and depressive symptoms

The self-compassion level difference was statistical significance (t=2.263, P=0.024) between medical workers and nursing students, as well as medical workers’ self-compassion level (0.547±0.111) was lower than nursing students’ self-compassion level (0.557±0.108). In addition, independent t-test revealed that the differences were statistical significance between medical workers and nursing students on levels of perceived stress (t=−11.222, P<0.001), anxious symptoms (t=−15.057, P<0.001) and depressive symptoms (t=−13.805, P<0.001). Medical workers’ levels on stress and anxious and depressive symptoms were higher than nursing students’ levels (all P<0.001), and yet the former lower than the latter on level of self-compassion (P=0.024). The results of SF-8 for nursing students were not specified because this study not collected the data. Of these, results of Levene’s test for equality of variances suggested that the variance homogeneity of stress and depressive symptoms equal variances was not assumed. Additionally, effect sizes of the SCS-SF (10 items and 12 items form) and its subscales were all below 0.10, but effect sizes of the PSQ and the GADS (GAS and GDS) were all above 0.10 and below 0.30. To compare mean scores in different forms of the SCS, detailed results are shown in Table 9.

Table 9 Nursing students’ and medical workers’ mean scores in measure instruments


Regarding its psychometric properties, especially in structure validity, our findings could not replicate the proposed six-factor second-order model (a default model). It seemed to confirm earlier findings that there is still no accordance regarding the SCS’s original factor structure.69 This more comprehensive CFAs and EFAs revealed main indexes of different models concerning six‐ and three-factor (first‐ and second‐order), indicating a good fit of the three-factor models with the survey data. The best-fitting solution was a three-factor second-order model with 10 items (i.e., the factor structure obtained from EFA was confirmed in CFA), and only at this point we gave our subscales labels, the name one positive factor (Factor 1, positively formulated items: C3, C5, C6, C7), two negative factors (Factor 2, negatively formulated items: C1, C4, C8, C9; Factor 3, negatively formulated items: C11, C12). However, a comparison of the three-factor solution obtained from EFA (Table 4) with the original subscales in the study by Raes et al did not support such labeling.12 Namely, Factor 3 only consisted of two self-judgment items, while Factor 2 consisted of two over-identification and two isolation items, Factor 1 consisted of two mindfulness items, one common humanity item and one self-kindness item. In a real sense, this best three-factor model with 10 items still included the SCS’s six-factor structure main information with at least one each original subscale’s item, despite incomplete information. Our shorten version supported a three-factor structure like other shorten version in China, the differences were that their research used the SCS-LF and extracted clear three factors, unfortunately, removed more than half of items.19 Notably, our best model formed one positive factor and two factors were a bit different from a model with one positive and one negative factor, because the latter was used 24 items (Dutch16) and 26 items (Portuguese17). To establish an identical set of factors across studies and national settings was a failure, since there were cultural and linguistic diversity, sampling factors, translation of the items and/or other subjective and objective factors.70,71

For MI, high levels of MI in the first‐ and second‐order three-factor model were supported across key groups (i.e., invariance on three levels of factor loadings, intercepts, and residual variances). The three-factor structure, especially in high‐order model, could be generalized across nursing students and medical workers, which allows the comparison of the groups. Nonetheless, the six-factor structure (grudgingly) supported generalization across subgroups of participants owing to some substandard indexes.

Criterion validity of the measure was evaluated by examining the associations between different factors of the SCS-SF (including 10 and 12 items form) and stress, anxiety and depression. In terms of the criterion validity of the scale, results showed that one positive subscale of self-compassion was negatively associated with stress, anxiety and depression; had a positive link to PCS and MCS. Two negative subscale of self-compassion with the same kind of results was done, due to recoding reverse‐coded items. Acceptable psychometric performance implies a Pearson correlation <0.60 (convergent) and a correlation <0.2072; convergent/divergent validity of the SCS-SF (including 10 and 12 items form) was not satisfactory, perhaps triggered by an imbalance on the number items per factor. It would mean that the scale that is being validated is not specific enough to measure the construct of interest in a given population.73

With regard to reliability, Neff has repeatedly mentioned that the SCS-SF is not recommended for use in examining the six components separately because subscales have poor reliability3; this point also happened to coincide with the current study. There were acceptable reliability coefficients (near-threshold of 0.70) on this adaption. Although the reliability (Cronbach’s α) was improved for the 10 items form, it is very close to cut‐off value and still below that value (0.69). Reliability coefficients of Factors 1 and 2 were above minimum quality criteria. Factor 3 in these dimensions only consisted of two items that led to lower reliability coefficients. Good practice dictates a minimum of three items per factor.47 It has unacceptable internal consistency for each subscale in six dimensions.

Additionally, key groups-based mean scores differences were also discovered. Level of self-compassion in medical workers was lower than level of self-compassion in nursing students. However, levels of adversities (i.e., stress, anxiety and depression) in medical workers were higher than level of self-compassion in nursing students. Unfortunately, this study did not collected the data of nursing students on quality of life, thereby not compared the level between key groups.

As for levels of self-compassion, perceived stress and anxious and depressive symptoms, a strong link was found between medical workers and nursing students. The findings may provide clues to describe the interaction between self-compassion and the other variables, such as adverse mental health outcomes. Unfortunately, effect sizes of the SCS-SF (10 items and 12 items form) and its subscales were all below 0.10 (i.e., a correlation coefficient of 0.10 is thought to represent a weak or small association74), indicating that effect sizes in trials may be statistically significant but not produce clinically important differences in practice settings. Still have, our results supported the use of both the total score and subscale scores of the SCS-SF and/or SCS-SF‐10, but demonstrating that some subscales and the scale have lower and still acceptable reliability. The advantage of using the short form but also the disadvantages must be mentioned. The short form has a near-perfect correlation with the long form when examining total scores.12 We also do not recommend using the short form if you are interested in subscale scores, since they are less reliable with the short form.3 Despite the fact that the total score with either the long or short form can use for assessing of clients’ self-compassion, the short form (SCS-SF) might be as a better, easy-to-use and economical alternative when the aim is to use the total score.

Implications for practice

Admittedly, there is a surge of interest in the well-being of those same medical professionals, even including medical students or nursing students, and other health care providers who provide treatment.29 The monitoring of key groups’ well-being, especially in medical workers, should be put on the agenda. Cultivating self-compassion may not be easy, but it is no doubt a worthwhile, empowering and liberating way to live your life. Additionally, some interventions could also be proposed, the Mindful Self-Compassion (MSC) training/program: either using informal practices such as the Self-Compassion Break, or formal meditation practices such as Affectionate Breathing (retrieved from:

Limitations and future directions

The present study is aimed at showing that these two samples from the Chinese population validated, explored and cross-validated the psychometric properties of this adaptation, the Chinese version of the SCS-SF, including 10 and 12 items form. Nevertheless, the authors admit to some weaknesses of the study. First, use of a single self-report questionnaire to assess self-compassion and its analogous concepts may be lacking. In this connection, it would be useful to further employ more integrated measuring, such as a combination of questionnaires, including mindfulness, self-esteem, self-criticism and/or self-concept. Second, the sample population may not be representative of the Chinese population since the study sample was confined to two areas from medical field, and this limit the generalizability of findings. Although measurement of stability of the instrument over time was used in our study; the sample size of 50 for test–retest reliability seemed to be small. Third, one could argue that not all aspects of validity were analyzed in this study; however, we tried our best to report our measured results and compare with similar studies. Additionally, given that it was an observational study, there existed inevitably various biases too. Only medical workers responded the SF-8 also was one of the perceived drawbacks. Since validation is an ongoing process,75 further validation study on the Chinese SCS is distinctly needed. In future studies, revisions would be needed if this instrument was tested with large-sample as clinical practice and research settings.


Despite the limitations of this study, the Chinese version of the SCS-SF was accepted as valid and reliable measures for assessing self‐compassion of psychological research areas in Chinese-speaking populations. Briefly, self-compassion scores can be utilized in research to evaluate self-compassion level of some population (adults), such as college students and medical workers.

The short form of the SCS, as the authors who created it suggest, might be particularly useful in settings where time constraints make the use of the long form less feasible or advisable. However, these authors recommend using the full scale if information about subscales is crucial.12 Further investigation on possible links between SCS-SF and SCS-LF in a large sampling survey is greatly needed but is beyond the scope of the present study. Future research directions should focus on other developmental stages (e.g., children, adolescents or elderly) and using more heterogeneous samples.


We are indebted to Prof. Dr Kristin Neff from University of Texas at Austin for providing this instrument. Special thanks go to our friends: Dr Jingjing Li (holds a PhD, from Department of Behavioral Sciences and Health Education, Rollins School of Public Health, Emory University), Dr Zhenkun Wang (PhD, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology), Yucong Ma (MTI, he studied at Southeast University-Monash University Joint Graduate School (Suzhou) at that time), Di Zhang (MSN, RN, Quality Control Department, Wuhan Asia General Hospital) and Yongyong Xi (MM, Department of Environment and Occupational Hazard Control, Center for Disease Control and Prevention of Pudong New District) for their valuable assistance at forward and back-translation within the process. The authors express their appreciation to all respondents taking part in the present study and some friends for offering support in collecting data. Again, we would like to thank the anonymous reviewers who offered constructive suggestions for improvement. This research was funded by the National Natural Science Foundation of China (Grant No. 81773552 and Grant No. 81273179) and the National Key Research and Development Program of China (Grant No. 2018YFC1315302, 2017YFC1200502). Additionally, this project was supported by Key Research Center for Humanities and Social Sciences in Hubei Province (Hubei University of Medicine) (Grant No. 2016YB06).

Author contributions

Chuanhua Yu and Runtang Meng conceived and designed the project. Runtang Meng collected, analyzed and interpreted the data as well as compiled the initial draft of the manuscript. Runtang Meng, Yong Yu and Shouxia Chai jointly edited this manuscript. Xiangyu Luo, Shouxia Chai, Boxiong Gong, Bing Liu and Yi Luo supported in providing the data. Chuanhua Yu and Ying Hu advised on statistical analysis. Runtang Meng directed all facets of the study. Runtang Meng and Chuanhua Yu are responsible for the overall project. All authors contributed to data analysis, drafting or revising the article, gave final approval of the version to be published, and agree to be accountable for all aspects of the work.


The authors declare no conflicts of interest in this work.


1. Neff KD. Self-compassion: an alternative conceptualization of a healthy attitude toward oneself. Self Identity. 2003;2(2):85–101. doi:10.1080/15298860309032

2. Neff KD. The development and validation of a scale to measure self-compassion. Self Identity. 2003;2(3):223–250. doi:10.1080/15298860309027

3. Neff KD. The self-compassion scale is a valid and theoretically coherent measure of self-compassion. Mindfulness. 2016;7(1):264–274. doi:10.1007/s12671-015-0479-3

4. Neff KD, Dahm KA. Self-compassion: what it is, what it does, and how it relates to mindfulness. In: Ostafin BD, Robinson MD, Meier BP, editors. Handbook of Mindfulness and Self-Regulation. New York: Springer New York; 2015:121–137.

5. MacBeth A, Gumley A. Exploring compassion: a meta-analysis of the association between self-compassion and psychopathology. Clin Psychol Rev. 2012;32(6):545–552. doi:10.1016/j.cpr.2012.06.003

6. Muris P, Petrocchi N. Protection or vulnerability? A meta‐analysis of the relations between the positive and negative components of self‐compassion and psychopathology. Clin Psychol Psychother. 2017;24(2):373–383. doi:10.1002/cpp.2005

7. Sirois FM, Kitner R, Hirsch JK. Self-compassion, affect, and health-promoting behaviors. Health Psychol. 2015;34(6):661. doi:10.1037/hea0000158

8. Leary MR, Tate EB, Adams CE, Batts Allen A, Hancock J. Self-compassion and reactions to unpleasant self-relevant events: the implications of treating oneself kindly. J Pers Soc Psychol. 2007;92(5):887. doi:10.1037/0022-3514.92.5.887

9. Zessin U, Dickhäuser O, Garbade S. The relationship between self‐compassion and well‐being: a meta‐analysis. Appl Psychol Health Well Being. 2015;7(3):340–364. doi:10.1111/aphw.12051

10. Sirois FM, Hirsch JK. Self-compassion and adherence in five medical samples: the role of stress. Mindfulness. 2019;10(1):46–54. doi:10.1007/s12671-018-0945-9

11. Pérula-de Torres L-A, Atalaya JCV-M, García-Campayo J, et al. Controlled clinical trial comparing the effectiveness of a mindfulness and self-compassion 4-session programme versus an 8-session programme to reduce work stress and burnout in family and community medicine physicians and nurses: MINDUUDD study protocol. BMC Fam Pract. 2019;20(1):24. doi:10.1186/s12875-019-0913-z

12. Raes F, Pommier E, Neff KD, Van Gucht D. Construction and factorial validation of a short form of the self‐compassion scale. Clin Psychol Psychother. 2011;18(3):250–255. doi:10.1002/cpp.702

13. Neff KD, Tóth-Király I, Yarnell LM, et al. Examining the factor structure of the self-compassion scale in 20 diverse samples: support for use of a total score and six subscale scores. Psychol Assess. 2019;31(1):27. doi:10.1037/pas0000629

14. Neff KD, Whittaker TA, Karl A. Examining the factor structure of the self-compassion scale in four distinct populations: is the use of a total scale score justified? J Pers Assess. 2017;99(6):596–607. doi:10.1080/00223891.2016.1269334

15. Tóth-Király I, Bőthe B, Orosz G. Exploratory structural equation modeling analysis of the Self-Compassion Scale. Mindfulness. 2017;8(4):881–892. doi:10.1007/s12671-016-0662-1

16. López A, Sanderman R, Smink A, et al. A reconsideration of the self-compassion scale’s total score: self-compassion versus self-criticism. PLoS One. 2015;10(7):e0132940. doi:10.1371/journal.pone.0132940

17. Costa J, Marôco J, Pinto‐Gouveia J, Ferreira C, Castilho P. Validation of the psychometric properties of the self‐compassion scale. Testing the factorial validity and factorial invariance of the measure among borderline personality disorder, anxiety disorder, eating disorder and general populations. Clin Psychol Psychother. 2016;23(5):460–468. doi:10.1002/cpp.1974

18. Zeng X, Wei J, Oei TP, Liu X. The self-compassion scale is not validated in a Buddhist sample. J Relig Health. 2016;55(6):1996–2009. doi:10.1007/s10943-016-0205-z

19. Gong H, Jia H, Guo T, Zou L. The revision of self-compassion scale and its reliability and validity in adolescents (in Chinese). Psychol Res. 2014;7(1):36–40.

20. Deniz M, Ş K, Sümer AS. The validity and reliability of the Turkish version of the self-compassion scale. Soc Behav Pers. 2008;36(9):1151–1160. doi:10.2224/sbp.2008.36.9.1151

21. Mantzios M, Wilson JC, Giannou K. Psychometric properties of the Greek versions of the self-compassion and mindful attention and awareness scales. Mindfulness. 2015;6(1):123–132. doi:10.1007/s12671-013-0237-3

22. Sutton E, Schonert-Reichl KA, Wu AD, Lawlor MS. Evaluating the reliability and validity of the self-compassion scale short form adapted for children ages 8–12. Child Indic Res. 2018;11(4):1217–1236. doi:10.1007/s12187-017-9470-y

23. Marshall SL, Parker PD, Ciarrochi J, Sahdra B, Jackson CJ, Heaven PC. Self-compassion protects against the negative effects of low self-esteem: a longitudinal study in a large adolescent sample. Pers Individ Dif. 2015;74:116–121. doi:10.1016/j.paid.2014.09.013

24. Muris P. A protective factor against mental health problems in youths? A critical note on the assessment of self-compassion. J Child Fam Stud. 2016;25(5):1461–1465. doi:10.1007/s10826-015-0315-3

25. Garcia-Campayo J, Navarro-Gil M, Andrés E, Montero-Marin J, López-Artal L, Demarzo MMP. Validation of the Spanish versions of the long (26 items) and short (12 items) forms of the Self-Compassion Scale (SCS). Health Qual Life Outcomes. 2014;12(1):4. doi:10.1186/1477-7525-12-4

26. Castilho P, Pinto‐Gouveia J, Duarte J. Evaluating the multifactor structure of the long and short versions of the self‐compassion scale in a clinical sample. J Clin Psychol. 2015;71(9):856–870. doi:10.1002/jclp.22187

27. Uršič N, Kocjančič D, Žvelc G. Psychometric properties of the slovenian long and short version of the self-compassion scale. Psihologija. 2018;OnlineFirst:1–19.

28. Chen J, Liangshi Yan ZL. Reliability and validity of Chinese version of Self-compassion Scale (in Chinese). Chin J Clin Psychol. 2011;19(6):734–736.

29. Raab K. Mindfulness, self-compassion, and empathy among health care professionals: a review of the literature. J Health Care Chaplain. 2014;20(3):95–108. doi:10.1080/08854726.2014.913876

30. Cheung T, Wong SY, Wong KY, et al. Depression, anxiety and symptoms of stress among baccalaureate nursing students in Hong Kong: a cross-sectional study. Int J Environ Res Public Health. 2016;13(8):779. doi:10.3390/ijerph13121252

31. Zeng Y, Wang G, Xie C, Hu X, Reinhardt JD. Prevalence and correlates of depression, anxiety and symptoms of stress in vocational college nursing students from Sichuan, China: a cross-sectional study. Psychology, Health & Medicine. 2019;24(7):798–811. doi: 10.1080/13548506.2019.1574358

32. Pizutti LT, Carissimi A, Valdivia LJ, et al. Evaluation of Breathworks’ Mindfulness for Stress 8‐week course: effects on depressive symptoms, psychiatric symptoms, affects, self‐compassion, and mindfulness facets in Brazilian health professionals. J Clin Psychol. 2019;75(6):970–984. doi:10.1002/jclp.22749

33. Efstathiou G. Translation, adaptation and validation process of research instruments. In: Suhonen R, Stolt M, Papastavrou E, editors. Individualized Care: Theory, Measurement, Research and Practice. Cham: Springer International Publishing; 2019:65–78.

34. Sousa VD, Rojjanasrirat W. Translation, adaptation and validation of instruments or scales for use in cross‐cultural health care research: a clear and user‐friendly guideline. J Eval Clin Pract. 2011;17(2):268–274. doi:10.1111/j.1365-2753.2010.01434.x

35. Levenstein S, Prantera C, Varvo V, et al. Development of the perceived stress questionnaire: a new tool for psychosomatic research. J Psychosom Res. 1993;37(1):19–32.

36. Luo Y, Gong B, Meng R, et al. Validation and application of the Chinese version of the Perceived Stress Questionnaire (C-PSQ) in nursing students. PeerJ. 2018;6:e4503. doi:10.7717/peerj.4503

37. Turner-Bowker DM, Bayliss MS, Ware JE, Kosinski M. Usefulness of the SF-8™ health survey for comparing the impact of migraine and other conditions. Qual Life Res. 2003;12(8):1003–1012.

38. Wang S, Luan R, Lei Y, Kuang C, He C, Chen Y. Development and evaluation of Chinese version of Short Form-8 (in Chinese). Mod Prev Med. 2007;34(6):1022–1023, 1026.

39. Lang L, Zhang L, Zhang P, Li Q, Bian J, Guo Y. Evaluating the reliability and validity of SF-8 with a large representative sample of urban Chinese. Health Qual Life Outcomes. 2018;16(1):55. doi:10.1186/s12955-018-0880-4

40. Ware JE, Kosinski M, Dewey JE, Gandek B. How to Score and Interpret Single-item Health Status Measures: a Manual for Users of the of the SF-8 Health Survey (with a Supplement on the SF-6 Health Survey). Lincoln, RI; Boston, MA: QualityMetric Inc.; Health Assessment Lab; 2001.

41. Goldberg D, Bridges K, Duncan-Jones P, Grayson D. Detecting anxiety and depression in general medical settings. Bmj. 1988;297(6653):897–899. doi:10.1136/bmj.297.6653.897

42. Luo Y, Meng R, Li J, Liu B, Cao X, Ge W. Self-compassion may reduce anxiety and depression in nursing students: a pathway through perceived stress. Public Health. 2019;174:1–10. doi:10.1016/j.puhe.2019.05.015

43. Association WM. World Medical Association Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Subjects World Medical Association Declaration of HelsinkiSpecial Communication. JAMA. 2013;310(20):2191–2194. doi:10.1001/jama.2013.281053

44. Meredith W. Measurement invariance, factor analysis and factorial invariance. Psychometrika. 1993;58(4):525–543. doi:10.1007/BF02294825

45. Comrey AL, Lee HB. A First Course in Factor Analysis. New York, NY: Psychology Press; 2013.

46. MacCallum RC, Widaman KF, Zhang S, Hong S. Sample size in factor analysis. Psychol Methods. 1999;4(1):84. doi:10.1037/1082-989X.4.1.84

47. Hair JF, Black WC, Babin BJ, Anderson RE. Multivariate Data Analysis: Pearson New International Edition. 7th ed. London: Pearson Higher Education; 2014.

48. Kline RB. Principles and Practice of Structural Equation Modeling. 4th ed. New York (NY): Guilford publications; 2016.

49. Marsh HW, Hau KT, Grayson D. Goodness of Fit in Structural Equation Models. In: A. Maydeu-Olivares & J. J. McArdle, editors. Multivariate applications book series. Contemporary psychometrics: A festschrift for Roderick P. McDonald. Mahwah, NJ, US: Lawrence Erlbaum Associates Publishers; 2005:275–340.

50. Tucker LR, Lewis C. A reliability coefficient for maximum likelihood factor analysis. Psychometrika. 1973;38(1):1–10. doi:10.1007/BF02291170

51. Bentler PM. Comparative fit indexes in structural models. Psychol Bull. 1990;107(2):238. doi:10.1037/0033-2909.107.2.238

52. Hu L, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Modeling. 1999;6(1):1–55. doi:10.1080/10705519909540118

53. Joreskog KG, Sorbom D. LISREL 8: User’s Reference Guide. Chicago: Scientific Software International; 1996.

54. Cheung GW, Rensvold RB. Evaluating goodness-of-fit indexes for testing measurement invariance. Struct Equ Modeling. 2002;9(2):233–255. doi:10.1207/S15328007SEM0902_5

55. Chen FF. Sensitivity of goodness of fit indexes to lack of measurement invariance. Struct Equ Modeling. 2007;14(3):464–504. doi:10.1080/10705510701301834

56. Meade AW, Johnson EC, Braddy PW. Power and sensitivity of alternative fit indices in tests of measurement invariance. J Appl Psychol. 2008;93(3):568. doi:10.1037/0021-9010.93.3.568

57. Brannick MT. Critical comments on applying covariance structure modeling. J Organ Behav. 1995;16(3):201–213. doi:10.1002/(ISSN)1099-1379

58. Kelloway EK. Structural equation modelling in perspective. J Organ Behav. 1995;16(3):215–224. doi:10.1002/(ISSN)1099-1379

59. Fabrigar LR, Wegener DT, MacCallum RC, Strahan EJ. Evaluating the use of exploratory factor analysis in psychological research. Psychol Methods. 1999;4(3):272. doi:10.1037/1082-989X.4.3.272

60. Everitt BS, Skrondal A. The Cambridge Dictionary of Statistics. 4th ed. New York (NY): Cambridge University Press; 2010.

61. Streiner DL, Norman GR, Cairney J. Health Measurement Scales: a Practical Guide to Their Development and Use. New York (NY): Oxford University Press, USA; 2015.

62. Stevens JP. Applied Multivariate Statistics for the Social Sciences. 5th ed. New York: Routledge; 2012.

63. Streiner DL, Norman GR. “Precision” and “accuracy”: two terms that are neither. J Clin Epidemiol. 2006;59(4):327–330. doi:10.1016/j.jclinepi.2005.09.005

64. Sijtsma K, Emons WH. Advice on total-score reliability issues in psychosomatic measurement. J Psychosom Res. 2011;70(6):565–572. doi:10.1016/j.jpsychores.2010.11.002

65. Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–163. doi:10.1016/j.jcm.2016.02.012

66. McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychol Methods. 1996;1(1):30. doi:10.1037/1082-989X.1.1.30

67. Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res. 2005;19(1):231–240.

68. Tabachnick BG, Fidell LS. Using Multivariate Statistics. Boston: Pearson Education; 2013.

69. Muris P, Otgaar H, Petrocchi N. Protection as the mirror image of psychopathology: further critical notes on the self-compassion scale. Mindfulness. 2016;7(3):787–790. doi:10.1007/s12671-016-0509-9

70. Rönnlund M, Vestergren P, Stenling A, Nilsson LG, Bergdahl M, Bergdahl J. Dimensionality of stress experiences: factorial structure of the Perceived Stress Questionnaire (PSQ) in a population‐based Swedish sample. Scand J Psychol. 2015;56(5):592–598. doi:10.1111/sjop.12214

71. Davidov E, Meuleman B, Cieciuch J, Schmidt P, Billiet J. Measurement equivalence in cross-national research. Annu Rev Sociol. 2014:40.

72. Luján-Tangarife J, Cardona-Arias J. Construction and validation of measurement scales in health: a review of psychometric properties. Arch Med. 2015;11(3):1–10.

73. Carvajal A, Centeno C, Watson R, Martínez M, Rubiales AS. How is an instrument for measuring health to be validated? An Sist Sanit Navar. 2011;34(1):63–72.

74. Durlak JA. How to select, calculate, and interpret effect sizes. J Pediatr Psychol. 2009;34(9):917–928. doi:10.1093/jpepsy/jsp004

75. Hubley AM, Zumbo BD. Validity and the consequences of test interpretation and use. Soc Indic Res. 2011;103(2):219. doi:10.1007/s11205-011-9843-4

Creative Commons License This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]