Reliability and Validity of the Korean Version of the Somatic Symptom Disorder-B Criteria Scale in a Clinical Population
Article information
Abstract
Objective
This study aimed to develop and validate the Korean version of the Somatic Symptom Disorder-B Criteria Scale (SSD-12) in outpatients at a psychiatric clinic and assess its diagnostic accuracy.
Methods
A total of 207 patients completed SSD-12. For the diagnostic accuracy of SSD-12, the somatic symptom disorder (SSD) section of the structured clinical interview for DSM-5 disorders-research version (SCID-5-RV) was used. The SSD-12 construct and concurrent validity were assessed by examining the correlations with Generalized Anxiety Disorder-7 (GAD-7), Patient Health Questionnaire-9 (PHQ-9), PHQ-15, 5-level EQ-5D version (EQ-5D-5L), and World Health Organization Quality of Life Brief Version (WHOQOL-BREF).
Results
The SSD-12 had excellent internal consistency (Cronbach α=0.90). Confirmatory factor analysis revealed good fit indices for a general factor model (comparative fit index [CFI]=0.92, Tucker-Lewis index [TLI]=0.88, root mean square error of approximation [RMSEA]=0.10; 95% confidence interval [CI], 0.08–0.11) and a three-factor model (CFI=0.94, TLI=0.91, RMSEA=0.08; 95% CI, 0.07–0.10). The total SSD-12 score was significantly correlated with anxiety (GAD-7: r=0.53, p<0.001), depression (PHQ-9: r=0.52, p<0.001), physical symptom burden (PHQ-15: r=0.36, p<0.001), and quality of life (EQ-5D-5L: r=-0.40, p<0.001; WHOQOL-BREF: r=-0.51, p<0.001). SSD-12 demonstrated good accuracy (area under the curve=0.75, standard error=0.04; 95% CI, 0.68–0.82) with an optimal cut-off of 29.
Conclusion
The Korean SSD-12 demonstrates reliability and validity for diagnosing SSD in clinical setting.
INTRODUCTION
The Diagnostic and Statistical Manual of Mental Disorders-Fifth Edition (DSM-5) introduced a new diagnosis, somatic symptom disorder (SSD), which marked a significant change in the concept of somatoform disorders in the DSM-Fourth Edition (DSM-IV) [1]. The DSM-IV emphasized that a somatoform disorder cannot be diagnosed if there was an underlying medically explainable condition [2]. However, in the new diagnostic classification, the diagnosis of SSD is based on the presence of distressing physical and positive symptoms such as abnormal thoughts, feelings, and behaviors in response to those physical symptoms, rather than the absence of medical evidence of physical symptoms. In other words, patients diagnosed with SSD are characterized by “excessiveness in the way they express and interpret physical symptoms” which is what Criterion B implies [1].
Several studies have supported the clinical use of DSM-5 diagnostic criteria for SSD, particularly the introduction of Criterion B [3-7]. The psychological characteristics of SSD assessed by Criterion B have been identified as risk factors for the development of SSD [8-10]. Moreover, the severity of SSD, as assessed by the number of symptoms in Criterion B, has also been associated with the degree of global functional impairment [5,11]. Therefore, Criterion B in DSM-5 is likely important for diagnosing and assessing SSD.
In Korea, the Patient Health Questionnaire-15 (PHQ-15), Somatic Symptom Severity Scale-8, Hamilton Depression Rating Scale, and the Depression and Somatic Symptom Scale were developed to assess the type and severity of physical symptoms [12-15]. However, for the DSM-5 SSD, these scales are only useful for Criterion A, which assesses the presence of one or more physical symptoms that are distressing or significantly interfere with daily life. Therefore, a need exists for a scale to assess Criterion B symptoms for DSM-5 SSD. The Somatic Symptom Disorder-B Criteria Scale (SSD-12) was developed as a self-reporting questionnaire to assess Criterion B. The questionnaire has demonstrated good reliability and validity in studies with different population samples from multiple countries [11,16-19]. A study conducted among community-dwelling in Korea also demonstrated significant reliability [20]. However, no studies have standardized SSD-12 in the clinical population in Korea. Therefore, we developed a Korean version of the SSD-12 and examined the reliability and validity of the scale with patients visiting outpatient psychiatric clinics.
METHODS
Sampling strategy and subjects
This study included outpatient who visited the Department of Psychiatry at Seoul National University Hospital, a tertiary hospital in Seoul, South Korea, between March 2021 to February 2022. Patients aged 18 years or older with sufficient cognitive capacity to understand and follow the researcher’s instructions were eligible if they had a history of having been diagnosed with somatic symptom and related disorders or if their current complaints included distress related to physical symptoms. Participation was voluntary and consensual. Those who did not agree to participate in the study, who were unable to maintain a sitting position for more than 30 minutes due to disability, and who had difficulty communicating because they did not speak Korean were excluded from the study. Furthermore, those who had difficulty maintaining attention and alertness, such as those with major neurocognitive disorders, delirium, acute episodes of psychotic disorders, substance addiction, or withdrawal, and those who were currently at a high risk of suicide and required psychiatric crisis intervention, were also excluded. The number of participants was calculated to be a minimum of 200 to ensure the stability of the factor analysis results, and 250 participants represented a 25% dropout rate [21,22]. A minimum of 101 subjects were required to obtain a moderate intraclass correlation coefficient with a 95% confidence interval (CI) in two repeated measures; therefore, we aimed to retest 101 of the total study population (n=250) for test-retest reliability [23-25]. This study was approved by the Institutional Review Board of Seoul National University Hospital (IRB No: H-2109-166-1260) and conducted following the principles of the Declaration of Helsinki.
Instruments
Development of the Korean version of the SSD-12
The SSD-12 consists of 12 items reflecting DSM-5 diagnostic Criterion B and is divided into three sub-criteria (cognitive: 1, 4, 7, 10; affective: 2, 5, 8, 12; and behavioral: 3, 6, 9, 11). Each item is measured on a five-point Likert scale (0, never; 4, very often), with a total score ranging from 0 to 48. Internal consistency reliability was excellent with Cronbach’s α=0.95 [11]. The adaptation process followed the “Guidelines for Test Translation and Adaptation, Second Edition” of the International Test Commission [26]. Permission was obtained from the original author to use the original text before adaptation. A committee consisted of six English-Korean bilingual psychiatrists with subspecialties in psychosomatic medicine and one clinical psychologist. Three psychiatrists independently conducted forward translations (English to Korean). The committee identified inconsistencies between Korean translations and adjusted them in a single version. Subsequently, a psychiatrist who had not seen the original English version of SSD-12, performed backward translation (from Korean to English). The committee then compared the backward translation with the original English version for accuracy. The final version of the Korean version of SSD-12 (Supplementary Material in the online-only Data Supplement) was developed using this process.
Other measurements
Structured clinical interview for DSM-5 disorders-research version (SCID-5-RV) is a semi-structured interview tool for DSM-5 diagnosis and optimal gold standard [27]. We received approval from the American Psychiatric Association to use the English version of the SCID-5-RV and translated the tool based on the diagnostic criteria for SSD in the Korean version of DSM-5.
PHQ-15 is a self-reported questionnaire that assesses the extent to which one has been bothered by physical symptoms in the past 4 weeks and consist of 15 items of physical symptoms. The questionnaire has been used to diagnose somatoform disorders. Each item is rated on a 3-point scale (0 to 2, not bothered at all to bothered a lot), with a total score of 0 to 30, which determines the severity of physical symptom burden [28]. The PHQ-15 has been validated in Korea, with a Cronbach’s α of 0.87 [12].
PHQ-9 is a self-reported questionnaire used to screen for depression in primary care settings and is designed to align with the diagnostic criteria of the DSM-IV, for major depressive disorder and consists of nine questions. Each item asks respondents to indicate how often they have experienced depressive symptoms in the last 2 weeks, with a total score ranging from 0 to 27 on a 4-point scale (0 to 3, not at all to nearly every day) [29]. The internal consistency of the Korean version of PHQ-9 was excellent, with a Cronbach’s α of 0.86 [30].
Generalized Anxiety Disorder-7 (GAD-7) is a seven-item self-reported questionnaire that requires subjects to rate their experience of anxiety-related problems in the past 2 weeks. The questionnaire is based on a 4-point scale (0 to 3, not at all to nearly every day), with a total score of 0 to 21, and has been used as a valid measurement for general anxiety [31]. A validation study of the Korean version of GAD-7 exhibited good internal consistency, with Cronbach’s α of 0.92 [32].
The 5-level EQ-5D version (EQ-5D-5L) was used to assess the health-related quality of life. EQ-5D-5L consists of five dimensions (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression), each of which is evaluated at five levels (no problems, slight problems, moderate problems, severe problems, and extreme problems) and is represented by a 1-digit number. The digits of the five dimensions can be combined into a five-digit number that describes the patient’s health [33]. A Korean version was released and weights were provided in EQ-5D-5L validity studies. The EQ-5D-5L index was calculated using the mapping method proposed by the EuroQol group [34].
The World Health Organization Quality of Life Brief Version (WHOQOL-BREF) was used to assess quality of life. This is a 26-item instrument consisting of four domains (physical health, psychological health, social relationships, and environment). Each question is rated on a 5-point scale, with higher scores indicating a more positive response to quality of life [35]. Standardization studies were conducted in Korea [36]. In this study, 18 of the 26 items were used, excluding eight items related to the environmental domain.
Procedures
The study consisted of the completion of self-reported questionnaires (SSD-12, PHQ-15, PHQ-9, GAD-7, EQ-5D-5L, and WHOQOL-BREF) and diagnostic interview. A psychiatrist with a subspecialty in psychosomatic medicine diagnosed SSD using SCID-5-RV. The psychiatrist who conducted the interview was blinded to the results of the questionnaire of the participants. The same self-reported questionnaires were re-administered at a minimum interval of 1 to 4 weeks to establish test-retest reliability [37].
Statistical analysis
Exploratory data analysis for SSD-12 was performed to examine mean scores, standard deviations (SD), skewness, and kurtosis [38]. For internal consistency, Cronbach’s α and corrected item-total correlations were examined [39]. Pearson’s correlation analysis was performed between the scores obtained from the initial test and those from the retest of the same self-reported questionnaire to evaluate test-retest reliability [40]. For factorial validity, confirmatory factor analyses (CFA) were performed to determine the comparative fit index (CFI), Tucker-Lewis index (TLI), and root mean square error of approximation (RMSEA) [41]. In the development and validation study of SSD-12, two models were proposed based on the diagnostic criteria of the DSM-5 SSD, specifically the “Criterion B” structure. The first model is a one-factor general factor model where all items are loaded onto a single “general factor” representing the DSM-5 SSD Criterion B, and the second model is a three-factor model with latent variables corresponding to three subcriteria [11]. We validated whether our results fit these two models. Receiver operating characteristic (ROC) curve analysis was performed to validate the criterion validity of the SSD-12. Through this analysis, we determined the accuracy level at which SSD-12 could predict the diagnosis of SSD and calculated the optimal cut-off value [42]. For construct validity, Pearson’s correlation coefficients were calculated by correlation analysis between the SSD-12 and PHQ-15, PHQ-9, and GAD-7 scores related to the burden of physical symptom, depression, and anxiety [43]. Concurrent validity was examined with Pearson’s correlation analysis between SSD-12 and WHOQOL-BREF, and EQ-5D-5L. A multiple linear hierarchical regression analysis was performed using WHOQOL-BREF and EQ-5D-5L as dependent variables to explore incremental validity. In the first step, PHQ-15 and sociodemographic characteristics were included as predictor variables, followed by SSD-12 in the second step. All data was analyzed using SPSS, version 23.0 and SPSS AMOS, version 23.0 (IBM Corp., Armonk, NY, USA).
RESULTS
A total of 214 subjects consented to participate in the study and were enrolled. Of these, seven dropped out and 207 patients were included in the analysis. Among them, 75.0% were female, with a mean age of 54.5±15.3 years. The sociodemographic characteristics of the participants are presented in Table 1.
Reliability of the SSD-12
High internal consistency reliability was demonstrated with a Cronbach’s α of 0.90. Except for items 7 and 10, corrected item-total correlation coefficients were above 0.50 for all other items, indicating a high correlation with the total score. The corrected item-total correlation for item 7 (“Others tell me that my physical problems are not serious”) on the cognitive aspects subscale was 0.02, indicating little correlation with the total score. The corrected item-total correlation for item 10 (“I think that doctors do not take my physical complaints seriously”) in the same subscale was 0.38, suggesting moderate correlation with the total score. When item 7 was removed, the Cronbach’s α increased to 0.92. The overall item characteristics of the items are listed in Table 2. The test-retest reliability was reliable, with a Pearson’s correlation coefficient of 0.89.
Validity of the SSD-12
Factorial validity
In the general factor model and three-factor model, the path coefficient values of all items were statistically significant, except for item 7. When evaluating the fit of the general factor model, the RMSEA value, a measure of absolute fit, was below 0.1, indicating a mediocre fit, and the incremental fit indices, TLI and CFI, were both above 0.9, indicating an acceptable fit. Similarly, in the three-factor model, the absolute fit index RMSEA was below 0.1, and the χ2 (chi-square)/df (degrees of freedom) was below 3, indicating a favorable fit. The incremental fit indices, TLI and CFI were also above 0.9, indicating an acceptable fit. Strong correlations were present between the three subscales (cognitive and behavioral domains: effect size [ES]=0.87; affective and behavioral domains: ES=0.90; cognitive and affective domains: ES=0.91). The CFA results are displayed in Table 3 and the three-factor model is illustrated in Figure 1.
Criterion validity
Diagnostic evaluation using SCID-5-RV and in-depth interview revealed that 65 (30.4%) participants were diagnosed with SSD. In our study, the mean (SD) of SSD-12 were 25.9 (10.8). The optimal cut-off point was 29, with a Youden index of 0.396 (sensitivity=0.656, specificity=0.739). The sensitivity and specificity of SSD-12 in the moderate range are displayed in Table 4. The ROC analysis demonstrated that the area under the curve (AUC) was 0.75 (standard error=0.04, 95% CI, 0.68–0.82), indicating a favorable level of predictive ability (Figure 2). This suggests that there is a 75% probability of distinguishing the SSD from the non-SSD, when a total score is 29 or higher on SSD-12.
Construct validity
PHQ-15, which evaluates the burden of physical symptoms, had a mean (SD) of 11.4 (5.8) and exhibited a weak positive correlation with the SSD-12 total score (r=0.36, p<0.001). Whereas, PHQ-9, which evaluates depressive symptoms, had a mean (SD) of 10.2 (7.1) and GAD-7, which evaluates anxiety symptoms, had a mean (SD) of 6.8 (5.9), both of which displayed moderate positive correlations with the SSD-12 total score (PHQ-9: r=0.52, p<0.001; GAD-7: r=0.53, p<0.001).
Concurrent validity
Significant correlations were observed between the SSD-12 total score and the WHOQOL-BREF and EQ-5D-5L total scores, which reflect the degree of quality of life and impairment in daily functioning. Both the total scores of WHOQOL-BREF (r=-0.48, p<0.001) and EQ-5D-5L (r=-0.40, p<0.001) had moderate negative correlations with the SSD-12 total score. When examining the correlations between the SSD-12 total score and the WHOQOL-BREF subdomain scores, a significant negative correlation was present with the total score of the physical health domain (r=-0.55, p<0.001) and the total score of the psychological health domain (r=-0.43, p<0.001), but not with the total score of the social relationships domain.
Incremental validity
Multiple linear hierarchical regression analysis was performed to test the incremental validity of SSD-12 beyond PHQ-15 in predicting the quality of life and daily life functioning impairment evaluated by WHOQOL-BREF and EQ-5D-5L. The regression models for each stage fit when the WHOQOL-BREF score was the dependent variable (Step 1: F=75.364, p<0.001; Step 2: F=65.189, p<0.001). Furthermore, we observed a significant increase in the explained variance when the SSD-12 total score was introduced as an independent variable at Step 2 (Step 1: R=0.52, adjusted R2=0.27; Step 2: R=0.63, adjusted R2=0.39). Using EQ-5D-5L scores as the dependent variable, the regression models were also fitted at each stage (Step 1: F=83.641, p<0.001; Step 2: F=52.452, p<0.001). The explained variance increased significantly when the SSD-12 total score was added as an independent variable in step 2 (Step 1: R=0.54, adjusted R2=0.30; Step 2: R=0.59, adjusted R2=0.35). When controlling for PHQ-15, SSD-12 had a negative influence of 37% on quality of life assessed using WHOQOL-BREF BREF (PHQ-15: β=-0.35; SSD-12: β=-0.36), and a negative influence of 24% on daily life functioning impairment assessed using EQ-5D (PHQ-15: β=-0.46; SSD-12: β=-0.24).
DISCUSSION
This study aimed to develop a Korean version of the SSD-12, designed for screening of SSD, and to assess the reliability and validity to determine its suitability as an evidence-based assessment instrument. In this study, the SSD-12 items were strongly interrelated and consistently measured the diagnostic Criterion B of SSD with a high internal consistency reliability coefficient. This is comparable to those assessed in other countries, such as Europe and China [11,17-19,44]. Unlike other items in SSD-12, item 7 “Others tell me that my physical problems are not serious” displayed a very weak correlation with the total score, and similarly, item 10 “I think that doctors do not take my physical complaints seriously” had a moderate correlation with the total score. In reliability and validity studies targeting community-dwelling adults in Korea, items 7 and 10 also exhibited very weak correlations with the total score, with corrected item-total correlations of 0.04 and 0.26, respectively [20]. Furthermore, this is similar to studies in other countries that have consistently identified that item 7 has the lowest correlation with total score compared to other items, followed by item [10 11,17-19,44]. These consistent results are likely to occur, as item 7 and 10 ask for thoughts about the perspectives or reactions of others to the physical symptoms experienced by patients with SSD. Patients with SSD are characterized by an excessive focus on their physical symptoms and may not be aware of the discrepancy between their perceived severity of physical symptoms and the perspectives of others [1]. In previous studies, patients with somatoform disorder have displayed functional impairment in a theory-of-mind task that assesses their ability to recognize and interpret other perspectives in social interactions [45,46]. Paradoxically, items 7 and 10 may not adequately reflect the cognitive aspects of Criterion B for SSD. The test-retest reliability of the SSD-12 was high, confirming that the SSD-12 is the instrument that produces relatively consistent results over time and situations.
Factorial validity
CFA indicated an acceptable fit for the general factor and three-factor models, encompassing the three sub-criteria of Criteria B for SSD as latent constructs. Therefore, the total score of SSD-12 can be used, and it was confirmed that the structure of SSD-12 is consistent with Criterion B for SSD. Strong correlations were observed between the cognitive, emotional, and behavioral subscales of the SSD-12, suggesting that some overlap may exist in the content of items within these three subscales. This is consistent with the results of previous studies [11,19,44]. Further research is needed to explore the structural implications of categorizing symptoms into three sub-criteria for diagnosing SSD and to investigate how they interact and manifest in clinical practice.
Criterion validity
In our study, the cut-off SSD-12 score was 29. The cut-off scores for SSD-12 varied according to the characteristics of the participants, such as sex and age, as well as the setting in which the participants were recruited. In a study of population-based norms conducted by the original author, a 55-year-old female had a cut-off of 29 for a very high psychological burden [47]. In a study involving patients referred by primary care physicians for rare and undiagnosed diseases, the cut-off was 23 [48]. Furthermore, in a study involving outpatients who came to a psychosomatic medicine clinic, a cut-off score of 26 displayed the highest diagnostic efficiency value [16]. The mean age of our study population was 54 years, 75% of the study population were female; the fact that our study only included outpatients in the Department of Psychiatry in a tertiary hospital may have influenced the cut-off score.
Construct validity
When convergent validity was examined, SSD-12 demonstrated a moderate level of correlation with the PHQ-9 and GAD-7. This suggests that depression and anxiety about health or physical symptoms, latent constructs of SSD-12, are related, but not completely redundant to depression and anxiety as symptoms of depressive and anxiety disorders, as measured by PHQ-9 and GAD-7. In other words, SSD-12 appears to reflect the unique characteristics of the symptoms observed in SSD. In contrast, a low correlation was present between SSD-12 and PHQ-15 scores. A study in the Netherlands also found a significant but low correlation between SSD-12 and PHQ-15 scores [49]. This can be interpreted as providing evidence of discriminant validity between SSD-12, which reflects the revised DSM-5 diagnostic criteria for SSD, and PHQ-15, reflecting the severity of the physical symptom burden. In other words, the SSD-12 measures the excessive cognitive, affective, and behavioral symptoms related to physical symptoms rather than the severity of physical symptom burden.
Concurrent validity and incremental validity
Concurrent validity demonstrated moderate negative correlations between SSD-12 and EQ-5D-5L, and WHOQOL-BREF, confirming that SSD symptoms, as defined by Criterion B, significantly predicted poor daily functioning and reduced quality of life. This finding is consistent with previous studies displaying negative correlations between physical and mental quality of life as measured by SF-12 and SSD-12 total score [19]. Therefore, based on the theoretical prediction that SSD-12 score would affect daily functioning and quality of life, we examined the incremental validity with the EQ-5D-5L and WHOQOL-BREF scores as dependent variables. The results revealed that SSD-12 could explain the decline in daily functioning and quality of life beyond by PHQ-15.
The use of SSD-12, which can screen for SSD, can enhance the diagnostic approach to SSD in clinical settings. The cut-off of the SSD-12 presented in this study could provide some evidence of the degree of “excessiveness” that may warrant psychiatric evaluation and treatment. How the degree of “excessiveness” specified in Criterion B can be precisely explained concerning thoughts, emotions, and behaviors associated with physical symptoms remains unclear. For an accurate diagnosis of SSD, the most crucial requirement is to establish an operational definition of the degree of “excessiveness” that is measurable [6,50]. Appropriate screening for SSD is important because these disorders are associated with increased individual and societal healthcare burdens [51,52]. Additionally, SSD-12 is a self-reported questionnaire that can be applied not only in psychiatry but also in other departments, making it easy to identify SSD and make referrals for consultation.
Limitations
The study has some limitations. First, this study was conducted in a psychiatric outpatient setting and limitations to generalizing the results to other populations may exist. Second, since reliability, validity, and cut-off values vary between different samples, more studies should be conducted with diverse samples, including psychiatric inpatients and outpatients from other medical departments.
In conclusion, the results of this study confirmed that the Korean version of the SSD-12 is a reliable and valid instrument in a clinical setting. Furthermore, we provided a cut-off point of 29 for diagnosing SSD, enabling the utilization of SSD-12 as a screening tool. Therefore, this scale is expected to be useful for diagnosing and evaluating SSD in clinical settings.
Supplementary Materials
The online-only Data Supplement is available with this article athttps://doi.org/10.30773/pi.2023.0352.
Notes
Availability of Data and Material
The datasets generated or analyzed during the study are available from the corresponding author on reasonable request.
Conflicts of Interest
The authors have no potential conflicts of interest to disclose.
Author Contributions
Conceptualization: Saim Jung, Bong-Jin Hahm. Data curation: Saim Jung, Bong-Jin Hahm, Chan-Woo Yeom. Formal analysis: Saim Jung, Chan-Woo Yeom. Funding acquisition: Saim Jung. Supervision: Bong-Jin Hahm. Writing—original draft: Saim Jung, Chan-Woo Yeom. Writing—review & editing: all authors.
Funding Statement
This study was supported by the Jisan Cultural Psychiatry Research Fund from the Korean Neuropsychiatric Association.