Predicting Efficacy of Virtual Reality-Based Stabilization for Individuals With Posttraumatic Stress Symptoms: A Machine Learning Approach
Article information
Abstract
Objective
The global impact of respiratory infectious diseases led to significant mental health challenges, highlighting the need for proactive psychological interventions to prepare for future pandemics. In response, virtual reality-based stabilization (VRS) was developed to mitigate posttraumatic stress symptoms (PTSS) and related comorbidities.
Methods
This study evaluated and predicted the effectiveness of VRS in 43 coronavirus disease-2019 (COVID-19) survivors and healthcare workers from COVID-19 treatment units. The effectiveness of VRS, conducted over five sessions, was measured using preand post-intervention psychological assessments for PTSS, depression, anxiety, COVID-related fear, posttraumatic growth, and quality of life. Additionally, a machine learning model was used to predict the impact of the intervention on PTSS and depression based on preintervention psychological assessments and heart rate variability tests.
Results
The post-intervention results showed significant improvements in all psychological outcomes. The machine learning-based model demonstrated good predictive accuracy for changes in PTSS and depression (R2=0.414–0.723). Notably, individuals with higher pre-intervention scores for PTSS and related comorbidities, as well as elevated heart rate variability and younger age, exhibited more significant improvements.
Conclusion
These findings suggest that VRS is effective in addressing PTSS and related conditions, and incorporating clinical and demographic data can enhance prediction models, enabling more personalized intervention strategies.
INTRODUCTION
Various respiratory infectious diseases, including severe acute respiratory syndrome, middle east respiratory syndrome, and the recent coronavirus disease-2019 (COVID-19), not only posed significant threats to physical health but also precipitated a range of mental health issues. These outbreaks led to numerous stressors, such as fear of infection, social isolation, disruptions to daily life, and financial hardships, all of which contributed to depression, anxiety, and posttraumatic stress disorder (PTSD) [1-3]. Patients with infectious diseases and frontline healthcare workers (HCWs) were at an increased risk of developing these mental health issues, with PTSD prevalence during pandemics significantly higher than in the general population [2,4,5]. The mental health challenges experienced during these crises not only diminish individuals’ quality of life (QoL), but for HCWs, can also impair the healthcare system by limiting workforce availability and the quality of care. Given that pandemics are recurrent phenomena [6,7], ongoing preparedness in developing and implementing effective interventions is crucial to mitigate the mental health impacts of future pandemics, particularly for high-risk groups such as patients and HCWs.
Stabilization techniques are readily adaptable across various trauma types and survivor profiles [8], and thereby have the potential for broad and efficacious application to traumatized individuals. These techniques are therapeutic methods aimed at helping individuals manage distressing trauma-related symptoms by promoting emotional regulation, grounding, and a sense of safety [9]. Integrating stabilization early in the therapeutic process is particularly important during the acute psychological trauma phase [10]; indeed, it significantly enhances overall treatment effectiveness [11]. In certain instances, employing stabilization interventions alone successfully reduces the symptoms of PTSD and its associated comorbidities [8].
However, restrictions on physical contact presented challenges in the delivery of this intervention in a face-to-face setting during pandemics. One solution to such an obstacle is the use of virtual reality (VR) technologies, which offer significant advantages for teletherapy by enabling users to interact with immersive and synchronized multisensory three dimensional environments via head-mounted displays [12]. Several VR-based therapeutic programs have demonstrated effectiveness in treating mental health issues, including PTSD, depression, and anxiety [13,14]. However, to the best of our knowledge, there has been no development of VR-based stabilization (VRS) for use either before or during the treatment of posttraumatic stress symptoms (PTSS). To date, a significant proportion of VR research has concentrated on presenting serene and pleasing imagery, such as natural landscapes, to mitigate pandemic-related psychological distress, aiming to boost positive emotions or relaxation but often lacking evidence-based treatment components for mental disorders [12]. Therefore, there is an evident need to develop and validate VRS.
In addition, it is crucial to recognize that the psychological effects of therapeutic interventions may vary based on individual differences. Recent studies have shown that personal attributes, such as gender [15] and emotion regulation [16], significantly influence responses to VR stimuli, suggesting VR-based therapeutic programs may not be equally effective for everyone. Therefore, diverse user characteristics must be considered when applying VR-based interventions. This view is well-established in traditional face-to-face psychotherapy. Recognizing that conventional therapies, such as cognitive behavioral therapy, are not universally beneficial has spurred interest in understanding which individual characteristics synergize with specific interventions [17]. This interest dovetails with the precision medicine approach [18], which aims to customize healthcare through a detailed analysis of a wide range of health-related data.
Aligned with the precision medicine paradigm, contemporary research increasingly uses machine learning (ML) algorithms to analyze patients’ demographic, clinical, and neurobiological data to predict treatment outcomes [19]. ML, which focuses on making accurate predictions [20], offers several advantages over traditional statistical methods, which aim to infer relationships among variables. Beyond observing statistical associations, ML enables the construction of predictive models based on combinations of multiple features. These models can then be interpreted to assess the contribution of each feature to the outcome. This approach allows researchers to integrate biomarkers and demographic information from different domains, providing insights into complex and multifactorial relationships that emerge when multiple types of data are combined [21].
Recent ML research has focused on predicting post-treatment outcomes based on baseline data for disorders such as PTSD [22], depression [23], and generalized anxiety disorder [24]. These studies also predict outcomes of digital therapies such as smartphone applications [25] and internet interventions [26]. The ability to accurately predict the most beneficial treatment approach for each patient offers the potential to maximize the use of time and financial resources for both patients and clinicians.
Indeed, some studies have been conducted to predict the effectiveness of VR therapy programs. For example, ML has been employed to predict the effectiveness of VR exposure therapy for arachnophobia [27]. Similarly, structural equation modeling has been utilized to analyze the effectiveness of VR intervention for PTSD [28]. However, there remains a significant lack of research on using ML to predict treatment outcomes in VR therapy. Specifically, no study has focused on employing ML to predict the effectiveness of VRS in individuals with traumatic experiences.
Against this background, we aimed to evaluate the effectiveness of VRS for patients with respiratory infectious diseases, specifically COVID-19, and HCWs, populations vulnerable to mental health problems during pandemics, and to predict its effectiveness using ML algorithms. Although these two groups differ in terms of occupation and the nature of their experiences, they both encountered trauma within the same extraordinary social context created by COVID-19. Therefore, we considered them as a single, unified research population affected by pandemic-related mental health challenges, and proceeded to evaluate the effectiveness of the ML-developed VRS program accordingly. To the best of our knowledge, this is the first investigation of its kind. Thus, we adopted an exploratory approach given the lack of prior research on the development and validation of VRS programs as well as the prediction of the effectiveness of these interventions using ML techniques.
In the present study, self-reporting measures and clinical assessments were used to determine the safety and the effectiveness of VRS. The scales for pre-intervention predictors and post-intervention outcomes were selected based on the pathology and relevant theoretical frameworks [3,29,30]. The primary (PTSS) and secondary (depression, anxiety, and fear of COVID-19) outcomes were evaluated along with the positive changes resulting from the intervention, including posttraumatic growth (PTG) and QoL. In addition, to predict the effectiveness of VRS, demographics, the primary and secondary outcomes, the positive change outcomes, and heart rate variability (HRV) were used as predictors. In this study, HRV is utilized as a biomarker for autonomic nervous system changes induced by participants’ PTSS, complementing questionnaires designed to assess the participants’ mental health. In the ML analysis of intervention effects, PTSS and depression were selected as outcome variables because of their significant comorbidity [31].
METHODS
Participants
Based on previous VR-based clinical research related to psychological trauma, which reported an effect size of d=0.77 (medium to large effect size) [32], a conservative effect size was hypothesized. Based on a power analysis with G*Power version 3.1.9.7 (Heinrich Heine University), assuming an effect size (d) of 0.50 (medium effect size), significance level (alpha) of 0.05, power of 0.80, and a two-tailed test, a group required 34 participants. The recruitment target was set at 40 participants, to account for a potential dropout rate of 20%.
The inclusion criteria for the recovered patients were: 1) admission to the pulmonology or infectious disease department at a COVID-19-designated hospital in Cheonan, Republic of Korea, with a confirmed COVID-19 diagnosis and reporting COVID-19-induced distress (e.g., depressive mood, anxiety, perceived stress) at discharge; 2) study volunteers reporting COVID-19-induced distress post-recovery, irrespective of hospital admission; and 3) aged 18 years or older. The HCWs were selected based on: 1) employment in COVID-19 treatment units at the same hospital with reported symptoms of COVID-19-induced stress and 2) aged 18 years or older. The exclusion criteria included pregnancy, intellectual disability or other neurological disorders, and physical conditions that hindered participation in the study. Consequently, the final study cohort comprised 43 participants, including 18 recovered patients and 25 HCWs. Participants’ characteristics are presented in Table 1.
VRS
VRS was developed using Unity 3D software (ver. 2021.3.30 f1 long-term support; Unity Technologies). The participants were instructed on stabilization techniques in a virtual environment and provided with tactile temperature feedback using the “HTC Vive Pro (HTC)” Eye head-mounted display and a “ThermoReal (TEGWAY Co. Ltd.)” temperature feedback sleeve. Only two parts of the sleeve were used for both arms to reduce the effort and heaviness of wearing each part of the device. Professional voice actors narrated all the instructions to ensure natural comprehension.
We implemented three stabilization techniques using VR technology: the “lightstream technique,” “mindful breathing,” and “container skills” (Figure 1). The lightstream technique is effective in managing the physical discomfort and emotional distress associated with trauma. Mindful breathing focuses attention on the physical sensations of breathing, which aids in redirecting attention from traumatic memories and associated emotional or sensory distress back to the present moment. Container skills are employed to foster a sense of control and security over traumatic memories by having individuals visualize a container in which they can confine and manage distressing memories, feelings, thoughts, and sensations.
Rendered images of the VR-based stabilization. A: Lightstream technique: participants can adjust the color and brightness of a healing light to their personal preference early in the intervention program (left image). As they progress through the program, they can feel the approach of a warm beam of light through visual imagery and tactile temperature responses (right image). B: Mindful breathing: participants are guided through the breathing process using auditory narration and onscreen instructions, both provided in Korean (left image). The visual representation of inhalation and exhalation is intuitive and easy to follow (right image). C: Container skills: participants can select a box to contain their uncomfortable memories according to the Korean narration (left image). The process of containing and sealing memories is presented with explanations, providing visually and auditorily intuitive expressions (right image).
In the lightstream technique, participants were first instructed to visualize a healing light capable of alleviating their distressing emotions and bodily sensations and to select the color and brightness they found most soothing within the VR environment. They were then asked to recall their current unpleasant feelings and somatic sensations, imagining the shape, size, color, and temperature of these sensations as vividly as possible. Following a brief relaxation exercise, participants experienced the chosen colored light shining upon them in VR, with warmth sensations enhanced by the ThermoReal device, initially set at 25°C and gradually increasing to a maximum of 42°C. Finally, participants were invited to observe any changes in their discomfort.
For mindful breathing in VRS, a virtual environment is specifically crafted to support breathing exercises in a tranquil, natural setting. This setup not only allows participants to breathe in synchronization with a three-dimensional representation of inhalation and exhalation but also intensifies the sensory experience of the breathing technique.
Regarding container skills in VRS, participants select a container of their preferred color and material from a provided selection of options. Using a virtual interface, they place a picture frame—a visualized symbol of their disturbing memories, thoughts, and emotions—into the container and securely lock it away. This interactive process mitigates the perceived threat posed by distressing symptoms and enhances individuals’ perceived control over them.
These techniques were chosen for their efficiency in VR environments and their ability to fully leverage VR’s unique benefits compared to other stabilization techniques. These techniques employ visual, tactile, and auditory elements to enhance both immersion and the effectiveness of interventions, addressing the shortcomings of traditional training methods that depend solely on the imagination. Furthermore, beyond merely having participants passively experience visual and auditory stimuli, we sought to employ techniques that maximize the advantages of VR, including the utilization of a comprehensive range of virtual environments, tracking of body movements, and provision of customized responses.
User safety and psychological assessment
User safety during VRS sessions was verified by analyzing the total score of the 16-item Simulator Sickness Questionnaire [33]. Additionally, to evaluate and predict the effectiveness of VRS, self-reported and clinician-administered assessments were conducted for PTSS, depression, anxiety, fear of COVID-19, PTG, and QoL. Specifically, PTSS was assessed using the total score of the PTSD Checklist-5 (PCL-5) [34,35] and the symptom intensity and frequency scores from the Clinician-Administered PTSD Scale for the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (CAPS-5)36,37; in the following, “CAPS-5score” refers to symptom intensity, while “CAPS-5symptom” refers to symptom frequency. Depression was measured using the total scores on the Beck Depression Inventory-II (BDI-II) [38,39] and the 17-item Hamilton Depression Rating Scale (HDRS-17) [40,41]. State anxiety and fear of COVID-19 were evaluated with the state anxiety subfactor scores from the State–Trait Anxiety Inventory-Y (STAI-Y) [42,43] and the total score on the Fear of COVID-19 Scale (F-COVID-19S) [44,45]. Finally, PTG and QoL were gauged using the total scores from the Posttraumatic Growth Inventory–Expanded (PTGI-X) [46,47] and the World Health Organization Quality of Life Scale Abbreviated Version [48,49]. Further details on these measures are provided under “User safety” and “Psychological assessment” in Supplementary Material.
HRV
HRV is a comprehensive term referring to various indices derived from electrocardiogram (ECG) signals through multiple analytical methods, including time-domain, frequency-domain, and nonlinear techniques. Since fluctuations in heart rate originate from changes in the impulse activity of the sinoatrial (SA) node—and the autonomic nervous system’s cardiac regulation is one of the key factors directly influencing SA node activity—HRV has been applied in numerous fields of research as an indirect indicator of autonomic nervous system dynamics [50]. Focusing particularly on stress responses to external psychological stimuli such as trauma, numerous studies in the field of psychology examining the relationship between stress and HRV have reported significant changes in HRV indices in both healthy individuals and patients with PTSD [51-53].
HRV was measured using the SA-3000P (MEDICORE Co., Ltd.). During the HRV test, the participant sat in a chair in front of the device, the HRV measurement cable was attached, and the measurement was performed with restricted movement and conversation for approximately 3 minutes. Various HRV indicators were outputted in the form of an automated test result sheet, which was extracted as coded data in the form of an Excel file and stored in a database. HRV metrics were collected as follows: calculated by power spectrum analysis and normalized ratio values across frequency bands (VLF, very low frequency; LF, low frequency; HF, high frequency; TP, total power; normalized LF; normalized HF; LF/HF, log-transformed features) and calculated by time series analysis (AbnormalECG; SDNN, standard deviation of the NN interval; RMSSD, square root of the mean of the sum of the square of differences between adjacent NN intervals; PSI, physical stress index; ApEn, approximate entropy; SRD, successive RRI difference). A detailed description of the HRV metrics and their associated interpretations can be found elsewhere [54].
Study design and procedure
To ensure a stable experimental environment, experimenters, limited to certified mental health and welfare workers, were instructed on preparation, procedural adherence, and participant handling per a predetermined manual and checklist. Additionally, a separate room in the hospital was set up with the help of the VRS program developers to guide the setup and use of the VR equipment and computer.
Prior to the VRS sessions, it was made clear to all the participants that they would be fully supported in overcoming any potential fears or discomfort during the VR experience. Furthermore, they were informed that they could discontinue the study at any time should they experience discomfort or dizziness during the experiment. The participants were informed of the following precautions to control for variables in the HRV measurement: 1) no smoking and no consumption of stimulants, such as caffeinated beverages, within 6 hours before the experiment; 2) no alcohol consumption the day before the experiment; and 3) at least 8 hours of sleep the day before the experiment.
Following the acquisition of informed consent from the participants, VRS was implemented weekly, encompassing five sessions over a period of 5 weeks. Demographic data were collected prior to initiating the comprehensive VR program. Both self-reported psychological scales and clinician assessments were administered at the beginning and after the completion of the entire VR program. HRV tests were conducted before the VRS sessions, while the user safety assessment was completed after the entire program concluded. The sequence of the VRS included the lightstream technique, mindful breathing, and container skills, each conducted in separate sessions. The modalities for the fourth and fifth sessions were selected by the participants, offering them autonomy in their treatment choices to potentially enhance individual effectiveness. This study was approved by the Ethical Committee of Soon Chun Hyang University Cheonan Hospital (IRB registration number: 2022-08-032).
Data analysis
To examine the effectiveness of VRS, pairwise comparisons of pre- and post-intervention outcomes were performed using a paired t-test. Beyond statistical testing, we aimed to use ML to predict changes in PTSS-related assessment indicators following the intervention compared to their pre-intervention values.
Subsequently, a regression prediction model was developed to estimate the changes in PTSS and depression scores. The model utilized eXtreme Gradient Boosting (XGBoost), a high-performance and widely used gradient-boosting algorithm for tree-based ML analyses [55]. To address the interpretability of the ML results from the gradient boosting and ensemble methods, SHapley Additive exPlanations (SHAP) values were used to describe the contribution of each feature to the model [56]. Additionally, the OPTUNA framework was employed for efficient hyperparameter optimization based on the Tree-structured Parzen Estimator algorithm [57]. The ML analysis was conducted using Scikit-learn 1.3.0, XGBoost 1.7.3, OPTUNA 3.4.0, and SHAP 0.42.1.
The regression model construction and evaluation process employed 36 features, including patient demographic data, clinical endpoints (self-reported and expert), and HRV derivatives. The recommended optimal number of features for good ML performance is √N when the number of features in the dataset is N, assuming each feature has a relatively high correlation [58,59]. However, in this study, the correlations among more features were tested, leading to the use of 2√N features to build the final model. This required compression through feature selection of approximately 13 features, given that the analysis involved 43 participants. Subsequently, the top six important features (√N features) for each indicator change prediction model were focused on for interpretation.
To incorporate these feature selection capabilities into a model with the same structure and hyperparameter optimization, the ML process consisted of two phases. Initially, an XGBoost regression model utilizing all features within the specified hyperparameter range was constructed to calculate the SHAP value of the best-performing model. This value determined feature importance, and up to 13 features with the highest contributions were selected. With the selected features, the same hyperparameter range and model configuration were maintained and the best-performing model was reimplemented. This approach enabled model performance-based feature selection. Owing to the limited available data, a non-nested five-fold cross-validation was employed without a separate validation dataset to maximize data utilization for model building. The performances of the best models from Step 2 were evaluated by calculating the R2 score using the entire dataset. Additionally, the SHAP value was calculated and visualized to determine the model contribution and influence of each feature on the results.
RESULTS
User safety and effectiveness of VRS for improvement in post-intervention outcomes
Cybersickness scores assessing user safety averaged 3.28 (standard deviation=3.47). The results for effectiveness of VRS revealed significant improvements across all measured outcomes with effect sizes ranging from medium to large (Table 2). Specifically, reductions in PTSS scores, assessed using the CAPS-5score, were significant (t(42)=6.01, p<0.001, d=0.92), alongside decreases in CAPS-5symptom (t(42)=5.96, p<0.001, d=0.91) and PCL-5 scores (t(42)=4.42, p<0.001, d=0.68). Regarding the secondary outcomes, depression levels, measured by the HDRS-17 and BDI-II, showed significant reductions (HDRS-17: t(42)=4.86, p<0.001, d=0.74; BDI-II: t(42)=4.85, p<0.001, d=0.74), as did state anxiety scores (t(42)=4.64, p<0.001, d=0.71) and scores indicating fear of COVID-19 (t(42)=2.49, p=0.017, d=0.38). Furthermore, significant increases were observed in PTG scores (t(42)=-4.44, p<0.001, d=0.94) and QoL scores (t(42)=-6.19, p<0.001, d=0.68).
ML analysis
Table 3 presents the metrics that showed significant performance among the ML predictions. Models predicting changes in the CAPS-5score, CAPS-5symptom, PCL-5, HDRS-17, and BDI-II scores performed well in the R2 scores based on the full dataset (R2 score [total]). In particular, the R2 scores (total) of the regression models for the PCL-5, CAPS-5score, and CAPS-5symptom were 0.525, 0.723, and 0.686, respectively. The R2 scores (total) of the BDI-II and HDRS-17 were 0.414 and 0.588, respectively.
The results of the SHAP analysis for interpreting the predictive models are as follows. For the PCL-5 score changes, a higher pre-intervention PCL-5 score, greater F-COVID-19S score, and higher PTGI-X score and lower SRD HRV, HF, and RMSSD metrics predicted greater intervention effectiveness. For the CAPS-5score changes, higher pre-intervention CAPS-5score, CAPS-5symptom, and BDI-II scores and VLF and TP HRV metrics predicted greater intervention effectiveness. These results indicated that lower HF HRV contributed to the prediction of fewer intervention effects, although the results were inconsistent. For changes in CAPS-5symptom scores, higher pre-intervention CAPS-5score, CAPS-5symptom, STAI-Y, and F-COVID-19S scores and VLF and TP HRV metrics were predictive of greater intervention effectiveness. For BDI-II score changes, higher pre-intervention questionnaire scores, including BDI-II, PCL-5, and HDRS-17, and HRV metrics, such as LF and SRD, were identified as the most influential features for greater intervention effectiveness. For HDRS-17 score changes, higher pre-intervention HDRS-17, PCL-5, PTGI-X, and F-COVID-19S scores and the ApEn HRV metric were identified as predictors of greater intervention effectiveness. Additionally, younger age contributed to the prediction of greater intervention effectiveness (Figures 2 and 3).
SHAP value beeswarm plots for the PTSS metrics with the high-performing models in predicting pre- vs. post-treatment change. Each horizontal row represents the distribution of actual pre-intervention values for that feature. The values increase as they approach red and decrease as they approach blue, following the vertical color scheme on the right. The bottom horizontal axis shows that the amount of change (post-intervention value) decreases as the values move left toward the negative. If the red dot for feature “A” is left of 0 on the horizontal axis, it indicates that a higher value of A predicts a decrease in the outcome in the regression model. SHAP, SHapley Additive exPlanations; PTTS, posttraumatic stress symptoms; PCL-5, PTSD Checklist-5; PTSD, posttraumatic stress disorder; RRI, R-R interval; HRV, heart rate variability; RMSSD, square root of the mean of the sum of the square of differences between adjacent NN intervals; CAPS-5, Clinician-Administered PTSD Scale for DSM-5; DSM-5, Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition; BDI-II, Beck Depression Inventory-II; Δ, (post-intervention score)-(pre-intervention score).
SHAP value beeswarm plots for the depression metrics with the high-performing models in predicting pre- vs. post-treatment change. Each horizontal row represents the distribution of actual pre-intervention values for that feature. The values increase as they approach red and decrease as they approach blue, following the vertical color scheme on the right. The bottom horizontal axis shows that the amount of change (post-intervention value) decreases as values move left towards the negative. If the red dot for feature “A” is left of 0 on the horizontal axis, it indicates that a higher value of A predicts a decrease in the outcome in the regression model. BDI-II, Beck Depression Inventory-II; PTSS, posttraumatic stress symptoms; PCL-5, PTSD Checklist-5; PTSD, posttraumatic stress disorder; RRI, R-R interval; HRV, heart rate variability; HDRS-17, 17-item Hamilton Depression Rating Scale; SHAP, SHapley Additive exPlanations; Δ, (post-intervention score)-(pre-intervention score).
DISCUSSION
We aimed to assess the safety and effectiveness of VRS in mitigating PTSS and related comorbidities in patients with respiratory infectious diseases and HCWs. Furthermore, the effectiveness of the intervention after five VRS sessions was predicted using ML methodologies based on pre-intervention sociodemographic and clinical baseline variables. The average cybersickness score, used to assess VRS safety, was comparable to those reported in other VR-based programs, indicating that the VRS program is as safe as existing programs [60,61]. Participants showed significant improvements in PTSS, depression, anxiety, fear of COVID-19, PTG, and QoL, with medium-to-large effect sizes through VRS, aligning with the literature on stabilization techniques for PTSD and related comorbidities [8,62]. These findings suggest that the effectiveness of VRS is comparable to that of traditional face-to- face therapy and highlight its potential as an effective noncontact therapeutic tool during pandemics.
ML-based prediction of intervention outcomes is contingent on the distinct attributes of individuals. We found that initial symptom severity significantly predicted treatment effectiveness in the ML model. However, the relationship between initial symptom severity and treatment outcomes remains complex, with previous studies yielding inconsistent results [63]. Meta-analysis research suggests that a higher initial symptom intensity predicts better treatment response [64], while other studies link it to reduced benefits [65,66]. Consistent with meta-analytic findings, our results revealed that participants with higher initial levels of PTSS, depression, anxiety, and COVID-19-related fear had significantly improved PTSS and depressive symptoms after VRS. One potential explanation for this finding is that increased psychological distress due to heightened symptom severity may enhance motivation for symptom improvement [24,67].
However, the mixed outcomes in predicting treatment effectiveness across studies suggest the need to consider additional factors alongside symptom severity. Intrapersonal factors, such as the extent of functional impairment, symptom chronicity, and the risk of self-harm or harm to others, may influence therapeutic outcomes, with lower levels of these factors generally associated with more favorable results [63,66]. Furthermore, external factors, particularly the nature of the therapeutic intervention, warrant careful consideration. For instance, emotional distress such as depression and anxiety in patients with trauma may reduce the effectiveness of exposure-based therapies [65,68]. This reduction is attributed to deficits in emotion regulation and coping skills, potentially establishing a negative correlation between symptom severity and treatment effectiveness in certain therapeutic contexts.
The interplay between symptom severity, intrinsic and extrinsic factors, and treatment motivation appears to significantly impact intervention outcomes. It is important to note that the participants in this study did not have severe or chronic mental disorders requiring hospitalization, nor did they demonstrate significant functional impairments in daily life. Additionally, VRS is primarily designed to enhance emotional stabilization with the aim of alleviating acute stress and negative affect. Considering these contextual factors, it is plausible that individuals with more severe symptoms may derive greater benefits from VRS.
Similar to the positive relationship between symptom severity and intervention effectiveness, higher PTG levels before the intervention predicted more positive outcomes in alleviating PTSS and depression. These results support previous research on the role of PTG in mitigating prevalence of PTSS after disasters [69] and its moderating effect on PTSS and depression [70]. PTG is associated with finding meaning in negative events and positive coping mechanisms, such as reframing and active coping [71,72]. Consequently, our results suggest that VRS may enhance reductions in PTSS and depression by synergistically interacting with the adaptive characteristics inherent to PTG.
SHAP analysis also revealed the distinct impacts of various HRV indicators in the ML model. In the ML model, higher pre-intervention values of VLF, LF, SRD, and ApEn predicted a greater reduction in post-intervention PTSS and depressive symptoms, suggesting that individuals with elevated baseline HRV levels might benefit more from VRS. This relationship between HRV levels and intervention outcomes may be attributed to the association of high HRV with enhanced parasympathetic nervous activity and emotion regulation abilities, which, in turn, boosts intervention effectiveness [73,74]. Previous research supports this, showing that HRV biofeedback, which regulates the ANS, can improve outcomes in psychotherapy for major depressive disorder [75] and that higher vagally mediated HRV at baseline is predictive of lower symptom severity following short-term psychotherapy [76].
In contrast, HF and RMSSD showed lower impacts on the SHAP values and inconsistent trends. While the ML model generally revealed that higher pre-intervention HF and RMSSD values correlated with better treatment outcomes, some participants exhibited inconsistent results, reducing the importance of their overall SHAP value. Despite this, HF and RMSSD remain significant, as previous research has shown clear differences between patients with PTSD and healthy individuals for these indices [53]. Further studies are required to validate the roles of the HF and RMSSD.
Among the demographic variables, only age significantly predicted intervention effects on clinician-rated depressive symptoms. Younger age indicated better outcomes with VRS, aligning with trends showing that young adults benefit more from psychotherapy than middle-aged adults [77]. This may be because older adults are less likely to modify emotion regulation strategies in response to negative emotions [78,79]. Consequently, older individuals may persist in using established and potentially maladaptive emotion regulation strategies. Given the negative relationship between variability in emotion regulation strategies and depressive symptoms [80], the effectiveness of VRS in enhancing emotion regulation may diminish with age, making symptom alleviation more pronounced in younger individuals.
Our ML results suggest that younger individuals with increased trauma-related symptom severity, PTG, and HRV metrics benefit more from VRS. However, it seems paradoxical that individuals with both high HRV and symptom severity derive significant benefits from VRS. Meta-analyses and other studies regarding HRV in PTSD have consistently reported lower HRV in individuals with PTSD than in healthy controls [53,81,82], indicating a negative correlation between HRV and symptom severity. This discrepancy might be due to the different sample characteristics used in the studies. Unlike typical patients with PTSD, the participants in our study did not have severe or chronic symptoms requiring hospitalization and could manage their daily activities, indicating minimal functional impairment. Therefore, even if the participants reported severe symptoms or were clinically assessed, their HRV function may still be relatively intact. This finding suggests that patients with acute or mild-to-moderate trauma might benefit more from VRS than those with severe or chronic PTSD.
Our study had several limitations. First, although our sample size was adequate for evaluating the effectiveness of VRS, the absence of a control group and follow-up sessions made it challenging to determine the relative differences and long-term durability of the effects of VRS. Second, more extensive studies are required to generalize our predictive models to diverse PTSD patients. Third, the predominance of female participants highlights the need for a more balanced gender distribution in future research. Fourth, we used a limited set of predictors including demographics, psychological assessments, and HRV. Including genetic and neuroimaging data could improve the precision of ML-based predictions [23]. Fifth, in this study, two potentially distinct groups (COVID-19 patient group and HCWs) were combined into a single trauma-experienced group based on their shared exposure to COVID-19. Therefore, differences in treatment responses that may arise from the distinct clinical characteristics of each group were not examined, and future studies may need to conduct additional group-based analyses to account for these differences. Finally, the ML models are based on a relatively small dataset, which raises concerns about overfitting and limits the generalizability of the models to larger and more diverse populations. Although we applied parameter optimization and model interpretability techniques to aid in understanding the ML results, these methods do not fundamentally resolve the underlying limitations of small-sample modeling.
Despite these limitations, this study has several strengths. To our knowledge, it is the first to develop and evaluate a VRS program targeting mental health vulnerable groups during pandemics. The study demonstrates the effectiveness of VRS, providing valuable insights into its potential role in addressing mental health problems during future pandemics. This opens up the possibility of combining VRS with existing VR-based exposure therapy programs to deliver remote and evidence-based treatments for PTSD [83]. However, this remains a prospective assertion, necessitating further investigation into the combined effectiveness of VRS and other evidence-based therapies. Furthermore, this is the first study to use ML to predict the effectiveness of VR programs for PTSS and related comorbidities. Our results indicate that specific characteristics, identifiable through questionnaires and HRV measures, could predict who would benefit more from VRS. This insight might streamline clinical practices, making therapeutic intervention selection more efficient and reducing treatment time and costs.
In conclusion, the present study aimed to evaluate the intervention effects of VRS on individuals with PTSS and related comorbidities, as well as to predict these effects using ML techniques. Our findings demonstrate that VRS was effective in ameliorating PTSS and associated comorbidities. Moreover, we observed that the intervention effectiveness of VRS was more pronounced in younger individuals and those with higher pre-intervention symptom levels, PTG, and HRV, suggesting that VRS may be particularly beneficial for individuals with acute or mild-to-moderate trauma triggered by events such as pandemics. Consequently, this study not only establishes the potential of VRS as an effective tool for addressing mental health challenges in future pandemics but also identifies specific individual characteristics that may predict better outcomes in VR-based interventions for PTSS.
Supplementary Materials
The Supplement is available with this article at https://doi.org/10.30773/pi.2025.0168.
Notes
Availability of Data and Material
The clinical data from this study will not be shared as a public dataset in order to protect patient privacy. For inquiries regarding the data, please contact the corresponding author.
Conflicts of Interest
The authors have no potential conflicts of interest to disclose.
Author Contributions
Conceptualization: Bin-Na Kim, Ji Sun Kim, Sungkean Kim, Kibum Kim. Data curation: Ji Sun Kim. Formal analysis: Euijin Kim, Sungkean Kim, Kibum Kim, Yongmin Shin. Funding acquisition: Bin-Na Kim, Ji Sun Kim, Sungkean Kim, Kibum Kim. Methodology: Sungkean Kim, Kibum Kim, Euijin Kim. Visualization: Yongmin Shin, Euijin Kim. Writing—original draft: Yongmin Shin, Euijin Kim. Writing—review & editing: Bin-Na Kim, Sungkean Kim, Yongmin Shin, Euijin Kim.
Funding Statement
This research was supported by a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute, funded by the Ministry of Health and Welfare, Republic of Korea (grant number: RS-2022-KH125605), and the Korea Creative Content Agency grant funded by the Ministry of Culture, Sports, and Tourism in 2023 (RS-2023-00224524).
Acknowledgments
None
