Psychiatry Investig Search

CLOSE


Psychiatry Investig > Volume 22(8); 2025 > Article
Kim, Yang, Dong, and Lee: Enhancing Electroencephalogram-Based Prediction of Posttraumatic Stress Disorder Treatment Response Using Data Augmentation

Abstract

Objective

This study aimed to improve the prediction of treatment response in patients with posttraumatic stress disorder (PTSD) by applying a variational autoencoder (VAE)-based data augmentation (DA) approach to electroencephalogram (EEG) data.

Methods

EEG spectrograms were collected from patients diagnosed with PTSD. A VAE model was pretrained on the original spectrograms and used to generate augmented data samples. These augmented spectrograms were then utilized to train a deep neural network (DNN) classifier. The performance of the model was evaluated by comparing the area under the receiver operating characteristic curve (AUC) between models trained with and without DA.

Results

The DNN trained with VAE-augmented EEG data achieved an AUC of 0.85 in predicting treatment response, which was 0.11 higher than the model trained without augmentation. This reflects a significant improvement in classification performance and model generalization.

Conclusion

VAE-based DA effectively addresses the challenge of limited EEG data in clinical settings and enhances the performance of DNN models for treatment response prediction in PTSD. This approach presents a promising direction for future EEG-based neuropsychiatric research involving small datasets.

INTRODUCTION

An electroencephalogram (EEG) is a non-invasive electrophysiological method that measures brain activity via electrodes attached to the scalp. It is widely used to detect abnormal electrical signals for diagnosing neurological disorders such as epilepsy, sleep disorders, and brain death [1-4]. In neuropsychiatry, EEG also serves as a valuable biomarker to detect and classify brain dysfunctions in conditions such as posttraumatic stress disorder (PTSD) [5], schizophrenia [6], major depressive disorder [7], and Alzheimer’s disease [8].
While EEG holds great promise for identifying neuropsychiatric disorders, predicting treatment response remains a significant clinical challenge. Current treatments often yield highly individualized outcomes [9], and patients may undergo prolonged interventions with uncertain effectiveness. If clinicians could predict treatment responsiveness in advance, it would enable the development of personalized therapeutic strategies and reduce unnecessary clinical burden.
Machine learning (ML) techniques have been employed in predictive modeling for treatment outcomes in psychiatric disorders [10]. With recent advances in artificial intelligence, deep neural networks (DNNs) have emerged as powerful tools for EEG analysis [11]. However, DNNs require large, well-distributed datasets for effective training, which is difficult to achieve in clinical settings due to patient recruitment challenges and cost-intensive data collection processes. As a result, models trained on scarce clinical EEG data are prone to overfitting and weak generalization performance.
To overcome these limitations, data augmentation (DA) techniques have been introduced in EEG-based deep learning studies to increase data diversity and model robustness [12]. Conventional DA methods, including geometric and photometric transformations, have been commonly used. More recently, generative approaches using models such as generative adversarial networks (GANs) and autoencoders have demonstrated improved generalization in EEG classification tasks [13-17]. CNN-based models such as EEGNet [18], Deep ConvNet, and hybrid CNN-LSTM architectures [19-23] have also shown high classification accuracy for tasks like epilepsy detection and user authentication. However, despite these advances, variational autoencoder (VAE)-based augmentation has been underutilized in EEG classification, particularly for spectrogram representations and treatment response prediction.
In this study, we propose a novel EEG DA approach using a VAE24 to address the issue of limited data in predicting treatment response for PTSD. Our method involves generating synthetic EEG spectrograms via a pretrained VAE, which are then used to train a DNN classifier. By improving generalization performance and reducing overfitting, this approach aims to enhance the accuracy and clinical utility of EEG-based prediction models. We also evaluate the robustness of our framework through subject-wise cross-validation to demonstrate its applicability in real-world neuropsychiatric settings.

METHODS

EEG data acquisition and preprocessing

Resting-state EEG recordings were obtained from 48 patients diagnosed with PTSD, both before and after transcranial direct current stimulation (tDCS) treatment. The recordings were collected using a 62-channel EEG system. Electrooculogram (HEO/VEO) and electrocardiogram (EKG) channels were excluded from the analysis. Preprocessing was conducted using EEGLAB [25] and MATLAB R2020a (MathWorks). The raw EEG signals were referenced to the average reference and bandpass filtered from 1 to 50 Hz using a Butterworth filter. Independent component analysis was performed to identify and remove artifacts. Subsequently, noisy segments were visually inspected and rejected. For each subject, a clean 150-second EEG segment was retained for further analysis.

Definition of treatment response

Subjects were classified into responders and non-responders based on symptom changes measured by the Clinician-Administered PTSD Scale for DSM-5 (CAPS-5) [17]. Classification was based on two criteria: the total symptom severity score and the total number of PTSD-related symptoms. Patients demonstrating a 50% or greater reduction in both metrics post-treatment were defined as responders. This threshold is commonly used in clinical PTSD research to reflect clinically meaningful improvement [26]. As a result, 17 patients were classified as responders and 31 as non-responders.

EEG spectrogram construction

EEG signals were segmented into 1-second epochs using a 50% overlap. Each segment was windowed with a Hamming window, and power spectra were computed using fast Fourier transform. The resulting spectrograms were stored as 3D arrays with dimensions [subject (48)×frequency (30)×time (299)].
The structure of the extracted spectrogram dataset is illustrated in Figure 1.
Based on prior findings [27], only the CZ and O1 channels were used in this study due to their superior discriminative power for treatment response prediction. CZ, located at the vertex, reflects global brain activity, while O1 over the occipital lobe is sensitive to visual and sensory processing.
Spectrograms derived from pre-treatment EEG were used to predict treatment outcomes. The final model input consisted of concatenated spectrograms from CZ and O1, providing both cognitive and affective signal features relevant to PTSD.

DA using VAE

Given the limited sample size, a VAE-based DA strategy was applied to enhance generalizability and mitigate overfitting. VAEs are generative models capable of learning latent feature representations and generating new samples from the same distribution as the original data [24].
In this study, the pretrained VAE was used to synthesize additional EEG spectrograms for each subject. The VAE encoder maps the spectrogram input to a latent vector z sampled from a Gaussian distribution, which is then reconstructed through the decoder. The augmented samples were used to expand the training set, effectively doubling the original data size.
The overall structure of the dataset and augmentation process is summarized in Table 1.

Classification framework and evaluation

The augmented spectrogram data, along with the original samples, were used to train a DNN based on EEGNet. Hyperparameters such as learning rate, batch size, and number of training epochs were optimized through grid search using a validation split from the training set in each fold. This tuning aimed to balance model complexity and generalization performance. To ensure robust evaluation, subject-wise cross-validation was used. In each fold, all data from a single subject were withheld for testing, preventing data leakage and enhancing generalizability to unseen individuals. This approach mimics real-world clinical scenarios where models are applied to new patients and ensures independence between training and testing sets.
The overall classification framework is illustrated in Figure 2.

Ethical approval

All participants provided written informed consent. The study protocol was approved by the Institutional Review Board of Inje University, Ilsan Paik Hospital (IRB No. 2015-07-025), and was conducted in accordance with the principles of the Declaration of Helsinki.

RESULTS

Performance comparison with other DA methods

To evaluate the effect of DA on EEG spectrogram classification, we applied a VAE-based DA method and compared it with other commonly used approaches, including noise injection (NI) and time-segmentation. DA was applied only to the training dataset, while the raw, unaugmented data were used for testing. Following augmentation, the size of the training set was doubled, resulting in a total of 78 samples.
The VAE model was configured with a 2D CNN-based encoder and decoder. Training was performed for up to 500 epochs with a batch size of 32, kernel size of 5, filter size of 16, two latent nodes, and a learning rate of 0.001. Hyperparameters were selected based on prior studies [18,24] and empirical tuning, and optimization was conducted using the Adam optimizer.
The classification model trained with the VAE-augmented dataset achieved an area under the receiver operating characteristic curve (AUC) of 0.81±0.16, with sensitivity of 0.48±0.23, specificity of 0.88±0.15, and balanced accuracy of 68.2%±15.1%. This performance was compared against three other augmentation settings, as summarized in Table 2.
Training performance and validation trends for each DA method are illustrated in Figure 3.
Although the NI-based model demonstrated slightly higher training accuracy and lower training loss, the VAE-based model achieved better validation performance, indicating improved generalization. In contrast, NI-based models showed limited improvement in validation accuracy after early epochs, suggesting possible overfitting.

Performance with N-fold VAE-based augmentation

To determine the optimal augmentation scale, we empirically evaluated model performance using different VAE-based augmentation factors (N). As shown in Figure 4, performance improved with increased augmentation, reaching the best results at 45-fold augmentation: AUC of 0.85, sensitivity of 0.85±0.08, specificity of 0.81±0.17, and balanced accuracy of 64.9%±9.4%.
Beyond 45-fold augmentation, performance declined, indicating a threshold beyond which additional augmentation no longer improves and may degrade performance.
To ensure model generalizability, subject-wise cross-validation was employed, where each test fold included data from previously unseen individuals.

DISCUSSION

Treatment response prediction

In comparison to conventional approaches such as clinical interviews or neuroimaging, our EEG-based classification framework offers several advantages for predicting PTSD treatment response. Clinical assessments like the CAPS-5 are subject to interviewer bias and patient self-report limitations, while neuroimaging methods such as fMRI or PET, although informative, are expensive and logistically demanding.
In contrast, EEG is non-invasive, cost-effective, and offers high temporal resolution, making it well-suited for clinical environments.
Our study demonstrated that combining EEG with deep learning and VAE-based DA enables reliable prediction even with limited sample sizes, especially when using subject-wise cross-validation to ensure generalizability. This approach provides a scalable and objective tool that can complement traditional assessments in identifying likely responders at an early stage.
However, individual variability in EEG patterns remains a limitation. Differences in baseline neural activity may affect model robustness. Future work should incorporate longitudinal EEG data (pre-, mid-, and post-treatment) to capture treatment-induced changes and further reduce inter-individual variability. Testing the framework in more diverse populations will also enhance its clinical applicability.

Limitations

Our results suggest that EEG-based classification, enhanced by VAE augmentation, is a viable approach for predicting PTSD treatment response despite a relatively small dataset. The VAE-augmented model showed improved generalization performance, supported by subject-wise cross-validation.
However, some limitations remain. First, the model exhibited higher specificity than sensitivity, raising concerns about false negatives in clinical use. This imbalance suggests that responders may be misclassified, which is problematic in treatment planning. Future studies should explore decision threshold adjustments or architectural modifications to improve sensitivity. Including metrics such as the F1 score may also provide a more balanced performance evaluation. Second, although hyperparameter tuning was performed, details were not explicitly presented. To avoid data leakage, it is important to ensure that tuning is restricted to the training set. Third, while qualitative comparisons were provided, direct benchmarking against alternative generative models—such as GANs, diffusion models, or transformer-based architectures—was not conducted. Future research should include such comparisons to better contextualize model performance. Fourth, although performance peaked at a 45-fold augmentation, the effect of increasing augmentation scale on overfitting was not systematically analyzed. Further studies should examine whether high augmentation levels risk introducing non-physiological artifacts. Finally, the sample was limited to PTSD patients, restricting generalizability. Expanding to more diverse diagnostic groups and increasing the sample size will be essential for broader validation.

Response threshold selection

A 50% reduction in PTSD symptoms was used to define responders, consistent with clinical standards and prior studies [26,27]. This threshold reflects clinically meaningful improvement and allows comparability with existing literature.
While we did not perform formal sensitivity analysis on the threshold, future work should explore how altering the threshold (e.g., 40% or 60%) impacts classification outcomes and whether subgroup-specific thresholds enhance model personalization.

Clinical validity of augmented data

Validating the clinical plausibility of the VAE-generated data is an important next step. We propose two complementary approaches: 1) qualitative expert review by neurologists to assess the physiological realism of generated EEG spectrograms and 2) the use of discriminator-based validation techniques, similar to those used in GAN frameworks [15,16], to detect non-physiological artifacts.
Although our results suggest that VAE-augmented data contributed meaningfully to model performance, no explicit validation was performed to confirm the integrity of the generated signals. Future studies should incorporate expert evaluations and statistical similarity assessments, as well as visual comparisons between real and synthetic spectrograms, to verify the fidelity of the augmentation process.
In conclusion, deep learning applications in clinical neuropsychiatry are often constrained by limited sample sizes and class imbalances, which can hinder model generalization. To address these challenges, this study proposed a VAE-based DA strategy for EEG spectrogram classification to predict PTSD treatment response. By generating synthetic data that preserved the distribution of the original EEG signals, the proposed method improved the model’s robustness against noise and variability. When trained on the VAE-augmented dataset, the classification model achieved an AUC of 0.85 in the delta band, representing an improvement of approximately 0.11 compared to models trained without augmentation. These findings highlight VAE-based augmentation as an effective strategy to enhance deep learning performance in clinical EEG analysis, especially when working with small and imbalanced datasets. This approach may serve as a practical tool for improving early prediction of treatment outcomes in PTSD and potentially other psychiatric disorders.

Notes

Availability of Data and Material

The datasets generated or analyzed during the study are not publicly available due to patient privacy concerns and institutional data protection policies, but are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors have no potential conflicts of interest to disclose.

Author Contributions

Conceptualization: Seung-Hwan Lee. Data curation: Chaeyeon Yang. Methodology: Suh-Yeon Dong. Supervision: Seung-Hwan Lee, Suh-Yeon Dong. Writing—original draft: Sangha Kim, Suh-Yeon Dong. Writing—review & editing: all authors.

Funding Statement

This work was supported by the National Research Foundation of Korea’s Brain Korea 21 FOUR Program at the Sookmyung Women’s University and by the MSIT (Ministry of Science and ICT), Korea, under the ICAN (ICT Challenge and Advanced Network of HRD) program (IITP-2025-RS-2022-00156299) supervised by the IITP (Institute of Information & Communications Technology Planning & Evaluation).

Acknowledgments

None

Figure 1.
Spectrogram dataset represented as a three-dimensional array extracted from pre-treatment electroencephalogram data.
pi-2025-0133f1.jpg
Figure 2.
Overview of the EEG classification framework using dual-channel spectrogram input and VAE-based data augmentation. EEG, electroencephalogram; VAE, variational autoencoder; tDCS, transcranial direct current stimulation.
pi-2025-0133f2.jpg
Figure 3.
EEGNet model performance under different data augmentation (DA) conditions. A: Training accuracy comparison among baseline, time-segmentation, noise injection (NI), and variational autoencoder (VAE) augmentation. B: Training loss across the same augmentation methods. C: Training and validation accuracy for baseline (none) versus VAE-based augmentation. D: Training and validation accuracy for NI versus VAE-based augmentation.
pi-2025-0133f3.jpg
Figure 4.
Classification performance at different levels of VAEbased augmentation (N-fold), showing peak performance at 45-fold augmentation. AUC, area under the receiver operating characteristic curve; VAE, variational autoencoder; DA, data augmentation.
pi-2025-0133f4.jpg
Table 1.
Summary of the EEG dataset used in the study
Dataset information Details
Number of subjects 48
Length of EEG recordings per subject 150 seconds
Data types EEG spectrograms
Data augmentation method Variational autoencoder
Features used Power spectral density
Augmentation scale Doubled the original training data size

EEG, electroencephalogram.

Table 2.
Classification performance with different data augmentation methods
Data augmentation method AUC Sensitivity Specificity Balanced accuracy (%)
 None 0.74±0.18 0.48±0.33 0.80±0.27 61.0±8.2
 Time-segmentation 0.63±0.13 0.66±0.12 0.60±0.23 63.2±12.8
 NI (Gaussian) 0.74±0.16 0.61±0.27 0.70±0.22 65.7±12.8
 VAE 0.81±0.16 0.48±0.23 0.88±0.15 68.2±15.1

Data are presented as mean±standard deviation. AUC, area under the receiver operating characteristic curve; NI, noise injection; VAE, variational autoencoder.

REFERENCES

1. Smith SJ. EEG in the diagnosis, classification, and management of patients with epilepsy. J Neurol Neurosurg Psychiatry 2005;76 Suppl 2:ii2-ii7.
crossref pmid pmc
2. Petit D, Gagnon JF, Fantini ML, Ferini-Strambi L, Montplaisir J. Sleep and quantitative EEG in neurodegenerative disorders. J Psychosom Res 2004;56:487-496.
crossref pmid
3. Szurhaj W, Lamblin MD, Kaminska A, Sediri H. EEG guidelines in the diagnosis of brain death. Neurophysiol Clin 2015;45:97-104.
crossref pmid
4. Thakor NV, Tong S. Advances in quantitative electroencephalogram analysis methods. Annu Rev Biomed Eng 2004;6:453-495.
crossref pmid
5. Kim YW, Kim S, Shim M, Jin MJ, Jeon H, Lee SH, et al. Riemannian classifier enhances the accuracy of machine-learning-based diagnosis of PTSD using resting EEG. Prog Neuropsychopharmacol Biol Psychiatry 2020;102:109960
crossref pmid
6. Kim JY, Lee HS, Lee SH. EEG source network for the diagnosis of schizophrenia and the identification of subtypes based on symptom severity-a machine learning approach. J Clin Med 2020;9:3934
crossref pmid pmc
7. Al-Kaysi AM, Al-Ani A, Loo CK, Powell TY, Martin DM, Breakspear M, et al. Predicting tDCS treatment outcomes of patients with major depressive disorder using automated EEG classification. J Affect Disord 2017;208:597-603.
crossref pmid
8. Dauwels J, Vialatte F, Cichocki A. Diagnosis of Alzheimer’s disease from EEG signals: where are we standing? Curr Alzheimer Res 2010;7:487-505.
crossref pmid
9. Hsu TY, Juan CH, Tseng P. Individual differences and state-dependent responses in transcranial direct current stimulation. Front Hum Neurosci 2016;10:643
crossref pmid pmc
10. Albizu A, Fang R, Indahlastari A, O’Shea A, Stolte SE, See KB, et al. Machine learning and individual variability in electric field characteristics predict tDCS treatment response. Brain Stimul 2020;13:1753-1764.
crossref pmid pmc
11. Craik A, He Y, Contreras-Vidal JL. Deep learning for electroencephalogram (EEG) classification tasks: a review. J Neural Eng 2019;16:031001
crossref pmid pdf
12. He C, Liu J, Zhu Y, Du W. Data augmentation for deep neural networks model in EEG classification task: a review. Front Hum Neurosci 2021;15:765525
crossref pmid pmc
13. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Commun ACM 2020;63:139-144.
crossref
14. Zhang K, Xu G, Han Z, Ma K, Zheng X, Chen L, et al. Data augmentation for motor imagery signal classification based on a hybrid neural network. Sensors (Basel) 2020;20:4485
crossref pmid pmc
15. Zhang A, Su L, Zhang Y, Fu Y, Wu L, Liang S. EEG data augmentation for emotion recognition with a multiple generator conditional Wasserstein GAN. Complex Intell Syst 2022;8:3059-3071.
crossref pdf
16. Zhang S, Mao X, Sun L, Yang Y. EEG data augmentation for personal identification using SF-GAN. 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA); 2022 May 20-22; Changchun. Changchun: IEEE, 2022, p.1-6.

17. Akbarimajd A, Hoertel N, Hussain MA, Neshat AA, Marhamati M, Bakhtoor M, et al. Learning-to-augment incorporated noise-robust deep CNN for detection of COVID-19 in noisy X-ray images. J Comput Sci 2022;63:101763
crossref pmid pmc
18. Lawhern VJ, Solon AJ, Waytowich NR, Gordon SM, Hung CP, Lance BJ. EEGNet: a compact convolutional neural network for EEG-based brain-computer interfaces. J Neural Eng 2018;15:056013
crossref pmid pdf
19. Schirrmeister RT, Springenberg JT, Fiederer LDJ, Glasstetter M, Eggensperger K, Tangermann M, et al. Deep learning with convolutional neural networks for EEG decoding and visualization. Hum Brain Mapp 2017;38:5391-5420.
crossref pmid pmc pdf
20. Wang T, Dong E, Du S, Jia C. A shallow convolutional neural network for classifying MI-EEG. 2019 Chinese Automation Congress (CAC); 2019 Nov 22-24; Hangzhou. Hangzhou: IEEE; 2019. p.5837-5841.

21. Yu T, Wei CS, Chiang KJ, Nakanishi M, Jung TP. EEG-based user authentication using a convolutional neural network. 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER); 2019 Mar 20-23; San Francisco. San Francisco: IEEE; 2019. p.1011-1014.

22. Wang X, Wang Y, Liu D, Wang Y, Wang Z. Automated recognition of epilepsy from EEG signals using a combining space-time algorithm of CNN-LSTM. Sci Rep 2023;13:14876
crossref pmid pmc pdf
23. Shanmugam S, Dharmar S. A CNN-LSTM hybrid network for automatic seizure detection in EEG signals. Neural Comput Appl 2023;35:20605-20617.
crossref pdf
24. Kingma DP, Welling M. Auto-encoding variational Bayes. arXiv [Preprint] December 20, 2013. Available at: https://arxiv.org/abs/1312.6114. Accessed April 18, 2025.

25. Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods 2004;134:9-21.
crossref pmid
26. Varker T, Kartal D, Watson L, Freijah I, O’Donnell M, Forbes D, et al. Defining response and nonresponse to posttraumatic stress disorder treatments: a systematic review. Clin Psychol Sci Pract 2020;27:e12355

27. Kim S, Yang C, Dong SY, Lee SH. Predictions of tDCS treatment response in PTSD patients using EEG based classification. Front Psychiatry 2022;13:876036
crossref pmid pmc
TOOLS
Share:
Facebook Twitter Linked In Google+
METRICS Graph View
  • 0 Crossref
  •   Scopus
  • 1,056 View
  • 58 Download


ABOUT
AUTHOR INFORMATION
ARTICLE CATEGORY

Browse all articles >

BROWSE ARTICLES
Editorial Office
#522, G-five Central Plaza, 27 Seochojungang-ro 24-gil, Seocho-gu, Seoul 06601, Korea
Tel: +82-2-537-6171  Fax: +82-2-537-6174    E-mail: psychiatryinvest@gmail.com                

Copyright © 2025 by Korean Neuropsychiatric Association.

Developed in M2PI

Close layer
prev next