Detecting Manic State of Bipolar Disorder Based on Support Vector Machine and Gaussian Mixture Model Using Spontaneous Speech

Article information

Psychiatry Investig. 2018;15(7):695-700
Publication date (electronic) : 2018 July 4
doi : https://doi.org/10.30773/pi.2017.12.15
1Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
2Shanghai Key Laboratory of Forensic Medicine, Institute of Forensic Science, Ministry of Justice, Shanghai, China
3School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
4Jiading District Mental Health Center, Shanghai, China
5Brain Science and Technology Research Center, Shanghai Jiao Tong University, Shanghai, China
Correspondence: Donghong Cui, MD Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, No. 3210 Humin Road, Shanghai 201108, China Tel: +86 18017311051, E-mail: wxhurol@126.com
Received 2017 September 6; Revised 2017 November 11; Accepted 2017 December 15.

Abstract

Objective

This study was aimed to compare the accuracy of Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) in the detection of manic state of bipolar disorders (BD) of single patients and multiple patients.

Methods

21 hospitalized BD patients (14 females, average age 34.5±15.3) were recruited after admission. Spontaneous speech was collected through a preloaded smartphone. Firstly, speech features [pitch, formants, mel-frequency cepstrum coefficients (MFCC), linear prediction cepstral coefficient (LPCC), gamma-tone frequency cepstral coefficients (GFCC) etc.] were preprocessed and extracted. Then, speech features were selected using the features of between-class variance and within-class variance. The manic state of patients was then detected by SVM and GMM methods.

Results

LPCC demonstrated the best discrimination efficiency. The accuracy of manic state detection for single patients was much better using SVM method than GMM method. The detection accuracy for multiple patients was higher using GMM method than SVM method.

Conclusion

SVM provided an appropriate tool for detecting manic state for single patients, whereas GMM worked better for multiple patients’ manic state detection. Both of them could help doctors and patients for better diagnosis and mood state monitoring in different situations.

INTRODUCTION

Bipolar disorder (BD) is a common but severe mental illness characterized by cyclic mood variations with manic, depressive and euthymic states. BD is the sixth leading cause of disability worldwide and has a lifetime prevalence of about 3% in general population, associating with high recurrence, morbidity and risk of suicide [1,2]. Mainly relying on clinicians’ interview and patients’ self-report, current BD diagnosis and treatment methods are often time-consuming and subject to a range of subjective biases. In clinic, approximately 25% of BD is misdiagnosed as major depression. The failure of timely diagnosis often leads to delayed treatment, increasing costs as well as poor outcomes [3]. Therefore, an objective biomarker to assist clinicians for better diagnosis and treatment is urgently needed.

It has long been known that speech characteristics of patients with mental disorders are different from healthy individuals, and the speech pattern could be influenced by patients’ mood and neurophysiological state [4]. Up to now, numerous studies have confirmed that the speech signal could be an objective biomarker to differentiate major depression from normal state [5-7]. According to the speech production model, current speech features that are related to major depression can be grouped into three categories: glottal features [e.g., glottal timing (GT) and glottal frequency (GF) etc.], spectral and cepstral features [e.g., spectral flux, spectral centroid, Mel-frequency Cepsturm Coefficients (MFCC), Linear Prediction Cepstral Coefficient (LPCC), shifted delta cepstrum (SDC), PLPCC and Gammatone Frequency Cepstral Coefficients (GFCC) etc.], and prosodic features [e.g., pitch, the first three formants, jitter, shimmer, loudness, harmonic-to-noise ratio (HNR), log of energy (LogE) and Teager Energy Operation (TEO)] [8-10]. Accumulating evidence suggests that different feature types and classifiers could result in moderate to significant accuracy rate of major depression detection [9-11].

Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) are well-known speech/emotion/vision classifiers and were once used in mood state detection. Automatic speech recognition approaches have been carried out using a variety of classifiers, both generative and discriminative. SVM is a discriminative classifier whose maximum discrimination is obtained with an optimal placement of the separation plane between the borders of two classes. SVM solves non-linear problems by a transformation of the input feature vectors into a generally higher dimensional feature space. GMM is a generative classifier directly modeling low-level features regardless of speech duration. A comparative study of different classifiers detection accuracies including GMM, SVM, and Multilayer Perception (MLP) Neural networks has been reported in a 60-cohort spontaneous speech dataset for major depression detection showed that the hybrid classifier-GMM and SVM performed well (accuracy 81.61%) [11]. In addition, Low et al. [12] found that although SVM yielded very similar results with GMM did, SVM required more training time and was less efficient than GMM in their 139-adolescent cohort study. Given that SVM and GMM had different working mechanisms, the efficacy of these two machine learning techniques need to be further studied especially in the context of manic state detection of BD patients.

BD patients have more fluctuated emotions and speech pattern changes than those with major depression. Recently, a comparative study revealed that pitch and jitter showed statistically significant differences between different mood states among BD patients [13]. When BD patients are in the manic state, they often show emotional outburst, repeat the same idea and show witty irritability. Studies have also shown that the pause, intonation and emotional tension during speech could help to detect whether a BD patient is in manic state. Smartphones have been used for detecting different mood states and mood changes of BD by analyzing patient’s physiological activities such as Heart Rate Variability (HRV), Electro Dermal Response (EDR) [14,15] and behavior activities (such as geospatial information and phone call activities) [16]. Low-level speech features correlated with BD mood states, depressive and manic states can be detected using smartphones, although the detection accuracy was moderate [17,18].

We aimed to establish and explore a speech recognition system that could be used to monitor and eventually predict manic state of BD to aid diagnosis, modulate therapy and avoid dangerous events. Here we presented our primary study endpoints of manic state detection accuracies of BD using SVM and GMM with spontaneous speech. The novelty of this work is to select speech features that can represent manic status as much as possible. Of particular note is that the basis of this article is our previous work. As far as we know, this is the first report to detect manic state using different classifiers with optimized speech features. Our findings could provide evidence that speech signals can be biomarkers and serve as assistant monitoring tools for BD manic state detection.

Methods

Patients

The study group consisted of 21 hospitalized patients (14 females and 7 males, average age 34.52±15.32). The patients were diagnosed with BD and in manic episode after admission. Recruited patients were aged between 18 and 65, being able and willing to operate modern smartphone devices. The patients were selected by the ward’s psychiatrists who were capable of dealing with the study. Psychiatric assessment and the psychological state examination were performed in patients’ euthymic states at Shanghai Jiaotong University School of Medicine Mental Health Center (Shanghai, China) from October 2014 to January 2015. The study has been approved by the Shanghai Jiaotong University School of Medicine Mental Health Center (approval number: 2011-15). All participants signed the informed consent and the study strictly followed the guardians of the hospital.

BRMS score system

Bech-Rafaelsen Mania Rating Scale (BRMS) was used to assess the patients by a psychiatrist for determining manic state. BRMS score system was firstly developed by Bech et al. [19] in 1978. This 11-iterm system was developed to assess the severity of the manic state quantitatively. The system includes important items such as social contact, sleep and work activity etc [20]. The BRMS scale was used to measure mania, ranging from 0 point (not symptomatic) to 44 points (highly symptomatic). In order to classify extracted features, we binned the BRMS into categories of being manic and euthymic. Patients with scores under 6 points (the threshold) were euthymic, and those with scores above 22 were manic. All recruited patients were manic in this study. Table 1 shows the patients’ clinical and sociodemographic characteristics.

Clinical and sociodemographic characteristics of patients

Speech collection

Each patient was provided with a preloaded Samsung GALAXY Mega 6.3 (a sampling frequency of 44 kHz and a resolution of 32 bit, purchased from Samsung China, Shanghai, China). The clinician would make a free open conversation with patient through a cellphone. In order to reduce the noise interference, the patients were comfortably alone in a double layer sound insulation glass room when talking to the clinician. Speech was recorded twice in each mood state (manic and euthymic) in the morning in consecutive 1–2 days. Each recording lasted for about 10–25 minutes. All collected speech is encrypted and transferred securely through Wi-Fi to a cloud database for further analysis. The implementation, data transfer and handling were done following the security and encryption guidelines approved by the internal review board to ensure the integrity and privacy of the collected data. In this study, approximately 50% of the speech data were used to train the manic and euthymic models, and the rest for testing. The speech duration of the manic was 775 minutes in total.

Speech features and classifiers

The speech recognition system of manic state in this work consisted of two main parts. The working flow was illustratred in Figure 1. Detection was achieved through astochastic modelling and matching processing carried out in the backend of the classifier. The speech features from speech signal were extracted in the front-end. Training phase and testing phase were carried in the back-end had. In the training phase, a model was established to predict speaker mood state using a given input (labelled speech sample). In the testing phase, the model was used to detect the mood state.

Figure 1.

Structure of the manic state speech classifier system. The speech recognition system of manic state consisted of front-end and back-end parts. Speech features (primarily prosodic features and spectral features) from the speech samples were extracted in the frontend. Then the mental state recognition was achieved through astochastic modelling [support vector machine (SVM) and Gaussian mixture model (GMM)] and matching processing in the back-end.

Features extraction

We restricted ourselves to automatically extract features regardless of the content of speech. The speech features explored in this study included prosodic features and spectral features. Each feature consisted of a number of sub-categories. All speech features were extracted using the publicly available open SMILE software (audEERING, Munich, Bavaria, Germany) [21]. During the preprocessing stage, only frames that contained vocal speech were concatenated. The frame size was set to 25 ms at a shift of 10 ms with a Hamming window. The main extracted features in the study were pitch, formants, LPCC, MFCC, and GFCC.

Classifiers

The LIBSVM toolbox (developed by Chih-Chung Chang and Chih-Jen Lin of National Taiwan University, Taipei, Taiwan) was used to implement SVM modeling [22]. In this work, we used the HTK toolbox (Speech Vision and Robotics Group, Cambridge University Engineering Department) [23] for GMM modeling. In the implementation, expectation-maximization (EM) algorithm was used for estimating parameters of mean, covariance, and mixture weight of each Gaussian component in the GMM. It should be pointed out that the complete algorithm of SVM and GMM were thoroughly depicted in a previous published study [24]. In this work, SVM and GMM were utilized to discriminate patients’ mental states as “manic” or “euthymic,” and their performance about manic detection accuracy was compared.

Statistical analysis

Three patients were chosen in the single patient experiments, whereas 21 patients were in the multiple patients experiments. Statistical analysis was performed using SPSS for Windows version 21.0 (IBM, Corp., Armonk, NY, USA). The manic state detection accuracies of SVM and GMM for single patients and multiple patients were compared using the Student’s t-test. All p values were two-tailed and statistical significance was accepted as p<0.05.

Results

Speech features with high ratio were used for manic state detection

To minimize the influence of noisess and to maximize the detection accuracy, manic state-related features were selected at the beginning. In this work, the optimal features selection was achieved using the features of between-class variance (δ2b) and within-class variance (δ2w) [25]. Generally, optimized features have larger δ2b and smaller δ2w. Considering the big fluctuation of the features, a feature level normalization has been performed before calculating the variances. In the present study, the following features were selected: LPCC, first six formants (Frequency, Amplitude), MFCC (Mean and Variance), GFCC and pitch. The ratios of speech features were calculated according to function (1) and the results were shown in Table 2.

Ratios of using single features for manic state discrimination

(1) Ratio=log10δb2δw2

δ2b, between-class variance; δ2w, within-class variance.

We also hypothesized that a single speech feature may not reflect all features of the manic state and therefore may not be suitable for detection. In order to verify this hypothesis, we compared the manic state detection accuracies of SVM and GMM using speech features extrated from a 3-minute-long speech of a single patient at manic state. As shown in Table 3, for both SVM and GMM, the manic state detection accuracy of single features was lower than that of multiple features.

Manic state detection accuracies of speech features for single patients (%)

SVM performed better in manic state detection for single patients

We randomly chose speeches of three patients to test the manic state detection accuracies of GMM and SVM for single patients. The speech features were further optimized by genetic algorithm after being processed and normalized [26]. The patients were chosen to be tested with their ID protected for privacy. The results of GMM and SVM detection accuracies were presented in Table 4. SVM classifier showed a better performance (88.56±8.56%) in the detection of mania state for single patients than GMM classifier (84.46±1.85%).

Manic state detection accuracies of SVM and GMM for single patients (%)

GMM performed better in manic state detection for multiple patients

The above results showed that SVM was highly effective to discriminate manic state from euthymic state for single patients. However, it remained to be determined whether SVM was also adept to detect manic state for multiple patients. Therefore, we trained SVM and GMM using speeches of 3 patients chosen in the above experiment and compared the manic state detection performance of 21 patients using GMM and SVM. The manic state detection accuracies of SVM and GMM for multiple patients were summarized in Table 5. We could see that the manic state detection accuracy of GMM (72.27±6.90%) was higher than that of SVM (60.87±18.90%), indicating GMM was more effective on multiple patients’ manic state detection than SVM classifier.

Manic state detection accuracies of SVM and GMM for 21 patients (%)

Discussion

Herein we presented a primary study of manic state detection of BD by selecting representative speech features and utilizing classifier SVM and GMM with spontaneous speeches. Results showed that SVM performed well in manic state detection for single patients while GMM was effective in manic state detection for multiple patients.

GMM and SVM classifiers have been popular partly because of their capacity of robustly handling smaller/sparse datasets as well as relatively low expenses. The two classifiers have been proved to be adept for depression detection by a previous study [27]. In this study, the experiment showed that SVM had higher manic state detection accuracy for a single patient compared with GMM. This finding suggested the effectiveness of SVM for a small sample size. To further explore the performance of SVM and GMM, the investigation about the manic state detection accuracies for multiple patients was conducted and GMM was proved to be more accurate in this case. Herein, our results demonstrated that mania state could be effectively differentiated from euthymic state using speechbased classifiers, which have been trained on unstructured smartphone recordings. This was in agreement with observations reported by Karam et al. that spontaneous speeches could effectively differentiate manic state from euthymic state [17,18]. Evidences have shown that spontaneous speeches (such as family conversation or clinical interview) have more variability and could increase depressive and manic mood state detection accuracies than fixed session speech (such as text reading, picture commenting) [9-12,18,28]. In addition, speech collection in natural environment highlights the applicability for autonomous ecologically valid monitoring for BD. The optimal features were selected according to the ratio of betweenclass variance and within-class variance together with the genetic algorithm [29-31]. Five features (pitch, LPCC, first six formants, MFCC, and GFCC) were extracted and formed a feature set. Ratios of single features for manic state discrimination suggested that LPCC and GFCC contained more important mood information of mania state than other features, simulating human cochlea auditory characteristics.

The results of our study would assist clinicians and patients for better diagnosis and mood state change monitoring. Yet, certain limitations exist in our study. Firstly, the lack of a larger study cohort and more types of speech features can be addressed in our future studies. BD is a severe mental illness characterized by cyclic mood variations with manic, depressive and euthymic states, we have only investigated the differentiation of manic state from euthymic states. Thus, we could conduct further studies to explore how to differentiate depressive state from euthymic state. In addition, a further analysis can be done to verify whether other features, such as vocal tract features and glottal features, can be utilized to diagnose BD.

Our system did not aim to replace professional expertise, but to supplement it. In this aspect, our results showed a promising diagnosis accuracy to determine BD manic state using SVM. Based on clinical findings and previous discoveries, spontaneous speeches could serve as effective tools for the determination of manic state of BD. Specifically, SVM was adept to detect manic state for a single patient or a small size, whereas GMM was more suitable to detect manic state for multiple patients.

Acknowledgements

This project was supported by the National Natural Science Foundation of China (NSFC) under Grant (No. 61271349, 61371147 and 11433002); the Shanghai Jiao Tong University Joint Research Fund for Biomedical Engineering under Grant (No. YG2012ZD04); the Shanghai Key Laboratory of Psychotic Disorders under Grant (No. 13dz2260500); and the National Key Research and Development Program of China under Grant (No. 2017YFC0909200).

References

1. Schmitt A, Malchow B, Hasan A, Falkai P. The impact of environmental factors in severe psychiatric disorders. Front Neurosci 2014;8:19.
2. Boland EM, Alloy LB. Sleep disturbance and cognitive deficits in bipolar disorder: toward an integrated examination of disorder maintenance and functional impairment. Clin Psychol Rev 2013;33:33–44.
3. Inoue T, Inagaki Y, Kimura T, Shirakawa O. Prevalence and predictors of bipolar disorders in patients with a major depressive episode: the Japanese epidemiological trial with latest measure of bipolar disorder (JET-LMBP). J Affect Disord 2015;174:535–541.
4. Park CK, Lee S, Park HJ, Baik YS, Park YB, Park YJ. Autonomic function, voice, and mood states. Clin Auton Res 2011;21:103–110.
5. Cummins N, Scherer S, Krajewski J, Schnieder S, Epps J, Quatieri TF. A review of depression and suicide risk assessment using speech analysis. Speech Commun 2015;71:10–49.
6. Mitra V, Shriberg E. Effects of feature type, learning algorithm and speaking style for depression detection from speech. Acoustics, Speech and Signal Processing (ICASSP). In : 2015 IEEE International Conference on IEEE; 2015 April 19-24; Brisbane, QLD. Australia: IEEE; 2015. 4774–4778.
7. Scherer S, Hammal Z, Yang Y, Morency LP, Cohn JF. Dyadic behavior analysis in depression severity assessment interviews. In : Proceedings of the 16th International Conference on Multimodal Interaction; 2014 November 12-16; Istanbul, Turkey. NY: ACM; 2014. 112–119.
8. Barlow RB Jr. What the brain tells the eye. Sci Am 1990;262:90–95.
9. Low LSA, Maddage NC, Lech M, Sheeber L, Allen N. Influence of acoustic low-level descriptors in the detection of clinical depression in adolescents. In : 2010 IEEE International Conference on Acoustics, Speech and Signal Processing; 2010 March 14-19; Dallas. TX, USA: IEEE; 2010. 5154–5457.
10. Moore E 2nd, Clements MA, Peifer JW, Weisser L. Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Trans Biomed Eng 2008;55:96–107.
11. Alghowinem S, Goecke R, Wagner M, Epps J. A comparative study of different classifiers for detecting depression from spontaneous speech. In : IEEE International Conference on Acoustics, Speech and Signal Processing; 2013 May 26-3; Vancouver. BC, Canada: IEEE; 2013. p. 8022–8026.
12. Low LS, Maddage NC, Lech M, Sheeber LB, Allen NB. Detection of clinical depression in adolescents’ speech during family interactions. IEEE Trans Biomed Eng 2011;58:574–586.
13. Vanello N, Guidi A, Gentili C, Werner S, Bertschy G, Valenza G, et al. Speech analysis for mood state characterization in bipolar patients. Engineering in Medicine and Biology Society (EMBC). In : 2012 Annual International Conference of the IEEE; 2012 Aug 28 - Sep 1; San Diego. CA, USA: IEEE; 2012. 2104–2107.
14. Nardelli M, Valenza G, Gentili C, Lanata A, Scilingo EP. Temporal trends of neuro-autonomic complexity during severe episodes of bipolar disorders. Engineering in Medicine and Biology Society (EMBC). In : 2014 36th Annual International Conference of the IEEE; 2014 Aug 26-30; Chicago. IL, USA: IEEE; 2014. 2948–2951.
15. Greco A, Valenza G, Lanata A, Rota G, Scilingo EP. Electrodermal activity in bipolar patients during affective elicitation. IEEE J Biomed Health Inform 2014;18:1865–1873.
16. Lam KY, Wang J, Ng KY, Han S, Zheng L, Kam CH, et al. SmartMood: toward pervasive mood tracking and analysis for manic episode detection. IEEE Trans Human-Mach Syst 2015;45:126–131.
17. Grunerbl A, Muaremi A, Osmani V, Bahle G, Ohler S, Troster G, et al. Smartphone-based recognition of states and state changes in bipolar disorder patients. IEEE J Biomed Health Inform 2015;19:140–148.
18. Karam ZN, Provost EM, Singh S, Montgomery J, Archer C, Harrington G, et al. Ecologically Valid Long-Term Mood Monitoring of Individuals with Bipolar Disorder Using Speech. Acoustics, Speech and Signal Processing (ICASSP). In : 2014 IEEE International Conference on; 2014 May 4-9; Florence. Italy: IEEE; 2014. 4858–4862.
19. Bech P, Rafaelsen OJ, Kramp P, Bolwig TG. The mania rating scale: scale construction and inter-observer agreement. Neuropharmacology 1978;17:430–431.
20. Bech P, Bolwig TG, Kramp P, Rafaelsen OJ. The Bech-Rafaelsen Mania Scale and the Hamilton Depression Scale. Acta Psychiatr Scand 1979;59:420–430.
21. Eyben F, Wöllmer M, Schuller B. Opensmile: the munich versatile and fast open-source audio feature extractor. In : Proceedings of the 18th ACM International Conference on Multimedia; 2010 October 25-29; Firenze, Italy. New York: ACM; 2010. 1459–1462.
22. Chang CC, Lin CJ. LIBSVM: a library for support vector machines. Acm Trans Intell Syst Tech 2011;2:1–27.
23. Young S, Jansen J, Odell J, Ollason D, Woodland P. HTK-Hidden Markov Model Toolkit USA: User Manual; 2006.
24. Gui C, Li W, Pan Z, Zhang J, Zhu J, Cui D. A classifier for diagnosis of manic psychosis state based on SVM-GMM Sydney: The 10th International Conference on Information Technology and Applications (ICITA2015); 2015.
25. Dehak N, Kenny PJ, Dehak R, Dumouchel P, Ouellet P. Front-end factor analysis for speaker verification. IEEE Transactions on Audio Speech & Language Processing 2011;19:788–798.
26. Huang CL, Wang CJ. A GA-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 2006;31:231–240.
27. Andreassen OA, Harbo HF, Wang Y, Thompson WK, Schork AJ, Mattingsdal M, et al. Genetic pleiotropy between multiple sclerosis and schizophrenia but not bipolar disorder: differential involvement of immune-related gene loci. Mol Psychiatry 2015;20:207–214.
28. Alghowinem S, Goecke R, Wagner M, Epps J, Breakspear M, Parker G. Detecting depression: a comparison between spontaneous and read speech. In : Acoustics, Speech and Signal Processing (ICASSP); 2013 May 26-31; Vancouver. BC, Canada: IEEE; 2013. 7547–7551.
29. Gonzalez S, Brookes M. A Pitch Estimation Filter robust to high levels of noise (PEFAC). In : Signal Processing Conference, 2011 European; 2011 Aug 29 -Sept 2; Barcelona. Spain: IEEE; 2011. 451–455.
30. Broad DJ, Clermont F. Formant estimation by linear transformation of the LPC cepstrum. J Acoust Soc Am 1989;86:2013–2017.
31. Dave N. Feature Extraction Methods LPC, PLP and MFCC In Speech Recognition. IJARET 2013;1

Article information Continued

Figure 1.

Structure of the manic state speech classifier system. The speech recognition system of manic state consisted of front-end and back-end parts. Speech features (primarily prosodic features and spectral features) from the speech samples were extracted in the frontend. Then the mental state recognition was achieved through astochastic modelling [support vector machine (SVM) and Gaussian mixture model (GMM)] and matching processing in the back-end.

Table 1.

Clinical and sociodemographic characteristics of patients

Variables % or mean±SD
Gender, female 66.67
Marriage, unmarried 71.43
Age, years 34.52±15.32
Educational level, years 10.48±2.96
Age at onset, years 20.52±7.74
Duration of course, years 14.02±11.71
Total no. of episodes 7.05±3.38
No. of hospitalization 3.67±3.32
BRMS total score in manic episode 32.33±7.04
Type of medication
 Lithium 33.33
 Sodium valproate 61.90
 Other anticonvusants 4.77
 Antipsychotic 100
Lithium dose (mg/d) 957.13±308.80
Sodium valproate dose (mg/d) 661.54±236.43

SD: standard deviation, BRMS: the Bech-Rafaelsen Mania Rating Scale

Table 2.

Ratios of using single features for manic state discrimination

Features LPCC First six formants MFCC GFCC Pitch
Ratio 4.86 3.85 3.55 4.82 3.47

Ratio=log10δb2δw2

LPCC: Linear Prediction Cepstrum Coefficient, MFCC: Mel-Frequency Cepstrum Coefficient, GFCC: Gammatone Frequency Cepstral Coefficient

Table 3.

Manic state detection accuracies of speech features for single patients (%)

Speech features Accuracy%
SVM GMM
First six formants 73.62 66.71
LPCC 87.66 80.70
MFCC 74.46 68.68
GFCC 75.53 71.80
Multiple features 90.57 85.24

SVM: Support Vector Machine, GMM: Gaussian Mixture Model, LPCC: Linear Prediction Cepstrum Coefficient, MFCC: Mel-Frequency Cepstrum Coefficient, GFCC: Gammatone Frequency Cepstral Coefficient

Table 4.

Manic state detection accuracies of SVM and GMM for single patients (%)

Patient No. Accuracy%
SVM GMM
1 92.41 82.66
2 82.57 86.35
3 90.70 84.38
Overall 88.56±5.26 84.46±1.85

SVM: Support Vector Machine, GMM: Gaussian Mixture Model

Table 5.

Manic state detection accuracies of SVM and GMM for 21 patients (%)

Patient No. Accuracy%
SVM GMM
BD-1 44.63 68.47
BD-2 63.44 68.25
BD-3 36.38 71.39
BD-4 42.07 65.31
BD-5 78.82 75.18
BD-6 55.31 64.36
BD-7 90.70 72.64
BD-8 74.28 75.24
BD-9 82.57 70.03
BD-10 73.40 66.32
BD-11 64.06 82.66
BD-12 60.21 73.93
BD-13 38.75 76.86
BD-14 60.36 84.38
BD-15 70.87 86.35
BD-16 72.35 78.8
BD-17 92.41 69.2
BD-18 30.86 71.16
BD-19 72.01 70.7
BD-20 40.44 58.02
BD-21 34.34 68.38
Overall 60.87±18.90 72.27±6.90*
*

p<0.05,

compared with SVM group. BD: bipolar disorder, SVM: Support Vector Machine, GMM: Gaussian Mixture Model