Seok, Yang, Han, Lim, Kim, Kim, and Kim: Reliability and Validity of a Tablet-Based Neuropsychological Test (the Hellocog) for Screening Dementia

Abstract

Objective

To address the gap in timely diagnosis of dementia due to limited screening tools, we investigated the validity and reliability of the Hellocog, computerized neuropsychological test based on tablets for screening dementia. The higher the probability score on the Hellocog, the higher the likelihood of dementia.

Methods

This study included 100 patients with dementia and 100 individuals with normal cognition who were aged 60 years or older and free of other major psychiatric, neurological, or medical conditions. They administered the Hellocog on a tablet under the supervision of a neuropsychologist. To determine test-retest reliability, 20 took the Hellocog again after 4 weeks. Diagnostic performance was assessed using the receiver operator characteristics (ROC) analysis.

Results

The Hellocog showed adequate internal consistency (Cronbach’s alpha=0.69) and good test-retest reliability (intraclass correlation coefficient=0.86, p<0.001). Participants with dementia scored higher on the Hellocog than those with normal cognition (p<0.001), confirming its high criterion validity. Strong correlations with the Mini-Mental Status Examination (MMSE) score and the total score of the Consortium to Establish a Registry for Alzheimer’s Disease Neuropsychological Assessment Battery (CERAD-TS) highlight the concurrent validity of the Hellocog. The area under the ROC curve for dementia of the Hellocog was excellent (0.971) and comparable to that of the MMSE and CERAD-TS. The sensitivity and specificity for dementia were 0.945 and 0.872%, respectively, which were slightly better than those of the MMSE and CERAD-TS.

Conclusion

Hellocog stands out as a valid and reliable tool for self-administered dementia screening, with promise for improving early detection of dementia.

INTRODUCTION

The number of people with dementia is increasing rapidly worldwide [1], including South Korea [2,3]. Although disease modifying therapies for dementia recently have been developed, they may be effective in the prodromal or early stages of the disease [4]. Furthermore, these therapies are specific to Alzheimer’s disease (AD) and there are still no cures for other forms of dementia. Thus, early diagnosis and intervention remain crucial.
However, the current diagnosis rate for dementia is approximately 50% [5], which may be due to complex interaction of a variety of factors, including low public awareness [6], low accessibility to diagnostic services [7], and lack of post-diagnostic services [8]. Among these, the lack of optimal screening tools is identified as a key contributing factor [9,10].
An ideal screening test for dementia should be sensitive and specific enough to identify those people with cognitive impairment who need further comprehensive diagnostic evaluation for dementia. In addition, it should be quick and easy to administer by a range of health professionals or self-administered without any assistance from a health professional. In this sense, the Mini-Mental Status Examination (MMSE), the most popular screening test for dementia in both clinical and research settings [11], has many drawbacks [12]. Although many other cognitive tests have been developed to screen for dementia, their disadvantages are not very different from those of the MMSE [13].
We have therefore developed the Hellocog, which is a brief, self-administered, tablet-based neuropsychological test. We developed the Hellocog based on our previous deep learning model for diagnosing dementia using demographic information, subjective memory complaints, depressive symptoms and the results of comprehensive neuropsychological tests [14]. The Hellocog consists of a questionnaire section (Hellocog-Q) and a cognitive test section (Hellocog-T). The Hellocog-Q consists of five questions about age, years of formal education, presence of subjective memory complaints (yes or no), presence of depressive mood (yes or no), and loss of interest or pleasure (yes or no). The Hellocog-T consists of the time orientation test (month and day of the week), the seven word list memory test (WLMT), the 13-digit trail making test (TMT), the seven word list recall test (WLRT), the seven word list recognition test (WLRcT), the one-minute verbal fluency test (VFT) for animal category, and the five-item confrontational naming test.
The foundation of this study, including the development of the Hellocog, is informed by research conducted by Choi et al. [14], which incorporated the Korean version of the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD-K) neuropsychological assessment. As such, the Hellocog shares some structural similarities with the CERAD-K in terms of the cognitive domains assessed. However, it diverges significantly in its approach and implementation. From the CERAD-K neuropsychological assessment, only items identified by Choi et al.’s [14] research as beneficial for dementia risk screening were selected for inclusion in the Hellocog. Moreover, the specific content of the cognitive task items in Hellocog differs from those in CERAD-K, with the Hellocog designed to complete assessments in about one-third the time required for CERAD-K. Additionally, unlike the comprehensive neuropsychological evaluation provided by CERAD-K, which offers scores across individual cognitive domains, the Hellocog focuses solely on calculating a weighted total score for the purpose of dementia screening. This streamlined approach, facilitated by digital technology, allows for self-administration, immediate data processing, and an interactive testing experience, thus making Hellocog a distinct and efficient tool for dementia screening in contrast to the traditional paper-based CERAD-K.
In this study, we examined the validity and reliability of the Hellocog for screening for dementia and compared its diagnostic performance for dementia with that of the MMSE [15] and the CERAD-K Neuropsychological Assessment Battery total score (CERAD-TS) 16 administered by neuropsychologists.

METHODS

Participants

We enrolled 100 participants with dementia from visitors to the dementia clinic at Seoul National University Bundang Hospital from 2019 to 2021, and 100 participants with normal cognition from participants in the Korean Longitudinal Study on Cognitive Aging and Dementia [17] from 2019 to 2021. However, task failures were observed in 9% of the dementia group and 6% of the control group. Consequently, the final dataset used for data analysis included 91 participants in the dementia group and 94 participants in the control group. It is important to note that there were no dropouts in either group, ensuring the robustness and reliability of our findings.
All participants were aged 60 years or older, lived in the community, and had normal or corrected-to-normal vision and hearing. All participants were free of major psychiatric, neurological, or medical conditions other than dementia that could affect cognitive function. The study was approved by the Institutional Review Board of the Seoul National University Bundang Hospital (B-1905-540-302). All participants were fully informed and provided written informed consent by themselves or their legal guardians.

Diagnostic assessment

Psychiatrists specializing in geriatric psychiatry and dementia research conducted a standardized face-to-face diagnostic interview, physical and neurological examinations and laboratory tests using the CERAD-K Clinical Assessment Battery [15] and the Korean version of the Mini International Neuropsychiatric Interview [18].
Neuropsychologists or trained research nurses assessed the severity of subjective cognitive complaints and depressive symptoms using the Subjective Memory Complaints Questionnaire [19] and the Korean version of the Geriatric Depression Scale [20]. They assessed cognitive function using the CERAD-K Neuropsychological Assessment Battery [15,21], the Digit Span Test [22], and the Frontal Assessment Battery [23]. The CERAD-K Neuropsychological Assessment Battery consists of nine neuropsychological tests: VFT, 15-item Boston Naming Test, MMSE, WLMT, Constructional Praxis Test (CPT), WLRT, WLRcT, constructional recall test, TMT A and TMT B [15]. We defined objective cognitive impairment as a score of -1.5 standard deviation (SD) or below on any of these neuropsychological tests, except MMSE, compared with age-, sex-, and education-adjusted norms of elderly Koreans [16,21-23]. We obtained the CERAD-TS according to the equation proposed in our previous work [16].
A panel of Psychiatrists specializing in geriatric psychiatry and dementia then made a diagnosis of cognitive disorder and determined the global severity of dementia using the Global Deterioration Scale (GDtS) [24] at the consensus diagnostic conference. We diagnosed dementia according to the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition diagnostic criteria [25]. We diagnosed AD according to the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA) diagnostic criteria [26], vascular dementia (VD) according to the National Institute of Neurological Disorders and Stroke-Association Internationale pour la Recherche et l’Enseignement en Neurosciences (NINDS-AIREN) diagnostic criteria [27], dementia of Lewy body (DLB) and Parkinson’s disease with dementia (PDD) according to the consensus guideline proposed by McKeith [28] and frontotemporal dementia according to the Lund-Manchester consensus diagnostic criteria [29].

Administration of the Hellocog

A neuropsychologist explained to each participant how to administer the Hellocog. Each participant then administered the Hellocog, which was installed in a tablet, without any assistance. Although the Hellocog was designed to automatically recognize speech, the accuracy of its speech recognition was limited in elderly Koreans. Therefore, in this validation study, research transcribers converted the recorded voices of the participants into text and manually entered them into the Hellocog. To examine the test-retest reliability of the Hellocog, we asked 20 participants (10 with dementia and 10 with normal cognition) to self-administer the Hellocog twice with a test-retest interval of four weeks.

Methodology and scoring of the Hellocog

Initially, research by Choi et al. [14] identified 43 variables. Of these, 27 were common to both the previous research and the current study involving Hellocog. These 27 variables were used to develop a logistic regression model identical to the one used in Choi’s research, aiming to predict the risk of dementia. The Hellocog test’s scoring system is not designed to provide separate scores for individual cognitive domains such as memory, language, attention, and executive function. Instead, it calculates a weighted composite score that reflects the overall risk of dementia. This clarification is essential to accurately represent the test’s methodology. The composite score is derived from a combination of responses to various cognitive tasks, with factors like age, education, and the cumulative responses to test components considered in the scoring equation. Higher scores on this scale indicate a greater likelihood of dementia. It is important to note that, although Hellocog does not offer domain-specific scores, its broad assessment range across various cognitive tasks is intended to provide a more accurate indication of dementia risk than tests that evaluate a more limited set of cognitive areas. This comprehensive approach aims to ensure that the influence of dementia’s type or stage on the test’s accuracy is minimized. The interpretation of Hellocog scores, based on validated thresholds, allows for a nuanced understanding of test results, facilitating early detection and monitoring of cognitive changes associated with dementia. This method underscores Hellocog’s utility as a reliable and valid tool for dementia screening, emphasizing its capacity to assess risk across a spectrum of cognitive functions without the need for individual domain scores. Based on our previous research [14], the Hellocog score was calculated using the following equation. The higher the Hellocog score, the higher the likelihood of having dementia.
Hellocog score=4.556+0.009*age+0.19*education+2.004*SMC-1.2815*DM2+0.8955*DM-0.7905*LIP2+2.1245*LIP-0.806*TOM-0.908*TOD-0.755*WLLR+0.065*WLLC-0.126*WLLRE-3.620*WLLP-0.312*WLL1+0.032*WLL2-0.095*WLL3+0.006*TMT-0.703*WLR+0.140*WLRI+0.237*WLRCB-0.321*WLRC-0.286*VF1-0.302*VF2-0.205*VF3-0.356*VF4+0.037×VFC+0.250*VFIS+0.104*VFS+0.043×NT
(SMC, subjective memory complaints; DM, depressive mood; LIP, loss of interest or pleasure; TOM, time orientation to month; TOD, time orientation to day; WLLR, recency index of word list learning; WLLC, consistency index of word list learning; WLLRE, number of repetition errors in word list learning; WLLP, primacy index of word list learning; WLL1, number of correct recalls in the first trial of the word list learning; WLL2, number of correct recalls in the second trial of the word list learning; WLL3, number of correct recalls in the third trial of the word list learning; TM, time for completing the trail making; WLR, number of correct recalls in the delayed word list recall; WLRI, number of intrusion errors in the delayed word list recall; WLRCB, response bias index of word list recognition; Y2, WLRC, total score or word list recognition; VF1, number of correct response during the first 15 seconds of the categorical verbal fluency; VF2, number of correct response during the second 15 seconds of the categorical verbal fluency; VF3, number of correct response during the third15 seconds of the categorical verbal fluency; VF4, number of correct response during the fourth 15 seconds of the categorical verbal fluency; VFC, clustering index of the categorical verbal fluency; VFIS, ineffective switching index of the categorical verbal fluency; VFS, switching index of the categorical verbal fluency; NT, number of correct confrontational naming on middle-frequency objects)

Statistical analysis

We compared continuous and categorical variables between groups using Student’s t-tests and chi-squared tests respectively.
We examined the internal consistency of the Hellocog using Cronbach’s alpha. We examined the test-retest reliability of the Hellocog using the intraclass correlation coefficient (ICC) between the test and retest scores. We assessed the concurrent validity of the Hellocog by examining its correlations with the MMSE, CEARD-TS, and GDtS using Pearson’s correlation analysis adjusting for age and education. We examined the criterion validity of the Hellocog by comparing Hellocog scores between participants with dementia and those with normal cognition. We evaluated the diagnostic accuracy for dementia of the Hellocog, MMSE, CERAD-TS and GDtS using receiver operating characteristic (ROC) analysis and determined the optimal cutoff score for dementia using the Youden index maximum (sensitivity+specificity-1) [30]. We compared the diagnostic accuracy for dementia of the Hellocog with that of the MMSE and CERAD-TS by comparing their areas under the ROC curve (AUC) using the z-test [31].
We performed ROC analyses using the MedCalc Statistical Software version 19.1 (MedCalc Software, Ostend, Belgium; https://www.medcalc.org; 2019) and all other analyses using the SPSS version 18.0 (SPSS Inc., Chicago, IL, USA).

RESULTS

As summarized in Table 1, of the 200 participants, 185 (91 with dementia and 94 with normal cognition) were included in the final analysis after excluding participants whose voice was not properly recorded due to a program error. Of the 91 patients with dementia, 73 (80.2%) had AD. Of the 18 (19.8%) patients with non-AD dementia, 9, 4, and 5 had VD, DLB/PDD, and FTD respectively. As summarized in Table 2, participants with dementia were older and less educated than those with normal cognition (p<0.001). However, the distribution of sex was comparable between the two groups (p=0.106). Participants with dementia had lower MMSE score and CERAD-TS but the higher GDtS score than those with normal cognition (p<0.001).
The Hellocog showed acceptable internal consistency (Chronbach’s alpha=0.69) and a good test-retest reliability (ICC=0.86, p<0.001). As shown in Table 1, the participants with dementia scored higher on the Hellocog than those with normal cognition (p<0.001), indicating that the Hellocog has good criterion validity. As shown in Table 2, the Hellocog score correlated well with the MMSE score (r=-0.73, p<0.001), the CERAD-TS (r=-0.82, p<0.001), and the GDtS score (r=0.76, p<0.001), indicating that the Hellocog has good concurrent validity.
As summarized in Table 3, the Hellocog showed excellent diagnostic performance for dementia (AUC=0.972). Its diagnostic performance was better than that of the MMSE and the CERAD-TS, but the differences were not statistically significant (p=0.352 for the MMSE; p=0.504 for the CERAD-TS). At the optimal cut-off score, the sensitivity and specificity of the Hellocog for dementia were 0.945 and 0.872, respectively. When the patients with AD were analyzed separately (Table 4), the results did not change. The Hellocog also showed excellent diagnostic performance for AD (AUC=0.971, 95% confidence interval=0.928–0.992). Although its diagnostic performance for AD was better than that of the MMSE and the CERAD-TS, the differences were not statistically significant (p=0.330 for the MMSE; p=0.846 for the CERAD-TS). At the optimal cut-off score, the sensitivity and specificity of the Hellocog for AD were 0.915 and 0.918, respectively.

DISCUSSION

The utilization of mobile technology among older adults presents a promising avenue for facilitating convenient and cost-effective assessments aimed at early detection of dementia. Mobile-based tests offer numerous advantages over traditional pen-and-paper tests, including higher completion rates, lower administration costs, automated scoring, immediate access to results, and effortless tracking of patient data [32-35]. In addition, mobile technology allows monitoring of cognitive function through repeated assessments outside the clinical setting, providing valuable insight into daily cognitive fluctuation and enabling early detection of subtle signs of cognitive decline [36,37].
To improve the accuracy and convenience of mobile cognitive testing, two categories of strategies can be explored: 1) adapting established neuropsychological tests into mobile versions, 2) designing cognitive tests specifically for mobile platforms [38]. The development of the Hellocog aligns with these strategies. First, the Hellocog is built upon our prior research aimed at creating an ideal battery of neuropsychological tests for dementia diagnosis from comprehensive neuropsychological assessments [14]. Second, all test items have been modified for a mobile platform and can be completed using touch screen or voice input.
The results of this study provide substantial proof that the Hellocog stands as a promising screening tool for dementia when compared to popular cognitive tests such as the MMSE and CERAD-TS. In addition, the high concurrent validity of the Hellocog with both the CERAD-TS and MMSE underscores its validity as a reliable screening tool for dementia. Furthermore, the Hellocog has several additional advantages over traditional pen-and-paper cognitive tests such as MMSE and CERAD Neuropsychological Assessment Battery as an early screening tool for dementia. First, its primary advantage is the ability to be self-administered at home without a human examiner, which significantly reduces the need for hospital visits, eliminates teaching bias or evaluation disparities, and minimizes infection risks. Second, not only is the administration of the Hellocog automated, but so is the scoring and reporting. Therefore, it is robust to human error and can provide immediate results. Owing to a larger set of questions, the administration time for Hellocog is around 15 minutes, which is longer compared to other mobile applications used for screening cognitive disorders. Despite this, Hellocog has effectively addressed this issue by designing a highly efficient and user-friendly interface.
Several mobile applications have been developed to assess cognitive function and screen for cognitive disorders (Table 5) [39]. Most of these applications have demonstrated diagnostic performance similar to that of the Hellocog. However, some of them have limitations in terms of the cognitive domains they assess. It is important to emphasize that dementia involves a persistent and progressive decline in multiple cognitive domains, not limited to memory. These domains encompass executive function, complex attention, language, learning, memory, perceptual-motor skills, and social cognition [40,41]. Therefore, a comprehensive evaluation of these cognitive domains is essential when screening for dementia. For example, executive function, which is critical because it involves a wide range of active cognitive processes such as verbal reasoning, problem solving, planning, sustained attention, resistance to distraction, multitasking, cognitive flexibility, and adaptability to novelty, is particularly impaired in the early stages of AD [42]. While instruments such as the Addenbrooke’s Cognitive Examination III provided an evaluation on wide range of cognitive domains, their administration by healthcare professionals may hinder their widespread use in clinical settings [39]. On the other hand, instruments such as the BrainTest lack sufficient evaluation of their psychometric properties and applicability to diverse populations, raising concerns about their reliability and validity for accurate cognitive assessment [43]. As we strive to improve dementia screening and early detection, it is imperative to address these limitations and work towards the development of mobile applications that provide comprehensive cognitive assessments, including executive function, while being easily accessible and validated for use across diverse populations.
This study has several limitations. First, there was a lack of uniformity in age and years of education between the patient and control groups. Although these variables were controlled for when comparing means or variances between groups and when examining correlations between variables, they were not accounted for in the ROC analysis, potentially leading to an overestimation of the diagnostic efficacy of the tests. Secondly, it didn’t take into account the participants’ familiarity or skill in using tablets or smartphones. The control group, being younger and more educated, may have performed better on the Hellocog than the patient group. This potential difference in experience with tablets or smartphones may have influenced the difference in test performance between the groups. Third, the current study did not develop a norm of the Hellocog. However, we agree that the development of population-specific norms will need to be addressed in subsequent studies for wider use of the Hellocog. Fourth, the current study employed a cross-sectional case-control design and is therefore susceptible to selection bias and lack of rater blinding. Although the current study did not restrict the type of dementia at enrollment, the type of dementia ultimately included in the analysis tended to overrepresent patients with AD by about 10% compared with the prevalence of AD reported in epidemiologic studies. Therefore, the validity of the Hellocog may need to be further validated in a follow-up study with non-AD. However, in this study, the Hellocog was self-administered by the participants themselves, so blinding is likely to have had a minimal effect on the results of this study. Fifth, the current study employed a cross-sectional case-control design and is therefore susceptible to selection bias and lack of rater blinding. Although the current study did not restict the type of dementia at enrollment, the type of dementia ultimately included in the analysis tended to overrepresent patients with AD by about 10% compared with the prevalence of AD reported in epidemiologic studies. Therefore, the validity of the Hellocog may need to be further validated in a follow-up study with non-AD. Sixth, one limitation of our study is the relatively small number of participants, approximately 20, involved in the test-retest reliability assessment of the Hellocog. The advice offered by DeVet et al. [44] is to use a sample size of 50 ‘‘as a starting point for negotiations’’. We agree that future research should aim to include a larger proportion of participants in the test-retest process to further validate these findings.
In conclusion, the Hellocog is a valid and reliable tool for screening for dementia and may help improve early detection of dementia.

Notes

Availability of Data and Material

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

Ki Woong Kim, a contributing editor of the Psychiatry Investigation, was not involved in the editorial evaluation or decision to publish this article. All remaining authors have declared no conflicts of interest.

Author Contributions

Conceptualization: Eun Young Kim, Ki Woong Kim. Data curation: Hee Won Yang, Daniel Hahnsam Seok. Formal analysis: Hee Won Yang, Daniel Hahnsam Seok. Funding acquisition: Ji Won Han, Ki Woong Kim. Investigation: Hee Won Yang, Daniel Hahnsam Seok, Ki Woong Kim. Methodology: Hee Won Yang, Eun Young Kim. Project administration: Hee Won Yang, Eun Young Kim. Resources: Eun Young Kim, Seon Hyeok Kim, Jin Hwan Lim. Software: Seon Hyeok Kim. Supervision: Eun Young Kim, Ki Woong Kim. Validation: Ji Won Han, Ki Woong Kim. Visualization: Daniel Hahnsam Seok. Writing—original draft: Daniel Hahnsam Seok, Hee Won Yang, Ki Woong Kim. Writing—review & editing: all authors.

Funding Statement

This work was supported by the ATC (Advanced Technology Center) Program (10076733, Development of the cognitive rehabilitation solution for patients with cognitive impairment using speech recognition and eye tracking technology) funded By the Ministry of Trade, Industry & Energy (MOTIE, Korea).

ACKNOWLEDGEMENTS

None

Table 1.
Demographic and clinical characteristics of the participants
Dementia (N=91) Control (N=94) p
Age (yr) 77.9±6.4 74.1±4.3 <0.001*
Men 31 (34.1) 43 (45.7) 0.106*
Education (yr) 9.8±5.2 12.8±4.2 <0.001*
MMSE (points) 20.16±5.0 27.3±2.3 <0.001
CERAD-TS (points) 45.8±13.1 74.2±10.7 <0.001
GDtS (points) 4.2±0.8 1.8±0.8 <0.001
Hellocog (points) 3.5±3.1 -3.8±2.3 <0.001

Data are presented as mean±standard deviation or N (%).

* student t test for continuous variables, and chi-square test for categorical variables;

analyses of variance adjusted for age, sex and education.

MMSE, Mini-Mental State Examination; CERAD-TS, Consortium to Establish a Registry for Alzheimer’s Disease Neuropsychological Assessment Battery total score; GDtS, Global Deterioration Scale

Table 2.
Concurrent validity of the Hellocog
Hellocog MMSE CERAD-TS GDtS
Hellocog 1
MMSE -0.73* 1
CERAD-TS -0.82* 0.83* 1
GDtS 0.76* -0.79* -0.80* 1

* p<0.001 by Pearson’s correlation analyses adjusted for age and education.

MMSE, Mini-Mental State Examination; CERAD-TS, Consortium to Establish a Registry for Alzheimer’s Disease Neuropsychological Assessment Battery total score; GDtS, Global Deterioration Scale

Table 3.
Comparison of diagnostic performance for dementia between the Hellocog and other instruments
Cutoff Sens Spec PPV NPV AUC (95% CI)
Hellocog 0.972 (0.937–0.991)
-1.557 0.945 0.851 0.860 0.941
-1.524 0.945 0.862 0.868 0.942
-1.494* 0.945 0.872 0.877 0.942
-1.464 0.934 0.872 0.876 0.932
-1.355 0.934 0.883 0.885 0.932
MMSE 0.958 (0.918–0.982)
27.5 0.670 0.967 0.952 0.752
26.5 0.862 0.933 0.926 0.875
25.5* 0.926 0.900 0.900 0.926
24.5 0.979 0.822 0.842 0.976
21.5 0.979 0.600 0.703 0.967
CERAD-TS 0.964 (0.925–0.986)
67.0 0.830 0.933 0.923 0.850
64.5 0.904 0.922 0.918 0.908
62.5* 0.936 0.922 0.921 0.937
60.5 0.947 0.911 0.912 0.947
58.5 0.957 0.900 0.903 0.956

* optimal cutoff score.

Sens, sensitivity; Spec, specificity; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the receiver operating characteristic curve; CI, confidence interval; MMSE, Mini-Mental State Examination; CERAD-TS, Consortium to Establish a Registry for Alzheimer’s Disease Neuropsychological Assessment Battery total score

Table 4.
Comparison of diagnostic performance for dementia due to Alzheimer’s disease between the Hellocog and other instruments
Cutoff Sens Spec PPV NPV AUC (95% CI)
Hellocog 0.971 (0.928–0.992)
-1.02 0.894 0.918 0.933 0.870
-0.93 0.904 0.918 0.934 0.882
-0.63* 0.915 0.918 0.935 0.893
-0.53 0.915 0.904 0.925 0.892
-0.42 0.926 0.904 0.926 0.904
MMSE 0.954 (0.905–0.982)
23.0 0.979 0.753 0.836 0.965
24.0 0.979 0.836 0.885 0.968
25.0* 0.926 0.904 0.926 0.904
26.0 0.862 0.932 0.942 0.840
27.0 0.670 0.973 0.969 0.696
CERAD-TS 0.968 (0.924–0.990)
57.0 0.957 0.896 0.947 0.915
59.0 0.947 0.917 0.957 0.898
62.0* 0.936 0.938 0.967 0.882
63.0 0.926 0.938 0.967 0.865
64.0 0.904 0.938 0.966 0.833

* optimal cutoff score.

Sens, sensitivity; Spec, specificity; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the receiver operating characteristic curve; CI, confidence interval; MMSE, Mini-Mental State Examination; CERAD-TS, Consortium to Establish a Registry for Alzheimer’s Disease Neuropsychological Assessment Battery total score

Table 5.
Comparison of the characteristics between mobile applications for screening dementia
OS Covered cognitive domains* TFA (min) Diagnostic performance


Absolute
Relative*
AOS IOS Att Exec Mem Lang Perc Soc AUC Sens Spec AUC Sens Spec
MOBI-COG 3 0.87 88 85 0.99 0.94 1.05
Dementia test 5
 6CIT 78–90 100
 SCIDS 75–100 100
MOCA 10 0.91 71 93 0.96 0.80 0.97
eSLUMS 7–10 0.98 98 100 1.05 1.10 1.20
ACE-III 5–10 0.95 94 83 1 1.06 0.94
Dementia screener 5 -
 SDS 78 84
 AD8 85 86
Brain test 10–15 0.90 81 88 0.95 0.90 1.00
Hellocog 15– 0.97 95 87 1.01 1.02 0.97

* relative ratio compared to the co-administered Mini Mental Status Examination.

OS, operation system; AOS, android operation system; IOS, iPhone operation system; Att, attention; Exec, executive function; Mem, learning and memory; Lang, language; Perc, perceptuomotor; Soc, social cognition; TFA, time for administration; AUC, area under receiver operating characteristic curve; Sens, sensitivity; Spec, specificity; 6CIT, 6-item cognitive impairment test; SCIDS, Structured Clinical Interview; SDS, Symptoms of Dementia Screnner; AD8, Ascertain Dementia 8; MOCA, the Montreal cognitive assessment; eSLUMS, a digital version of St Louis University Mental Status Exam; ACE, Addenbrooke’s Cognitive Examination

REFERENCES

1. Scheltens P, De Strooper B, Kivipelto M, Holstege H, Chételat G, Teunissen CE, et al. Alzheimer's disease. Lancet 2021;397:1577–1590.
crossref pmid pmc
2. Kim KW, Park JH, Kim MH, Kim MD, Kim BJ, Kim SK, et al. A nationwide survey on the prevalence of dementia and mild cognitive impairment in South Korea. J Alzheimers Dis 2011;23:281–291.
crossref pmid
3. Suh SW, Kim YJ, Kwak KP, Kim K, Kim MD, Kim BS, et al. A 9-year comparison of dementia prevalence in Korea: results of NaSDEK 2008 and 2017. J Alzheimers Dis 2021;81:821–831.
crossref pmid
4. Cummings J, Osse AML, Cammann D, Powell J, Chen J. Anti-amyloid monoclonal antibodies for the treatment of Alzheimer's disease. BioDrugs 2024;38:5–22.
crossref pmid
5. Amjad H, Roth DL, Sheehan OC, Lyketsos CG, Wolff JL, Samus QM. Underdiagnosis of dementia: an observational study of patterns in diagnosis and awareness in US older adults. J Gen Intern Med 2018;33:1131–1138.
crossref pmid pmc
6. Thoits T, Dutkiewicz A, Raguckas S, Lawrence M, Parker J, Keeley J, et al. Association between dementia severity and recommended lifestyle changes: a retrospective cohort study. Am J Alzheimers Dis Other Demen 2018;33:242–246.
crossref pmid pmc
7. Mansfield E, Bryant J, Nair BR, Zucca A, Pulle RC, Sanson-Fisher R. Optimising diagnosis and post-diagnostic support for people living with dementia: geriatricians' views. BMC Geriatr 2022;22:143
crossref pmid pmc
8. Frost R, Rait G, Aw S, Brunskill G, Wilcock J, Robinson L, et al. Implementing post diagnostic dementia care in primary care: a mixed-methods systematic review. Aging Ment Health 2021;25:1381–1394.
crossref pmid
9. Bradford A, Kunik ME, Schulz P, Williams SP, Singh H. Missed and delayed diagnosis of dementia in primary care: prevalence and contributing factors. Alzheimer Dis Assoc Disord 2009;23:306–314.
pmid pmc
10. Dubois B, Padovani A, Scheltens P, Rossi A, Dell’Agnello G. Timely diagnosis for Alzheimer’s disease: a literature review on benefits and challenges. J Alzheimers Dis 2016;49:617–631.
crossref pmid
11. Folstein MF, Folstein SE, McHugh PR. “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 1975;12:189–198.
pmid
12. Kim KW, Lee DY, Jhoo JH, Youn JC, Suh YJ, Jun YH, et al. Diagnostic accuracy of mini-mental status examination and revised Hasegawa dementia scale for Alzheimer’s disease. Dement Geriatr Cogn Disord 2005;19:324–330.
crossref pmid
13. Larner AJ. Cognitive screening instruments for dementia: comparing metrics of test limitation. Dement Neuropsychol 2021;15:458–463.
crossref pmid pmc
14. Choi HS, Choe JY, Kim H, Han JW, Chi YK, Kim K, et al. Deep learning based low-cost high-accuracy diagnostic framework for dementia using comprehensive neuropsychological assessment profiles. BMC Geriatr 2018;18:234
crossref pmid pmc
15. Lee JH, Lee KU, Lee DY, Kim KW, Jhoo JH, Kim JH, et al. Development of the Korean version of the consortium to establish a registry for Alzheimer’s disease assessment packet (CERAD-K): clinical and neuropsychological assessment batteries. J Gerontol B Psychol Sci Soc Sci 2002;57:P47–P53.
crossref pmid
16. Seo EH, Lee DY, Lee JH, Choo IH, Kim JW, Kim SG, et al. Total scores of the CERAD neuropsychological assessment battery: validation for mild cognitive impairment and dementia patients with diverse etiologies. Am J Geriatr Psychiatry 2010;18:801–809.
crossref pmid
17. Han JW, Kim TH, Kwak KP, Kim K, Kim BJ, Kim SG, et al. Overview of the Korean longitudinal study on cognitive aging and dementia. Psychiatry Investig 2018;15:767–774.
crossref pmid pmc
18. Yoo SW, Kim YS, Noh JS, Oh KS, Kim CH, NamKoong K, et al. [Validity of Korean version of the mini-international neuropsychiatric interview]. Anxiety Mood 2006;2:50–55. Korean.

19. Youn JC, Kim KW, Lee DY, Jhoo JH, Lee SB, Park JH, et al. Development of the subjective memory complaints questionnaire. Dement Geriatr Cogn Disord 2009;27:310–317.
crossref pmid
20. Kim JY, Park JH, Lee JJ, Huh Y, Lee SB, Han SK, et al. Standardization of the Korean version of the geriatric depression scale: reliability, validity, and factor structure. Psychiatry Investig 2008;5:232–238.
crossref pmid pmc
21. Lee DY, Lee KU, Lee JH, Kim KW, Jhoo JH, Kim SY, et al. A normative study of the CERAD neuropsychological assessment battery in the Korean elderly. J Int Neuropsychol Soc 2004;10:72–81.
crossref pmid
22. Choi HJ, Lee DY, Seo EH, Jo MK, Sohn BK, Choe YM, et al. A normative study of the digit span in an educationally diverse elderly population. Psychiatry Investig 2014;11:39–43.
crossref pmid
23. Kim TH, Huh Y, Choe JY, Jeong JW, Park JH, Lee SB, et al. Korean version of frontal assessment battery: psychometric properties and normative data. Dement Geriatr Cogn Disord 2010;29:363–370.
crossref pmid
24. Reisberg B, Ferris SH, de Leon MJ, Crook T. The global deterioration scale for assessment of primary degenerative dementia. Am J Psychiatry 1982;139:1136–1139.
crossref pmid
25. American Psychiatric Association. Diagnostic and statistical manual of mental disorders (4th ed). Washington, DC: American Psychiatric Association; 1994.

26. McKhann G, Drachman D, Folstein M, Katzman R, Price D Stadlan EM. Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology 1984;34:939–944.
crossref pmid
27. Roman GC, Tatemichi TK, Erkinjuntti T, Cummings JL, Masdeu JC, Garcia JH, et al. Vascular dementia: diagnostic criteria for research studies. Report of the NINDS-AIREN International Workshop. Neurology 1993;43:250–260.
crossref pmid
28. McKeith IG. Consensus guidelines for the clinical and pathologic diagnosis of dementia with Lewy bodies (DLB): report of the Consortium on DLB International Workshop. J Alzheimers Dis 2006;9:417–423.
crossref pmid
29. Neary D, Snowden JS, Gustafson L, Passant U, Stuss D, Black S, et al. Frontotemporal lobar degeneration: a consensus on clinical diagnostic criteria. Neurology 1998;51:1546–1554.
crossref pmid
30. Youden WJ. Index for rating diagnostic tests. Cancer 1950;3:32–35.
crossref pmid
31. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44:837–845.
crossref pmid
32. Berg JL, Durant J, Léger GC, Cummings JL, Nasreddine Z, Miller JB. Comparing the electronic and standard versions of the Montreal cognitive assessment in an outpatient memory disorders clinic: a validation study. J Alzheimers Dis 2018;62:93–97.
crossref pmid pmc
33. Mielke MM, Machulda MM, Hagen CE, Edwards KK, Roberts RO, Pankratz VS, et al. Performance of the CogState computerized battery in the Mayo Clinic Study on Aging. Alzheimers Dement 2015;11:1367–1376.
crossref pmid
34. Ruggeri K, Maguire Á, Andrews JL, Martin E, Menon S. Are we there yet? Exploring the impact of translating cognitive tests for dementia using mobile technology in an aging population. Front Aging Neurosci 2016;8:21
crossref pmid pmc
35. Scharre DW, Chang SI, Nagaraja HN, Vrettos NE, Bornstein RA. Digitally translated self-administered gerocognitive examination (eSAGE): relationship with its validated paper version, neuropsychological evaluations, and clinical assessments. Alzheimers Res Ther 2017;9:44
crossref pmid pmc
36. Allard M, Husky M, Catheline G, Pelletier A, Dilharreguy B, Amieva H, et al. Mobile technologies in the early detection of cognitive decline. PLoS One 2014;9:e112197.
crossref pmid pmc
37. Lange S, Süß HM. Measuring slips and lapses when they occur – ambulatory assessment in application to cognitive failures. Conscious Cogn 2014;24:1–11.
crossref pmid
38. Koo BM, Vizer LM. Mobile technology for cognitive assessment of older adults: a scoping review. Innov Aging 2019;3:igy038
crossref pmid pmc
39. Thabtah F, Peebles D, Retzler J, Hathurusingha C. Dementia medical screening using mobile applications: a systematic review with a new mapping model. J Biomed Inform 2020;111:103573
crossref pmid
40. American Psychiatric Association. Diagnostic and statistical manual of mental disorders (5th ed). Arlington: American Psychiatric Association; 2013.

41. Emmady PD, Schoo C, Tadi P. Major neurocognitive disorder (dementia). Treasure Island: StatPearls Publishing; 2022.

42. Levy G, Jacobs DM, Tang MX, Côté LJ, Louis ED, Alfaro B, et al. Memory and executive function impairment predict dementia in Parkinson’s disease. Mov Disord 2002;17:1221–1226.
crossref pmid
43. Kansagara D, Freeman M. A systematic evidence review of the signs and symptoms of dementia and brief cognitive tests available in VA. Washington, DC: Department of Veterans Affairs (US); 2010.

44. de Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in medicine: a practical guide. Cambridge: Cambridge University Press; 2011.