INTRODUCTION
Major depressive disorder (MDD) is a debilitating disease affecting 350 million people globally. MDD is the second leading contributor to global disease burden, as expressed in disability-adjusted life years (DALYs), in both developed and developing countries [1]. Antidepressant (AD) medications, with over 40 compounds belonging to different classes available on the market, are the mainstay of moderate to severe MDD treatment [2-4]. Response to ADs however shows significant inter-individual variability: around one third of patients attains full symptom remission after the first AD treatment, one third needs treatment augmentation or switch in order to reach remission, whilst one third does not respond to two or more ADs prescribed at an adequate dose (treatment resistance) [5,6]. The current paradigm guiding the quest for most effective treatment for each patient is one of trial-and-error (heuristic) prescription. As a consequence, delayed recovery and undesired side effect are frequently encountered along the way, until optimal treatment is identified [7]. In this scenario, the prediction of treatment response has become a crucial but daunting task for clinicians and has spawned the search for biomarkers of treatment response. On the premise that AD response frequently clusters in families [8,9], supporting the hypothesis of a genetic component, and that common genetic variants were estimated to explain 42% of variance in AD response,10 pharmacogenetics aims to match the medication pharmacological profile to the patient’s genetic profile in order to optimize prescription. On this account, pharmacogenetics represents a key contributor to the implementation of precision medicine.
The greatest part of human genetic variability is explained by single nucleotide polymorphisms (SNPs), i.e. single basepair changes in the DNA sequence, explaining why the greatest part of pharmacogenetic studies are focused on this type of variant. Less common and larger variations include deletions, insertions and copy number variations, which have been implicated in psychiatric disorders with a neurodevelopmental component, such as schizophrenia and autism [11,12], but they have probably no relevant role in AD response and tolerability [13,14]. The first pharmacogenetic studies investigated candidate variants in genes thought to play a key role in AD mechanisms of action, such as the serotonin transporter gene, or AD metabolism, but their results revealed that few variants considered individually have no reproducible effect. Another limitation of candidate gene studies is linked to their hypothesis-driven approach: the molecular actors mediating AD effects are only partially known. In the last decade, advances in genotyping technologies and analysis approaches have given fresh momentum to genetic association studies and shifted the conceptual framework from candidate genes to the whole genome and from single polymorphisms to multi-marker approaches. Over 20 years of pharmacogenetic research have provided promising but also inconsistent findings that call for careful evaluation. The present review aims to summarize and discuss the main findings of AD pharmacogenetics as well as the available clinical applications.
CANDIDATE GENE STUDIES: USEFUL FOR STUDYING AD PHARMACOKINETICS
The findings of candidate gene studies were for the most part inconclusive due to the investigation of individual polymorphisms with small effect sizes in underpowered samples (OR usually<2.0 [15]). This led often to non-replication or identification of a much smaller effect size in replication studies compared to the initial ones. Accordingly, a number of historic candidate genes for MDD were not replicated in much larger samples compared to the original studies [16]. The usefulness of candidate gene studies has been called into doubt since the hypothesis that MDD and AD response are highly polygenic and part of the involved loci are presumably in genes without a known link with MDD or AD action.
An exception is represented by cytochrome P450 (CYP450) genes, which show consistent associations with AD side effects and in some cases response and were endorsed by guidelines as validated predictors of AD clinical outcomes [17,18]. The CYP450 superfamily is a class of enzymes with a major role in the oxidation and reduction of both endogenous and xenobiotic substances (phase I reactions). The isoenzymes involved in AD metabolism and endorsed by guidelines as biomarkers of AD clinical outcomes are CYP2D6 and CYP2C19. The genes coding for these isoforms are highly polymorphic, and alleles may show normal, partially or totally defective activity or increased activity, defining functional groups according to the observed allele combination. Subjects carrying two completely defective alleles are defined poor metabolizers (PMs), while ultra-rapid metabolizers (UMs) carry two alleles with increased activity or gene duplications, extensive metabolizers (EMs) do not show any variant allele and have normal enzymatic function. Two intermediate groups are represented by intermediate metabolizers (IMs) and EMs+. These functional groups show a replicated correlation with antidepressant pharmacokinetics (e.g. drug and metabolite plasma levels, plasma halflife, oral clearance) which is stronger for PMs and UMs, the two groups showing the largest differences compared to EMs [19]. Given the known relationship between tricyclic AD (TCAs) plasma levels and response/side effects, the strongest evidence of association between PMs/UMs status and clinical outcomes was found for this AD class and the current guidelines include prescription recommendations for seven TCA ADs [20]. Results were more inconsistent for other AD classes, probably because of the non-linear relationship between drug plasma levels and efficacy/side effects which makes difficult the detection of differences between EMs and PMs or UMs, since PMs and UMs are relatively rare in the general population. Large samples or meta-analyses are needed to capture differences in response/side effects in these groups, as recently demonstrated for citalopram/escitalopram and CYP2C19 [21]. This meta-analysis showed that CYP2C19 PMs have higher risk of side effects during the first month of treatment but also higher remission rates and better symptom improvement after 2–3 months of treatment. The contribution of CYP2D6/CYP2C19 and other candidate genes to the current clinical applications of AD pharmacogenetics is further discussed in paragraph [5].
GENOME WIDE ASSOCIATION STUDIES
Genome-wide association studies (GWAS) ensued the introduction of chip-based microarray technology capable of interrogating up to few millions of polymorphisms, ushering in a new term, pharmacogenomics, which indeed encapsulates the shift from single genes to virtually the whole genome. GWAS were hailed as a major breakthrough for a number of reasons. Moving forward candidate gene studies, they dispense with the need for a priori hypothesis, which is quite convenient considering that the most significantly associated polymorphisms in behavioral GWAS occur in non-coding sequence [22] and the mechanisms of AD action are not fully elucidated. They provide extensive coverage and they allow to analyze sets of variants that are included in a gene or molecular pathway, which then set the stage for multi-marker statistical models such as pathway analysis and polygenic risk scores (PRS) (see paragraph multi-marker approaches). However, while anticipated to mark considerable advancement relatively to candidate gene studies, on the whole GWAS of AD response and their meta-analyses have yet to establish any replicated genetic variant [23-36]. Only sparse polymorphisms indeed met the threshold for genomewide significance (p<5e-08). With reference to GWAS of samples of Caucasian descent, a few significant variants were identified, namely rs1908557, a SNP located within the intron of the human spliced expressed sequence tags in chromosome 4, and rs10769025, a potentially regulatory SNP in the ALX4 gene [32]. ALX4, while most commonly associated with bone development [37], may act indirectly through different signaling cascades to decrease serotonin availability and increase neuroinflammation and aberrant neuronal signaling, all of which are implicated in MDD pathogenesis and AD response [24]. The global landscape for GWAS based on Asian samples does not change much. Only one study reported a couple of positive genomewide significant associations with non-response in the AUTS2 gene (rs7785360 and rs12698828 SNPs) [38]. AUTS2 is implicated in neurodevelopmental disorders including autism [39] but also schizoaffective or bipolar affective disorders [40,41].
While GWAS most often point at non-conding sequence for association with complex traits, some studies focused on exons (the coding segments of the genome) since they may directly affect protein function/levels. In the first whole-exome sequencing study of AD response, rs41271330 A allele in the bone morphogenetic protein 5 (BMP5) gene was identified in association with worse treatment response [42]. This finding seems to echo one of the top findings of a previous GWAS (rs6127921 in the BMP7 gene) [29]. A subsequent study limited to a functional exome array [43] revealed an exome-wide significant finding in a methylated DNA immunoprecipitation sequencing site; furthermore, a combination of three exome variants could reportedly predict AD response with area under the receiver operating characteristic (ROC) curve of 0.95. However, the lack of replication of the three-locus model in three Caucasian GWAS [44] disappointingly hints at some specific characteristics of the Mexican-American sample or a false positive.
The overview of GWAS findings above reveals that, while shedding some light, these studies fell short of providing revolutionary findings. The main weakness in GWAS is that, focusing on genomewide significant hits, they do not really model the polygenic nature of AD response. Polymorphisms with genuine but small effect size may not reach the genome-wide significant threshold in samples of hundreds to a few thousands subjects. In fact, considering that odds ratio of 1.1–1.2 were found for significant GWAS findings, samples running into tens of thousands are probably required. Further, it is difficult to know which non-significant findings may be of some value because their biological role is often unknown. Lastly, genome-wide microarrays include a limited number of pre-selected variants, while over 39 million are reported in the first release of the Haplotype Reference Consortium data, a reference panel of human genetic variation [45]. The variants included in genome-wide microarrays are common variants (observed in >1% of the population) and for this reason GWAS do not investigate the effect of rare genetic polymorphisms which however may play a role in AD response. In fact, the heritability of several common traits remains partly unexplained when considering the results of GWAS compared to twin studies, suggesting that this missing heritability may be explained by rare single nucleotide variants which however require very large sample sizes to be studied with sufficient power [46].
MULTI-MARKER APPROACHES
The availability of genome-wide data and more recently sequencing data enabled analysis approaches that gauge the aggregated effect of variants at the gene and pathway level, i.e. gene and pathway analysis respectively. The conceptual foundation for such approaches is that SNPs do no act as single units but interact among each other, within the same gene and across different genes constituting a pathway. From a statistical standpoint, by reducing the number of performed tests (~20,000 genes are known in the human genome in comparison to tens of million SNPs) and consequently relaxing the multiple test correction, these methods provide higher power than single-variant analysis. More to the point, since heterogeneity is expected to impact more on individual polymorphisms than pathways, these methods can increase the replicability of findings in independent samples. Available pathway analyses consistently pointed to pathways involved in neuroplasticity-neurogenesis and inflammation, even though there was no replication at the individual pathway level. In particular, the long-term potentiation (LTP) pathway [47], the inorganic cation transmembrane transporter activity pathway [28], the GAP43 pathway [48], the cAMP signaling pathway and the chromatin silencing pathway [23] underscore the importance of hippocampal plasticity and neurogenesis, which are indeed mechanisms known to mediate the AD effect [49]. As regards inflammation pathways, the KEGG B cell receptor signaling pathway [50], the antigen processing and presentation pathway and the tumor necrosis factor pathway [47] were suggested to affect antidepressant response. Further, genes involved in extracellular matrix remodeling (e.g., ADAMTSL1, CD36, PON2, APOB, and PIK3R1) and thus modulating the release of inflammatory factors were associated with AD efficacy [30]. These findings suggested a key role for abnormalities in inflammatory cytokine production and immune cell activation in MDD pathogenesis, upon which antidepressants were shown to act [51,52]. Despite preliminary promising results, there are unmet methodological challenges. First, the development of next-generation pathway analysis methods is hindered by low resolution pathway data, missing condition- and cell-specific information, and incomplete annotations. Second, the inability to integrate the dynamic nature of a biological system into the analysis limits the utility of existing methods [53].
PRS is another aggregated approach that has been recently developed in order to take into account the concept that single variants may not have appreaciable effects per se but in conjunction with other variants. PRS is geared to estimate an individual’s propensity to a particular phenotype and it condenses in a single metric the cumulative effect of a number of variants on a complex trait, calculated as the sum of variants associated with the trait weighted by corresponding effect size on the trait. The typical approach of studies using PRS is to estimate the polygenic score in a training sample and then test it in a validation or target sample in order to replicate the predictive value of the PRS. This method is also apt to investigate the shared genetic etiology between complex phenotypes; in particular, risk scores for a number of conditions can be tested for associaton with AD response. PRS did not demonstrate any genetic overlap between AD response and liability to schizophrenia, bipolar disorder or MDD while they showed a genetic overlap between AD response and openness, conscientiousness and neuroticism [26,36,54]. However, in another study the association between neuroticism and less favorable response to SSRIs did not hold significant even though a consistent direction of effect was shown [55]. PRS analysis was also used to explore the relevance of systemic inflammation to AD response, revealing that a PRS associated with C-reactive protein (CRP) levels predicted better response to escitalopram but worse response to nortriptyline, in line with the hypothesis that ADs of different classes may interact differently with inflammatory markers [56]. Lastly, modest genetic overlap in predictors of response to ketamine and scopolamine, two rapid-acting experimental ADs, was suggested in a PRS analysis [25].
Despite PRS are expected to be well suited to address the polygenic nature of AD response, they have not yet yielded a reliable predictor of AD response. There seems however to be room for honing PRS. While the main shortcoming has been the use of underpowered sample size, a number of strategies could in fact be deployed to optimize the power of existing samples, in particular SNP prioritization based on annotation, other corrections such as winner’s curse adjustments [57], analysis of clinically more homogenous groups of patients or specific symptom dimensions and study of the additive effect between PRS and stressful life events [58].
USEFULNESS OF PHARMACOGENOMICS FOR DRUG REPOSITIONING
The multi-marker approaches described in the previous paragraph can provide valuable results not only in terms of discovering genetic predictors of AD response, but they can also provide information on the biological mechanisms involved in AD response which are helpful for identyfying druggable targets and for drug repositioning. Indeed, drugs developed under genetic guidance are reported to be twice as likely to be clinically approved compared to drugs with no genetically supported link to disease traits [59]. Another advantage of genetics for drug repositioning is cost saving. Conventional drug development is an expensive and lengthy process: on average 13–15 years and US$2–3 billion are alloted to the development of a new drug [60], with only a ~10% chance of successfull approval by governmen regulatory agencies [61]. Drug repositioning using molecular targets identified by genetic studies serves as a complementary cost-effective approach to extend existing drugs to the treatment of conditions for which they were not initially intended. While sifting through top GWAS hits and checking for overlap with known drug target genes appears most straightforward, this approach comes with a number of limitations. Top GWAS genes may not be easily targeted by a drug; the majority of them occur within noncoding regions and may thus be difficult to relate to the proteins they regulate; lastly, variants with smaller but genuine effect size may go undetected in GWAS. Multi-marker approaches do a better job of leveraging genomewide data for drug repositioning since they provide a better fit for complex polygenic traits [62]. Pathway analysis revealed that gene-sets targeted by antipsychotics, as well as ADs and anxiolytics, were significantly enriched in GWAS associated with MDD and anxiety [63]. Another study took a complementary approach and compared drug-induced transcriptome against GWAS-imputed expression profiles suggesting that repositioning candidates for a number of disorders were also significantly enriched for known psychiatric medications or therapies considered in clinical trials [64]. Using polygenic analyses, a significantly shared genetic basis of MDD with various cardiometabolic traits was demonstrated and a number of repositioning candidates were revealed at the intersection between the two conditions [65].
CLINICAL APPLICATIONS
Dozens of pharmacogenetic tests aiming to predict antidepressant response and side effects are commercialized. They represent the clinical translation of candidate genes studies which, as discussed above, de facto proved to be unsuited to capture the highly polygenic nature of AD response. According to the Clinical Pharmacogenetics Implementation Consortium (CPIC), the Dutch Pharmacogenetic Working Group (DPWG) and drug regulatory agencies, the only variants that have sufficient level of support are those within CYP2C19 and CYP2D6 genes which affect the functional level of the coded enzyme [17,18,20,66]. Functional variants in these genes are included in all the commercial pharmacogenetic tests, but many of them include polymorphisms in genes involved in AD pharmacodynamics that have not enough support in clinical guidelines and poor experimental support of being cost/effective [67].
First-generation pharmacogenetic decision support tools adopted an individual gene testing (IGT) approach which can be conceptualized as testing pharmacogenomic markers ad hoc and using each gene results separately to predict treatment outcomes. Second-generation tools took instead a combinatorial gene testing (CPGx) approach, i.e. they simultaneously assess the combined effects of multiple pharmacokinetic and pharmacodynamic genes for a given medication to provide a cumulative prediction. CPGx is based on the evidence that most ADs and other psychiatric medications interact with multiple pharmacodynamic and pharmacokinetic pathways; in this manner, synergies between some genes and not just single genes are captured [68].
As with any new intervention, randomized controlled trials (RCTs) are needed to establish pharmacogenetic testing superiority over standard of care. The RCTs published to date presented encouraging results but they were not free from potential sources of bias [69-75]. Of note, some were not double-blind, had power issues and were heterogeneous in their design and methods. These RCTs investigated five different pharmacogenetic tests, each with specific characteristics in terms of included variants and algorithm used to predict treatment outcomes. Therefore, their results may not be generalized and interpreted as broad indicators of pharmacogenetics usefulness in predicting treatment outcomes [67]. A meta-analysis including five RCTs concluded that pharmacogenetic-guided prescribing improved the likelihood of achieving symptom remission. However, this benefit may be limited to individuals with moderate to severe depression and a history of inadequate response or intolerability to previous psychotropic medications, given the inclusion criteria of the considered RCTs. In point of fact, developers of pharmacogenetic tools initially put them forward for pre-emptive use (i.e. prior to prescribing) but they may be more useful in patients who failed at least one previous treatment. Indeed, patients failing to respond to multiple medication trials or experiencing a high side-effect burden may carry more clinically actionable genetic variation [76].
As of yet, pharmacogenetic testing is sometimes refunded by health insurance companies or national health services for patients who did not respond to at least one previous treatment or had not tolerated side effects [77].
However, there is still not definitive demonstration of favorable cost/benefit ratio even when testing only the variants in CYP2C19 and CYP2D6 as recommended by guidelines. Large multi-center projects as Ubiquitous Pharmacogenomics are expected to clarify this point using a randomized design [78].
DISCUSSION
The final aim of pharmacogenetics is to complement clinical criteria used in medication choice with information on the individual’s genetic profile to personalize treatment prescription. While the basic premise appears sound, i.e. matching response/side effects of each AD to the individual’s genetic make-up, results are still inconclusive apart from CYP2D6 and CYP2C19 genes.
The first surge of pharmacogenetic research used the candidate gene approach for more than a decade. It later became clear that, while helping to shed some light on the role of a limited number of genes, the candidate gene design is not suitable to uncover the complexity of AD pharmacogenomics. As a matter of fact, the involved signals are on average very weak and broadly spread across the whole genome as opposed to few hits with a large effect size. Upon the introduction of chipbased microarray technology, GWAS and multi-marker tests (pathway analysis and PRS) took over and provided potential solutions to some of the shortcomings of the candidate gene studies. Indeed, multi-marker tests arguably represent to date the most suitable approach to capture the highly polygenic nature of AD response and they will probably represent the future of AD pharmacogenomics. The use of machine learning algorithms is a promising option to unravel the complexity of the multiple non-linear interactions among the involved genetic variants. Some studies have already tested this approach, despite convincing independent replication is lacking [79-81]. The improvement in our statistical analysis methods and in genotyping technologies will both contribute to the evolution of pharmacogenomics in the next years. For example, the cost of genotyping has shown more than an exponential decrease after 2007 and the cost for sequencing a human genome dropped from $95.263.072 in 2001 to $1.121 in 2017 [82].
Some unsolved issues which may account for the often inconclusive results of pharmacogenomic research should be considered. One important limitation of GWAS has been the use of underpowered sample sizes. The key variables influencing power are the prevalence of the trait of interest (e.g., AD response) and sample size. Testing few millions markers under the assumption of an odds ratio of 1.1–1.2, 5% disease prevalence, 5% minor allele frequency (MAF), complete linkage disequilibrium (LD), requires indeed tens of thousands of cases and controls [83]. The rising of consortia, such as the Psychiatric Genomics Consortium (PGC), aims to collect large cohorts and in this way overcome the power issue of previous studies. An alternative way to increase power is by reducing the prevalence of the trait under study, i.e. by selecting more homogeneous subgroups. As a matter of fact, different definitions of AD response and different assessment scales were used by different studies and usually they were quite unspecific. For example, an improvement of more than 50% on a certain assessment scale was frequently used as response phenotype, without taking into account other sources of heterogeneity such as the number of failed AD treatments during the current episode. Treatment-resistant depression (TRD) is an example of phenotype which could serve the purpose of reducing heterogeneity and focusing on a clinically-relevant group of patients. The reduction of clinical heterogeneity has been successfully applied to improve power in the study of MDD genetics by the CONVERGE consortium, which recruited a sample of Chinese women with recurrent MDD and homogeneous clinical characteristics [84]. Despite a limited sample size (5,303 cases), the study was able to identify two replicated risk loci. Another strategy to address clinical heterogeneity is represented by the use of dimensional classifications over categorical ones. In line with this approach, the US National Institute of Mental Health (NIMH) has launched the Research Domain Criteria (RDoC) project to create a framework for research on pathophysiology. Traditional diagnostic categories such as the Diagnostic and Statistical Manual of Mental Disorders (DSM) [85] and International Classification of Diseases (ICD) [86] generally appear not well-suited to findings emerging from clinical neuroscience and genetics. RDoC seek to revisit psychopathology with the tools of clinical neuroscience by defining basic dimensions of function that cut across and beyond disorders as traditionally defined and can be studied across multiple layers of analysis, from genes to neural circuits to behaviors [87].
Finally, there are categories of biomarkers that are complementary to genomic ones, particularly those studied by epigenomics, transcriptomics and proteomics, and that should therefore be integrated in a broader framework. Indeed, genetic polymorphisms represent somehow a first level of biomarker considering that gene expression depends on several regulatory mechanisms and the final protein levels are affected by gene expression level but also by protein metabolism.
In the precision medicine era, pharmacogenetic testing for psychotropic medication use is increasing among physicians in the United States [88] and Canada [89]. Clinicians should anticipate patients asking about pharmacogenetic testing and inform them about the current clinical applications. It is therefore important for clinicians to keep abreast of this evolving area to best facilitate informed discussions with their patients. As of yet, pharmacogenetic tests currently commercialized, like the candidate gene studies they are grounded on, do not account for the polygenic structure of AD response and polymorphisms within CYP2C19 and CYP2D6 genes are the only tested variants backed by a good level of evidence. Still, based on the meta-analytic validation of RCTs, a ‘tipping point’ of evidence has been reached so that pharmacogenetic decision support tools should merit consideration by clinicians treating patients who have not responded or have not been able to tolerate one or more psychotropic medications.