DOI: https://doi.org/10.4414/smw.2019.20151
One of the most paradoxical phenomena of human behaviours is that in many situations individuals will invest a considerable amount of effort to obtain an object of their desire; however, once they obtain the object they only experience it as little bit pleasant compared with the immense effort they mobilised. This happens in nonpathological behaviours such as occasional overeating, but it becomes emblematic in the case of substance addiction, where individuals are willing to go to extraordinary lengths in order to obtain a substance, even though after a period of time the substance itself elicits less and less pleasurable feelings during its consumption [1].
This kind of pursuit of rewards, such as drugs or alcohol, has been characterised as compulsive because it not only persists despite the total absence of positive consequences but it also persists despite the presence of adverse consequences [2–5]. Compulsive reward-seeking behaviours play an important role in several psychiatric disorders (fig. 1), but in this review we will focus on substance addiction (or substance use disorder). In particular, we will illustrate how stress exacerbates compulsive reward-seeking behaviours and in turn triggers relapses in substance addiction [6].
Strikingly, very large individual differences exist when it comes to the development of compulsive reward-seeking behaviours in substance addiction [7]. Decades of research in affective neuroscience have been devoted to understanding how and why some individuals are more vulnerable than others to situations where choice behaviours are hijacked in the service of outcomes that are no longer – or very little – valued by the individual [1, 4, 5, 8–11]. This research has aimed at identifying the underlying neuropsychological mechanisms that can lead to compulsive reward-seeking behaviours where the amount of effort mobilised to obtain an outcome is no longer justified by its value.
Here, we summarise findings from this literature and we illustrate how the neuropsychological mechanisms implicated in compulsive reward-seeking behaviours can elucidate how and why individuals are more likely to relapse under stress in the case of substance addiction. We summarise literature suggesting that the balance between two fundamental learning systems – goal-directed and habitual learning – is critical for the understanding of compulsive reward-seeking behaviours. We present findings suggesting that individual differences observed during Pavlovian learning could be promising to identify risk profiles for compulsive reward-seeking behaviours. Critically, at each step, we illustrate how these mechanisms can be applied to better understand substance addiction and vulnerabilities to relapse under stress.
The presence of compulsive reward-seeking behaviours is one of the key characteristics of substance addiction. These behaviours are defined by the persistent pursuit of a reward, despite that reward being no longer valued by the individual: there is a disparity between the choice behaviour and the amount of mobilised effort on one hand, and the actual pleasure the individual experiences once that reward is obtained on the other [1, 4, 5]. Affective neuroscience suggests that these kinds of behaviours arise from the existence of different learning mechanisms exerting parallel controls on behaviour [12–15]. Classically, three learning systems controlling behaviour are distinguished: goal-directed control (driven by goals), and habitual control and Pavlovian control (which are both driven by cues).
Pavlovian control is one of the oldest learning mechanisms shared across the animal kingdom and has a deep influence on human physiology [16–18], behaviour [19–22] and cognition [23, 24]. In Pavlovian conditioning, reflexive conditioned behaviours [14, 15] (e.g., salivating) or motivational states [25–27] can be triggered by an environmental stimulus (e.g., a ticking sound from a metronome) that predicts the delivery of an affectively significant outcome (e.g., food). During Pavlovian conditioning, the organism learns the association between a stimulus and an outcome that occurs independently of its behaviour. Conditioned Pavlovian behaviours (e.g., salivating) help the organism prepare for the outcome delivery (e.g., arrival of food), but do not allow the organism to actively increase the probability of obtaining the rewarding outcome [14]. On the other hand, goal-directed control and habitual control involve learning about an instrumental action (e.g., pressing on a lever) that actively increases the possibility of obtaining a reward (e.g., food).
Goal-directed control involves learning the association between a specific action (e.g., ordering a glass of wine) and the current value of its outcome (e.g., pleasure while drinking the glass of wine; fig. 2). This class of behaviours (i.e., goal-directed actions) is driven by a rich representation of the outcome and its causal relationship with the action. Therefore, behaviour under goal-directed control is flexibly proportional to the predicted value of the outcome that it is leading to. Importantly, this flexibility comes at a cost: goal-directed control demands significantly more cognitive resources to implement than habits [9, 14, 28].
Habitual control is highly efficient and can be executed with minimal cognitive effort. Unlike goal-directed actions, habits are driven by environmental cues (e.g., seeing a bar) and are executed reflexively (e.g., automatically ordering a glass of wine) regardless of the current value of the outcome the action is leading to (fig. 2). However, this implies that actions can come to be selected even if the action’s outcome is no longer relevant to or valued by the individual (e.g., ordering a drink even though drinking it is not pleasurable or watching television at night even though it is not liked) [14, 29, 30]. Typically, habits are distinguished from goal-directed actions based on outcome devaluation tests. In classical animal experiments, the outcome (e.g., a food pellet) is devalued either by it being fed until satiation or by it being associated with a state of gastric illness (e.g., induced by lithium chloride injections). If a behaviour persists even after the action’s outcome is no longer valued (e.g., ordering a glass of wine the day after being intoxicated from binge alcohol consumption), then it is considered to be under habitual control.
The psychological distinction between Pavlovian, habitual and goal-driven control is rooted in distinct neuronal networks. At a computational level, the interactions between Pavlovian control and the other two controllers remain relatively poorly investigated [31–34]. However, the interactions and distinctions between the goal-directed and habitual computational strategies have been largely studied [14, 29, 30, 35, 36]. Two algorithms accounting for reinforcement learning (RL) have been used to define the computations and the learning signals of goal-directed versus habitual control. The fundamental difference between these two computational strategies is whether they rely on the representation of an “internal model” of the environmental contingencies. The strategy that relies on an internal model is defined as model-based RL and is used to approximate goal-directed control. Model-based RL is computationally sophisticated: it builds a cognitive map of the environment (its states, actions, and the transition probabilities between them) to plan behaviour prospectively. Therefore if an outcome has been devalued, its associated cues and actions are immediately devalued as well. Model-based computations are mainly implemented in the dorsolateral prefrontal, the orbitofrontal and the posterior parietal cortex [14, 37].
The strategy that does not rely on an internal model is defined as model-free RL and is used to approximate habitual control. Model-free RL is computationally simple as it ascribes value to an action only by integrating past reinforcement experiences: actions that have been rewarded in the past are more likely to be repeated in the present. These algorithms learn retrospectively and are merely updated after an experience with the environment (i.e., reward prediction errors). Therefore, if an outcome has been devalued, its associated cues and actions will be devalued only if they are paired again with the now devalued outcome: to be updated they require a direct associative experience. These computations are mainly implemented in the dopaminergic activity of the striatum [14, 37].
Habits and their corresponding model-free computations are not in and of themselves pathological: the synergy between habitual control and goal-directed control allows individuals to be flexible and efficient in order to face the constant challenges present in the environment and to optimise reward pursuit. For instance, we can habitually and effortlessly always choose to go to the same restaurant for lunch, but if that restaurant is closed, we are able to invest more effort in evaluating different restaurants and to choose the best option (e.g., a restaurant that is in the neighbourhood serving a type of cuisine we like).
Initially, behaviour is predominantly under goal-directed control so that it can be constantly adjusted, but the more the behaviour is repeated the more habitual control becomes predominant so that optimised actions can be executed with much less effort [9]. Another factor determining the predominance of goal-directed behaviour is the cost of cognitive control [38–40]: the organism has limited resources, it should therefore only engage in costly processes such as goal-directed control when this strategy is clearly superior to the other types of control [41, 42]. If a task is very easy or too hard to be solved even through sophisticated model-based computations, then it is unlikely that effortful goal-directed processes will be engaged [43].
The degree to which habitual, goal-directed and Pavlovian control behaviour varies over time and context. An influential theoretical idea is that a number of compulsive behaviours in humans and other animals can be understood as emerging from interactions between two or more of these controllers [12]. In particular, a prominent hypothesis suggests that the imbalance between habitual and goal-directed control is a critical trans-diagnostic neurocognitive mechanism underlying compulsive symptoms in several mental health disorders [44] ranging from substance addiction to eating disorders (e.g., binge eating) and behavioural addictions (e.g., gambling; problematic internet use or gaming disorder [45]). This imbalance favours the rigid habitual control over the flexible goal-directed control, thereby stereotyping behaviour that is triggered by environmental cues and is insensitive to its consequences [3, 44, 46–50]. Importantly, this imbalance is still not fully understood: there are several mechanisms that could drive it such as (1) a potentiation of the habitual responses, (2) an impairment of the goal-directed action, or (3) a dysregulation in the arbitration between the habitual and the goal-directed control (e.g., a problem in the inhibition of the habitual responses) [14, 51].
The role of the imbalance between habits and goal-directed actions has been particularly prominent among theories of substance addiction: initially the substance consumption is mediated by the hedonic experiences associated with the consumption; the transition to addiction is characterised by a reduction of the goal-directed system’s activity with a concomitant predominant engagement of the habitual system in the brain [3, 48]. This makes the individual invest energy in the pursuit of a substance even if it is no longer experienced as pleasurable [1, 52]. A large corpus of animal studies showed that a wide variety of substance-seeking behaviours (i.e., cocaine- [53], alcohol- [54] and nicotine- [55] seeking behaviours) became very rapidly resistant to outcome devaluation procedures: animals continue to pursue these substances even if they have been devalued. Similarly, studies combining neuroimaging and computational modelling in humans suggest that the impairment of model-based representation mediated by the caudate and medial orbitofrontal cortex might be responsible for the excessive habits observed in substance addiction [44, 47].
Though imbalance between goal-directed and habitual control mechanisms is critical in substance addiction, it is not the sole mechanism underlying the transition from voluntary to compulsive consumption [3]: motivational processes involved in Pavlovian learning are also thought to be fundamental in this transition [1, 3, 52, 56]. From the very beginning of research on substance addiction, it has been clear that environmental stimuli associated with substance consumption are able to precipitate relapses even after a prolonged period of abstinence [3, 57–59].
The incentive sensitisation hypothesis of addiction attempts to model the role of Pavlovian learning in substance addiction. A central tenet of this hypothesis is that the neuronal network underlying the cue-triggered motivation for substances (i.e., “wanting”) becomes sensitised and hyper-reactive [1, 52, 60]. This cue-triggered motivation has been investigated for decades in affective neuroscience through a paradigm called Pavlovian-to-instrumental transfer [25–27, 61]. Classical studies using this paradigm show that the perception of a Pavlovian cue (e.g., a sound) that has been previously associated with a rewarding outcome (e.g., food) increases the amount of energy invested in the instrumental action (e.g., pressing the lever to obtain food). This increase of motivation appears upon the presentation of the Pavlovian cue and disappears when the Pavlovian cue is no longer presented; therefore, it is sometimes referred to as a motivational “burst” [62, 63].
This cue-triggered motivation mainly relies on striatal dopamine neurons and is at least partly distinct from the network underlying hedonic pleasure (i.e., “liking”) that relies on a collection of opioid hotspots distributed in cortical and subcortical areas. This neuro-anatomical distinction makes it possible that, in substance addiction, the motivational hyper-reactivity to substance-associated cues does not occur with a parallel increase of the hedonic pleasure associated with the substance consumption. According to the incentive salience hypothesis, the more the consumption continues over time, the more the cue-triggered motivation for substances increases while the actual pleasure during the substance consumption decreases. This mechanism leads to the situation we described earlier wherein an individual becomes more and more motivated to obtain a substance even though they enjoy its consumption less and less.
Critically, in substance addiction, the neuronal network underlying motivation is not constantly hyperactive, but it is hyper-reactive to cues associated with the substance. For these motivational bursts to occur, an interaction between a sensitised brain and the perception of an environmental substance-associated cue is necessary [1, 52]. Initially, it was thought that the prolonged use of substances directly stimulating dopamine release (e.g., cocaine, amphetamines) was essential for the sensitisation process. However, it has been shown that neuronal sensitisation of cue-triggered motivational networks can even occur without substances directly stimulating of dopamine release. Indeed, recent evidence suggests that gambling and binge eating [59, 64–67] might rely on similar neuronal sensitisation resulting in hyper-reactivity to cues related to these addictions. These processes are still poorly understood, but it is hypothesised that sensitisation changes of motivational networks could occur without the need of external substances in vulnerable individuals [1]. Note that in the case of binge eating, the nature of the addiction-like subtype is still debated: whereas some authors propose that it could be conceptualised as a behavioural addiction to eating (similar to gambling or compulsive gaming), others propose that it should rather be considered as an addiction to high fat/sugar foods (similar to substance addiction) [68].
Similar to habits, Pavlovian motivational mechanisms are also triggered by environmental cues. This common characteristic led many authors to think that the interaction between the habitual and the Pavlovian control could be key for understanding the emergence of compulsive reward-seeking behaviours [3, 12, 15, 69, 70]. Despite this hypothesis’ potential, a clear computational model underlying this interaction is still lacking.
It is important to note that theories of addiction have long postulated the existence of parallel and potentially independent processes with different degrees of automaticity and awareness, such as automatic action schemata and the subjective feeling of urge [71]. Recent theoretical accounts distinguish several separable entities such as craving and relapse [49, 72]. Although the cue-triggered mechanisms we described are thought to lead to relapse, they are not necessarily involved in craving. Inversely, although goal-directed mechanisms could lead to craving and explicit desire for a substance, they will not necessarily lead to relapse [49, 72].
Addiction has been defined as a chronically relapsing disorder [56], a relapse after periods of abstinence being in its intrinsic nature [73]. A classical observation in clinical studies is that stress is a critical factor associated with relapse [6, 73–75] in individuals who abuse cocaine [76], alcohol [77] and nicotine [78]. The consistency of this observation has given rise to a new recent research line aiming at understanding the neuropsychological mechanisms underlying stress-induced relapse [6, 73–75].
Stress has typically been conceptualised in the framework of emotion research [79], with current conceptualisations in affective neuroscience considering that each emotion is composed of two phases: (1) an emotion elicitation process, mainly driven by appraisal mechanisms, that elicits (2) the emotion response composed of a psychophysiological bodily reaction, a motor expression, an action tendency and a feeling [80, 81]. Although there are theoretical debates concerning whether stress can be considered a typical emotion or not, in particular because of its relatively long duration compared with typical emotions, it is conceptually useful to distinguish between these two phases. In terms of elicitation, an event is usually stressful when the individual (1) appraises the event as threatening to their physiological and psychological integrity and (2) appraises their resources as insufficient to successfully cope with such an event [82]. Generally, in everyday life, powerful stressors are situations representing a threat to the self because of a form of social evaluation such as being judged by other people [83]. The stress response includes cognitive, affective, and physiological components. Research on stress and reward-seeking behaviours mainly focuses on the physiological component that is characterised by the activation of dopaminergic and noradrenergic systems [84, 85], the sympathetic nervous system as well as the hypothalamic-pituitary-adrenal axis (HPA) [86]. The activation of the HPA leads to the secretion of glucocorticoids (cortisol in humans) and numerous other hormones, neuropeptides and neurotransmitters [87]. Important individual differences exist with respect not only to stress elicitation but also to how people adapt after a stressful episode, for instance, as a function of resilience factors [88].
Strikingly, recent evidence suggests that stress might amplify the control that cue-triggered mechanisms exert on behaviour (i.e., habitual and Pavlovian controls), which are thought to underlie the transition from voluntary to compulsive reward-seeking behaviour [1, 3] and to play a critical role in relapse [49].
Growing evidence shows that stress shifts the control toward stimulus-response striatal mechanisms increasing the imbalance between habitual and goal-directed behavioural controls. As previously described, this imbalance is believed to underlie the transition from voluntary to compulsive substance consumption [3, 48]. Animal literature has demonstrated that stress reduces rodents’ sensitivity to outcome devaluation – the classical marker of habitual behaviour. Stressed rodents kept performing an instrumental action (e.g., pressing on a lever) even if it led to an outcome (i.e., food) that had been devalued, whereas non-stressed rodents adapted their behaviour to the new outcome value [89]. Translational studies on a human population show similar findings. A behavioural induction of a stressful state decreased the individuals’ sensitivity to outcome devaluation: stressed participants kept performing an instrumental action even when it was leading to a food outcome that had been previously devalued [85, 90]. This bias toward habitual behaviour appears to be induced by the physiological stress response, particularly by the concurrent activation of the glucocorticoid and noradrenergic systems [91–94]. Research on the influence of stress on habits and goal-directed actions provided the foundation for a new line of studies investigating the neuro-computational mechanisms underlying this stress-induced bias toward habitual behaviour. Evidence from these studies suggests that stress might impair the prefrontal circuitry supporting the model-based computation underlying the goal-directed actions and might increase the striatal activity supporting the model-free computation underlying habits [95–97].
This promising line of research suggests an interesting mechanism might underlie stress-induced relapse in substance addiction: Under stressful conditions, the organism mainly relies on striatal model-free mechanisms, with environmental cues triggering habitual substance-seeking behaviour, no matter whether the substance is currently liked or not, and independently of the organism’s current goals (e.g., the goal of not consuming the substance). The habitual control determines what kind of behavioural routine is executed (i.e., whether a specific substance-seeking behaviour is initiated upon cue perception). However, it does not appear to determine the motivational intensity with which these behavioural routines are executed (i.e., the amount of effort mobilised in a specific substance-seeking behaviour). The motivational intensity determining effort mobilisation seems to be rather controlled by Pavlovian mechanisms [3, 58, 98, 99].
Pavlovian cues have a powerful motivational influence on behaviour [19–22, 100–103]. The motivational influence of Pavlovian cues (e.g., food-associated sounds) critically depends on the physiological state (e.g., hunger) of the organism at the moment of the Pavlovian cue perception [104–110]. A famous series of studies conducted on rodents demonstrated that increased mesolimbic dopamine activity amplifies the motivational control exerted by the Pavlovian cues [103, 107, 108]. Interestingly, manipulations of stress hormones may have similar behavioural consequences than manipulations of mesolimbic dopamine. In an experiment, researchers injected the nucleus accumbens of a rodent population with corticotrophin-releasing factor (CRF) – a hormone critically involved in the physiological stress response. The rodents with an elevated CRF neurotransmission invested three times more energy into reward-seeking behaviours after perceiving Pavlovian cues than the rodents that did not receive the CRF injections [111]. Moreover, consistent results have been found in humans: participants who had an elevated cortisol level after a behavioural induction of stress mobilised more energy in reward-seeking behaviours upon the presentation of Pavlovian cues than participants who were not stressed [102]. These findings are in line with the incentive salience hypothesis: stress does not globally increase reward-seeking behaviours but rather makes the organisms more reactive to cues that have been associated with a reward (i.e., Pavlovian cues). The effects of stress are therefore contingent on the perception of Pavlovian cues, meaning that stress is not supposed to increase spontaneous reward-seeking behaviours when a cue is not present. The amplified cue-triggered motivational bursts decay quickly after the Pavlovian cue is removed but re-appear quickly when the Pavlovian cue is re-encountered [112].
Research showing that stress increases habitual and Pavlovian controls, but not goal-directed control, suggests that stress-induced relapse might critically depend on the interaction between stress and the encounter of some environmental substance-associated cues. These cues can trigger stimulus-response habits and/or Pavlovian motivational bursts. Crucially, cues associated with previous substance consumption are particularly difficult to ignore: these cues are perceptually salient and the individual’s attention is rapidly and involuntarily oriented toward them (i.e., attentional bias) [24, 113–116]. Evidence suggests that stress amplifies this attentional bias toward substance-associated cues in individuals reporting problematic substance consumption [117, 118]. Such research illustrates how a stressed person is more likely to detect environmental cues associated with the substance they are addicted to (fig. 3).
The typical “relief” assumption proposes that stressed individuals (e.g., after a conflict with a significant other) seek substances (e.g., consuming cannabis) in the hope that the reward (e.g., the cannabis) will provide some relief from the negative affective state related to the stress. Although this might be the case, the stress amplification of cue-triggered control we described in the previous section relies on mechanisms different from the attempt to reduce a stress-induced negative affective state. Those substance-seeking behaviours would be driven by the goal of reducing a state of distress, therefore relying on the representation of some of the substance’s value properties (e.g., providing relief). On the other hand, substance-seeking habits are possibly completely independent of any kind of substance value representation: cues trigger reward-seeking behaviours, even if there is no explicit goal to consume the reward, and even if the reward is currently devalued by the individual [3, 14, 58]. Similar to habits, the stress amplification of Pavlovian motivation appears to be independent of the reward’s hedonic properties as such. Preliminary evidence indeed suggests that stress might activate the neuronal network underlying cue-triggered motivation without altering the network underlying pleasure [98, 111]. Stressed participants mobilise more effort upon perception of Pavlovian cues, but they do not appear to enjoy the reward more when they finally obtain it [102]. Therefore, it is possible that under stress the substance is sought because the individual thinks that it would reduce negative affect but, in fact, such a phenomenon relies on mechanisms different from the amplification of cue-triggered controllers; notably, it relies on goal-directed processes. This implies that these two phenomena (cue-triggered substance-seeking behaviour versus a substance sought with the goal of decreasing negative affect) can co-exist but also exist independently from each other, so that it is possible that stress increases substance-seeking behaviours independently of the hedonic properties of the substance and the intention of down-regulating stress.
An interesting recent account proposes that the influence of stress on substance-seeking behaviours could be circular [119, 120]. We reviewed evidence suggesting that the stress-induced shift from goal-directed to habitual control relies on the activation of the glucocorticoid system [91–94], but strikingly, the consumption of addictive substances (e.g., nicotine, cannabis, alcohol, stimulants and opioids) increases the level of glucocorticoids (for a review see [119]). Although this substance-induced increase of glucocorticoid does not correlate with a subjective feeling of negative affect classically experienced under stress, it has been proposed that it contributes to the consolidation of cue-response memories supporting habitual learning and cue-affect memories supporting Pavlovian learning. The actual substance consumption may thereby reinforce the strength of the Pavlovian and habitual controls through neuronal mechanisms shared with those involved in stress-induced relapse. This implies that relapsing and consuming could make the individual more vulnerable to relapse under stress in the future.
In sum, converging evidence from animal and human literature suggests that stress increases the control of cue-driven controls (Pavlovian and habitual) over the goal-directed control. Recent neuropharmacological studies shed some light on the neuronal mechanisms underlying this shift. They suggested that the combined increase of glucocorticoids and noradrenaline induced by acute stress increases the activity of the striatal brain region known to be involved in cue-driven control and decreases the activity of brain areas such as the hippocampus and the orbitofrontal cortex that are known to be involved in goal-directed processes [121, 122]. This shift is thought to be orchestrated by the amygdala: the increase of noradrenaline and glucocorticoids acts through mineralocorticoid receptors increasing the amygdala-dorsal striatum connectivity and reducing amygdala-hippocampus connectivity [122]. The stress-induced shift of the neuronal network activity could potentially create a situation in which individuals are vulnerable to relapse in substance addiction. Indeed, it could make individuals (1) more prone to perceive substance associated cues, (2) more likely to react upon their perception with substance-seeking habits invigorated by Pavlovian motivation and (3) less likely to inhibit these processes through voluntary and effortful goal-directed control.
Similar to most clinical observations, stress-induced relapse is subject to a large individual variability. For instance, the stress-induced shift from model-based to model-free dominant computations is modulated by the individual’s working memory capacities. Model-based computations seem to rely on working memory [96] and stress depletes working memory resources. In stressful situations, individuals with large working memory resources tend to choose model-based computations more often than individuals with lower working memory resources [96, 123]. Given the clinical relevance and the large variability of relapses under stress, there is a strong interest in investigating whether there are specific markers of relapse risk. The identification of vulnerability profiles could account for the individual variability observed in clinical practice and, even more importantly, represent a target for new treatments to prevent relapse. Several approaches have been taken in the quest to identify risk profiles. Traditionally, this line of research aimed at finding biological markers, and identified two potential markers of vulnerability: the role of D2 dopamine receptors’ down regulation in the ventral striatum [124, 125] as well as the individual’s physiological responsivity (e.g., measured by adrenocorticotropic hormones) to stressors [120]. A recent approach has proposed to identify vulnerabilities through computational phenotypes [126]. In this new approach, a cognitive process (e.g., instrumental learning) is formalised in a mathematical model that contains a set of free parameters (e.g., a learning rate that defines how quickly an individual learns). These free parameters are estimated based on the neuronal, the peripheral physiological, or the behavioural data collected from an individual performing a particular task (e.g., a learning task). A computational phenotype is thus a set of parameters derived from an individual’s performance allowing characterisation of their individual cognitive style [126]. This approach suggests that an individual’s predisposition to a model-free rather than model-based cognitive style may represent a computational phenotype underlying several disorders involving compulsions, from substance addiction to obsessive compulsive disorders [44].
Moreover, individual differences in cognitive styles have also been proposed to represent a vulnerability factor for addiction in the literature investigating Pavlovian conditioning in rodents [127, 128]. Surprisingly, despite the ubiquity of Pavlovian conditioning in the animal kingdom, the exact value computations underlying Pavlovian learning have not been elucidated. Traditionally, neurobiological substrates of Pavlovian conditioning have been modelled through model-free reinforcement learning [129, 130]. More recently, it has been suggested that model-based mechanisms could also support Pavlovian learning [62, 109, 110, 131, 132] or even other mechanisms that do not appear to be easily classified into the model-based or model-free taxonomy [131]. Such heterogeneity of Pavlovian learning mechanisms might be the basis of individual differences in what is one of the most basic forms of cognitive processes. In animal studies, such individual differences have consistently been observed during the presentation of the Pavlovian cue (e.g., a cue associated with food delivery in a cup) [127, 128, 133]. Some animals approach and engage with the Pavlovian cue itself (e.g., approach the cue), whereas other animals approach the location where the food will be delivered (e.g., approach the cup). Animals that approach the Pavlovian cue are considered sign-trackers and animals that approach the location of the reward delivery are considered goal-trackers [127, 128, 133]. These behavioural differences seem to be underlain by distinct neuro-computations: goal-trackers appear to use cortical model-based mechanisms while sign-trackers mainly rely on model-free striatal dopaminergic signals [133, 134] (fig. 3).
In rodents, findings suggest that these individual differences observed during Pavlovian learning can predict compulsive substance-seeking behaviours [135–140]. This has been tested by identifying sign- versus goal-tracking propensities in Pavlovian conditioning involving food rewards, and subsequently exposing the sign- and goal-tracker animals to the environment where they could self-administer the substance. Animals identified as sign-trackers showed several compulsive substance-seeking behaviours that animals identified as goal-trackers did not show. Compared with goal-trackers, sign-trackers had a higher preference for cocaine over food [140]. Strikingly, sign-trackers exhibited very robust cue-triggered substance-seeking behaviours, whereas goal-trackers appeared to be less susceptible to the influence of environmental cues. In sign-trackers, the presentation of a substance-associated cue triggered strong substance-seeking behaviours even after a long period of time (e.g., several days) when the substance was no longer available [135]. Moreover, upon the presentation of a substance-associated cue, sign-trackers showed reward-seeking behaviours that persisted despite the presence of aversive consequences [136, 138, 139]. Sign-trackers’ increased responsiveness to substance-associated cues could represent an important vulnerability to relapse under stress. As previously mentioned, stress amplifies the control of cue-triggered mechanisms such as Pavlovian motivation and habits. Individuals with sign-tracking tendencies – being attracted to environmental cues – could be particularly vulnerable to the influence of stress on their substance-seeking behaviours (fig. 4). Therefore, individual differences in Pavlovian learning appear to represent a promising phenotype for the vulnerability to relapse under stress in addiction. Relapses under stress are typically triggered by substance-associated cues and sign-trackers’ behaviour appears to be highly sensitive to environmental cues. Although individual differences in Pavlovian learning have only been directly investigated on compulsive substance-seeking behaviours, it is nonetheless likely that they could represent a more general trans-diagnostic phenotype for diverse kinds of compulsive reward-seeking behaviours underlying symptoms in a variety of psychiatric disorders such as gambling or binge eating.
Despite the potential contribution that sign- versus goal-tracking propensities could have for identifying risk profiles for substance addiction, the development of paradigms measuring these individual differences in humans is just beginning [141–143]. A consistent amount of research has shown that – similar to animals – humans can also be attracted to the Pavlovian cue themselves and, typically, human attention is very rapidly oriented toward Pavlovian cues [24, 116]. However, to date, only a few studies have systematically investigated the individual differences in human sign- versus goal-tracking behaviours [142, 143]. These new translational studies are developing paradigms that measure conditioned approaches toward the Pavlovian cue or the location of the reward delivery in a human population. This first evidence suggests that the interindividual differences observed in animals are also present in humans. If these differences were to be consistently replicated in future studies, it would be important to test how human sign-trackers react to cues under stress compared with human goal-trackers and whether, like animals, they also rely differently on striatal networks in decision-making processes. Importantly, the study of a human population would also allow us to directly test whether individuals who show sign-tracking behaviours in Pavlovian conditioning paradigms are also the ones who report a higher rate of relapse under stress.
In this review, we specifically focused on the role of stress and relapse in compulsive reward-seeking behaviours. However, a multitude of other mechanisms have been described as being involved in addiction, such as cravings underlined by goal-directed processes [49], and consumption driven by the motivation of reducing withdrawal (e.g., the negative emotional state that emerges when the access to the substance is prevented) [144]. In sum, there are many paths that can lead to substance addiction and very different mechanisms that could be at play in different individuals. If future translational research succeeds in identifying sign- versus goal-tracking propensities in humans, this could provide a promising target for personalised clinical interventions to prevent a particular kind of relapse in substance addiction but as well as in other different disorders characterised by compulsive reward-seeking behaviours.
Recent research in human affective neuroscience shed light on the mechanisms that might underlie stress-induced relapse in addiction. Stress appears to shift the balance between the different behavioural controllers favouring cue-driven over goal-driven mechanisms. This shift makes it so that substance-associated cues are more likely to trigger habitual behaviours as well as Pavlovian motivation, thereby eliciting rigid motivationally intense habitual behaviours that are very difficult to inhibit.
These mechanisms, however, do not appear to be specific to addiction but rather underlie compulsive reward-seeking symptoms in a variety of psychiatric disorders such as gambling or binge eating. Models drawn from affective neuroscience could provide a trans-diagnostic framework for the investigation of large individual differences in stress-induced relapses, which are observed in clinical practice. This line of research aims at identifying individual profile risks for disorders characterised by compulsive reward-seeking behaviours. This could help the development of personalised evidence-based treatments targeting specific neuro-computational mechanisms rather than diagnostic categories.
The authors thank Sacha di Poi for the illustrations and Dr Vanessa Sennwald for her insightful comments.
The work was supported by the Swiss Centre for Affective Sciences, University of Geneva.
No potential conflict of interest relevant to this article was reported.
1 Berridge KC , Robinson TE . Liking, wanting, and the incentive-sensitization theory of addiction. Am Psychol. 2016;71(8):670–9. doi:.https://doi.org/10.1037/amp0000059
2American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5®). Arlington (VA): American Psychiatric Publishing; 2013.
3 Everitt BJ , Robbins TW . Drug addiction: Updating actions to habits to compulsions ten years on. Annu Rev Psychol. 2016;67(1):23–50. doi:.https://doi.org/10.1146/annurev-psych-122414-033457
4 Giuliano C , Peña-Oliver Y , Goodlett CR , Cardinal RN , Robbins TW , Bullmore ET , et al. Evidence for a long-lasting compulsive alcohol seeking phenotype in rats. Neuropsychopharmacology. 2018;43(4):728–38. doi:.https://doi.org/10.1038/npp.2017.105
5 Koob GF , Volkow ND . Neurocircuitry of addiction. Neuropsychopharmacology. 2010;35(1):217–38. doi:.https://doi.org/10.1038/npp.2009.110
6 Sinha R . Chronic stress, drug use, and vulnerability to addiction. Ann N Y Acad Sci. 2008;1141(1):105–30. doi:.https://doi.org/10.1196/annals.1441.030
7 Chen C-Y , O’Brien MS , Anthony JC . Who becomes cannabis dependent soon after onset of use? Epidemiological evidence from the United States: 2000-2001. Drug Alcohol Depend. 2005;79(1):11–22. doi:.https://doi.org/10.1016/j.drugalcdep.2004.11.014
8 Everitt BJ , Robbins TW . Drug addiction: updating actions to habits to compulsions ten years on. Annu Rev Psychol. 2016;67(1):23–50. doi:.https://doi.org/10.1146/annurev-psych-122414-033457
9 Balleine BW , O’Doherty JP . Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology. 2010;35(1):48–69. doi:.https://doi.org/10.1038/npp.2009.131
10 Balleine B . Instrumental performance following a shift in primary motivation depends on incentive learning. J Exp Psychol Anim Behav Process. 1992;18(3):236–50. doi:.https://doi.org/10.1037/0097-7403.18.3.236
11 Balleine BW , Dickinson A . Instrumental performance following reinforcer devaluation depends upon incentive learning. Q J Exp Psychol. 1991;43:279–96.
12 Dayan P , Niv Y , Seymour B , Daw ND . The misbehavior of value and the discipline of the will. Neural Netw. 2006;19(8):1153–60. doi:.https://doi.org/10.1016/j.neunet.2006.03.002
13 Dickinson A , Balleine B , Watt A , Gonzalez F , Boakes RA . Motivational control after extended instrumental training. Anim Learn Behav. 1995;23(2):197–206. doi:.https://doi.org/10.3758/BF03199935
14 O’Doherty JP , Cockburn J , Pauli WM . Learning, reward, and decision making. Annu Rev Psychol. 2017;68(1):73–100. doi:.https://doi.org/10.1146/annurev-psych-010416-044216
15 Rangel A . Regulation of dietary choice by the decision-making circuitry. Nat Neurosci. 2013;16(12):1717–24. doi:.https://doi.org/10.1038/nn.3561
16 Stussi Y , Delplanque S , Coraj S , Pourtois G , Sander D . Measuring Pavlovian appetitive conditioning in humans with the postauricular reflex. Psychophysiology. 2018;55(8):e13073. doi:.https://doi.org/10.1111/psyp.13073
17 Stussi Y , Pourtois G , Sander D . Enhanced Pavlovian aversive conditioning to positive emotional stimuli. J Exp Psychol Gen. 2018;147(6):905–23. doi:.https://doi.org/10.1037/xge0000424
18 Pauli WM , Larsen T , Collette S , Tyszka JM , Seymour B , O’Doherty JP . Distinct contributions of ventromedial and dorsolateral subregions of the human substantia nigra to appetitive and aversive learning. J Neurosci. 2015;35(42):14220–33. doi:.https://doi.org/10.1523/JNEUROSCI.2277-15.2015
19 Bray S , Rangel A , Shimojo S , Balleine B , O’Doherty JP . The neural mechanisms underlying the influence of pavlovian cues on human decision making. J Neurosci. 2008;28(22):5861–6. doi:.https://doi.org/10.1523/JNEUROSCI.0897-08.2008
20 Prévost C , Liljeholm M , Tyszka JM , O’Doherty JP . Neural correlates of specific and general Pavlovian-to-Instrumental Transfer within human amygdalar subregions: a high-resolution fMRI study. J Neurosci. 2012;32(24):8383–90. doi:.https://doi.org/10.1523/JNEUROSCI.6237-11.2012
21 Talmi D , Seymour B , Dayan P , Dolan RJ . Human pavlovian-instrumental transfer. J Neurosci. 2008;28(2):360–8. doi:.https://doi.org/10.1523/JNEUROSCI.4028-07.2008
22 Trick L , Hogarth L , Duka T . Prediction and uncertainty in human Pavlovian to instrumental transfer. J Exp Psychol Learn Mem Cogn. 2011;37(3):757–65. doi:.https://doi.org/10.1037/a0022310
23 Bucker B , Theeuwes J . Stimulus-driven and goal-driven effects on Pavlovian associative reward learning. Vis Cogn. 2018;26(2):131–48. doi:.https://doi.org/10.1080/13506285.2017.1399948
24 Pool E , Brosch T , Delplanque S , Sander D . Where is the chocolate? Rapid spatial orienting toward stimuli associated with primary rewards. Cognition. 2014;130(3):348–59. doi:.https://doi.org/10.1016/j.cognition.2013.12.002
25 Bindra D . A motivational view of learning, performance, and behavior modification. Psychol Rev. 1974;81(3):199–213. doi:.https://doi.org/10.1037/h0036330
26 Bolles RC . Reinforcement, expectancy, and learning. Psychol Rev. 1972;79(5):394–409. doi:.https://doi.org/10.1037/h0033120
27 Toates F . The interaction of cognitive and stimulus-response processes in the control of behaviour. Neurosci Biobehav Rev. 1997;22(1):59–83. doi:.https://doi.org/10.1016/S0149-7634(97)00022-5
28 Otto AR , Gershman SJ , Markman AB , Daw ND . The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol Sci. 2013;24(5):751–61. doi:.https://doi.org/10.1177/0956797612463080
29 Daw ND , Niv Y , Dayan P . Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005;8(12):1704–11. doi:.https://doi.org/10.1038/nn1560
30 Pauli WM , Cockburn J , Pool ER , Pérez OD , O’Doherty JP . Computational approaches to habits in a model-free world. Curr Opin Behav Sci. 2018;20:104–9. doi:.https://doi.org/10.1016/j.cobeha.2017.12.001
31 Guitart-Masip M , Duzel E , Dolan R , Dayan P . Action versus valence in decision making. Trends Cogn Sci. 2014;18(4):194–202. doi:.https://doi.org/10.1016/j.tics.2014.01.003
32 Guitart-Masip M , Huys QJM , Fuentemilla L , Dayan P , Duzel E , Dolan RJ . Go and no-go learning in reward and punishment: interactions between affect and effect. Neuroimage. 2012;62(1):154–66. doi:.https://doi.org/10.1016/j.neuroimage.2012.04.024
33 Huys QJM , Cools R , Gölzer M , Friedel E , Heinz A , Dolan RJ , et al. Disentangling the roles of approach, activation and valence in instrumental and pavlovian responding. PLOS Comput Biol. 2011;7(4):e1002028. doi:. Correction in: PLOS Comput Biol. 2014;10(1): e619f5526. doi: https://doi.org/10.1371/journal.pcbi.1002028
34 Lindström B , Golkar A , Olsson A . A clash of values: Fear-relevant stimuli can enhance or corrupt adaptive behavior through competition between Pavlovian and instrumental valuation systems. Emotion. 2015;15(5):668–76. doi:.https://doi.org/10.1037/emo0000075
35 Robbins TW , Costa RM . Habits. Curr Biol. 2017;27(22):R1200–6. doi:.https://doi.org/10.1016/j.cub.2017.09.060
36 Niv Y , Daw ND , Dayan P . Choice values. Nat Neurosci. 2006;9(8):987–8. doi:.https://doi.org/10.1038/nn0806-987
37 Gläscher J , Daw N , Dayan P , O’Doherty JP . States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron. 2010;66(4):585–95. doi:.https://doi.org/10.1016/j.neuron.2010.04.016
38 Shenhav A , Musslick S , Lieder F , Kool W , Griffiths TL , Cohen JD , et al. Toward a rational and mechanistic account of mental effort. Annu Rev Neurosci. 2017;40(1):99–124. doi:.https://doi.org/10.1146/annurev-neuro-072116-031526
39 Botvinick M , Braver T . Motivation and cognitive control: from behavior to neural mechanism. Annu Rev Psychol. 2015;66(1):83–113. doi:.https://doi.org/10.1146/annurev-psych-010814-015044
40 Kool W , McGuire JT , Wang GJ , Botvinick MM . Neural and behavioral evidence for an intrinsic cost of self-control. PLoS One. 2013;8(8):e72626. doi:.https://doi.org/10.1371/journal.pone.0072626
41 Lieder F , Shenhav A , Musslick S , Griffiths TL . Rational metareasoning and the plasticity of cognitive control. PLOS Comput Biol. 2018;14(4):e1006043. doi:.https://doi.org/10.1371/journal.pcbi.1006043
42 Shenhav A , Botvinick MM , Cohen JD . The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron. 2013;79(2):217–40. doi:.https://doi.org/10.1016/j.neuron.2013.07.007
43 Westbrook A , Braver TS . Dopamine does double duty in motivating cognitive effort. Neuron. 2016;89(4):695–710. doi:.https://doi.org/10.1016/j.neuron.2015.12.029
44 Voon V , Derbyshire K , Rück C , Irvine MA , Worbe Y , Enander J , et al. Disorders of compulsivity: a common bias towards learning habits. Mol Psychiatry. 2015;20(3):345–52. doi:.https://doi.org/10.1038/mp.2014.44
45 Billieux J , Flayelle M , Rumpf HJ , Stein DJ . High involvement versus pathological involvement in video games: a crucial distinction for ensuring the validity and utility of gaming disorder. Curr Addict Rep. 2019;6(3):323–30. doi:.https://doi.org/10.1007/s40429-019-00259-x
46 Gillan CM , Fineberg NA , Robbins TW . A trans-diagnostic perspective on obsessive-compulsive disorder. Psychol Med. 2017;47(9):1528–48. doi:.https://doi.org/10.1017/S0033291716002786
47 Gillan CM , Kosinski M , Whelan R , Phelps EA , Daw ND . Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife. 2016;5:e11305. doi:.https://doi.org/10.7554/eLife.11305
48 Lucantonio F , Caprioli D , Schoenbaum G . Transition from ‘model-based’ to ‘model-free’ behavioral control in addiction: Involvement of the orbitofrontal cortex and dorsolateral striatum. Neuropharmacology. 2014;76(Pt B):407–15. doi:.https://doi.org/10.1016/j.neuropharm.2013.05.033
49 Redish AD , Jensen S , Johnson A . A unified framework for addiction: vulnerabilities in the decision process. Behav Brain Sci. 2008;31(4):415–37, discussion 437–87. doi:.https://doi.org/10.1017/S0140525X0800472X
50 Byrne KA , Otto AR , Pang B , Patrick CJ , Worthy DA . Substance use is associated with reduced devaluation sensitivity. Cogn Affect Behav Neurosci. 2019;19(1):40–55. doi:.https://doi.org/10.3758/s13415-018-0638-9
51 Robbins TW , Vaghi MM , Banca P . Obsessive-Compulsive Disorder: puzzles and prospects. Neuron. 2019;102(1):27–47. doi:.https://doi.org/10.1016/j.neuron.2019.01.046
52 Robinson TE , Berridge KC . Incentive-sensitization and addiction. Addiction. 2001;96(1):103–14. doi:.https://doi.org/10.1046/j.1360-0443.2001.9611038.x
53 Miles FJ , Everitt BJ , Dickinson A . Oral cocaine seeking by rats: action or habit? Behav Neurosci. 2003;117(5):927–38. doi:.https://doi.org/10.1037/0735-7044.117.5.927
54 Corbit LH , Nie H , Janak PH . Habitual alcohol seeking: time course and the contribution of subregions of the dorsal striatum. Biol Psychiatry. 2012;72(5):389–95. doi:.https://doi.org/10.1016/j.biopsych.2012.02.024
55 Clemens KJ , Castino MR , Cornish JL , Goodchild AK , Holmes NM . Behavioral and neural substrates of habit formation in rats intravenously self-administering nicotine. Neuropsychopharmacology. 2014;39(11):2584–93. doi:.https://doi.org/10.1038/npp.2014.111
56 Koob GF , Volkow ND . Neurobiology of addiction: a neurocircuitry analysis. Lancet Psychiatry. 2016;3(8):760–73. doi:.https://doi.org/10.1016/S2215-0366(16)00104-8
57 Childress AR , Mozley PD , McElgin W , Fitzgerald J , Reivich M , O’Brien CP . Limbic activation during cue-induced cocaine craving. Am J Psychiatry. 1999;156(1):11–8. doi:.https://doi.org/10.1176/ajp.156.1.11
58 Everitt BJ , Dickinson A , Robbins TW . The neuropsychological basis of addictive behaviour. Brain Res Brain Res Rev. 2001;36(2-3):129–38. doi:.https://doi.org/10.1016/S0165-0173(01)00088-1
59 O’Brien CP , Childress AR , Ehrman R , Robbins SJ . Conditioning factors in drug abuse: can they explain compulsion? J Psychopharmacol. 1998;12(1):15–22. doi:.https://doi.org/10.1177/026988119801200103
60 Robinson TE , Berridge KC . The neural basis of drug craving: an incentive-sensitization theory of addiction. Brain Res Brain Res Rev. 1993;18(3):247–91. doi:.https://doi.org/10.1016/0165-0173(93)90013-P
61 Lovibond PF . Facilitation of instrumental behavior by a Pavlovian appetitive conditioned stimulus. J Exp Psychol Anim Behav Process. 1983;9(3):225–47. doi:.https://doi.org/10.1037/0097-7403.9.3.225
62Berridge KC, O’Doherty JP. Chapter 18 - From experienced utility to decision utility. In: Glimcher PW and Fehr E, editors. Neuroeconomics (Second Edition). San Diego: Academic Press; 2014. p. 335-51.
63 Berridge KC . Wanting and liking: observations from the neuroscience and psychology laboratory. Inquiry (Oslo). 2009;52(4):378–98. doi:.https://doi.org/10.1080/00201740903087359
64 Gearhardt AN , Yokum S , Orr PT , Stice E , Corbin WR , Brownell KD . Neural correlates of food addiction. Arch Gen Psychiatry. 2011;68(8):808–16. doi:.https://doi.org/10.1001/archgenpsychiatry.2011.32
65 O’Sullivan SS , Wu K , Politis M , Lawrence AD , Evans AH , Bose SK , et al. Cue-induced striatal dopamine release in Parkinson’s disease-associated impulsive-compulsive behaviours. Brain. 2011;134(Pt 4):969–78. doi:.https://doi.org/10.1093/brain/awr003
66Robinson MJF, Fischer AM, Ahuja A, Lesser EN, Maniates H. Roles of “Wanting” and “Liking” in motivating behavior: gambling, food, and drug addictions. In: Simpson EH, Balsam PD, editors. Behavioral Neuroscience of Motivation. Cham. New York: Springer International Publishing; 2016. p. 105–36.
67 Voon V , Mole TB , Banca P , Porter L , Morris L , Mitchell S , et al. Neural correlates of sexual cue reactivity in individuals with and without compulsive sexual behaviours. PLoS One. 2014;9(7):e102419. doi:.https://doi.org/10.1371/journal.pone.0102419
68 Schulte EM , Potenza MN , Gearhardt AN . How much does the Addiction-Like Eating Behavior Scale add to the debate regarding food versus eating addictions? Int J Obes. 2018;42(4):946. doi:.https://doi.org/10.1038/ijo.2017.265
69 Watson P , de Wit S . Current limits of experimental research into habits and future directions. Curr Opin Behav Sci. 2018;20:33–9. doi:.https://doi.org/10.1016/j.cobeha.2017.09.012
70 Watson P , Wiers RW , Hommel B , de Wit S . Working for food you don’t desire. Cues interfere with goal-directed food-seeking. Appetite. 2014;79:139–48. doi:.https://doi.org/10.1016/j.appet.2014.04.005
71 Tiffany ST . A cognitive model of drug urges and drug-use behavior: role of automatic and nonautomatic processes. Psychol Rev. 1990;97(2):147–68. doi:.https://doi.org/10.1037/0033-295X.97.2.147
72Kurth-Nelson Z, Redish AD. Modeling decision-making systems in addiction.In: Gutkin B, Ahmed SH, editors. Computational neuroscience of drug addiction. New York: Springer; 2012, p. 163–87.
73 Milivojevic V , Sinha R . Central and peripheral biomarkers of stress response for addiction risk and relapse vulnerability. Trends Mol Med. 2018;24(2):173–86. doi:.https://doi.org/10.1016/j.molmed.2017.12.010
74 Sinha R . How does stress increase risk of drug abuse and relapse? Psychopharmacology (Berl). 2001;158(4):343–59. doi:.https://doi.org/10.1007/s002130100917
75 Sinha R . How does stress lead to risk of alcohol relapse? Alcohol Res. 2012;34(4):432–40.
76 Sinha R , Garcia M , Paliwal P , Kreek MJ , Rounsaville BJ . Stress-induced cocaine craving and hypothalamic-pituitary-adrenal responses are predictive of cocaine relapse outcomes. Arch Gen Psychiatry. 2006;63(3):324–31. doi:.https://doi.org/10.1001/archpsyc.63.3.324
77 Adinoff B , Junghanns K , Kiefer F , Krishnan-Sarin S . Suppression of the HPA axis stress-response: implications for relapse. Alcohol Clin Exp Res. 2005;29(7):1351–5. doi:.https://doi.org/10.1097/01.ALC.0000176356.97620.84
78 al’Absi M . Hypothalamic-pituitary-adrenocortical responses to psychological stress and risk for smoking relapse. Int J Psychophysiol. 2006;59(3):218–27. doi:.https://doi.org/10.1016/j.ijpsycho.2005.10.010
79Lazarus RS, Folkman S. Stress, appraisal, and coping. New York: Springer publishing company; 1984.
80 Sander D , Grandjean D , Scherer KR . An appraisal-driven componential approach to the emotional brain. Emot Rev. 2018;10(3):219–31. doi:.https://doi.org/10.1177/1754073918765653
81Sander D. Models of emotion: the affective neuroscience approach. In: Armony J, Vuilleumier P, editors. The Cambridge handbook of human affective neuroscience. New York: Cambridge University Press; 2013. p. 5–53.
82Lazarus RS, Folkman S. Stress, Appraisal, and Coping. New York: Springer; 1984. 1.
83 Dickerson SS , Kemeny ME . Acute stressors and cortisol responses: a theoretical integration and synthesis of laboratory research. Psychol Bull. 2004;130(3):355–91. doi:.https://doi.org/10.1037/0033-2909.130.3.355
84 Cabib S , Puglisi-Allegra S . The mesoaccumbens dopamine in coping with stress. Neurosci Biobehav Rev. 2012;36(1):79–89. doi:.https://doi.org/10.1016/j.neubiorev.2011.04.012
85 Schwabe L , Dickinson A , Wolf OT . Stress, habits, and drug addiction: a psychoneuroendocrinological perspective. Exp Clin Psychopharmacol. 2011;19(1):53–63. doi:.https://doi.org/10.1037/a0022212
86 Koolhaas JM , Bartolomucci A , Buwalda B , de Boer SF , Flügge G , Korte SM , et al. Stress revisited: a critical evaluation of the stress concept. Neurosci Biobehav Rev. 2011;35(5):1291–301. doi:.https://doi.org/10.1016/j.neubiorev.2011.02.003
87 Herman JP , Figueiredo H , Mueller NK , Ulrich-Lai Y , Ostrander MM , Choi DC , et al. Central mechanisms of stress integration: hierarchical circuitry controlling hypothalamo-pituitary-adrenocortical responsiveness. Front Neuroendocrinol. 2003;24(3):151–80. doi:.https://doi.org/10.1016/j.yfrne.2003.07.001
88 Kalisch R , Baker DG , Basten U , Boks MP , Bonanno GA , Brummelman E , et al. The resilience framework as a strategy to combat stress-related disorders. Nat Hum Behav. 2017;1(11):784–90. doi:.https://doi.org/10.1038/s41562-017-0200-8
89 Dias-Ferreira E , Sousa JC , Melo I , Morgado P , Mesquita AR , Cerqueira JJ , et al. Chronic stress causes frontostriatal reorganization and affects decision-making. Science. 2009;325(5940):621–5. doi:.https://doi.org/10.1126/science.1171203
90 Schwabe L , Wolf OT . Stress prompts habit behavior in humans. J Neurosci. 2009;29(22):7191–8. doi:.https://doi.org/10.1523/JNEUROSCI.0979-09.2009
91 Schwabe L , Höffken O , Tegenthoff M , Wolf OT . Preventing the stress-induced shift from goal-directed to habit action with a β-adrenergic antagonist. J Neurosci. 2011;31(47):17317–25. doi:.https://doi.org/10.1523/JNEUROSCI.3304-11.2011
92 Schwabe L , Tegenthoff M , Höffken O , Wolf OT . Concurrent glucocorticoid and noradrenergic activity shifts instrumental behavior from goal-directed to habitual control. J Neurosci. 2010;30(24):8190–6. doi:.https://doi.org/10.1523/JNEUROSCI.0734-10.2010
93 Schwabe L , Tegenthoff M , Höffken O , Wolf OT . Simultaneous glucocorticoid and noradrenergic activity disrupts the neural basis of goal-directed action in the human brain. J Neurosci. 2012;32(30):10146–55. doi:.https://doi.org/10.1523/JNEUROSCI.1304-12.2012
94 Vogel S , Klumpers F , Schröder TN , Oplaat KT , Krugers HJ , Oitzl MS , et al. Stress induces a shift towards striatum-dependent stimulus-response learning via the mineralocorticoid receptor. Neuropsychopharmacology. 2017;42(6):1262–71. doi:.https://doi.org/10.1038/npp.2016.262
95 Maier SU , Makwana AB , Hare TA . Acute stress impairs self-control in goal-directed choice by altering multiple functional connections within the brain’s decision circuits. Neuron. 2015;87(3):621–31. doi:.https://doi.org/10.1016/j.neuron.2015.07.005
96 Otto AR , Raio CM , Chiang A , Phelps EA , Daw ND . Working-memory capacity protects model-based learning from stress. Proc Natl Acad Sci USA. 2013;110(52):20941–6. doi:.https://doi.org/10.1073/pnas.1312011110
97 Radenbach C , Reiter AMF , Engert V , Sjoerds Z , Villringer A , Heinze HJ , et al. The interaction of acute and chronic stress impairs model-based behavioral control. Psychoneuroendocrinology. 2015;53:268–80. doi:.https://doi.org/10.1016/j.psyneuen.2014.12.017
98 Pool ER , Delplanque S , Coppin G , Sander D . Is comfort food really comforting? Mechanisms underlying stress-induced eating. Food Res Int. 2015;76:207–15. doi:.https://doi.org/10.1016/j.foodres.2014.12.034
99 Pool E , Sennwald V , Delplanque S , Brosch T , Sander D . Measuring wanting and liking from animals to humans: A systematic review. Neurosci Biobehav Rev. 2016;63:124–42. doi:.https://doi.org/10.1016/j.neubiorev.2016.01.006
100 Allman MJ , DeLeon IG , Cataldo MF , Holland PC , Johnson AW . Learning processes affecting human decision making: An assessment of reinforcer-selective Pavlovian-to-instrumental transfer following reinforcer devaluation. J Exp Psychol Anim Behav Process. 2010;36(3):402–8. doi:.https://doi.org/10.1037/a0017876
101 Corbit LH , Balleine BW . The general and outcome-specific forms of Pavlovian-instrumental transfer are differentially mediated by the nucleus accumbens core and shell. J Neurosci. 2011;31(33):11786–94. doi:.https://doi.org/10.1523/JNEUROSCI.2711-11.2011
102 Pool E , Brosch T , Delplanque S , Sander D . Stress increases cue-triggered “wanting” for sweet reward in humans. J Exp Psychol Anim Learn Cogn. 2015;41(2):128–36. doi:.https://doi.org/10.1037/xan0000052
103 Wassum KM , Ostlund SB , Balleine BW , Maidment NT . Differential dependence of Pavlovian incentive motivation and instrumental incentive learning processes on dopamine signaling. Learn Mem. 2011;18(7):475–83. doi:.https://doi.org/10.1101/lm.2229311
104 Dickinson A , Balleine B . Motivational control of instrumental performance following a shift from thirst to hunger. Q J Exp Psychol B. 1990;42(4):413–31.
105 Dickinson A , Dawson GR . Pavlovian processes in the motivational control of instrumental performance. Q J Exp Psychol. 1987;39:201–13.
106 Peciña S , Cagniard B , Berridge KC , Aldridge JW , Zhuang X . Hyperdopaminergic mutant mice have higher “wanting” but not “liking” for sweet rewards. J Neurosci. 2003;23(28):9395–402. doi:.https://doi.org/10.1523/JNEUROSCI.23-28-09395.2003
107 Wyvell CL , Berridge KC . Intra-accumbens amphetamine increases the conditioned incentive salience of sucrose reward: enhancement of reward “wanting” without enhanced “liking” or response reinforcement. J Neurosci. 2000;20(21):8122–30. doi:.https://doi.org/10.1523/JNEUROSCI.20-21-08122.2000
108 Wyvell CL , Berridge KC . Incentive sensitization by previous amphetamine exposure: increased cue-triggered “wanting” for sucrose reward. J Neurosci. 2001;21(19):7831–40. doi:.https://doi.org/10.1523/JNEUROSCI.21-19-07831.2001
109 Dayan P , Berridge KC . Model-based and model-free Pavlovian reward learning: revaluation, revision, and revelation. Cogn Affect Behav Neurosci. 2014;14(2):473–92. doi:.https://doi.org/10.3758/s13415-014-0277-8
110 Robinson MJF , Berridge KC . Instant transformation of learned repulsion into motivational “wanting”. Curr Biol. 2013;23(4):282–9. doi:.https://doi.org/10.1016/j.cub.2013.01.016
111 Peciña S , Schulkin J , Berridge KC . Nucleus accumbens corticotropin-releasing factor increases cue-triggered motivation for sucrose reward: paradoxical positive incentive effects in stress? BMC Biol. 2006;4(1):8. doi:.https://doi.org/10.1186/1741-7007-4-8
112 Berridge KC . The debate over dopamine’s role in reward: the case for incentive salience. Psychopharmacology (Berl). 2007;191(3):391–431. doi:.https://doi.org/10.1007/s00213-006-0578-x
113 Anderson BA , Kim H . Mechanisms of value-learning in the guidance of spatial attention. Cognition. 2018;178:26–36. doi:.https://doi.org/10.1016/j.cognition.2018.05.005
114 Hickey C , Peelen MV . Neural mechanisms of incentive salience in naturalistic human vision. Neuron. 2015;85(3):512–8. doi:.https://doi.org/10.1016/j.neuron.2014.12.049
115 Pool E , Brosch T , Delplanque S , Sander D . Attentional bias for positive emotional stimuli: A meta-analytic investigation. Psychol Bull. 2016;142(1):79–106. doi:.https://doi.org/10.1037/bul0000026
116 Sennwald V , Pool E , Brosch T , Delplanque S , Bianchi-Demicheli F , Sander D . Emotional attention for erotic stimuli: Cognitive and brain mechanisms. J Comp Neurol. 2016;524(8):1668–75. doi:.https://doi.org/10.1002/cne.23859
117 Field M , Powell H . Stress increases attentional bias for alcohol cues in social drinkers who drink to cope. Alcohol Alcohol. 2007;42(6):560–6. doi:.https://doi.org/10.1093/alcalc/agm064
118 Field M , Quigley M . Mild stress increases attentional bias in social drinkers who drink to cope: a replication and extension. Exp Clin Psychopharmacol. 2009;17(5):312–9. doi:.https://doi.org/10.1037/a0017090
119 Goldfarb EV , Sinha R . Drug-induced glucocorticoids and memory for substance use. Trends Neurosci. 2018;41(11):853–68. doi:.https://doi.org/10.1016/j.tins.2018.08.005
120 Milivojevic V , Sinha R . Central and peripheral biomarkers of stress response for addiction risk and relapse vulnerability. Trends Mol Med. 2018;24(2):173–86. doi:.https://doi.org/10.1016/j.molmed.2017.12.010
121 de Quervain D , Schwabe L , Roozendaal B . Stress, glucocorticoids and memory: implications for treating fear-related disorders. Nat Rev Neurosci. 2017;18(1):7–19. doi:.https://doi.org/10.1038/nrn.2016.155
122 Wirz L , Bogdanov M , Schwabe L . Habits under stress: mechanistic insights across different types of learning. Curr Opin Behav Sci. 2018;20:9–16. doi:.https://doi.org/10.1016/j.cobeha.2017.08.009
123 Quaedflieg CWEM , Stoffregen H , Sebalo I , Smeets T . Stress-induced impairment in goal-directed instrumental behaviour is moderated by baseline working memory. Neurobiol Learn Mem. 2019;158:42–9. doi:.https://doi.org/10.1016/j.nlm.2019.01.010
124 Volkow ND , Fowler JS , Wang G-J , Swanson JM , Telang F . Dopamine in drug abuse and addiction: results of imaging studies and treatment implications. Arch Neurol. 2007;64(11):1575–9. doi:.https://doi.org/10.1001/archneur.64.11.1575
125 Dalley JW , Fryer TD , Brichard L , Robinson ES , Theobald DE , Lääne K , et al. Nucleus accumbens D2/3 receptors predict trait impulsivity and cocaine reinforcement. Science. 2007;315(5816):1267–70. doi:.https://doi.org/10.1126/science.1137073
126 Patzelt EH , Hartley CA , Gershman SJ . Computational phenotyping: using models to understand individual differences in personality, development, and mental illness. Personal Neurosci. 2018;1:e18. doi:.https://doi.org/10.1017/pen.2018.14
127 Flagel SB , Robinson TE . Neurobiological basis of individual variation in stimulus-reward learning. Curr Opin Behav Sci. 2017;13:178–85. doi:.https://doi.org/10.1016/j.cobeha.2016.12.004
128 Flagel SB , Watson SJ , Robinson TE , Akil H . Individual differences in the propensity to approach signals vs goals promote different adaptations in the dopamine system of rats. Psychopharmacology (Berl). 2007;191(3):599–607. doi:.https://doi.org/10.1007/s00213-006-0535-8
129 Pearce JM , Hall G . A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol Rev. 1980;87(6):532–52. doi:.https://doi.org/10.1037/0033-295X.87.6.532
130Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF, editors. Classical conditioning II: Current research and theory. New York: Appleton-Century-Crofts; 1972. p. 64–99.
131 Pool ER , Pauli WM , Kress CS , O’Doherty JP . Behavioural evidence for parallel outcome-sensitive and outcome-insensitive Pavlovian learning systems in humans. Nat Hum Behav. 2019;3(3):284–96. doi:.https://doi.org/10.1038/s41562-018-0527-9
132 Prévost C , McNamee D , Jessup RK , Bossaerts P , O’Doherty JP . Evidence for model-based computations in the human amygdala during Pavlovian conditioning. PLOS Comput Biol. 2013;9(2):e1002918. doi:.https://doi.org/10.1371/journal.pcbi.1002918
133 Flagel SB , Clark JJ , Robinson TE , Mayo L , Czuj A , Willuhn I , et al. A selective role for dopamine in stimulus-reward learning. Nature. 2011;469(7328):53–7. doi:.https://doi.org/10.1038/nature09588
134Huys QJM, Tobler PN, Hasler G, Flagel SB. Chapter 3 - The role of learning-related dopamine signals in addiction vulnerability. In: Diana M, Di Chiara G, Spano P, editors. Dopamine Progress in Brain Research. Elsevier; 2014. p. 31-–7.
135 Saunders BT , Robinson TE . A cocaine cue acts as an incentive stimulus in some but not others: implications for addiction. Biol Psychiatry. 2010;67(8):730–6. doi:.https://doi.org/10.1016/j.biopsych.2009.11.015
136 Saunders BT , Robinson TE . Individual variation in the motivational properties of cocaine. Neuropsychopharmacology. 2011;36(8):1668–76. doi:.https://doi.org/10.1038/npp.2011.48
137 Saunders BT , Robinson TE . Individual variation in resisting temptation: implications for addiction. Neurosci Biobehav Rev. 2013;37(9 Pt A):1955–75. doi:.https://doi.org/10.1016/j.neubiorev.2013.02.008
138 Saunders BT , Yager LM , Robinson TE . Preclinical studies shed light on individual variation in addiction vulnerability. Neuropsychopharmacology. 2013;38(1):249–50. doi:.https://doi.org/10.1038/npp.2012.161
139 Saunders BT , Yager LM , Robinson TE . Cue-evoked cocaine “craving”: role of dopamine in the accumbens core. J Neurosci. 2013;33(35):13989–4000. doi:.https://doi.org/10.1523/JNEUROSCI.0450-13.2013
140 Tunstall BJ , Kearns DN . Sign-tracking predicts increased choice of cocaine over food in rats. Behav Brain Res. 2015;281:222–8. doi:.https://doi.org/10.1016/j.bbr.2014.12.034
141 Joyner MA , Gearhardt AN , Flagel SB . A translational model to assess sign-tracking and goal-tracking behavior in children. Neuropsychopharmacology. 2018;43(1):228–9. doi:.https://doi.org/10.1038/npp.2017.196
142 Garofalo S , di Pellegrino G . Individual differences in the influence of task-irrelevant Pavlovian cues on human behavior. Front Behav Neurosci. 2015;9:163. doi:.https://doi.org/10.3389/fnbeh.2015.00163
143 Schad DJ , Rapp MA , Garbusow M , Nebe S , Sebold M , Obst E , et al. Dissociating neural learning signals in human sign- and goal-trackers. Nat Hum Behav. 2019. Epub ahead of print. doi:.https://doi.org/10.1038/s41562-019-0765-5
144 Koob GF . Dynamics of neuronal circuits in addiction: reward, antireward, and emotional memory. Pharmacopsychiatry. 2009;42(S 01, Suppl 1):S32–41. doi:.https://doi.org/10.1055/s-0029-1216356