(PDF) Hippocampal representations switch

ARTICLE https://doi.org/10.1038/s41467-022-31040-w OPEN Hippocampal representations switch from errors to predictions during acquisition of predictive associations Fraser Aitken 1,2 & Peter Kok 1✉ 1234567890():,; We constantly exploit the statistical regularities in our environment to help guide our per- ception. The hippocampus has been suggested to play a pivotal role in both learning envir- onmental statistics, as well as exploiting them to generate perceptual predictions. However, it is unclear how the hippocampus balances encoding new predictive associations with the retrieval of existing ones. Here, we present the results of two high resolution human fMRI studies (N = 24 for both experiments) directly investigating this. Participants were exposed to auditory cues that predicted the identity of an upcoming visual shape (with 75% validity). Using multivoxel decoding analysis, we ﬁnd that the hippocampus initially preferentially represents unexpected shapes (i.e., those that violate the cue regularities), but later switches to representing the cue-predicted shape regardless of which was actually presented. These ﬁndings demonstrate that the hippocampus is involved both acquiring and exploiting pre- dictive associations, and is dominated by either errors or predictions depending on whether learning is ongoing or complete. 1 Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, University College London, 12 Queen Square, London WC1N 3AR, UK. 2 School of Biomedical Engineering and Imaging Sciences, King’s College London, St Thomas’ Hospital, London SE1 7EH, UK. ✉email:

[email protected]

NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w W e constantly exploit the statistical regularities in our neuromodulators such as acetylcholine (ACh) and nor- environment to help guide our perception1. For epinephrine (NE)50–52 and hippocampal theta phase resets53–55. instance, hearing a particular jingle will prime our However, a proper test of this proposal requires establishing sensory systems for the sight (and taste!) of ice cream. But how whether the hippocampus switches from representing errors to does the brain acquire and exploit knowledge about such reg- predictions as learning progresses. Here, we present the results of ularities in a changing environment? two high-resolution fMRI studies (N = 24 for both experiments) The hippocampus has been suggested to play a pivotal role in directly testing this hypothesis. Participants were exposed to this process. That is, the hippocampus has been shown to be auditory cues that predicted the identity of an upcoming visual involved in learning novel associations between arbitrary shape (with 75% validity). To preface our ﬁndings, we found that stimuli2–9, especially when stimuli are discontiguous in space and the hippocampus initially preferentially represented unexpected time10–12, as is the case for many predictive contextual cues. In shapes, but later switched to representing the cue-predicted shape fact, learning of such relationships is strongly impaired when the regardless of whether it was actually presented. Furthermore, in hippocampus is damaged13–18. At the same time, the hippo- this latter phase we observed increased informational connectivity campus has also been suggested to play a role in exploiting such between the posterior subiculum and early visual cortex (V1), in predictive associations once learning is complete1,19,20. Speciﬁ- line with hippocampal predictions being relayed to the sensory cally, one of the main computational functions of the hippo- cortex. These ﬁndings demonstrate that hippocampal repre- campus is to retrieve associated items from memory based on sentations switch from being dominated by errors to predictions partial information, a process known as pattern completion21–23. as associative learning proceeds. This function has mostly been studied in the context of memory recall, but is also ideally suited for retrieving perceptual predic- tions based on contextual cues5,24–28. Results This raises the question of how the hippocampus balances the We present the results of two human fMRI studies (N = 24 encoding of new associations with the retrieval of existing participants in both experiments) in which human participants ones29,30. One way to achieve this would be to emphasise pre- were exposed to auditory cues that predicted the identity of an diction errors when an environment is novel, since these can serve upcoming visual shape (with 75% validity) (Fig. 1a, b). On each to update one’s internal model of the world31. On the other hand, block of trials (n = 32 trials per block in Experiment 1 and once an environment (and its statistical regularities) have become n = 128 in Experiment 2) new auditory cues were presented, such familiar, prediction errors may be downweighted and predictions that novel associations would have to be learnt. (i.e., retrieval of existing associations) may dominate. That is, once Participants performed a shape discrimination task that was the statistical regularities of an environment are fully learnt, the orthogonal to the predictive cues. Speciﬁcally, on each trial, the hippocampus becomes more resilient to prediction errors caused ﬁrst shape (validly predicted, 75% of trials, or invalidly predicted, by random ﬂuctuations (i.e., expected uncertainty), since these are 25%) was followed by a second shape that was either identical to no longer considered model updating (‘newsworthy’) events. the ﬁrst (50% of trials) or very slightly warped (50%; see Indeed, many previous studies have reported prediction error “Methods” for details). Participants’ task was to indicate whether signals (i.e., a response evoked by a mismatch between repre- the two shapes were the same or different. This task was designed sentations retrieved from memory and current sensory inputs) in to encourage participants to pay attention to the shapes while the hippocampus32–37, while others have instead revealed pre- keeping the cue-shape contingencies task-irrelevant. In fact, diction signals (i.e., a representation of a predicted stimulus, participants were not informed that the auditory cues predicted regardless of whether it is actually presented)5,26,27,38. Potentially, the identity of the upcoming shape, and debrieﬁng revealed that this seeming contradiction may arise from the fact that mismatch they did not become aware of this during the experiments. In signals have mostly been reported in the context of episodic other words, any learning of cue-shape associations was inci- memory-like paradigms, where individual stimuli are only repe- dental and implicit. ated a few times, whereas studies revealing prediction signals have Multivoxel decoding analyses (Supplementary Fig. 1), trained generally involved a longer training phase to fully establish pre- on data from separate shape-only runs in which no predictive dictive associations before measuring neural signals. That is, when cues were presented (Fig. 1c, d), were used to reveal hippocampal stimuli or associations are novel, the hippocampus is mainly shape representations on valid and invalid trials (Fig. 1e). If the driven by sensory signals that provide the opportunity to update hippocampus were to represent prediction errors, valid trials our model of the world, i.e., prediction errors28,39–42. However, should not result in a shape representation, since the predicted once learning is complete and environmental contingencies are no and presented shapes are identical and should cancel each other longer novel, hippocampal processing is dominated by retrieving out (Fig. 1f, top left). On invalid trials on the other hand, if shape predicted stimuli based on contextual cues to optimally guide B is predicted but shape A is presented, unexpected shape A perception1,27,43,44. should be represented in the hippocampus (Fig. 1f, middle left). If In line with this idea, recent work has shown that novel pre- instead, the hippocampus was to represent predictions rather diction errors can bias the human hippocampus towards than errors, on invalid trials where shape B is predicted but shape encoding45, increasing sensory processing (i.e., EC to CA1 con- A is presented, shape B should be represented in the hippo- nectivity) and decreasing mnemonic retrieval (CA3/DG to CA1 campus (Fig. 1f, middle right). Further, on valid trials, the shape connectivity). In addition, behavioural evidence suggests that that is both predicted and presented should be represented expectation violation46 and novelty47 can bias the hippocampus (Fig. 1f, top right). towards performing pattern separation, proposed to underlie Both of these types of patterns have been observed in the prediction error computations28. Indeed, in the context of epi- hippocampus56, and the aim of the current study was to inves- sodic memory, it has been proposed that the hippocampus tigate how they develop over the course of learning. Note that the operates in two distinct modes, namely an encoding mode that temporal resolution afforded by fMRI did not allow us to prioritises processing of novel sensory signals and promotes investigate any potential fast within-trial dynamics of these plasticity, and a retrieval mode that prioritises memory retrieval hypothesised prediction and prediction error signals. Rather, the through pattern completion48,49. The hippocampus is thought to shape representations revealed here reﬂect a temporal integration be biased towards encoding by novelty-induced increases in of neural signals over the course of a trial. It seems likely that both 2 NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w ARTICLE a Prediction runs c Shape-only runs 100 ms 500 ms 500 ms 250 ms 500 ms 250 ms 2000- 1100 ms 250 ms 500 ms 250 ms 2000- 5000 ms 5000 ms b 75% Experiment 1 d 16 blocks x 32 trials 20% 20% 20% 20% 20% 25% Experiment 2 75% 4 blocks x 128 trials e f Prediction error Prediction Subtraction logic hypothesis hypothesis Decoder evidence A no error A Valid A A cue A pred Cue Prediction Stimulus B B - - - Decoder evidence A A B pred Invalid B A cue unexp A Cue Prediction Stimulus B B - - Decoder evidence - - - - A A Predicted shape effect A - B B B Fig. 1 Experimental paradigm and analysis. a During prediction runs, an auditory cue preceded the presentation of two consecutive shape stimuli. On each trial, the second shape was either identical to the ﬁrst or slightly warped with respect to the ﬁrst along an orthogonal dimension, and participants’ task was to report whether the two shapes were the same or different. b The auditory cues predicted whether the ﬁrst shape on a given trial would be shape 2 or shape 4 (of 5 shapes). The cue was valid on 75% of trials, whereas in the other 25% of (invalid) trials the unpredicted shape was presented. c During shape-only runs, no auditory cues were presented. As in the prediction runs, two shapes were presented on each trial, and participants’ task was to report the same or different. d All ﬁve shapes appeared with equal (20%) likelihood during shape-only runs. e Subtracting the response evoked by invalidly from validly predicted shapes isolated the effect of the predictive cues. f Hypothesised shape decoding results if the hippocampus represents either prediction errors (left column) or predictions (right column). predictions and prediction errors play a role in hippocampal Experiment 1—behavioural results. Participants were able to computations, the question addressed here is whether the relative detect small differences in the shapes, during both the shape-only weighting of the two is affected by novelty and uncertainty. runs (67.7 ± 1.7% correct; 29.7 ± 1.8% modulation of the 3.18 Hz The clearest way to dissociate the effects of the predictive cues radial frequency component, mean ± SEM) and during the pre- from the effects of the presented shapes is to subtract decoding diction runs (69.0 ± 1.4% correct; 28.7 ± 1.9% modulation). evidence for the invalidly predicted shapes from evidence for the Accuracy and reaction times (RTs) did not differ signiﬁcantly validly predicted shapes (Fig. 1e), since the presented shapes were between valid (68.9 ± 1.5% correct; RT = 592 ± 19 ms) and invalid identical in both types of trials. Under a prediction error (69.3 ± 1.6% correct; RT = 595 ± 20 ms; both p > 0.10) trials. Task hypothesis, this would result in a negative signal (subtracting a accuracy was stable over trials and no difference between valid positive signal on invalid trials from a zero signal on valid trials; and invalid trials emerged over time (Supplementary Fig. 2a). Fig. 1f left column). Under a prediction hypothesis on the This is as expected, since the discrimination task was orthogonal other hand this would result in a positive signal (subtracting a to the prediction manipulation (see “Methods” for details), and in negative signal on invalid trials from a positive one on valid trials; line with previous results27. Fig. 1f right column). This subtraction, therefore, constitutes our main effect of interest. In addition, averaging the evidence for validly and invalidly predicted shapes allowed us to quantify Experiment 1—fMRI decoding results. The dynamics of hip- evidence for the shape as presented on the screen, regardless of pocampal shape representations over trials were investigated using the cues27. a sliding window approach (see Methods for details). In the sec- ond half of the blocks, hippocampal activity patterns started to reﬂect unexpected (i.e., invalidly predicted) visual shapes (sig- Experiment 1: short blocks. In Experiment 1, participants niﬁcant cluster from trial 22 to 32, p = 0.024; Fig. 2a, red line). completed 16 blocks of 32 trials, with two novel auditory cues However, there was no signiﬁcant representation of validly pre- being presented in each block, while the same two visual shapes dicted shapes (no clusters with p < 0.05; Fig. 2a, green line). In fact, were presented throughout. Each auditory cue predicted which of there was a signiﬁcant difference between invalidly and validly the two shapes would be presented with 75% validity (Fig. 1b). predicted shape decoding in the hippocampus (valid–invalid, NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w a Hippocampus b Validly predicted shape Predicted shape evidence Invalidly predicted shape 0.08 0.10 0.04 0.05 Decoding signal 0.00 0.00 -0.05 -0.04 -0.10 -0.08 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 Trials Trials c CA1 d 0.2 0.10 0.1 0.05 Decoding signal 0.0 0.00 -0.05 -0.1 -0.10 -0.2 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 CA2-3-DG 0.2 0.10 0.1 0.05 Decoding signal 0.0 0.00 -0.05 -0.1 -0.10 -0.2 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 Subiculum 0.2 0.10 0.1 0.05 Decoding signal 0.0 0.00 -0.05 -0.1 -0.10 -0.2 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 Trials Trials Fig. 2 Experiment 1 shape decoding over trials. a Decoding evidence for validly (green) and invalidly (red) predicted shapes in the hippocampus. b Decoding evidence for predicted (valid–invalid) shapes in the hippocampus. c Decoding evidence for validly (green) and invalidly (red) predicted shapes in hippocampal subﬁelds. d Decoding evidence for predicted (valid–invalid) shapes in hippocampal subﬁelds. Time courses were temporally smoothed using a sliding window approach (see “Methods” for details). Horizontal lines indicate signiﬁcant clusters. N = 24 participants, shaded regions indicate SEM. 4 NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w ARTICLE a Hippocampus b Amplitude of sigmoid 0.2 Predicted shape evidence 0.08 Sigmoid fit 0.0 Amplitude parameter 0.04 Decoding signal p = 0.041 0.00 -0.2 -0.04 -0.4 -0.08 -0.6 0 4 8 12 16 20 24 28 32 Trials Fig. 3 Quantiﬁcation of hippocampal learning curve in Experiment 1. a Sigmoid learning curve ﬁt to predicted shape decoding in hippocampus. N = 24 participants, shaded regions indicate SEM. b Amplitude parameter of the sigmoid curve. N = 24 participants, error bars indicate SEM. P value reﬂects two- sided one-sample t-test (df = 23) against zero. Dots indicate individual participants. Source data are provided as a Source Data ﬁle. signiﬁcant cluster from trial 24 to 32, p = 0.028; Fig. 2a). In other (t23 = −1.55, p = 0.13), the average derivative over the course of words, hippocampal activity patterns reﬂected shapes that were the blocks was signiﬁcantly negative. unexpectedly presented (e.g., decoding evidence for shape A when It is noteworthy that, visually, an early positive predicted shape shape B was predicted but A was presented) but not shapes that effect seemed to be present in the hippocampus (Fig. 2b), were presented as expected (e.g., no evidence for shape A when especially in the subiculum (Fig. 2d). This effect was not shape A was predicted and presented). In sum, towards the end of signiﬁcant according to the cluster-based permutation tests, but the blocks, activity patterns in the hippocampus reﬂected a pre- in an exploratory post hoc analysis we investigated whether this diction error-like signal, representing unexpected but not expected early positivity was signiﬁcant by ﬁtting two sigmoids, rather than shapes. one, to the predicted shape evidence (see Methods for details). Segmenting the hippocampus into its subﬁelds revealed that There was no signiﬁcantly positive early sigmoid in hippocampus this effect was signiﬁcantly present in CA2-3-DG (signiﬁcant as a whole (t23 = 1.13, p = 0.27), nor in CA1 (t23 = 0.77, p = 0.45) cluster for invalidly predicted shapes from trial 20 to 32, p = 0.014; or CA2-3-DG (t23 = 0.66, p = 0.52), but there was in the signiﬁcant cluster for valid–invalid from 23 to 32, p = 0.045), but subiculum (t23 = 3.43, p = 0.002; Supplementary Fig. 3). not in CA1 and the subiculum (no clusters with p < 0.05), Based on previous ﬁndings of predictive signals in the caudate suggesting that this effect may have been driven by CA2-3-DG. nucleus4,27,60, we also tested these effects in the caudate, and However, decoding evidence for the predicted shape (i.e., found that like the hippocampus, caudate activity patterns valid–invalid, Fig. 2d) in the last bin was not signiﬁcantly different reﬂected unexpected (signiﬁcant cluster from trial 20 to 29, between the different subﬁelds (F2,46 = 0.83, p = 0.44). Given the p = 0.0062) but not expected (no clusters with p < 0.05) shapes recent interest in potential functional differences along the long towards the end of the blocks, with a signiﬁcant difference axis of the hippocampus57–59, we also compared decoding between the two conditions (valid–invalid, signiﬁcant cluster evidence for the predicted shape in the last bin between the from trial 20 to 29, p = 0.033; Supplementary Fig. 4). posterior and anterior hippocampus, but found no signiﬁcant The fact that the hippocampus displayed a prediction error-like difference (t23 = 0.91, p = 0.37). However, decoding evidence for pattern (cf. Figs. 2a and 1f, left column) is striking given that the predicted shape was signiﬁcant in the posterior (t23 = −3.83, several previous studies have reported prediction-like p = 0.00086) but not the anterior (t23 = −1.36, p = 0.19) hippo- effects5,26,38. Speciﬁcally, a previous study with a virtually campus, suggesting the posterior hippocampus may be driving the identical design27 revealed evidence for the shape predicted by prediction error-like effects. the cue, regardless of which shape was actually presented (as in In order to quantify the emergence of these signals over trials, we Fig. 1f, right column). The crucial difference is that in these ﬁt sigmoid functions, or S-curves, to the decoding evidence for previous studies participants were exposed to the predictive predicted shapes in the hippocampus (Fig. 3a; see “Methods” for associations for many trials before the fMRI session, whereas here details). In line with the results from the non-parametric cluster- participants learned novel predictive associations every block. based permutations tests reported earlier, the best ﬁtting sigmoids Based on this, we hypothesised that the hippocampus may switch had a signiﬁcantly negative amplitude in the hippocampus from representing prediction errors (early in learning) to (t23 = −2.17, p = 0.041) and CA2-3-DG (t23 = −2.90, p = 0.0080), representing predictions (once learning is complete) as learning but not in CA1 (t23 = −0.31, p = 0.76) and the subiculum progresses (Fig. 4). In order to test this hypothesis, we performed (t23 = −1.38, p = 0.18). Finally, in a control analysis, to quantify a second fMRI experiment, in which participants (N = 24) were the representational change over time without making any exposed to the same cues for longer, and tested for potential assumptions about the shape of this change, we calculated the switches in dynamics by ﬁtting sigmoid learning curves to the derivative of the decoding evidence for the predicted shape over decoding evidence over trials. trials. In line with the previous analyses, in hippocampus (t23 = −2.72, p = 0.012) and CA2-3-DG (t23 = −2.84, p = 0.0092), Experiment 2: long blocks. In Experiment 2, participants were but not in CA1 (t23 = −0.58, p = 0.57) and the subiculum exposed to 4 blocks of 128 trials (compared to 16 blocks of 32 NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w Hypotheses curve was marginal (t23 = −2.07, p = 0.05) while the later positive Experiment 1 results one was signiﬁcant (t23 = 2.56, p = 0.018). As in Experiment 1, we performed a control analysis that did Kok & Turk-Browne not make any assumptions about the shapes of the learning 2018 data curves, in which we calculated the derivative of the decoding Decoding signal evidence for the predicted shape (Fig. 5b, e), separately for the ﬁrst and second half of the blocks. In line with the curve ﬁtting H2: Hippocampus switches results, the derivative was signiﬁcantly different in the ﬁrst versus to representing P the second halves of the blocks in hippocampus (t23 = −2.67, p = 0.014) and CA1 (t23 = −2.41, p = 0.024), while this difference was marginal in CA2-3-DG (t23 = −2.06, p = 0.051) and not signiﬁcant in the subiculum (t23 = −1.34, p = 0.19). This was H1: Hippocampus continues to represent PE driven by the derivative being signiﬁcantly positive in the second half of the blocks in hippocampus (t23 = 2.25, p = 0.034) and Trials CA1 (t23 = 2.24, p = 0.035), but marginally negative in the ﬁrst Fig. 4 Hypotheses for Experiment 2. Experiment 1 revealed a negative half (hippocampus: t23 = −2.00, p = 0.057; CA1: t23 = −1.98, shape decoding signal (solid line). Lengthening the learning phase may p = 0.060). In CA2-3-DG, the derivative was signiﬁcantly either result in this effect continuing (H1, dotted line), or lead to a switch negative in the ﬁrst half (t23 = −2.73, p = 0.012) but not the towards positive prediction signals once learning is complete (H2, dashed second half (t23 = 0.87, p = 0.39), while neither half was line). Square indicates results from ref. 27, where participants were signiﬁcant in the subiculum (ﬁrst half: t23 = −0.51, p = 0.61; acquainted with the predictive cues before the fMRI session. second half: t23 = 1.49, p = 0.15). There was no signiﬁcant difference between the hippocampal subﬁelds in terms of the trials in Experiment 1), with new auditory predictive cues being derivative of the decoding time courses in either the ﬁrst presented in each block. In all other regards, Experiment 2 was (F2,46 = 0.58, p = 0.56) or second halves (F2,46 = 0.79, p = 0.46) identical to Experiment 1. of the blocks. However, there was a signiﬁcant difference between posterior and anterior hippocampus, with the positive derivative in the Experiment 2—behavioural results. As in Experiment 1, partici- second half of the blocks being stronger in posterior than anterior pants were able to detect small differences in the shapes, during both hippocampus (t23 = 3.00, p = 0.0064; Supplementary Fig. 6). In the shape-only runs (69.5 ± 1.7% correct; 30.1 ± 2.0% modulation of fact, the early negative (posterior: t23 = −2.43, p = 0.024; anterior: the 3.18 Hz radial frequency component, mean ± SEM) and during t23 = −0.86, p = 0.40) and late positive (posterior: t23 = 2.70, the prediction runs (68.5 ± 1.8% correct; 24.5 ± 2.0% shape mod- p = 0.013; anterior: t23 = 0.98, p = 0.34) sigmoids, as well as the ulation). Accuracy and reaction times (RTs) again did not differ difference in the derivative between the ﬁrst and second halve of signiﬁcantly between valid (68.6 ± 1.9% correct; RT = 651 ± 18 ms) the blocks (posterior: t23 = 2.74, p = 0.012; anterior: t23 = 1.42, and invalid (68.2 ± 1.9% correct; RT = 654 ± 18 ms; both p > 0.10) p = 0.17), were signiﬁcant in the posterior, but not anterior trials. Task accuracy was stable over trials and no difference between hippocampus. valid and invalid trials emerged over time (Supplementary Fig. 2b). A positive slope in the second half of the blocks might also be observed if the early prediction error signal simply gradually disappeared, rather than hippocampal representations switching Experiment 2—fMRI decoding results. As in Experiment 1, to a positive prediction signal. To resolve this, we tested whether dynamics of hippocampal shape representations over trials were decoding evidence for the predicted shape at the end of the blocks investigated using a sliding window approach (see Methods for (i.e., the ﬁnal bin) was signiﬁcantly larger than zero. While this details). Reﬂections of validly and invalidly predicted shapes in signal was not signiﬁcantly positive for the hippocampus as a hippocampal activity patterns displayed striking of dynamics over whole (t23 = 1.75, p = 0.094), it was in the posterior hippocampus time (Fig. 5a). These dynamics were quantiﬁed by ﬁtting two (t23 = 2.15, p = 0.042), which was also the driver of the results sigmoid curves to the decoding evidence for predicted shapes reported above. In fact, decoding evidence for the predicted shape (Fig. 5b), one with an inﬂection point in the ﬁrst half of the blocks at the end of the blocks was stronger in the posterior than the (trials 1–64) and the other with an inﬂection point in the second anterior (t23 = 0.31, p = 0.76; posterior vs. anterior: t23 = 2.14, half (trials 65–128). This analysis revealed that an initial negative p = 0.043) hippocampus. The signiﬁcant effect in posterior curve (amplitude parameter of early sigmoid; t23 = −2.26, hippocampus was reﬂected by signiﬁcant evidence for the p = 0.033), reﬂecting evidence for unexpected but not expected predicted shape in the posterior subiculum (t23 = 2.55, shapes (i.e., prediction error, as in Experiment 1) was followed by p = 0.018) and CA1 (t23 = 2.17, p = 0.041), but not CA2-3-DG a positive curve (amplitude parameter of the late sigmoid curve; (t23 = 1.31, p = 0.20). There was no signiﬁcant evidence for the t23 = 2.45, p = 0.022) about halfway through the blocks. This presented shape (quantiﬁed by averaging evidence for valid and switch was most striking for invalidly predicted shapes (Supple- invalid shapes, thereby averaging out the effect of the cues27; mentary Fig. 5a). Initially, approximately halfway through the Fig. 1e) at the end of the blocks in the hippocampus (t23 = −1.30, blocks, the unexpectedly presented shape was represented (Sup- p = 0.21) or any of its subdivisions (all p > 0.2). In other words, plementary Fig. 5b). However, at the end of the blocks, the hip- once predictions were learnt, hippocampal representations were pocampus instead represented the shape predicted by the determined by the predictive cues, not by which shape was auditory cue, rather than the shape presented on the screen actually presented on screen. Note that this analysis was based on (Supplementary Fig. 5b). a relatively small subset of trials (i.e., only those in the last bin of Segmenting the hippocampus into its subﬁelds revealed that each block), but the positive evidence for predicted shapes and both the early negative and later positive learning curves were also absence of evidence for the presented shapes is in line with a signiﬁcant in CA1 (early: t23 = −2.75, p = 0.011; late: t23 = 2.96, study employing a virtually identical paradigm where participants p = 0.0070), but not in CA2-3-DG (early: t23 = −0.95, p = 0.35; learnt the cue-shape predictions before being tested in the late: t23 = 0.54, p = 0.60), while in the subiculum the early negative scanner27. There was no signiﬁcant difference in predicted shape 6 NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w ARTICLE a Hippocampus b c Amplitude of sigmoids Predicted shape evidence 0.08 Double sigmoid fit 0.8 0.10 Amplitude parameter p = 0.022 Decoding signal 0.04 0.4 0.05 0.00 0.00 0.0 -0.05 p = 0.033 -0.04 -0.4 -0.10 Validly predicted shape Invalidly predicted shape -0.08 -0.8 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Early Late Trials Trials d CA1 e f 0.2 0.8 0.10 p = 0.0070 Amplitude parameter 0.1 Decoding signal 0.4 0.05 0.0 0.00 0.0 -0.05 -0.1 -0.4 p = 0.011 -0.10 -0.8 -0.2 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Early Late CA2-3-DG 0.2 0.8 0.10 Amplitude parameter 0.1 Decoding signal 0.4 0.05 0.0 0.00 0.0 -0.05 -0.1 -0.4 -0.10 -0.8 -0.2 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Early Late Subiculum 0.2 0.8 0.10 p = 0.018 Amplitude parameter 0.1 Decoding signal 0.4 0.05 0.0 0.00 0.0 -0.05 -0.1 -0.4 p = 0.05 -0.10 -0.8 -0.2 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Early Late Trials Trials Fig. 5 Experiment 2 shape decoding over trials. a Decoding evidence for validly (green) and invalidly (red) predicted shapes in the hippocampus. b Decoding evidence for predicted (valid–invalid) shapes in the hippocampus (yellow) with the double sigmoid ﬁt (grey). c Amplitude parameters of early (midpoint between trials 1 and 64) and late (midpoint between trials 65 and 128) sigmoid curves in the hippocampus. P value reﬂects two-sided one- sample t-test against zero. d Decoding evidence for validly (green) and invalidly (red) predicted shapes in hippocampal subﬁelds. e Decoding evidence for predicted (valid–invalid) shapes in hippocampal subﬁelds (yellow) with the double sigmoid ﬁt (grey). f Amplitude parameters of early (midpoint between trials 1 and 64) and late (midpoint between trials 65 and 128) sigmoid curves in hippocampal subﬁelds. Time courses were temporally smoothed using a sliding window approach (see Methods for details). N = 24 participants in all panels. Shaded regions and error bars indicate SEM. Dots indicate individual participants. P values reﬂect two-sided one-sample t-tests (df = 23) against zero. Source data are provided as a Source Data ﬁle. evidence between the posterior subﬁelds (F2,46 = 1.09, p = 0.34), established63,64. We tested this hypothesis in an exploratory but it is worth noting that the effect was numerically largest in the analysis of informational connectivity between the posterior posterior subiculum (0.13; Supplementary Fig. 7), in line with the subiculum and entorhinal cortex (EC; a major interface between previous work27. hippocampus and cortex) as well as the visual cortex (V1, V2 and Since the subiculum is a major output relay from the LO) (see “Methods” for details). This analysis revealed increased hippocampus to neocortex61,62, we speculate that this may be informational connectivity at the end of the blocks (ﬁnal bin) in line with hippocampal prediction signals being communicated versus at the start of the blocks (ﬁrst bin) between the posterior to the sensory cortex, in order to guide perception. If this were the subiculum and EC (t23 = 2.40, p = 0.025; Fig. 6) and V1 case, one would expect functional connectivity between hippo- (t23 = 2.88, p = 0.0084), but not V2 (t23 = 1.73, p = 0.097) and campus and neocortex to increase as predictive associations are LO (t23 = −0.28, p = 0.78). NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w Connectivity between posterior Subiculum and Entorhinal Connectivity between posterior Subiculum and V1 0.3 0.16 Informative connectivity 0.2 0.12 p = 0.025 p = 0.0084 0.08 0.1 0.04 0.0 0.00 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Trials Trials Fig. 6 Experiment 2 informational connectivity between posterior subiculum and neocortex over trials. Time-resolved Pearson correlation between shape evidence in the posterior subiculum and entorhinal cortex (left panel) and V1 (right panel). Time courses were temporally smoothed using a sliding window approach (see Methods for details). N = 24 participants, shaded regions indicate SEM. P values reﬂect two-sided one-sample t-tests (df = 23) against zero. The caudate nucleus displayed a qualitatively similar pattern of The early bias towards prediction errors is in line with recent results as the hippocampus, with an initial negative sigmoid curve demonstrations of hippocampal mode switches induced by novel (t23 = −3.11, p = 0.0049) being followed by a positive curve prediction errors in humans45. Mechanistically, this switch may (t23 = 3.65, p = 0.0013; Supplementary Fig. 8a). The control occur since novelty leads to an increase of neuromodulators like analysis also revealed a signiﬁcant difference in derivative in the ACh and NE50–52, which suppress retrieval-related connections ﬁrst versus second half of the blocks (t23 = 2.90, p = 0.0081), (CA3’s autorecurrence and CA3 -> CA1) relative to encoding- driven by positive derivative in the second half (t23 = 3.94, related ones (EC -> CA1)69–71. Alternatively, novelty may pro- p = 0.00066) but not the ﬁrst half (t23 = −0.33, p = 0.74), and a mote encoding on a faster time-scale by inducing a hippocampal signiﬁcantly positive representation of the predicted shape in the theta phase reset49,53–55. Further research is needed to determine ﬁnal bin of the blocks (t23 = 3.06, p = 0.0055). whether the switch demonstrated here was indeed driven by To investigate the speciﬁcity of these effects, we also analysed hippocampal mode changes or by a different mechanism that regions in the early visual cortex (Supplementary Fig. 8b–d) and upweights novel prediction errors, such as attention72–74. For the amygdala (Supplementary Fig. 8e), a region adjacent to the instance, methods with higher temporal resolution such as EEG/ hippocampus. These regions did not signiﬁcantly reﬂect the MEG or invasive electrophysiology could be used to investigate learning of predictive associations (no signiﬁcantly negative or whether there is a relationship between hippocampal theta phase positive sigmoid curves, all p > 0.05), indicating that the effects in and error vs. prediction representations in the hippocampus. In the hippocampus and caudate were region-speciﬁc27. either case, as learning progresses and novelty diminishes, a bias towards encoding prediction errors is abolished and retrieval of predictive associations dominates. Discussion As prediction signals emerged in the hippocampus, functional In two human fMRI experiments, we ﬁnd that as learning of connectivity increased between the posterior subiculum and the associative predictions progresses, the hippocampus switches entorhinal cortex and the primary visual cortex, demonstrating a from preferentially representing unexpected stimuli (i.e., predic- potential route for relaying predictions to the sensory tion errors) to representing predicted shapes. These ﬁndings cortex26,62,63,75,76. This relaying of predictions likely involves the demonstrate that the hippocampus is involved in both acquiring same mechanisms that are responsible for hippocampus-mediated and exploiting predictive associations, and is dominated by either cortical reinstatement of memories77–80. Of course, fMRI con- errors or predictions depending on whether learning is ongoing nectivity analyses cannot determine directionality given the slow (i.e., when prediction errors are informative28) or complete (only nature of the BOLD signal, so future research using expected uncertainty remains). Concretely, what this suggests in electrophysiology81,82 or layer-speciﬁc fMRI76,83,84 will be required the context of the current study is that prediction errors caused by to test this hypothesis further. Similarly, the slow nature of the early cue violations, when learning is still very much ongoing, BOLD signal prevents investigating fast within-trial dynamics of dominate processing in the hippocampus, leading to the repre- prediction and prediction error signals. For instance, it may be that sentation of the unexpectedly presented shape. On the other prediction signals always precede prediction error signals in the hand, once the 75–25% cue contingencies are ﬁrmly learnt, the hippocampus. The hippocampal representations revealed here 25% cue violations are no longer treated as model updating reﬂect a temporal integration of neural signals over the course of a (‘newsworthy’) events28, are therefore no longer upweighted, and trial, and thus indicate whether predictions or prediction errors the retrieval of the cued shape dominates. Note that we are not dominate. Future studies with millisecond temporal resolution are suggesting that this is an all-or-nothing switch; it is likely that the required9,82 to reveal the dynamic interplay of predictions and hippocampus always represents both predictions (through pat- errors within the hippocampus. tern completion in CA321,24,65,66) and errors (potentially through An exploratory post hoc analysis of Experiment 1 additionally mismatch comparison in CA134,36,67), but that the balance revealed early prediction-like signals in the subiculum, before the between the two depends on contextual factors such as novelty prediction error-dominated signals emerged (Supplementary and unexpected uncertainty. An analogous switch between pre- Fig. 3). This initial positive signal could potentially reﬂect early, diction vs. surprise dominated representations has recently been imprecise predictions, which lead to strong prediction errors on proposed in the realm of perception, albeit on a sub-second time- subsequent invalid trials. This explanation is currently spec- scale68. ulative, especially given the post hoc nature of the analysis. Future 8 NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w ARTICLE research is needed to investigate the early build-up of prediction or is it sufﬁcient for only their predictive values to change, i.e., for signals further, for instance using a paradigm with many blocks there to be unexpected uncertainty52? with only a few cue repetitions each. In addition, whether the hippocampus signals predictions or It appears that the prediction error-like signals emerged later in prediction errors may also depend on the type of predicted sti- Experiment 2 than in Experiment 1 (cf. Figs. 2 and 5). This is likely mulus. For instance, in previous work, we reported hippocampal the result of differences in sliding window length and smoothing prediction signals for complex shapes, but prediction error-like between the two experiments, combined with the fact that the signals for low-level features, i.e., the predicted orientation of a prediction error signal may still be building by the end of the grating stimulus56. Future work systematically manipulating the blocks in Experiment 1 (Fig. 2b). More speculatively, the later complexity of visual stimuli may shed light on this by exploring negative peak in Experiment 2 may also partly have resulted from the relationship between hippocampal computations and stimu- averaging together the small early positive peak discussed above lus complexity93. (~trial 16–20) with the subsequent negative signal (~trial 24 In sum, the current ﬁndings demonstrate a role for the hip- onwards) (Supplementary Fig. 3). While the sliding window in pocampus in both acquiring and exploiting predictive associa- Experiment 1 was short enough to separately resolve these two tions, bridging the ﬁelds of learning and perception. These ﬁelds signals, the longer sliding window in Experiment 2 was not, have separately made progress in investigating the roles of pre- resulting in these two signals cancelling each other out early in the diction, novelty and uncertainty1,52, but have until now largely blocks. At present this explanation is highly speculative since the remained segregated literatures, despite great promise to inform initial positive signal in Experiment 1 was detected using post hoc one another68,94. Ultimately, weighting predictions and errors analyses and needs to be investigated further, as discussed above. according to their reliability is crucial to optimally perceive and It is noteworthy that the predictive associations studied here engage with our environment, and the current ﬁndings suggest were fully implicit. Participants were not informed that there that the hippocampus plays a crucial role in this process. were any such associations, the predictions were incidental to the task, and debrieﬁng indicated that participants did not become aware of them over the course of the experiments8,44. The fact Methods Participants. Both experiments aimed to recruit 24 healthy, right-handed, MR- that such implicit associations still involved the hippocampus is compatible participants with normal or corrected-to-normal vision. All partici- in line with theories of hippocampal processing based on the pants provided informed consent through a protocol reviewed by the University types of computations required, rather than whether they are College London (UCL) Research Ethics Committee and were compensated a total explicit or implicit23,40,85. In fact, it has even been suggested that of £27.50 for their time. Twenty-nine individuals completed Experiment 1, of which ﬁve were excluded due to our strict head motion criteria (ﬁve or more the hippocampus may engage in error-driven conjunction movements larger than 1.5 mm in any direction between successive functional learning speciﬁcally when associations are incidental to the task volumes). The ﬁnal sample consisted of 24 participants (12 female; age 25.6 ± 7.2, participants perform86. mean ± SD). Twenty-nine individuals completed Experiment 2, of which two were Despite the predictive associations being implicit, hippocampal excluded for not performing the task above chance, and three due to excessive head signals may still have been affected by ﬂuctuations in the level of motion (see criteria above). The ﬁnal sample consisted of 24 participants (19 female; age 26.2 ± 7.0, mean ± SD). attention paid to the cues over the course of the blocks. That is, if participants pay more (less) attention to the cues over time, this might increase (decrease) the strength of the prediction signals in Stimuli. Visual and auditory stimuli were generated using MATLAB (Mathworks, the hippocampus. Future research might dissociate learning Natick, MA, USA) and the Psychophysics Toolbox95. In the MR scanner, visual dynamics and attentional ﬂuctuations by changing the reliability stimuli were displayed on a rear projection screen using a projector (1600 × 1200 resolution, 60 Hz refresh rate) against a grey background. Participants viewed the of predictive cues between blocks. More reliable (e.g., 90% valid) visual display through a mirror that was mounted on the head coil. The visual cues would be expected to lead to faster learning rates than less stimuli consisted of complex shapes deﬁned by radial frequency components reliable (e.g., 60% valid) ones, without affecting non-speciﬁc (RFCs)96,97, identical to the shapes used in Kok & Turk-Browne27 (Fig. 1). The ﬂuctuations in attention due to time spent on task. contours of the stimuli were deﬁned by seven RFCs, and a one-dimensional shape space was created by varying the amplitude of three out of the seven RFCs27. In the current study, both prediction error and prediction sig- Speciﬁcally, the amplitudes of the 1.11, 1.54 and 4.94 Hz components increased nals seem to have been driven by the posterior rather than the together, ranging from 0 to 36 (ﬁrst two components), and from 15.58 to 33.58 anterior hippocampus. This ﬁnding is in line with suggestions that (third component). Note that we chose to vary three RFCs simultaneously, rather hippocampal representations increase in complexity and scale than one, to increase the perceptual (and neural) discriminability of the shapes. Five shapes (Fig. 1d) were selected from this continuum such that they represented along the long axis57; simple cue-stimulus associations as studied a perceptually symmetrical sample of this shape space (see Kok & Turk-Browne27 here may therefore be encoded in the posterior hippocampus58, for details). In addition, a fourth RFC (the 3.18 Hz component) was used to create whereas more complex representations such as narratives57 and slightly warped versions of the ﬁve shapes, to enable the same/different shape scenes87–89 are encoded in the anterior hippocampus. discrimination cover task (see below). Experiments 1 and 2 presented identical shapes (black, subtending 4.5°), centred on ﬁxation. Analysis of the caudate nucleus revealed similar prediction In the scanner, auditory stimuli were presented using MR-compatible ear buds signals as in the hippocampus, in line with previous work (E-A-RTONE 3 A, 10 Ohm, Etymotic Research, Elk Grove Village, IL, USA). The employing a highly similar experimental design27, as well as other auditory stimuli consisted of sequences of pure tones, ranging in frequency from studies revealing the involvement of the caudate in predictive 261.36 Hz (C4) to 987.77 (B5) Hz (set of 14 tones: C4, D4, E4, F4, G4, A4, B4, C5, D5, processing4,60. Recently, it has been suggested that perceptual E5, F5, G5, A5, B5; duration = 100 ms; 10 ms linear rise and fall ramps). Seventeen sequences of ﬁve tones (500 ms) were created by selecting the least correlated expectation signals in the tail of the striatum play a role in gen- sequences from all permutations of {1, 2, 3, 4, 5}. These 17 sequences (e.g., 1-2-3-4- erating hallucination-like percepts in mice90. Future research is 5, 1-5-4-3-2, 3-1-5-4-2, etc.) were further differentiated by assigning them different needed to establish whether the caudate and hippocampus play starting tones, in the step of 3. For instance, if sequence 1 was 1-2-3-4-5, sequence 2 different or complementary roles in the processing of predictive was 4-8-7-6-5, sequence 3 was 9-7-11-10-8, etc. Since the maximum starting tone was 10, given the set of 14 tones, every ﬁfth sequence started with starting tone 1 associations91,92. again. For each sequence, a mirrored sequence was generated in order to create 17 In the current study, novel predictive cues were introduced on pairs of easily distinguishable sequences consisting of the same tones (e.g., sequence each block of the experiment. It is an open question whether 1-2-3-4-5 was paired with 5-4-3-2-1, sequence 4-8-7-6-5 was paired with 8-4-5-6-7, similar hippocampal dynamics would occur if the cue identities etc.). In Experiment 1, 16 of these pairs were assigned in random order to the sixteen blocks, while the seventeenth pair was used during the practice block remained the same throughout the experiment, but the predictive outside the scanner (see below). In Experiment 2, only the ﬁrst four pairs were contingencies switched. In other words, does the hippocampal used, since this experiment contained only four blocks (see below), while the ﬁfth switch observed here depend on the cues themselves being novel, pair was used for practice. NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w Experimental procedure. The trial structure was identical in both experiments. multiband factor = 6). This sequence produced a partial volume for each partici- The start of each trial was signalled by the presentation of a ﬁxation bullseye pant, which covered the occipital and temporal lobes, including and parallel to the (diameter, 0.7°). During prediction runs, an auditory cue (sequence of ﬁve tones; hippocampus. Field map data were acquired using a Siemens Field Map sequence 500 ms) was presented 100 ms after the trial onset (Fig. 1a). Following a 500 ms (TR = 1020.0 ms; short TE = 10.00 ms; long TE = 12.46 ms; voxel size = 3.0 × 3.0 delay, two consecutive shapes were presented for 250 ms each, separated by a × 2.0 mm, 64 transverse slices, ﬂip angle = 90°). Anatomical images were acquired 500 ms ﬁxation screen. The auditory cues predicted whether the ﬁrst shape on that using a T1-weighted Magnetisation Prepared Rapid Gradient Echo (MPRAGE), trial would be shape 2 or shape 4 (out of ﬁve shapes; Fig. 1b, d). The cue was valid using a Generalized Auto calibrating Partially Parallel Acquisition (GRAPPA) in 75% of trials, whereas in the other 25% of trials the unpredicted shape would be factor of 2 (TR = 2530 ms; TE = 3.34 ms; 176 sagittal slices; voxel size = 1.0 × 1.0 presented. For instance, a speciﬁc auditory cue might be followed by shape 2 in × 1.0 mm; ﬂip angle = 7°). To enable hippocampal segmentation, a T2-weighted 75% of trials and by shape 4 in the remaining 25% of trials. On each trial, the turbo spin-echo (TSE) image (TR = 12650 ms; TE = 45 ms; voxel size = 0.4 × 0.4 second shape either was identical to the ﬁrst (50%), or slightly warped (50%), by × 1.5 mm; 54 coronal slices perpendicular to the long axis of the hippocampus; ﬂip modulating the amplitude of the 3.18 Hz RFC component deﬁning the shape. This angle = 122°) was acquired. modulation could be either positive or negative (counterbalanced over conditions) and the participants’ task was to indicate whether the two shapes on a given trial fMRI preprocessing. Images for both experiments were preprocessed using Sta- were the same or different, using an MR-compatible button box (750 ms response tistical Parametric Mapping (SPM12, http://www.ﬁl.ion.ucl.ac.uk/spm, Wellcome interval). This task was designed to encourage participants to attend the visual Centre for Human Neuroimaging, London, UK). The ﬁrst six volumes of each shapes, while avoiding a relationship between the perceptual prediction and the functional run were discarded to allow T1 equilibration. For each run, the task response. Furthermore, by modulating one of the RFCs that was not used to remaining functional images were spatially realigned to correct for head motion, deﬁne our one-dimensional shape space, we ensured that the shape change on and simultaneously supplied to B0 unwarping, using SPM’s realign and unwarp which the task was performed was orthogonal to the changes that deﬁned the shape function. The functional data were temporally high-pass ﬁltered with a 128 s period space, and thus orthogonal to the shape features predicted by the auditory cues. cut-off. No spatial smoothing was applied, and all analyses were performed in the The size of the shape modulation was determined by a staircasing procedure98, participants’ native space. The T1 and T2-weighted structural scans were co- updated after every trial to ensure sufﬁcient task difﬁculty (~75% correct). The end registered and subsequently co-registered to the mean functional scan. of each trial was signalled by replacing the ﬁxation bullseye with a single ﬁxation dot, encouraging participants to continue to ﬁxate (inter trial interval jittered between 1.25 and 4.25 s). Regions of interest. The hippocampus and its subﬁelds, CA1, CA2-3-DG, and the Experiment 1 consisted of 16 blocks of 32 trials, presented in four prediction subiculum, were deﬁned based on the structural T2 and T1 images using the runs (4 blocks per run, 30 s breaks between runs, ~12 min per run). In each block, a automatic segmentation of hippocampal subﬁelds (ASHS)99 machine learning different pair of cues were presented. For each trial number (1–32) we toolbox, in conjunction with a database of manual medial temporal lobe (MTL) counterbalanced (1) which cue was presented, (2) whether the cue was valid (75%) segmentations from a separate set of 51 participants100,101. Consistent with previous or invalid (25%), and (3) whether the two shapes were the same or different. studies, CA2, CA3 and DG were combined into a single region of interest (ROI) Experiment 2 consisted of 4 blocks of 128 trials (1 block per prediction run, 30 s since these subﬁelds are difﬁcult to distinguish at our functional resolution (1.5 mm break halfway, ~12 min per run), with a different pair of cues presented in each isotropic). This method also yielded an entorhinal cortex (EC) ROI for our infor- block. As in Experiment 1, cue validity was counterbalanced for every trial position, mative connectivity analysis (see below). Results of the automated segmentation were but given the smaller number of blocks, the presented cue and shape modulation inspected visually for each participant. In addition, a caudate region of interest were counterbalanced over groups of four trial positions (trials 1–4, 5–8, etc.) (ROI), as well as visual cortex ROIs for our informational connectivity analysis—V1, rather than for every trial position. This was reﬂected in the analyses by a fourfold V2 lateral occipital cortex (LO)—were automatically deﬁned in each participant’s increase in the trial averaging window; see below. T1-weighted anatomical scan using FreeSurfer (http://surfer.nmr.mgh.harvard.edu/). In both experiments, which pair of cues was assigned to which block, as well as The visual cortex ROIs were restricted to the 500 most active voxels during the which member of each pair predicted which shape, was counterbalanced across shape-only runs, to ensure that we were measuring responses in the retinotopic participants. locations corresponding to our visual stimuli. Since no clear retinotopic organization In addition to the four prediction runs, both experiments also contained two is present in the other ROIs, cross-validated feature selection was used instead (see shape-only runs, ﬂanking the prediction runs, constituting the ﬁrst and last (sixth) below). All ROIs were collapsed over the left and right hemispheres, as we had no runs of the experiments. In these runs (120 trials per run, ~12 min) no auditory hypotheses regarding hemispheric differences. cues were presented (Fig. 1c). As in the prediction runs, each trial started with the appearance of a ﬁxation bullseye followed 1100 ms later by two shapes (250 ms fMRI data modelling. For both experiments, the pattern of activity evoked by each, 500 ms interval). On each trial, one of the ﬁve possible shapes was presented, every single trial of the prediction runs, in each ROI, was estimated using the Least- with equal (20%) likelihood (Fig. 1d). As in the prediction runs, the participants’ Squares-Separate method102,103. That is, a separate GLM was created for every trial, task was to indicate whether the two shapes were the same or different. The size of such that each trial is modelled once as a regressor of interest, with all other trials the shape modulations was controlled by a staircase separate from that of the combined into a single nuisance regressor. Delta functions were inserted at the prediction runs, to equate task difﬁculty in these runs with ﬁve instead of two onset of the trial of interest (ﬁrst regressor) and all other trials (second regressor) possible initial shapes. The shape-only runs acted as the training data for our shape and convolved with a double-gamma hemodynamic response function (HRF) and decoding model, see below. its temporal derivative104. The voxel-wise parameter estimates for the trial-of- Before both experiments, participants completed an instruction and practice interest HRF regressor constituted the estimated BOLD activity pattern for each session to acquaint them with the task (~30 min). During practice, participants trial. This method has been shown to improve the estimation of single-trial BOLD completed 100 shape-only trials and 16 prediction trials. The pair of auditory cues responses, compared with a GLM with one regressor for each trial102. In addition used during the short prediction run was not included in the main experiments. to these regressors, the GLMs included nuisance regressors consisting of the head After the experiments, participants completed a short questionnaire that motion parameters resulting from spatial realignment, their derivatives, and the indicated whether or not they became aware of the predictive nature of the auditory square of these derivatives (i.e., 18 motion parameters in total). The data from the cues. The responses to both an open-ended question (“Can you tell us what the shape-only runs were analysed using a more conventional GLM, with one regressor meaning of the sounds was during the experiment?”) as well as a guided one for each of the ﬁve shapes and 18 head motion nuisance regressors. (“During every block of the experiment, two different sounds were played. These sounds predicted which shapes would appear. For instance, a series of rising tones might predict that you’ll see shape A, and falling tones might predict you’ll see Shape decoding. In order to probe neural shape representations, a forward shape B. These predictions were 75% valid, so on 25% of trials they were incorrect. modelling approach was used to decode the shapes from the patterns of BOLD Did you realise this?”) indicated that the vast majority of participants did not activity in each ROI27,105. The decoding algorithm was identical to that used in become aware of the predictions in either experiment (Experiment 1: 1 out of 22 Kok & Turk-Browne27, and will be outlined here brieﬂy (see Supplementary Fig. 1 participants indicated that they realised the cues predicted which shape would for a visual depiction). appear, no data for 2 participants; Experiment 2: 0 out of 22 participants indicated The shape selectivity of each voxel was characterised as a weighted sum of ﬁve that they realised the cues predicted which shape would appear, no data for 2 hypothetical channels, each with an idealised shape tuning curve (or basis participants). function), consisting of a halfwave-rectiﬁed sinusoid raised to the ﬁfth power. In the ﬁrst stage of the analysis, the parameter estimates obtained from the two shape- only runs were used to estimate the weights on the ﬁve hypothetical channels separately for each voxel, using linear regression. Speciﬁcally, let k be the number of MRI acquisition. In both experiments structural and functional MRI data were channels, m the number of voxels, and n the number of measurements (i.e., the ﬁve collected on a 3 T Siemens Prisma scanner with a 64-channel head coil at the shapes). The matrix of estimated response amplitudes for the different shapes Wellcome Centre for Human Neuroimaging (WCHN). Note that two different during the shape-only runs (Btrain, m × n) was related to the matrix of hypothetical scanners with identical speciﬁcations were used for the two experiments, for channel outputs (Ctrain, k × n) by a weight matrix (W, m × k): availability reasons. Functional images for both experiments were acquired using a T2*-weighted multiband echo-planar imaging sequence (TR = 1000 ms; TE = Btrain ¼ WCtrain þ N ð1Þ 33.0 ms; 60 transverse slices; voxel size = 1.5 × 1.5 × 1.5 mm; ﬂip angle = 55°, 10 NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w ARTICLE The weight matrix was estimated by least squares estimation: With a midpoint x0 between trials 1 and 32, slope k between 0.01 and 1, and amplitude A between −1 and 1. These parameters were ﬁtted using Matlab’s b ¼ Btrain CTtrain ðCtrain CTtrain Þ1 W ð2Þ fmincon function, wrapped in GlobalSearch. We ran 100 iterations with random Using these weights, the second stage of analysis consisted of reconstructing the parameter starting values (within their prescribed ranges), in order to avoid local channel outputs associated with the pattern of activity across voxels evoked by each minima. The amplitude parameter was submitted to simple t-tests to test whether trial in the prediction runs (Btest), again using linear regression: learning curves signiﬁcantly deviated from zero. In Experiment 2, the ﬁtted curves consisted of a combination of two sigmoids, to test whether dynamics 1 b T WÞ b test ¼ ðW C b Wb T Btest ð3Þ changed as learning progressed: Aa Ab where C b test are the estimated channel outputs. These channel outputs were used to þ ð5Þ 1 þ eka ðxx0a Þ 1 þ ekb ðxx0b Þ compute a weighted average of the ﬁve basic functions, reﬂecting a neural shape tuning curve (Supplementary Fig. 1). Note that, during the main experiment (i.e., with slopes ka and kb between 0.01 and 1, amplitudes Aa and Ab between −1 and 1. the prediction runs), only shapes 2 and 4 were presented. Decoding performance The ﬁrst sigmoid had a midpoint x0a between trial 1 and 64, while the second had a was quantiﬁed by subtracting the amplitude of the shape tuning curve at the midpoint x0b between trial 65 and 128, allowing them to capture potential presented shape (e.g., shape 2) from the amplitude at the non-presented shape differences between the ﬁrst and second half of the blocks. Note that both sigmoids’ (shape 4). This procedure yielded a measure of decoding evidence for the presented amplitudes were free to range between −1 and 1, meaning that this analysis shape on each trial, in each ROI. imposed no priors on the signs of the curves. As in Experiment 1, the amplitude For all ROIs, voxel selection was based on data from the shape-only runs, in parameters were submitted to simple t-tests to test whether learning curves which no predictions were present, to ensure voxel selection was independent of signiﬁcantly deviating from zero. Since Experiment 2 was motivated by a speciﬁc the data in which we tested our effects of interest (i.e., the prediction runs). In hypothesis on the nature of change of the hippocampal signal over trials (Fig. 4) we visual cortex ROIs, we selected the 500 most active voxels during the shape-only relied on these tests of the dynamics of the signal, rather than cluster-based runs. However, the hippocampus and caudate did not show a clear evoked response permutation tests as in Experiment 1. to visual stimuli, as deﬁned by a lack of signiﬁcant ﬁt of a regressor of stimulus In an exploratory post hoc analysis, we also ﬁtted two sigmoids to the data of onset times convolved with a canonical haemodynamic response to the mean Experiment 1. As in the analysis of Experiment 2, the ﬁrst sigmoid was constrained hippocampal time course. Therefore, we applied a different method of voxel to have a midpoint in the ﬁrst half of the blocks (here, between trial 1 and 16), selection for these ROIs. Voxels were ﬁrst sorted by their informativeness, that is, while the second had a midpoint in the second half (here, between trial 17 and 32), how different the weights for the different channels were from each other, as allowing them to capture potential differences between the ﬁrst and second half of indexed by the standard deviation of the weights. Second, the decoding model was the blocks. trained and tested on different subsets of these voxels (between 10 and 100%, in In a control analysis that made no assumptions on the shapes of the time 10% increments), within the shape-only runs (trained on one run and tested on the courses, we calculated the average derivatives of the decoding time courses. For other). For all iterations, decoding performance on shapes 2 and 4 was quantiﬁed Experiment 2, this was done separately for the ﬁrst (trials 1–64) and second (trials as described above, and the number of voxels that yielded the highest performance 65–128) half of the blocks, to investigate whether dynamics changed as learning was selected. This procedure was used for voxel selection in the hippocampus progressed. (Experiment 1: 1068 voxels selected; Experiment 2: 970 voxels; group average), All analyses were initially performed on the hippocampus ROI as a whole, and CA1 (Experiment 1: 271 voxels; Experiment 2: 313 voxels), CA2-3-DG when signiﬁcant these were followed up by investigating hippocampal subﬁelds and (Experiment 1: 374 voxels; Experiment 2: 433 voxels), subiculum (Experiment 1: comparing the anterior and posterior hippocampi. This hierarchical approach, 273 voxels; Experiment 2: 249 voxels), and caudate (Experiment 1: 1292 voxels; where signiﬁcant effects in the ROI as a whole were followed up with tests of its Experiment 2: 1249 voxels). subdivisions, rather than simply examining all possible comparisons, helped control the false positive rate. All statistical tests performed in this paper were two-sided. Quantifying time courses of shape representations. A sliding window approach was used to investigate how shape representations evolved over trials. In Experi- Informational connectivity. In an exploratory analysis, we investigated whether ment 1, this window consisted of 4 trial positions (i.e., trials 1–4 of all 16 blocks, functional connectivity between regions (speciﬁcally, between the posterior sub- followed by trials 2–5, trials 3–6, etc.), while for Experiment 2 the window was four iculum and EC, V1, V2, and LO) changed over trials in Experiment 2. Speciﬁcally, times as wide (16 trial positions; trials 1–16 of all 4 blocks, trials 2–17, trials 3–18, the Pearson correlation in decoding evidence over trials between two regions was etc.) to compensate for the fourfold decrease in the number of blocks (i.e. the calculated107, within the sliding windows described above. This analysis yielded number of trials-per-position). Within each window, we averaged the decoding time courses of correlation values, with a positive value indicating that whenever evidence for validly and invalidly predicted shapes separately. In order to quantify region A represents shape 2 (rather than shape 4), region B is likely to do so as well. evidence for the shape predicted by the cue, controlling for the actually presented Changes in informational connectivity over time were tested by comparing r values shape, evidence for validly and invalidly predicted shapes was subtracted (i.e., at the end of the blocks (i.e., the ﬁnal window, containing trials 113–128) with the averaging (1 - evidence) for the invalidly predicted shapes with evidence for the start of the blocks (the ﬁrst window, containing trials 1–16), using paired-sample t- validly predicted shapes) (Fig. 1e, f). Finally, the decoding time courses were tests. smoothed by averaging over a sliding window. In Experiment 1 each bin was averaged with the previous and subsequent 4 bins, yielding a window size of 9 bins. Reporting summary. Further information on research design is available in the Nature In Experiment 2 the window size was 33 bins, containing the previous and sub- Research Reporting Summary linked to this article. sequence 16 bins. Note that the results presented here do not critically depend on these parameters, as qualitatively identical effects were present when the length of the sliding window was doubled and subsequent smoothing was omitted. In the current study, analysing time courses without applying either a sliding window or Data availability temporal smoothing was not feasible, as fMRI responses to individual trials are not All region-speciﬁc fMRI time course data are available on the OSF platform (https://osf. sufﬁciently robust. Future work could potentially address this by conducting io/48xjf/). Source data are provided with this paper. multiple (e.g., four or more) fMRI sessions per participant, increasing the amount of data per trial position. Initially, in Experiment 1, in a fully assumption-free analysis, we performed non-parametric cluster-based permutation tests106 on the time courses, to test Code availability whether the decoding signals differed signiﬁcantly from zero at any timepoint. All analysis scripts are available on the OSF platform (https://osf.io/48xjf/). Speciﬁcally, univariate t statistics were calculated for all timepoints, and neighbouring elements that passed a threshold value corresponding to a p value of 0.05 (two-tailed) were collected into clusters. Cluster-level test statistics consisted of Received: 21 September 2021; Accepted: 11 May 2022; the sum of t values within each cluster, which were compared to a null distribution created by drawing 10,000 random permutations of the observed data. A cluster was considered signiﬁcant when its p value was below 0.05 (i.e., a cluster of its size occurred in fewer than 5% of the null distribution clusters). These non-parametric tests were speciﬁcally conceived to test effects in data with non-zero independence across time (and space), by generating null distributions with the same smoothness References as the original data106. 1. De Lange, F. P., Heilbron, M. & Kok, P. How do expectations shape Subsequently, the obtained time courses of decoding evidence for the predicted perception? Trends Cogn. Sci. 22, 764–779 (2018). shapes were quantiﬁed by ﬁtting sigmoid curves to them. In Experiment 1, this 2. Cohen, N. J. & Eichenbaum, H. Memory, Amnesia, and the Hippocampal consisted of a single sigmoid: System (The MIT Press, 1993). A 3. Davachi, L. Item, context and relational episodic encoding in humans. Curr. ð4Þ 1 þ ekðxx0 Þ Opin. Neurobiol. 16, 693–700 (2006). NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w 4. Turk-Browne, N. B., Scholl, B. J., Chun, M. M. & Johnson, M. K. Neural 34. Chen, J., Olsen, R. K., Preston, A. R., Glover, G. H. & Wagner, A. D. evidence of statistical learning: efﬁcient detection of visual regularities without Associative retrieval processes in the human medial temporal lobe: awareness. J. Cogn. Neurosci. 21, 1934–1945 (2009). Hippocampal retrieval success and CA1 mismatch detection. Learn. Mem. 18, 5. Schapiro, A. C., Kustner, L. V. & Turk-Browne, N. B. Shaping of object 523–528 (2011). representations in the human medial temporal lobe based on temporal 35. Chen, J., Cook, P. A. & Wagner, A. D. Prediction strength modulates regularities. Curr. Biol. 22, 1622–1627 (2012). responses in human area CA1 to sequence violations. J. Neurophysiol. 114, 6. Davachi, L. & DuBrow, S. How the hippocampus preserves order: the role of 1227–1238 (2015). prediction and context. Trends Cogn. Sci. 19, 92–99 (2015). 36. Duncan, K., Ketz, N., Inati, S. J. & Davachi, L. Evidence for area CA1 as a 7. Garvert, M. M., Dolan, R. J. & Behrens, T. E. A map of abstract relational match/mismatch detector: a high-resolution fMRI study of the human knowledge in the human hippocampal–entorhinal cortex. eLife 6, e17086 (2017). hippocampus. Hippocampus 22, 389–398 (2012). 8. Spaak, E. & De Lange, F. P. Hippocampal and prefrontal theta-band 37. Long, N. M., Lee, H. & Kuhl, B. A. Hippocampal mismatch signals are mechanisms underpin implicit spatial context learning. J. Neurosci. 40, modulated by the strength of neural predictions and their similarity to 191–202 (2020). outcomes. J. Neurosci. 36, 12677–12687 (2016). 9. Henin, S. et al. Learning hierarchical sequence representations across human 38. Barron, H. C. et al. Neuronal computation underlying inferential reasoning in cortex and hippocampus. Sci. Adv. 7, eabc4530 (2021). humans and mice. Cell 183, 228–243.e21 (2020). 10. Solomon, P. R., Vander Schaaf, E. R., Thompson, R. F. & Weisz, D. J. 39. Strange, B. A. & Dolan, R. J. Adaptive anterior hippocampal responses to Hippocampus and trace conditioning of the rabbit’s classically conditioned oddball stimuli. Hippocampus 11, 690–698 (2001). nictitating membrane response. Behav. Neurosci. 100, 729–744 (1986). 40. Kumaran, D. & Maguire, E. A. Novelty signals: a window into hippocampal 11. Wallenstein, G. V., Hasselmo, M. E. & Eichenbaum, H. The hippocampus as information processing. Trends Cogn. Sci. 13, 47–54 (2009). an associator of discontiguous events. Trends Neurosci. 21, 317–323 (1998). 41. Liu, K., Sibille, J. & Dragoi, G. Generative predictive codes by multiplexed 12. Staresina, B. P. & Davachi, L. Mind the gap: binding experiences across space hippocampal neuronal tuplets. Neuron 99, 1329–1341.e6 (2018). and time in the human hippocampus. Neuron 63, 267–276 (2009). 42. Sinclair, A. H., Manalili, G. M., Brunec, I. K., Adcock, R. A. & Barense, M. D. 13. Sutherland, R. J., McDonald, R. J., Hill, C. R. & Rudy, J. W. Damage to the Prediction errors disrupt hippocampal representations and update episodic hippocampal formation in rats selectively impairs the ability to learn cue memories. Proc. Natl Acad. Sci. USA 118, e2117625118 (2021). relationships. Behav. Neural Biol. 52, 331–356 (1989). 43. Bar, M. The proactive brain: using analogies and associations to generate 14. Chun, M. M. & Phelps, E. A. Memory deﬁcits for implicit contextual predictions. Trends Cogn. Sci. 11, 280–289 (2007). information in amnesic subjects with hippocampal damage. Nat. Neurosci. 2, 44. Aitken, F., Turner, G. & Kok, P. Prior expectations of motion direction 844–847 (1999). modulate early sensory processing. J. Neurosci. 40, 6389–6397 (2020). 15. Hannula, D. E., Tranel, D. & Cohen, N. J. The long and the short of it: 45. Bein, O., Duncan, K. & Davachi, L. Mnemonic prediction errors bias relational memory impairments in amnesia, even at short lags. J. Neurosci. 26, hippocampal states. Nat. Commun. 11, 3451 (2020). 8352–8359 (2006). 46. Frank, D., Montemurro, M. A. & Montaldi, D. Pattern separation underpins 16. Konkel, A., Warren, D. E., Duff, M. C., Tranel, D. & Cohen, N. J. expectation-modulated memory. J. Neurosci. 40, 3455–3464 (2020). Hippocampal amnesia impairs all manner of relational memory. Front. Hum. 47. Duncan, K., Sadanand, A. & Davachi, L. Memory’s penumbra: episodic memory Neurosci. 2, 15 (2008). decisions induce lingering mnemonic biases. Science 337, 485–487 (2012). 17. Schapiro, A. C., Gregory, E., Landau, B., McCloskey, M. & Turk-Browne, N. B. 48. Hasselmo, M. E., Wyble, B. P. & Wallenstein, G. V. Encoding and retrieval of The necessity of the medial temporal lobe for statistical learning. J. Cogn. episodic memories: role of cholinergic and GABAergic modulation in the Neurosci. 26, 1736–1747 (2014). hippocampus. Hippocampus 6, 693–708 (1996). 18. Finnie, P. S. B., Komorowski, R. W. & Bear, M. F. The spatiotemporal 49. Hasselmo, M. E., Bodelón, C. & Wyble, B. P. A proposed function for organization of experience dictates hippocampal involvement in primary hippocampal theta rhythm: separate phases of encoding and retrieval enhance visual cortical plasticity. Curr. Biol. 31, 3996–4008.e6 (2021). reversal of prior learning. Neural Comput. 14, 793–817 (2002). 19. Stachenfeld, K. L., Botvinick, M. M. & Gershman, S. J. The hippocampus as a 50. Giovannini, M. G. et al. Effects of novelty and habituation on acetylcholine, predictive map. Nat. Neurosci. 20, 1643–1653 (2017). GABA, and glutamate release from the frontal cortex and hippocampus of 20. Whittington, J. C. R. et al. The Tolman-Eichenbaum machine: unifying space freely moving rats. Neuroscience 106, 43–53 (2001). and relational memory through generalization in the hippocampal formation. 51. Gu, Q. Neuromodulatory transmitter systems in the cortex and their role in Cell 183, 1249–1263.e23 (2020). cortical plasticity. Neuroscience 111, 815–835 (2002). 21. Treves, A. & Rolls, E. T. Computational analysis of the role of the 52. Yu, A. J. & Dayan, P. Uncertainty, neuromodulation, and attention. Neuron hippocampus in memory. Hippocampus 4, 374–391 (1994). 46, 681–692 (2005). 22. McClelland, J. L., McNaughton, B. L. & O’Reilly, R. C. Why there are 53. Rizzuto, D. S., Madsen, J. R., Bromﬁeld, E. B., Schulze-Bonhage, A. & Kahana, complementary learning systems in the hippocampus and neocortex: insights M. J. Human neocortical oscillations exhibit theta phase differences between from the successes and failures of connectionist models of learning and encoding and retrieval. NeuroImage 31, 1352–1358 (2006). memory. Psychol. Rev. 102, 419–457 (1995). 54. Manns, J. R., Zilli, E. A., Ong, K. C., Hasselmo, M. E. & Eichenbaum, H. 23. Henke, K. A model for memory systems based on processing modes rather Hippocampal CA1 spiking during encoding and retrieval: relation to theta than consciousness. Nat. Rev. Neurosci. 11, 523–532 (2010). phase. Neurobiol. Learn. Mem. 87, 9–20 (2007). 24. Eichenbaum, H. & Fortin, N. J. The neurobiology of memory based 55. Douchamps, V., Jeewajee, A., Blundell, P., Burgess, N. & Lever, C. Evidence predictions. Philos. Trans. R. Soc. Lond. B: Biol. Sci. 364, 1183–1191 (2009). for encoding versus retrieval scheduling in the hippocampus by theta phase 25. Turk-Browne, N. B., Scholl, B. J., Johnson, M. K. & Chun, M. M. Implicit and acetylcholine. J. Neurosci. 33, 8689–8704 (2013). perceptual anticipation triggered by statistical learning. J. Neurosci. 30, 56. Kok, P., Rait, L. I. & Turk-Browne, N. B. Content-based dissociation of 11177–11187 (2010). hippocampal involvement in prediction. J. Cogn. Neurosci. 32, 527–545 (2020). 26. Hindy, N. C., Ng, F. Y. & Turk-Browne, N. B. Linking pattern completion in 57. Collin, S. H. P., Milivojevic, B. & Doeller, C. F. Memory hierarchies map onto the hippocampus to predictive coding in visual cortex. Nat. Neurosci. 19, the hippocampal long axis in humans. Nat. Neurosci. 18, 1562–1564 (2015). 665–667 (2016). 58. Bone, M. B. & Buchsbaum, B. R. Detailed episodic memory depends on 27. Kok, P. & Turk-Browne, N. B. Associative prediction of visual shape in the concurrent reactivation of basic visual features within the posterior hippocampus. J. Neurosci. 38, 6888–6899 (2018). hippocampus and early visual cortex. Cereb. Cortex Commun. https://doi.org/ 28. Barron, H. C., Auksztulewicz, R. & Friston, K. Prediction and memory: a 10.1093/texcom/tgab045 (2021). predictive coding account. Prog. Neurobiol. 192, 101821 (2020). 59. Poppenk, J., Evensmoen, H. R., Moscovitch, M. & Nadel, L. Long-axis 29. Brunec, I. K., Robin, J., Olsen, R. K., Moscovitch, M. & Barense, M. D. specialization of the human hippocampus. Trends Cogn. Sci. 17, 230–240 (2013). Integration and differentiation of hippocampal memory traces. Neurosci. 60. Den Ouden, H. E. M., Friston, K. J., Daw, N. D., McIntosh, A. R. & Stephan, Biobehav. Rev. 118, 196–208 (2020). K. E. A dual role for prediction error in associative learning. Cereb. Cortex 19, 30. Lee, H., GoodSmith, D. & Knierim, J. J. Parallel processing streams in the 1175–1185 (2009). hippocampus. Curr. Opin. Neurobiol. 64, 127–134 (2020). 61. Lavenex, P. & Amaral, D. G. Hippocampal-neocortical interaction: a hierarchy 31. Sinclair, A. H. & Barense, M. D. Prediction error and memory reactivation: of associativity. Hippocampus 10, 420–430 (2000). how incomplete reminders drive reconsolidation. Trends Neurosci. 42, 62. Roy, D. S. et al. Distinct neural circuits for the formation and retrieval of 727–739 (2019). episodic memories. Cell 170, 1000–1012.e19 (2017). 32. Kumaran, D. & Maguire, E. A. An unexpected sequence of events: mismatch 63. Hindy, N. C., Avery, E. W. & Turk-Browne, N. B. Hippocampal-neocortical detection in the human hippocampus. PLoS Biol. 4, e424 (2006). interactions sharpen over time for predictive actions. Nat. Commun. 10, 1–13 33. Axmacher, N. et al. Intracranial EEG correlates of expectancy and memory (2019). formation in the human hippocampus and nucleus accumbens. Neuron 65, 64. Günseli, E. & Aly, M. Preparation for upcoming attentional states in the 541–549 (2010). hippocampus and medial prefrontal cortex. eLife 9, e53191 (2020). 12 NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31040-w ARTICLE 65. Schapiro, A. C., Turk-Browne, N. B., Botvinick, M. M. & Norman, K. A. 97. Op de Beeck, H., Wagemans, J. & Vogels, R. Inferotemporal neurons represent Complementary learning systems within the hippocampus: a neural network low-dimensional conﬁgurations of parameterized shapes. Nat. Neurosci. 4, modelling approach to reconciling episodic memory with statistical learning. 1244–1252 (2001). Philos. Trans. R. Soc. B 372, 20160049 (2017). 98. Watson, A. B. & Pelli, D. G. Quest: a Bayesian adaptive psychometric method. 66. Grande, X. et al. Holistic recollection via pattern completion involves Percept. Psychophys. 33, 113–120 (1983). hippocampal subﬁeld CA3. J. Neurosci. 39, 8100–8111 (2019). 99. Yushkevich, P. A. et al. Automated volumetry and regional thickness analysis 67. Lisman, J. E. & Grace, A. A. The hippocampal-VTA loop: controlling the entry of hippocampal subﬁelds and medial temporal cortical structures in mild of information into long-term memory. Neuron 46, 703–713 (2005). cognitive impairment: automatic morphometry of MTL subﬁelds in MCI. 68. Press, C., Kok, P. & Yon, D. The perceptual prediction paradox. Trends Cogn. Hum. Brain Mapp. 36, 258–287 (2015). Sci. 24, 13–24 (2020). 100. Aly, M. & Turk-Browne, N. B. Attention stabilizes representations in the 69. Hasselmo, M. E. & Schnell, E. Laminar selectivity of the cholinergic human hippocampus. Cereb. Cortex 26, 783–796 (2016). suppression of synaptic transmission in rat hippocampal region CA1: 101. Aly, M. & Turk-Browne, N. B. Attention promotes episodic encoding by computational modeling and brain slice physiology. J. Neurosci. 14, stabilizing hippocampal representations. Proc. Natl Acad. Sci. USA 113, 3898–3914 (1994). E420–E429 (2016). 70. Hasselmo, M. E., Schnell, E. & Barkai, E. Dynamics of learning and recall at 102. Mumford, J. A., Turner, B. O., Ashby, F. G. & Poldrack, R. A. Deconvolving excitatory recurrent synapses and cholinergic modulation in rat hippocampal BOLD activation in event-related designs for multivoxel pattern classiﬁcation region CA3. J. Neurosci. 15, 5249–5262 (1995). analyses. NeuroImage 59, 2636–2643 (2012). 71. Meeter, M., Murre, J. M. J. & Talamini, L. M. Mode shifting between storage 103. St. John-Saaltink, E., Kok, P., Lau, H. C. & De Lange, F. P. Serial dependence and recall based on novelty detection in oscillating hippocampal circuits. in perceptual decisions is reﬂected in activity patterns in primary visual cortex. Hippocampus 14, 722–741 (2004). J. Neurosci. 36, 6186–6192 (2016). 72. Feldman, H. & Friston, K. J. Attention, uncertainty, and free-energy. Front. 104. Friston, K. J. et al. Event-related fMRI: characterizing differential responses. Hum. Neurosci. 4, 215 (2010). NeuroImage 7, 30–40 (1998). 73. Kok, P., Rahnev, D., Jehee, J. F. M., Lau, H. C. & De Lange, F. P. Attention 105. Brouwer, G. J. & Heeger, D. J. Decoding and reconstructing color from reverses the effect of prediction in silencing sensory signals. Cereb. Cortex 22, responses in human visual cortex. J. Neurosci. 29, 13992 (2009). 2197–2206 (2012). 106. Maris, E. & Oostenveld, R. Nonparametric statistical testing of EEG- and 74. Jiang, J., Summerﬁeld, C. & Egner, T. Attention sharpens the distinction MEG-data. J. Neurosci. Methods 164, 177–190 (2007). between expected and unexpected percepts in the visual brain. J. Neurosci. 33, 107. Koster, R. et al. Big-loop recurrence within the hippocampal system 18438–18447 (2013). supports integration of information across episodes. Neuron 99, 1342–1354 75. Kok, P., Mostert, P. & De Lange, F. P. Prior expectations induce prestimulus (2018). sensory templates. Proc. Natl Acad. Sci. USA 114, 10473–10478 (2017). 76. Aitken, F. et al. Prior expectations evoke stimulus-speciﬁc activity in the deep layers of the primary visual cortex. PLoS Biol. 18, e3001023 (2020). Acknowledgements 77. Bosch, S. E., Jehee, J. F. M., Fernandez, G. & Doeller, C. F. Reinstatement of The authors would like to thank Patricia Andrea Cabiles, Victoire Martignac and Ellis associative memories in early visual cortex is signaled by the hippocampus. J. Langford for assistance with data collection, and Anna Schapiro for helpful discussion of Neurosci. 34, 7493–7500 (2014). these ﬁndings. This work was supported by a Wellcome/Royal Society Sir Henry Dale 78. Gordon, A. M., Rissman, J., Kiani, R. & Wagner, A. D. Cortical reinstatement Fellowship [218535/Z/19/Z] and a European Research Council (ERC) Starting Grant mediates the relationship between content-speciﬁc encoding activity and [948548] to P.K. The Wellcome Centre for Human Neuroimaging is supported by core subsequent recollection decisions. Cereb. Cortex 24, 3350–3364 (2014). funding from the Wellcome Trust [203147/Z/16/Z]. 79. Horner, A. J., Bisby, J. A., Bush, D., Lin, W.-J. & Burgess, N. Evidence for holistic episodic recollection via hippocampal pattern completion. Nat. Author contributions Commun. 6, 7462 (2015). P.K. designed the study; F.A. collected the data; F.A. and P.K. analysed the data and 80. Staresina, B. P. & Wimber, M. A neural chronometry of memory recall. Trends wrote the manuscript. Cogni. Sci. 23, 1071–1085 (2019). 81. Makino, H. & Komiyama, T. Learning enhances the relative impact of top- down processing in the visual cortex. Nat. Neurosci. 18, 1116–1122 (2015). Competing interests 82. Staresina, B. P. et al. Recollection in the human hippocampal-entorhinal cell The authors declare no competing interests. circuitry. Nat. Commun. 10, 1503 (2019). 83. Lawrence, S. J. D., Formisano, E., Muckli, L. & De Lange, F. P. Laminar fMRI: Applications for cognitive neuroscience. NeuroImage 197, 785–791 (2019). Additional information Supplementary information The online version contains supplementary material 84. Maass, A. et al. Laminar activity in the hippocampus and entorhinal cortex available at https://doi.org/10.1038/s41467-022-31040-w. related to novelty and episodic encoding. Nat. Commun. 5, 5547 (2014). 85. Turk-Browne, N. B. The hippocampus as a visual area organized by space and Correspondence and requests for materials should be addressed to Peter Kok. time: a spatiotemporal similarity hypothesis. Vis. Res. 165, 123–130 (2019). 86. O’Reilly, R. C. & Rudy, J. W. Computational principles of learning in the Peer review information Nature Communications thanks Brice Kuhl and the other neocortex and hippocampus. Hippocampus 10, 389–397 (2000). anonymous reviewer(s) for their contribution to the peer review of this work. Peer review 87. Zeidman, P. & Maguire, E. A. Anterior hippocampus: the anatomy of reports are available. perception, imagination and episodic memory. Nat. Rev. Neurosci. 17, 173–182 (2016). Reprints and permission information is available at http://www.nature.com/reprints 88. Cooper, R. A. & Ritchey, M. Progression from feature-speciﬁc brain activity to hippocampal binding during episodic encoding. J. Neurosci. 40, 1701–1709 Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in (2020). published maps and institutional afﬁliations. 89. McCormick, C., Dalton, M. A., Zeidman, P. & Maguire, E. A. Characterising the hippocampal response to perception, construction and complexity. Cortex 137, 1–17 (2021). 90. Schmack, K., Bosc, M., Ott, T., Sturgill, J. F. & Kepecs, A. Striatal dopamine Open Access This article is licensed under a Creative Commons mediates hallucination-like perception in mice. Science 372, eabf4740 (2021). Attribution 4.0 International License, which permits use, sharing, 91. Poldrack, R. A. et al. Interactive memory systems in the human brain. Nature adaptation, distribution and reproduction in any medium or format, as long as you give 414, 546–550 (2001). appropriate credit to the original author(s) and the source, provide a link to the Creative 92. Shohamy, D. & Turk-Browne, N. B. Mechanisms for widespread hippocampal Commons license, and indicate if changes were made. The images or other third party involvement in cognition. J. Exp. Psychol.: Gen. 142, 1159–1170 (2013). material in this article are included in the article’s Creative Commons license, unless 93. Wammes, J., Norman, K. A. & Turk-Browne, N. Increasing stimulus similarity indicated otherwise in a credit line to the material. If material is not included in the drives nonmonotonic representational change in hippocampus. eLife 11, article’s Creative Commons license and your intended use is not permitted by statutory e68344 (2022). regulation or exceeds the permitted use, you will need to obtain permission directly from 94. Press, C., Kok, P. & Yon, D. Learning to perceive and perceiving to learn. the copyright holder. To view a copy of this license, visit http://creativecommons.org/ Trends Cogn. Sci. 24, 260–261 (2020). licenses/by/4.0/. 95. Brainard, D. H. The psychophysics toolbox. Spat. Vis. 10, 433–436 (1997). 96. Zahn, C. T. & Roskies, R. Z. Fourier descriptors for plane closed curves. IEEE Trans. Computers C.– 21, 269–281 (1972). © The Author(s) 2022 NATURE COMMUNICATIONS | (2022)13:3294 | https://doi.org/10.1038/s41467-022-31040-w | www.nature.com/naturecommunications 13

(PDF) Hippocampal representations switch from errors to predictions during acquisition of predictive associations