Attention, Perception, & Psychophysics 2009, 71 (2), 352-362 doi:10.3758/APP.71.2.352 Auditory event files: Integrating auditory perception and action planning SHARON ZMIGROD AND BERNHARD HOMMEL Leiden University, Leiden, The Netherlands The features of perceived objects are processed in distinct neural pathways, which call for mechanisms that integrate the distributed information into coherent representations (the binding problem). Recent studies of se- quential effects have demonstrated feature binding not only in perception, but also across (visual) perception and action planning. We investigated whether comparable effects can be obtained in and across auditory perception and action. The results from two experiments revealed effects indicative of spontaneous integration of auditory features (pitch and loudness, pitch and location), as well as evidence for audio–manual stimulus–response integration. Even though integration takes place spontaneously, features related to task-relevant stimulus or response dimensions are more likely to be integrated. Moreover, integration seems to follow a temporal overlap principle, with features coded close in time being more likely to be bound together. Taken altogether, the findings are consistent with the idea of episodic event files integrating perception and action plans. The perceived features of visual (Zeki & Bartels, 1999) features were bound to response features, so that repeat- and auditory (Kaas & Hackett, 1999; Lee & Winer, 2005; ing one retrieved the other. This created conflict in partial Wessinger et al., 2001) objects are processed in distinct repetition trials—that is, when the retrieved stimulus or neural pathways, which calls for processes that integrate response feature did not match the present one. Hence, this distributed information into coherent representations. facing a particular combination of stimulus and response This so-called binding problem and the mechanisms solv- features seems to create a multimodal event file (Hom- ing it have been studied extensively in recent years (e.g., mel, 1998, 2004), which is retrieved if at least one of the Allport, Tipper, & Chmiel, 1985; Hall, Pastore, Acker, & features it includes is encountered again. Huang, 2000; Hommel, 2004; Treisman & Gelade, 1980). The existing theories in feature integration have been One of the leading theories in this field, Treisman’s fea- based largely on experiments using visual information, but ture integration theory (FIT), holds that primary visual it makes sense to assume that feature integration takes place features are processed in parallel and represented in sepa- in auditory perception as well. The auditory system allows rate feature maps. Through spatial selection via a master us to perceive events based on the sound produced by them. map of locations, an episodic representation is created: an And yet, an acoustic event is commonly made up of several object file, which is updated as the object changes and can features, among them pitch, timbre, loudness, and spatial be addressed by location (Kahneman, Treisman, & Gibbs, position. Numerous studies have been done to look into 1992; Treisman, 1990; Treisman & Gelade, 1980). how these features are perceived; however, in everyday Hommel (1998, 2004, 2005) extended Treisman’s ob- life, we do not perceive features in isolation but, rather, ject file concept to include not only stimulus features, perceive coherent, integrated acoustic events. Given that but also response-related feature information. A number these features are processed in different areas of the audi- of studies have provided evidence for this extension. In tory cortex (Kaas & Hackett, 1999; Wessinger et al., 2001), these studies, participants carried out two responses in there should be a mechanism that integrates the auditory a row. First, they were cued by a response cue signaling features into a coherent acoustic perception. Indeed, there is the first response, which, however, was carried out only preliminary evidence for the existence of auditory binding. after a visual trigger stimulus was presented. After 1 sec, For instance, Hall et al. (2000) examined auditory feature another visual stimulus appeared, and the participants had integration of spatially distributed musical tones by having to perform a binary-choice response to one of its features. participants search for either a cued conjunction of pitch As was expected, main effects of stimulus feature repeti- and timbre or a single cued value (pitch or timbre) in ar- tion were obtained. But more interestingly, stimulus and rays of simultaneous tones in different lateralized positions. response repetition effects interacted: Repeating a stimu- Their finding revealed more frequent illusory conjunctions lus feature sped up reaction time (RT) only if the response when pitch and timbre features were separately presented, also repeated, whereas stimulus feature repetition slowed suggesting that, like the visual system, the auditory system down RT if the response alternated. Apparently, stimulus differentiates the auditory features from the sound field and S. Zmigrod,

[email protected]

© 2009 The Psychonomic Society, Inc. 352 INTEGRATING AUDITORY PERCEPTION AND ACTION PLANNING 353 then integrates them according to their source. The investi- in a standard object file can be observed for different audi- gators concluded that the auditory system binds its features tory dimensions; whether evidence for stimulus–response with reference to their location, just as FIT (Treisman & integration effects can be obtained between the auditory Gelade, 1980) assumes to occur for the visual system. In modality and action planning; and whether these integra- addition, Leboe, Mondor, and Leboe (2006), who inves- tion effects rely on or are mediated by attention. tigated different sources of auditory negative priming ef- fects, found that repeated sounds in opposite locations were EXPERIMENT 1 categorized more slowly than repeated sounds in the same location. In the interdomains of auditory perception and ac- Experiment 1 was performed to determine whether au- tion, Mondor, Hurlburt, and Thorne (2003) found interac- ditory features are integrated into a coherent object rep- tions between pitch and response repetition effects, which resentation and whether response-related features are also may indicate the integration of sound features and action. integrated with auditory features to produce an event file Another important research question that has been ad- similar to Hommel’s (1998, 2004) findings in the visual dressed concerns the role of the attention in auditory fea- domain. The task followed Hommel’s (1998) design, ex- ture binding. Previous studies have shown contradictory cept that the stimuli were pure tone sounds. Participants evidence. Hall et al. (2000) suggested that reliable integra- were cued to prepare a response (left or right mouse button tion of auditory features might require focused attention in click), which they carried out (R1) after the first stimulus order to avoid illusory feature conjunctions when multiple (S1). One second later, the second sound (S2) was played, sounds exist. However, this suggestion is inconsistent with and the participants had to respond to the value of the recent findings of Takegata et al. (2005). They conducted relevant auditory feature by carrying out response R2 (left an EEG study in which participants performed a visual or right mouse button click; see Figure 1). working memory task while ignoring a background of The auditory features that were chosen for this ex- two sounds. The two sounds, varying in timbre and pitch, periment were pitch and loudness. Neuhoff, Kramer, and were played simultaneously. Regardless of the task load, Wayand (2002) demonstrated that pitch and loudness have the pitch–timbre combinations elicited similar amplitudes an interactive effect—that is, changes in one of these di- and latencies in the ERP component mismatch negativ- mensions influence the other. On the basis of these results ity. According to the investigators, these results provided and the object file concept, we hypothesized that pitch and evidence that feature integration in the auditory modality loudness features of S1 are integrated and are still bound can occur without focus of attention. In line with this view, when S2 is processed. If so, repeating the feature in one di- Hommel (2005) demonstrated that even irrelevant visual mension should produce better performance if the feature stimuli may be bound to a response. in the other dimension is also repeated, whereas alternat- Although there is ample evidence for the existence of ing the feature in one dimension should produce better event files in and across visual perception and action plan- performance if the feature in the other dimension is also ning, the event file concept has not been systematically alternated. In addition, we hypothesized that the features applied to auditory perception and action planning. Only making up S1 are integrated with R1 and are still bound to a few studies have examined the binding mechanism in it when S2 is responded to, on the basis of the suggested the auditory modality, and there is contradictory evidence event-file mechanism that posits that a specific combina- regarding the role of attention in this mechanism. The aim tion of stimulus and response creates an episodic trace of the present study was to investigate the feature bind- that is retrieved in case of any feature repetition (Hommel, ing mechanism in and across auditory perception and 1998, 2004). If so, a response to S2 should be better with a action planning. More specifically, we addressed three complete match or a complete mismatch between the pre- research issues: whether evidence for feature integration vious response and a given auditory feature than with par- ♪ Tim e S2 (50 msec; wait � 2,000 msec) � R2 ♪ Blank (950 msec) S1 (50 msec) � R1 Blank (1,000 msec) R1 cue (1,500 msec) ITI (1,000 msec) Figure 1. Sequence of events in the experiments. A response cue signaled a left or right mouse button click (R1) that was to be delayed until presentation of the first stimulus (S1). S2 appeared 1,000 msec later. S2 signaled R2, a speeded left or right mouse button click, according to the task. 354 ZMIGROD AND HOMMEL tial matches. Moreover, previous observation showed that Results pitch repetition interacts with response repetition (Mondor Trials with incorrect R1 responses (1.7%), as well as et al., 2003). To investigate the role of attention in auditory missing or anticipatory (RT , 100 msec) R2 responses feature integration, we manipulated the feature that was (0.7%), were excluded from analysis. The mean RT for R1 relevant for responding to S2. In one block of trials, only was 270 msec (SD 5 88). From the remaining data, mean one of the two auditory features (pitch or loudness) was RTs and PEs for R2 were analyzed as a function of the four relevant, whereas in another block the other auditory fea- variables: the task-relevant stimulus feature (loudness vs. ture was relevant. The task relevance of S2 features (and pitch), or task for short; the relationship between the re- the amount of attention consequently devoted to them) has sponses R1 and R2 (alternation vs. repetition); the relation- been shown to affect the size of integration-related effects ship between S1 and S2 on the pitch dimension (alterna- with visual stimuli (e.g., Hommel, 1998), and we were tion vs. repetition); and the relationship between S1 and interested to see whether it would also modify such effects S2 on the loudness dimension (alternation vs. repetition). with auditory stimuli. ANOVAs were performed by using a four-way design for repeated measures. Table 1 provides an overview of the Method RT and PE means obtained for R2 performance. Participants. Fourteen participants were recruited by advertise- First, we will report less important theoretical find- ment for this experiment and were paid or received a course credit for ings. The analysis yielded a main effect of pitch in PEs a 40-min session. Two participants were excluded from the analysis [F(1,11) 5 5.22, p , .05], with higher error rates for pitch due to a high error rate (around the chance level of 50%) and very long RTs in the pitch task—reflecting their difficulty in identify- repetition than for alternation. This effect was further ing low versus high pitch (see Neuhoff, Knight, & Wayand, 2002). modified by task [F(1,11) 5 8.54, p , .05], indicating The remaining 12 participants (4 of them male; mean age, 23 years; that it was more pronounced in the pitch task [F(1,11) 5 range, 18–38 years) reported not having any known hearing problem. 11.13, p , .01] than in the loudness task (F , 1). Simi- The participants were naive as to the purpose of the experiment. larly, an interaction between loudness and task in PEs was Apparatus and Stimuli. The experiment was controlled by a obtained [F(1,11) 5 5.28, p , .05], which was also more Targa Pentium 3, attached to a Targa TM 1769-A 17-in. monitor. The participants faced the monitor at a distance of about 60 cm. The loud- pronounced in the pitch task [F(1,11) 5 7.42, p , .05] speakers were located on both sides of the screen (approximately 25º) than in the loudness task (F , 1). at a distance of 70 cm. The stimuli S1 and S2 were composed from two Second, we will address the stimulus integration effect pure tones of 1000 and 3000 Hz with a duration of 50 msec and were by examining the interactions between repetition and al- presented at 65 and 75 dB SPL, respectively. Visual response cues ternation of the stimulus features: There was an interac- were presented in the middle of the screen (see Figure 1), with a right tion between pitch repetition (vs. alternation) and loudness or left arrow indicating a right or left response (R1), respectively. Re- sponses were made by clicking on the left or the right mouse button repetition [F(1,11) 5 11.07, p , .01], indicating that, with with the index and middle fingers of the dominant hand. pitch repetition, performance was quicker if loudness was Procedure and Design. The experiment was composed of two also repeated than if loudness alternated, whereas with sessions: In one session, pitch was the relevant dimension for the pitch alternation, performance was quicker if loudness al- task, and the participants had to respond to whether pitch was high ternated than if it was repeated (see Figure 2). This result or low; in the other session, loudness was the relevant dimension for provides support for auditory feature integration between the task, and the participants had to respond to whether loudness was pitch and loudness. high or low. The sessions were counterbalanced between participants. Each session contained a practice block with 10 practice trials and an Third, we will consider stimulus–response integration experimental block with 128 experimental trials. The order of the tri- effects by examining the interactions between repetition als was randomized. The participants had to carry out two responses and alternation of the response and the stimulus features. per trial. R1 was a simple reaction with a left or right mouse click, There were interactions between response repetition and as indicated by the direction of an arrow in the response cue. It had to be carried out as soon as S1 appeared, regardless of its pitch or its loudness. R2 was a binary-choice reaction to S2. In the pitch-relevant Table 1 session, half of the participants responded to the high pitch (3000 Hz) Experiment 1: Means and Standard Errors of Mean Reaction and the low pitch (1000 Hz) by pressing on the left and right mouse Times (RTs, in Milliseconds) and Percentages of Errors (PEs) buttons, respectively, whereas the other half received the opposite for Responses to the Second Stimulus As a Function of the mapping. In the loudness-relevant task, half of the participants re- Attended Dimension, the Relationship Between the Stimuli sponded to the loud sound (75 dB SPL) and to the soft sound (65 dB (Repetition vs. Alternation), and the Relationship Between SPL) by pressing on the left and right mouse buttons, respectively, the Responses (Repetition vs. Alternation) whereas the other half received the opposite mapping. The partici- Response pants were asked to respond as quickly and accurately as possible. Repeated Alternated The sequence of events in each trial is shown in Figure 1. A re- Stimulus sponse cue with a right or left arrow visually presented for 1,500 msec Attended Feature RT PE RT PE signaled response R1, which was to be carried out after S1 had been Dimension Repeated M SE M SE M SE M SE played. S2 was played 1 sec after the response to S1, with the pitch Loudness Neither 553 41 8.8 3.0 472 31 4.1 1.9 (in the pitch session) or loudness (in the loudness session) signaling Loudness 557 24 12.4 2.8 553 36 9.8 3.7 the second response (R2). In case of incorrect or absent responses, an Pitch 553 31 11.0 4.5 517 27 3.8 2.6 error message was presented. R2 speed (RT) and accuracy (percent- Both 486 32 6.3 3.2 541 35 15.8 4.9 age of errors, or PE) were analyzed for all the trials with correct R1 Pitch Neither 574 41 11.3 2.2 502 39 5.6 2.3 responses as a function of session (pitch/loudness), repetition versus Loudness 564 38 14.6 3.9 521 40 8.2 2.7 alternation of the response, and repetition versus alternation of the Pitch 548 42 19.1 5.3 604 38 18.4 5.5 stimulus dimensions pitch and loudness. Both 507 42 7.9 2.7 545 45 21.6 4.5 INTEGRATING AUDITORY PERCEPTION AND ACTION PLANNING 355 repetition costs if the prime was to be ignored and, thus, 600 not accompanied by a response. This outcome pattern bears similarities with our findings: good performance if both Reaction Time (msec) 550 pitch and response repeat or both alternate, but bad perfor- mance if one repeats but not the other. However, Mondor and Leboe manipulated the response requirements between 500 participants, which may have induced different attentional sets and strategies in the two tasks. For instance, ignoring 450 primes in a detection task may lead to inhibition of return (Posner & Cohen, 1984), which may explain stimulus rep- Loudness repeated etition costs without referring to response requirements; 400 Loudness alternated indeed, ignoring the prime and omitting a response to it led to a 70-msec increase in RT. Accordingly, it is not clear Repeated Alternated whether the observations of Mondor and Leboe reflect the Pitch same mechanisms that underlie the pitch 3 response inter- actions obtained in the present study. Figure 2. Reaction times in Experiment 1 as a function of rep- Our findings reveal an interesting dissociation between etition versus alternation of pitch and loudness. the integration of stimulus features and the integration of stimulus and response features—a dissociation in which pitch repetition in RTs [F(1,11) 5 42.45, p , .0001] and attention induced by task relevance plays a major role. PEs [F(1,11) 5 8.90, p , .05], showing that response Stimulus–response integration seems to be restricted mainly repetition facilitates performance if the pitch repeats but to the stimulus features that are task relevant: pitch in the impairs performance if the pitch alternates. Furthermore, pitch task and loudness in the loudness task. In contrast, dif- there was an interaction between response repetition and ferent features of the same stimulus seem to be integrated loudness repetition in RTs [F(1,11) 5 5.14, p , .05] and irrespective of task-relevance, as evidenced by the reliable PEs [F(1,11) 5 9.30, p , .05], showing that the responses interaction between pitch and loudness under conditions were faster and more accurate for total repetition or total that rendered only one of them relevant at any given time. alternation of the response and the loudness than for par- tial repetition. In addition, a three-way interaction among task, response, and loudness in RTs [F(1,11) 5 6.63, 600 Pitch Task p , .05] was obtained, indicating sensitivity to the task- Reaction Time (msec) relevant feature in this stimulus–response effect. Separate 550 ANOVAs confirmed that response interacted significantly in RTs only with loudness in the loudness task [F(1,11) 5 7.38, p , .05] and not in the pitch task (F , 1). These 500 interactions show stimulus–response effects between the response and the auditory stimuli. In the case of loudness, 450 it was modulated by task relevance (see Figure 3). Pitch repeated Pitch alternated Loudness repeated Discussion 400 Loudness alternated Experiment 1 was successful in providing evidence for event file creation in auditory perception and action plan- Loudness Task 600 ning. It demonstrated spontaneous integration of pitch and loudness, even when only one of the dimensions was task Reaction Time (msec) relevant and the other could be ignored. In addition, we 550 observed stimulus–response integration effects for pitch and loudness, which were more pronounced for the task- 500 relevant feature. This is in line with findings from visual studies, where integration was also spontaneous (i.e., oc- curred even if unnecessary for the task) but was mediated 450 by the task relevance of the feature dimensions (see Hom- mel, 2004, for an overview). 400 Our findings seem consistent with those of a recent au- ditory study by Mondor and Leboe (2008). These authors observed that the impact of pitch repetition on tone detec- Repeated Alternated tion performance depends on response repetition, which Response seems to fit with our present stimulus–response integration Figure 3. Reaction times in Experiment 1 for the repetition effects. In particular, they found pitch repetition benefits versus alternation of relevant and irrelevant stimuli (pitch and if both the prime and the probe tone were to be detected loudness) as a function of response (repetition vs. alternation) in and, thus, accompanied by the same response, and pitch- pitch and loudness tasks. 356 ZMIGROD AND HOMMEL One possible explanation might be that the physical attri- can be extended to stimulus location. Many authors have butes of the features influence one another; that is, loudness emphasized the possibly crucial role of stimulus location is known to be affected by frequency and pitch by intensity. in feature integration (in vision, Treisman & Gelade, 1980; It is also interesting to see that, in stimulus–response in- in audition, Hall et al., 2000; Leboe et al., 2006). tegration, task relevance was more effective in excluding On the one hand, this could mean that spatial location irrelevant loudness information than irrelevant pitch infor- is so important for feature integration that it does not mat- mation. In other words, in the present study, loudness was ter whether location information is nominally relevant or more sensitive to task relevance than was pitch. irrelevant for a given task. This would still be consistent We think that all these aspects of our findings point to with the feature overlap principle, assuming that location the same integration principle: Features of events (whether features are strongly weighted irrespective of the task, but they refer to stimuli or responses) are integrated to the de- it would imply that the proposed relationship between task gree that the activations of their codes overlap in time. This relevance and weighting does not apply to location. On the principle underlies the concept of conditioning (Pavlov, other hand, however, it is true that many tasks that are taken 1927) and seems crucial for the hippocampal integration to demonstrate the crucial role of location have used spa- of episodic stimulus and action events (Bangasser, Wax- tial responses. Assuming that responses are represented, ler, Santollo, & Shors, 2006). First, consider the respective prepared, and planned in terms of their perceptual features roles that this principle plays in the integration of stimulus (Hommel, 1996; Hommel et al., 2001), it is possible that features versus the integration of stimulus and response defining a response set in terms of spatial features (e.g., by features. As is indicated in Figure 4A, the activations of characterizing responses as left and right) attracts attention stimulus feature codes are likely to overlap in time even if to the spatial dimension(s) and, thus, induces a stronger they are peaking at different time points—that is, even if weighting of spatial codes. Indeed, Fagioli, Hommel, and stimulus features are registered asynchronously. Accord- Schubotz (2007) found evidence that preparing for par- ingly, they are likely to be bound to each other, thus pro- ticular types of actions (grasping vs. pointing) attracts at- ducing a partial-overlap cost. However, the earlier a fea- tention to the features that are relevant for defining these ture is coded, the earlier its code decays, suggesting that actions (size vs. location). Along the same lines, Hom- quickly coded features are less likely to overlap in time mel (2007) observed that the integration of visual stimu- with response code activation. In our study, we found that lus location and the response is much more pronounced RTs were shorter in the loudness task than in the pitch task when the response alternatives are spatially defined (left (see Figure 3), probably due to the greater saliency of loud- vs. right) than when they are not (pressing a key once vs. ness and/or the better discriminability of the loudness val- twice). Hence, it is possible that the previous findings of ues we chose, suggesting that, in this experiment, loudness integration of (nominally) irrelevant location information was coded more quickly than pitch.1 With respect to the and the response reflect not so much a central role of stim- temporal relations depicted in Figure 4, this implies that ulus location in feature integration as the fact that defining response code activation started earlier, in our experiment, responses spatially makes location task-relevant. in the loudness task than it did in the pitch task. On top The aim of Experiment 2 was to examine this possible of that, there is evidence that loudness codes decay more interpretation of the role of location information, apart quickly than pitch codes (Clement, Demany, & Semal, from studying the integration-related effects of the audi- 1999), which would further work against the integration of tory location as such. We did so by manipulating the pitch loudness and response. We can thus conclude that the code and location of auditory stimuli and by using two different overlap principle accounts for both the observation that types of response sets. One set was spatially defined, just task relevance did not affect stimulus integration and the as in Experiment 1, and the other consisted of a nonspa- finding that it did affect stimulus–response integration. tial go/no-go response. We expected to replicate the find- Making a feature dimension relevant to a task is likely ings from Experiment 1 with regard to pitch and to obtain to increase the weights (or gain) of that dimension’s comparable findings for location. However, the location- codes (Bundesen, 1990; Found & Müller, 1996; Hommel, related findings should vary with the response set, with Müsseler, Aschersleben, & Prinz, 2001), which again the spatial set producing stronger integration of location may result in stronger and/or more enduring activation codes than would the nonspatial set. (see Figure 4B). This means that task-relevant features induce activations that are more likely to overlap with Method the response activation. As a consequence, task-relevant Participants. Thirty participants were recruited by advertisement for this experiment and were paid or received a course credit for a features should be more likely to be integrated with the 40-min session. One participant was excluded from the analysis due response than should task-irrelevant features, just as we to a high PE (around the chance level of 50%) and a very long RT observed in Experiment 1. in the pitch task. The remaining 29 students (3 of them male; mean age, 22 years; range, 18–34 years) reported not having any known EXPERIMENT 2 hearing problem. They were randomly assigned to two groups: a spatial response set group (n 5 14) and a nonspatial response set group (n 5 15). Experiment 1 suggests that pitch and loudness are Procedure and Design. The procedure was the same as that in spontaneously integrated both with each other and with Experiment 1, with the following exceptions. The loudspeakers were the response, at least if the given feature is task relevant. In placed at an upper and lower position at 45º from the center of the Experiment 2, we investigated whether these observations screen. The stimuli S1 and S2 were composed from two pure tones of INTEGRATING AUDITORY PERCEPTION AND ACTION PLANNING 357 A Loudness Pitch Response B Loudness Response Attention Figure 4. Sketch of the hypothetical activation functions of stimulus codes. (A) In our experiment, loudness was coded more quickly than pitch was, so that the activation of pitch codes (even as the irrelevant dimension) was more likely to overlap with response code activation. (B) Task relevance of a given feature increased the duration of code ac- tivation, so that even codes that were activated early in time now over- lapped with response code activation. 1000 and 3000 Hz, with durations of 50 msec, presented at approxi- were excluded from the analysis. The mean RTs for R1 mately 70 dB SPL. The experiment was composed of two sessions: were 330 msec (SD 5 78) for the spatial response set In one session, pitch was relevant for responding to S2; in the other group and 341 msec (SD 5 114) for the nonspatial re- session, location was relevant to S2, requiring a response to the top sponse set group. From the remaining data, mean RTs and versus bottom location. The sessions were counterbalanced between participants. Each task contained a practice block with 15 practice PEs for R2 were analyzed as a function of five variables: trials and an experimental block with 96 experimental trials. The the task (pitch vs. location as the relevant S2 feature), the order of the trials was randomized. relationship (repetition vs. alternation) between S1 and S2 The spatial response set group saw a left or right arrow indicating with regard to pitch and location, the relationship (repeti- a left or right mouse click, respectively; responses to S1 and to S2 tion vs. alternation) between responses R1 and R2, and the were made as in Experiment 1. The nonspatial response set group response set (spatial vs. nonspatial) (see Table 2 for mean saw the word GO or NO GO, indicating whether to emit or withhold the response, respectively. Responses on the GO trials were made RTs and PEs). ANOVAs were performed by using a mixed by clicking on the left mouse button; the NO-GO trials for S1 lasted design with repeated measures on four variables and with 500 msec. response set as a between-groups variable (see Table 3 for the outcomes). Results and Discussion Let us consider the outcomes according to their theo- Trials with incorrect R1 responses (1%) as well as miss- retical implications. First, we will address the task effects ing or anticipatory R2 responses (RT , 100 msec: 0.1%) that reflect the impact of the task on the stimulus dimen- 358 ZMIGROD AND HOMMEL Table 2 Experiment 2: Means and Standard Errors of Mean Reaction Times (RTs, in Milliseconds) and Percentages of Errors (PEs) for Responses to the Second Stimulus As a Function of the Response Set (Spatial or Nonspatial), the Attended Dimension, the Relationship Between the Stimuli (Repetition vs. Alternation), and the Relationship Between the Responses (Repetition vs. Alternation) Response Repeated Alternated Stimulus Response Attended Feature RT PE RT PE Set Dimension Repeated M SE M SE M SE M SE Spatial Location Neither 496 33 12.8 3.1 453 23 1.8 1.4 Location 534 28 17.1 2.8 542 24 7.6 2.7 Pitch 506 30 8.0 2.3 496 25 7.9 2.4 Both 443 22 5.4 2.0 505 26 11.3 2.7 Pitch Neither 502 30 15.0 2.2 437 29 4.9 2.9 Location 474 29 12.0 3.2 470 30 13.6 2.9 Pitch 508 30 11.8 3.1 482 30 9.0 2.6 Both 426 26 8.2 2.3 513 29 13.1 4.2 Nonspatial Location Neither 432 32 15.0 3.0 383 22 7.4 1.3 Location 480 28 11.9 2.7 453 23 13.3 2.6 Pitch 448 29 11.8 2.2 420 24 9.0 2.3 Both 396 21 8.2 1.9 410 25 13.1 2.6 Pitch Neither 417 29 11.7 2.2 396 28 11.0 2.8 Location 452 28 10.8 3.1 388 29 8.9 2.8 Pitch 436 29 7.1 3.0 480 29 14.0 2.5 Both 387 25 7.2 2.2 406 28 14.4 4.0 sions and the response. Second, we will consider the The latter observation seems inconsistent with the find- stimulus integration effects; these effects are revealed by ings of Mondor and Leboe (2008), who failed to obtain interactions between the stimulus features, showing that interactions between pitch and location repetition when repetition of a particular feature enhances performance if using a nonspatial response set. However, as was pointed the other feature is also repeated and hinders performance out earlier, they used a detection task that did not require if the other feature is alternated. Third, we will discuss the discrimination of any stimulus feature. This design stimulus–response integration effects by examining the choice was likely to prevent feature bindings from affect- interactions between repetition and alternation of the re- ing performance in several ways. For one, it yielded aver- sponse and the stimulus features. Finally, we will address age RTs of less than 300 msec, which may have been too response set effects. short to allow for the complete retrieval of the binding Task effects. There were two significant interactions in from the previous trial. Indeed, when Mondor and Leboe RT between task and location and between task and pitch, shortened the interval between the prime and the probe— showing that performance was facilitated in the location a manipulation that they considered would facilitate bind- task by repeating a feature on the task-irrelevant dimension ing retrieval and that effectively increased RTs—a close- (439 vs. 470 msec, respectively) or alternating the feature to-significant interaction between pitch and location on the task-relevant dimension in the pitch task (441 vs. repetition effects was obtained. Moreover, a detection 471 msec, respectively). In addition, the response inter- task is likely to induce rather shallow perceptual-coding acted with the task in such a way that, in the pitch task, re- processes, which again is likely to hamper the feature- sponses were more accurate when they were repeated than matching process necessary to retrieve a particular bind- when they alternated (PEs, 8.7% vs. 11.2%, respectively), ing. In any case, the present findings suggest that evidence whereas in the location task, alternation was more benefi- for pitch–location binding can be obtained under favor- cial than repetition (PEs, 8.9% vs. 11.3%, respectively). able conditions. Stimulus integration effects. Pitch repetition inter- To summarize, we were able to extend our observation acted with location repetition, reflecting the standard of spontaneous pitch–loudness integration from Experi- crossover pattern with slower responses for trials in which ment 1 to the integration of pitch and location. Again, one feature repeats while the other alternates; interest- features from the two auditory dimensions involved were ingly, it was more prominent when the relevant feature bound even though only one dimension was relevant at a was repeated rather than alternated, which may point to time, suggesting that the mere temporal overlap of code the role of attention in the process (see Figure 5). This activation is sufficient for integration. interaction was also modified by task, suggesting that Stimulus–response integration effects. Analogously the pitch 3 location interaction was somewhat more pro- to Experiment 1, pitch and location repetition entered two- nounced in the location task than in the pitch task, but it way interactions with response repetition, in both RTs and was clearly reliable in both [F(1,27) 5 66.44, p , .0001, PEs, reflecting worse performance if a stimulus-feature and F(1,27) 5 16.54, p , .0001, respectively]. repetition was accompanied by an alternation of the re- INTEGRATING AUDITORY PERCEPTION AND ACTION PLANNING 359 Table 3 Results of ANOVA on Mean Reaction Times (RTs) of Correct Responses and Percentages of Errors (PEs) for the Second Response in Experiment 2 RT PE Effect MSe F p MSe F p Response set (S) 456,360.06 3.89 .059 373.38 0.79 .381 Task (T) 23,002.41 0.96 .335 1.99 0.01 .909 Response (R) 5,032.01 1.36 .254 0.03 0.00 .989 Pitch (P) 1,022.49 0.49 .491 95.27 2.40 .133 Location (L) 56.20 0.02 .887 297.88 3.97 .056 T฀3฀R 816.12 0.17 .683 647.12 7.67** .010 T฀3฀P 28,779.30 13.54*** .001 46.96 0.58 .453 T฀3฀L 33,450.01 11.16** .002 3.25 0.06 .816 R฀3฀P 82,731.54 33.22*** .000 1,730.53 17.32*** .000 T฀3฀R฀3฀P 7,570.75 3.25 .082 26.24 0.34 .566 R฀3฀L 39,527.31 15.10*** .001 1,153.42 13.44*** .001 T฀3฀R฀3฀L 2,687.12 1.39 .249 2.47 0.05 .824 P฀3฀L 145,149.32 73.60*** .000 69.45 1.30 .264 T฀3฀P฀3฀L 10,938.12 5.18* .031 56.30 1.01 .323 R฀3฀P฀3฀L 5,761.72 3.35 .078 108.75 1.25 .273 T฀3฀R฀3฀P฀3฀L 403.56 0.21 .649 37.01 0.49 .489 T฀3฀S 4,988.43 0.21 .651 23.15 0.15 .698 R฀3฀S 6,499.33 1.76 .196 94.31 0.66 .423 P฀3฀S 47.70 0.02 .881 7.82 0.20 .660 L฀3฀S 2,038.26 0.75 .394 255.25 3.40 .076 T฀3฀R฀3฀S 4,101.46 0.86 .363 18.92 0.22 .640 T฀3฀P฀3฀S 28.65 0.01 .908 1.30 0.02 .900 R฀3฀P฀3฀S 40.02 0.02 .900 59.32 0.59 .448 T฀3฀R฀3฀P฀3฀S 836.78 0.36 .554 271.64 3.49 .073 T฀3฀L฀3฀S 550.30 0.18 .672 34.54 0.58 .451 R฀3฀L฀3฀S 40,724.50 15.55*** .001 164.97 1.92 .177 T฀3฀R฀3฀L฀3฀S 14,871.09 7.70** .010 641.79 13.10*** .001 P฀3฀L฀3฀S 3,782.71 1.92 .177 93.51 1.75 .197 T฀3฀P฀3฀L฀3฀S 4,032.12 1.91 .178 0.29 0.01 .943 R฀3฀P฀3฀L฀3฀S 521.03 0.30 .586 103.76 1.20 .284 T฀3฀R฀3฀P฀3฀L฀3฀S 639.43 0.33 .568 4.68 0.06 .805 Note—df 5 (1,27). *p , .05. **p , .01. ***p , .001. Pitch Task Location repeated Location Task Pitch repeated 600 Location alternated Pitch alternated 550 Reaction Time (msec) 500 450 400 Repeated Alternated Repeated Alternated Pitch Location Figure 5. Reaction times in Experiment 2 as a function of repetition versus alternation of pitch and location in the pitch task (left panel) and the location task (right panel). 360 ZMIGROD AND HOMMEL Pitch Task the contributions to the four-way interaction, we analyzed 600 Pitch repeated Pitch alternated the two tasks separately. In the location task, location and Location repeated response repetition interacted significantly both in RTs Reaction Time (msec) 550 Location alternated [F(1,27) 5 15.41, p , .001] and in PEs [F(1,27) 5 8.52, p , .01], with no modulation by response set (see Figure 7, lower panel). However, in the pitch task, the location 3 500 response interaction was further modified by response set both in RTs [F(1,27) 5 20.86, p , .0001] and in PEs 450 [F(1,27) 5 9.94, p , .005]. Separate analyses of the pitch task by response set revealed significant location 3 re- sponse interactions only for the spatial response set both 400 in RTs [F(1,13) 5 19.16, p , .001] and in PEs [F(1,13) 5 24.47, p , .0001; see Figure 7, upper panel] and not in 600 Location Task the nonspatial response set (F , 1). This pattern is in line with our expectation that a spatial response set amounts Reaction Time (msec) 550 to making location task relevant, even with respect to stimulus coding. If location is task relevant by requiring discrimination of S2 locations, location codes are strongly 500 weighted anyway. As a consequence, stimulus location and responses are integrated, no matter whether the response 450 set is spatially defined or not. However, when location is irrelevant with regard to S2 (i.e., in the pitch task), location codes are weighted strongly only if location is relevant for 400 discriminating the two responses, but not if a nonspatial response set is used. Repeated Alternated Response 600 Pitch Task Spatial: location repeated Figure 6. Reaction times in Experiment 2 in the pitch task Spatial: location alternated (upper panel) and the location task (lower panel) for relevant and Nonspatial: location repeated Reaction Time (msec) irrelevant stimuli (repetition vs. alternation) as a function of re- 550 Nonspatial: location alternated sponse (repetition vs. alternation). 500 sponse or vice versa (see Figure 6). The pitch 3 response interaction was unaffected by task, and a separate analysis confirmed that it was still reliable in the location task for 450 both RTs [F(1,27) 5 9.04, p , .01] and PEs [F(1,27) 5 12.70, p , .001], as well as in the pitch task for both RTs 400 [F(1,27) 5 27.10, p , .0001] and PEs [F(1,27) 5 7.25, p , .05]. The location 3 response interaction was also Location Task 600 unaffected by task. A reliable effect between location and response was observed in the pitch task for both RTs Reaction Time (msec) [F(1,27) 5 4.3, p , .05] and PEs [F(1,27) 5 8.614, p , 550 .01], as well as in the location task for both RTs [F(1,27) 5 15.41, p , .001] and PEs [F(1,27) 5 8.56, p , .01]. 500 Response set effects. The response set manipulation did not yield a reliable main effect in RTs or PEs, even though participants tended to respond more quickly with 450 a nonspatial than with a spatial set—presumably reflect- ing the reduced response uncertainty in the nonspatial go/ 400 no-go task. There were two reliable effects: The interaction between location and response was modified by response set (in RTs), and this three-way interaction was further Repeated Alternated modified by task (in RTs and PEs). Separate ANOVAs re- Response vealed that the location 3 response interaction was reliable only in the spatial response set condition [F(1,13) 5 39.43, Figure 7. Reaction times in Experiment 2 in the pitch task (upper panel) and the location task (lower panel) for stimulus p , .0001] and not in the nonspatial response set condi- location (repetition vs. alternation) as a function of response tion (F , 1), indicating stronger activation when the re- (repetition vs. alternation) and response set group (spatial vs. sponse included spatial features. Moreover, to disentangle nonspatial). INTEGRATING AUDITORY PERCEPTION AND ACTION PLANNING 361 GENERAL DISCUSSION acteristics. Stimuli that are closer in time to execution of a response seem to be more likely to be integrated with The aim of our study was to investigate the binding it. This fits with earlier observations of Hommel (2005), mechanism in and across auditory perception and action. who found stimulus–response integration for stimuli pre- In both experiments, we found evidence for the sponta- sented briefly before, concurrently with, or even after the neous integration of auditory features: pitch and loud- execution of the corresponding response, but no integra- ness in Experiment 1 and pitch and location in Experi- tion for stimuli presented during the planning of that re- ment 2. Even though our participants were not instructed sponse. Apparently, then, response execution provides the or required to create any feature conjunction, and even information necessary to trigger the integration process. though nothing could be gained by doing so, the features A plausible candidate for pulling the trigger is the success of S1—a mere go signal—were apparently integrated into of the response, which may signal that integrating the re- a coherent representation. This outcome is in line with sponse with the apparently suitable context conditions is previous findings in visual perception, where feature inte- useful (Schultz, 2002). This possibility is strengthened by gration effects were obtained between shape and color or the finding that the integration of visual stimulus features shape and location (Hommel, 1998; Hommel & Colzato, and manual responses in a task such as ours is facilitated 2004), and with findings from auditory studies, where evi- by presenting positively toned pictures after the execu- dence of integration was found for pitch and timbre (Hall tion of R1 (Colzato, van Wouwe, & Hommel, 2007). Taken et al., 2000; Takegata et al., 2005). We can conclude that together, our findings provide evidence for the existence feature-binding processes are not restricted to visual ob- of temporary feature binding in auditory perception and ject perception, the modality targeted by FIT (Kahneman action, suggesting a general principle of how events are et al., 1992; Treisman & Gelade, 1980), but follow compa- cognitively represented—presumably, in terms of event rable principles in integrating auditory information. files, as proposed by Hommel (1998, 2004). Moreover, both experiments revealed interactions between stimulus and response that were indicative of AUTHOR NOTE stimulus–response feature binding. Again, these effects Correspondence concerning this article should be addressed to S. Zmig- were obtained for all the auditory dimensions investi- rod, Department of Psychology, Cognitive Psychology Unit, Leiden gated—that is, pitch, loudness, and location. These ef- University, Postbus 9555, 2300 RB Leiden, The Netherlands (e-mail: fects followed the same pattern as that observed between

[email protected]

). stimulus features: Repeating one member of a pair, but not the other, results in performance costs, usually in terms REFERENCES of RT and, often, in errors as well. This supports the idea Allport, D. A., Tipper, S. P., & Chmiel, N. R. J. (1985). Perceptual that feature integration creates episodic links between the integration and postcategorical filtering. In M. I. Posner & O. S. M. respective elements, which are retrieved as a whole when Marin (Eds.), Attention and performance XI (pp. 107-132). Hillsdale, at least one element is encountered again (Hommel, 1998, NJ: Erlbaum. Bangasser. D. A., Waxler, D. E., Santollo, J., & Shors, T. J. 2004). This retrieval process does not take place if the rel- (2006). Trace conditioning and the hippocampus: The importance evant stimulus feature and the response are different from of contiguity. Journal of Neuroscience, 26, 8702-8706. doi:10.1523/ the previous ones, and it does not create any particular JNEUROSCI.1742-06.2006 problem if all the elements of the binding are repeated. In Bundesen, C. (1990). A theory of visual attention. Psychological Re- the case of partial repetitions (either the response or the view, 97, 523-547. Clement, S., Demany, L., & Semal, C. (1999). Memory for pitch relevant stimulus feature), retrieval results in the reactiva- versus memory for loudness. Journal of the Acoustical Society of tion of currently incorrect, conflicting information and, America, 106, 2805-2811. thus, prolongs stimulus and/or response processing. Colzato, L. S., van Wouwe, N. C., & Hommel, B. (2007). Fea- The fact that evidence for feature integration processes ture binding and affect: Emotional modulation of visuomo- tor integration. Neuropsychologia, 45, 440-446. doi:10.1016/j was obtained even under conditions in which the process- .neuropsychologia.2006.06.032 ing of simple features would be sufficient supports the Fagioli, S., Hommel, B., & Schubotz, R. I. (2007). Intentional control idea that integration occurs rather automatically. And yet, of attention: Action planning primes action-related stimulus dimen- which information is integrated seems to be determined sions. Psychological Research, 71, 22-29. doi:10.1007/s00426-005 by the action goal. In particular, features that vary on di- -0033-3 Found, A., & Müller, H. J. (1996). Searching for unknown feature tar- mensions that are relevant for defining a target stimulus gets on more than one dimension: Investigating a “dimension weight- or a response alternative are more likely to become part of ing” account. Perception & Psychophysics, 58, 88-101. bindings than are features unrelated to such dimensions. Hall, M. D., Pastore, R. E., Acker, B. E., & Huang, W. (2000). Evi- Another principle underlying feature integration seems dence for auditory feature integration with spatially distributed items. Perception & Psychophysics, 62, 1243-1257. to be a temporal overlap of code activation. Codes of Hommel, B. (1996). The cognitive representation of action: Automatic stimulus features seem to be processed sufficiently close integration of perceived action effects. Psychological Research, 59, in time to produce overlapping activations, even if the 176-186. doi:10.1007/BF00425832 time needed to process them differs (see Experiment 1) Hommel, B. (1998). Event files: Evidences for automatic integra- and even if only one of them is task relevant. That is, fea- tion of stimulus–response episodes. Visual Cognition, 5, 183-216. doi:10.1080/713756773 tures belonging to the same physical stimulus are likely Hommel, B. (2004). Event files: Feature binding in and across perception to become part of the same object file. The integration of and action. Trends in Cognitive Sciences, 8, 494-500. doi:10.1016/j stimuli and responses is more sensitive to temporal char- .tics.2004.08.007 362 ZMIGROD AND HOMMEL Hommel, B. (2005). How much attention does an event file need? Jour- Pavlov, I. P. (1927). Conditioned reflexes (G. V. Anrep, Trans.). London: nal of Experimental Psychology: Human Perception & Performance, Oxford University Press. 31, 1067-1082. doi:10.1037/0096-1523.31.5.1067 Posner, M. I., & Cohen, Y. A. (1984). Components of visual orient- Hommel, B. (2007). Feature integration across perception and action: ing. In: H. Bouma & D. G. Bouwhuis (Eds.), Attention and perfor- Event files affect response choice. Psychological Research, 71, 42-63. mance: X. Control of language processes (pp. 531-556). Hillsdale, doi:10.1007/s00426-005-0035-1 NJ: Erlbaum. Hommel, B., & Colzato, L. S. (2004). Visual attention and the tem- Schultz, W. (2002). Getting formal with dopamine and reward. Neu- poral dynamics of feature integration. Visual Cognition, 11, 483-521. ron, 36, 241-263. doi:10.1080/13506280344000400 Takegata, R., Brattico, E., Tervaniemi, M., Varyagina, O., Hommel, B., Müsseler, J., Aschersleben, G., & Prinz, W. (2001). Näätänen, R., & Winkler, I. (2005). Preattentive representation The theory of event coding (TEC): A framework for perception and of feature conjunctions for concurrent spatially distributed audi- action planning. Behavioral & Brain Sciences, 24, 849-937. tion objects. Cognitive Brain Research, 25, 169-179. doi:10.1016/j Kaas, J. H., & Hackett, T. A. (1999). “What” and “where” processing .cogbrainres.2005.05.006 in auditory cortex. Nature Neuroscience, 2, 1045-1047. Treisman A. M. (1990). Variations on the theme of feature integration: Kahneman, D., Treisman, A., & Gibbs, B. J. (1992). The reviewing Reply to Navon. Psychological Review, 97, 460-463. of object files: Object-specific integration of information. Cognitive Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of Psychology, 24, 175-219. attention. Cognitive Psychology, 12, 97-136. Leboe, J. P., Mondor, T. A., & Leboe, L. C. (2006). Feature mismatch Wessinger, C. M., VanMeter, J., Tian, B., Van Lare, J., Pekar, J., & effects in auditory negative priming: Interference as dependent on Rauschecker, J. P. (2001). Hierarchical organization of the human salient aspects of prior episodes. Perception & Psychophysics, 68, auditory cortex revealed by functional magnetic resonance imaging. 897-910. Journal of Cognitive Neuroscience, 13, 1-7. Lee, C. C., & Winer, J. A. (2005). Principles governing auditory cortex Zeki, S., & Bartels, A. (1999). Toward a theory of visual conscious- connections. Cerebral Cortex, 15, 1804-1814. doi:10.1093/cercor/ ness. Consciousness & Cognition, 8, 225-259. bhi057 Mondor, T. A., Hurlburt, J., & Thorne, L. (2003). Categorizing NOTE sounds by pitch: Effects of stimulus similarity and response repeti- tion. Perception & Psychophysics, 65, 107-114. 1. The temporal overlap scenario sketched in Figure 4 refers to the Mondor, T. A., & Leboe, L. C. (2008). Stimulus and response rep- hypothetical temporal relations between coding processes in Experi- etition effects in the detection of sounds: Evidence of obligatory re- ment 1. These relations depend on the particular stimuli and the stimulus trieval and use of a prior event. Psychological Research, 72, 183-191. parameters chosen and, thus, may look very different for other stimuli, doi:10.1007/s00426-006-0095-x intensities, and pitch values. Thus, we do not suggest that loudness is Neuhoff, J. G., Knight, R., & Wayand, J. (2002). Pitch change, soni- always coded more quickly than pitch or that loudness and pitch coding fication, and musical expertise: Which way is up? In R. Nakatsu & or pitch and response coding always overlap in time; we suggest only H. Kawahara (Eds.), Proceedings of the 8th International Conference that features that happen to be coded by overlapping processes are more on Auditory Display, Kyoto, Japan. likely to be integrated. Neuhoff, J. G., Kramer, G., & Wayand, J. (2002). Pitch and loudness interact in auditory displays: Can the data get lost in the map? Journal of Experimental Psychology: Applied, 8, 17-25. doi:10.1037//1076 (Manuscript received June 2, 2008; -898X8.1.17 revision accepted for publication August 22, 2008.)