Information

Can one alter their auditory perception?

Can one alter their auditory perception?



We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I'm coming from the idea that the way we perceive sound is a reaction to a certain signal, sent to our brain by ears. Of course this feeling intensifies as volume of the sound increases and our ears send us a stronger signal.

Now here I am wondering if we can augment that signal in our brain, so when the volume stays the same - we will be able to perceive it louder. Can that be done without any kind of chemical or surgical intrusion, just by sheer force of will? Do such perception "controllers" even exist and we just haven't learned to control them? This question is about auditory system, but since it touches the brain signal interpretation you could explain this from a different perspective (visual or even tactile).

This isn't part of the question, but I also wonder, if previous is possible, than can the same be done to mentally create sounds in your head and hear them as if they were real… ?


Imagine a music recording studio, with a band playing in the soundproof room. Now imagine 2 mic->speaker connections: A mic inside the room records the music, and plays it on a speaker outside, and then another mic records the sound coming out of that speaker, and delivers it to the recording equipment. You can imagine the loss of quality involved in that set up I'm sure, but bear with me. Both mic->speaker connections are set to 50% volume. Say we want to increase the volume: We can increase the volume on either connection (or both). If we increase the volume to 100% on only one of the 2 connections, then the volume of the recorded sound will increase by approximately the same amount whichever one we choose. However, increasing the volume of the first connection is better - it results in a better quality recording. This is because each connection can only lose quality, not add it.

Now, supposing that the first mic->speaker connection represents the auditory nerve (ear -> primary auditory cortex), and the second connection represents "perception" (auditory cortex -> rest of brain), then we can discuss increasing the "volume" at either "connection" via conscious control (I'm assuming that this is what you mean by "force of will").

The auditory nerve passes through the central auditory system, an area of the brain considered unconscious:

Damage to the Primary Auditory Cortex in humans leads to a loss of any awareness of sound, but an ability to react reflexively to sounds remains as there is a great deal of subcortical processing in the auditory brainstem and midbrain.

In other words, so far as we know, there is no conscious control over this "connection".

Neuroplasticity of "perception" is much greater. A variety of studies have shown certain improved hearing capabilities in blind people, correlating with how early hearing loss occurs (the earlier the better), and the involvement of neuroplasticity as the visual cortex can be usurped for auditory processing. Whether or not this corresponds to a change in "volume" is unclear, and this effect does not appear to be "consciously" controlled.

However, more recent research has demonstrated how similar effects can be induced in non-blind adults: Through temporary blindness (as little as 90 minutes with a blindfold). Corresponding research on mice has shown that this may be the result of increased "volume" in perception:

The cells fired faster and more powerfully in response to sounds and were more sensitive to quiet sounds.

So not exactly "sheer will", but temporary blindness, that you can easily induce on yourself consciously, may improve certain aspects of hearing, possibly including "volume".

Hope this helps.


We can easily alter the signal that arrives at the eardrum by changing our position/orientation relative to the sound source (and any reflections and any noise sources). If we limit ourselves to cases where the sound pressure at the ear drum is identical, there are still things that can be done. For example, it might be possible to voluntarily control the middle ear reflex which changes the transfer function characteristics of the middle ear and can change the sound level by approximately 20 dB. If we further limit ourselves to cases where the pressure at the stapes is identical, there are things that affect the transmission properties of the ear. The olivocochlear system is thought to control the gain properties of the cochlear amplifier. While I am not aware of any studies that have demonstrated conscious control of the oliviocochlear system, it is definitely possible through surgery and likely drugs. Our auditory perception is also driven by the chemical composition of the fluid in the cochlea and neurotransmitter reserves in the auditory nerve. It seems less likely that we could consciously control these chemical states, but surgery and drugs definitely can alter them.


REVIEW article

  • 1 Pain and Perception Lab, IIMPACT in Health, The University of South Australia, Adelaide, SA, Australia
  • 2 Neuroscience Research Australia, Randwick, NSW, Australia
  • 3 Crossmodal Research Laboratory, Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom

The sounds that result from our movement and that mark the outcome of our actions typically convey useful information concerning the state of our body and its movement, as well as providing pertinent information about the stimuli with which we are interacting. Here we review the rapidly growing literature investigating the influence of non-veridical auditory cues (i.e., inaccurate in terms of their context, timing, and/or spectral distribution) on multisensory body and action perception, and on motor behavior. Inaccurate auditory cues provide a unique opportunity to study cross-modal processes: the ability to detect the impact of each sense when they provide a slightly different message is greater. Additionally, given that similar cross-modal processes likely occur regardless of the accuracy or inaccuracy of sensory input, studying incongruent interactions are likely to also help us predict interactions between congruent inputs. The available research convincingly demonstrates that perceptions of the body, of movement, and of surface contact features (e.g., roughness) are influenced by the addition of non-veridical auditory cues. Moreover, auditory cues impact both motor behavior and emotional valence, the latter showing that sounds that are highly incongruent with the performed movement induce feelings of unpleasantness (perhaps associated with lower processing fluency). Such findings are relevant to the design of auditory cues associated with product interaction, and the use of auditory cues in sport performance and therapeutic situations given the impact on motor behavior.


4. Varieties of Auditory Perception

4.1 Musical Listening

Musical listening is a topic that bears on questions about the relationship between hearing sounds and hearing sources. While the philosophy of music has its own vast literature (see the entry on the philosophy of music), musical experience has not been used as extensively to explore general philosophical questions about auditory perception. This section discusses links that are relevant to advancing philosophical work on auditory perception.

4.1.1 Acousmatic Experience

An account of listening to pure or non-vocal music should capture the aesthetic significance of musical listening. Appreciating music is appreciating sounds and sequences, arrangements, or structures of sounds. Thus, the temporal aspects of auditory experiences are critical to appreciatively listening to music.

One might go further and hold that sounds are all that matters in music. In particular, some have argued that appreciatively listening to music demands listening in a way that abstracts from the environmental significance, and thus from the specific sources, of the sounds it includes (Scruton 1997, 2&ndash3). Such acousmatic listening involves experiencing sounds in a way that is &ldquodetached from the circumstances of their production,&rdquo rather than &ldquoas having a certain worldly cause&rdquo (Hamilton 2007, 58 see also Hamilton 2009). Listening to music and being receptive to its aesthetically relevant features requires not listening to violins, horns, or brushes on snare drums. It requires hearing sounds and grasping them in a way removed from their common sources. Hearing a high fidelity recording thus furnishes an aesthetically identical musical experience despite having a speaker cone rather than a violin as source. &ldquoThe acousmatic experience of sound is precisely what is exploited by the art of music&rdquo (Scruton 1997, 3).

This suggests an intuitive difference between music and visual arts such as painting and sculpture. As Kivy (1991) explains, it is difficult even with the most abstract paintings and sculptures to see them in a way that takes them to be entirely formal or abstract. That is, it is difficult to avoid seeing pictures and sculptures as representational. In contrast, it seems easier to listen attentively to the formal acoustical features of musical sounds, without being compelled to think of what makes them.

Musical listening thus may be thought to provide a prima facie argument against the claim that in hearing sounds one typically hears sound sources such as the strumming of guitars and bowing of violins. If such &ldquointerested&rdquo audition were the rule, musical listening would be far more challenging.

4.1.2 Acousmatic Listening as Attention to Sounds

Acousmatic experience, however, may be a matter of attention. Nothing prevents focusing one&rsquos attention on the sounds and audible qualities without attending to the instruments, acts, and events that are their sources, even if each is auditorily available. That musical listening requires effort and training supports the idea that one can direct attention differently in auditory experience, depending on one&rsquos interests. Caring for an infant and safely crossing the street require attending to sound sources, while listening with aesthetic appreciation to a symphony may require abstracting from the circumstances of its production, such as the finger movements of the oboist. This response holds that musical listening is a matter of auditorily attending in a certain way. It is attending to features of sounds themselves, but does not imply failing to hear sound sources.

The acousmatic thesis is a limited view about which aspects of the things one can auditory experience are aesthetically significant. These include audible aspects of sounds themselves, but exclude, for example, other contents of auditory experience. However, room exists for debate over the aesthetically significant aspects of what you hear (see Hamilton 2007, 2009). For example, one might argue that live performances have aesthetic advantages over recordings because one hears the performance of the sounds and songs, rather than their reproduction by loudspeakers (cf. Mag Uidhir 2007). Circumstances of sound production, such as that skillful gestures generate a certain passage, or that a particularly rare wood accounts for a violin&rsquos sounds, might be aesthetically relevant in a way that outstrips the sounds, and some such features may be audible in addition to sounds. For instance, hearing the spatial characteristics of a performance may hold aesthetic significance beyond the tones and structures admitted by traditional accounts of musical listening. Composers may even intend &ldquospatial gestures&rdquo among aspects essential for the appreciation of a piece (see, e.g., Solomon 2007). To imagine auditorily experiencing the spatial characteristics of music in a way entirely divorced from the environmental significance of the sounds is difficult. Appreciating the relationship between experiences of sounds and of sources makes room for a view of the aesthetic value of musical listening that is more liberal than acousmatic experience allows.

4.2 Speech Perception

4.2.1 Is Speech Special?

Speech perception presents uniquely difficult twists, and few philosophers have confronted it directly (Appelbaum 1999, Trout 2001a, Matthen 2005, ch 9, and Remez and Trout 2009 are recent exceptions). Something striking and qualitatively distinctive&mdashperhaps uniquely human&mdashseems to set the perception of speech apart from ordinary hearing. The main philosophical issues about speech perception concern versions of the question, Is speech special? (See O&rsquoCallaghan 2015 for a comprehensive review and discussion.)

How does perceiving speech differ from perceiving ordinary non-linguistic sounds? Listening to music and listening to speech each differ from listening to other environmental sounds in the following respect. In each case, one&rsquos interest in listening is to some degree distanced from the specific environmental happenings involved in the production of sounds.

But this is true of listening to music and of listening to speech for different reasons. In music, it is plausible that one&rsquos interest is in the sounds themselves, rather than in the sources of their production. However, speech is a vehicle for conventional linguistic meaning. In listening to speech, one&rsquos main interest is in the meanings, rather than in the sources of sound. Ultimately, the information conveyed is what matters.

Nevertheless, according to the most common philosophical understanding, perceiving spoken utterances is just a matter of hearing sounds. The sounds of speech are complex audible sound structures. Listening to speech in a language you know typically involves grasping meanings, but grasping meanings requires first hearing the sounds of speech. According to this account, grasping meanings itself is a matter of extra-perceptual cognition.

The commonplace view&mdashthat perceiving speech is a variety of ordinary auditory perception that just involves hearing the sounds of speech&mdashhas been challenged in a number of ways. The challenges differ in respect of how speech perception is held to differ from non-linguistic audition.

4.2.2 The Objects of Speech Perception

First, consider the objects of speech perception. What are the objects of speech perception, and do they differ from those of ordinary or non-linguistic auditory perception? According to the commonplace understanding, hearing speech involves hearing sounds. Thus, hearing spoken language shares perceptual objects with ordinary audition. Alternatively, one might hold that the objects of speech perception are not ordinary sounds at all. Perhaps they are language-specific entities, such as phonemes or words. Perhaps, as some have argued, perceiving speech involves perceiving articulatory gestures or movements of the mouth and vocal organs (see the supplement on Speech Perception: Empirical and Theoretical Considerations). Note that if audition&rsquos objects typically include distal events, speech in this respect is not special, since its objects do not belong to an entirely different kind from ordinary sounds.

4.2.3 The Contents of Speech Perception

Second, consider the contents of speech perception. Does the content of speech perception differ from that of ordinary audition? If it does, how does the experience of perceiving speech differ from that of hearing ordinary sounds? Perceiving speech might involve hearing ordinary sounds but auditorily ascribing distinctive features to them. These features might simply be, or comprise, finer grained qualitative and temporal acoustical details than non-linguistic sounds audibly possess. But perceiving speech also might involve perceiving sounds as belonging to language-specific types, such as phonemes, words, or other syntactic categories.

Furthermore, speech perception&rsquos contents might differ in a more dramatic way from those of non-linguistic audition. Listening with understanding to speech involves grasping meanings. The commonplace view is conservative. It holds that grasping meanings is an act of the understanding rather than of audition. Thus, the difference between the experience of listening to speech in a language you know and the experience of listening to speech in a language you do not know is entirely cognitive.

But one might think that there also is a perceptual difference. A liberal account of this perceptual difference holds that perceiving speech in a language you know may involve hearing sounds as meaningful or auditorily representing them as having semantic properties (see, e.g., Siegel 2006, Bayne 2009, Azzouni 2013, Brogaard 2018 cf. O&rsquoCallaghan 2011b, Reiland 2015). Alternatively, a moderately liberal account holds that the perceptual experience of speech in a language you know involves perceptually experiencing language-specific but nevertheless non-semantic features. For instance, O&rsquoCallaghan (2011b) argues that listening to speech in a familiar language typical involves perceiving its phonological features.

4.2.4 Is Speech Perception Auditory?

Third, consider the processes responsible for speech perception. To what extent does perceiving speech implicate processes that are continuous with those of ordinary or general audition, and to what extent does perceiving speech involve separate, distinctive, or modular processes? While some defend general auditory accounts of speech perception (see, e.g, Holt and Lotto 2008), some argue that perceiving speech involves dedicated perceptual resources, or even an encapsulated perceptual system distinct from ordinary non-linguistic audition (see, e.g., Fodor 1983, Pinker 1994, Liberman 1996, Trout 2001b). These arguments typically are grounded in several types of phenomena, including the multimodality of speech perception&mdashvisual cues about the movements of the mouth and tongue impact the experience of speech, as demonstrated by the McGurk effect (see the section 4.3 Crossmodal Influences) duplex perception&mdasha particular stimulus sometimes contributes simultaneously both to the experience of an ordinary sound and to that of a speech sound (Rand 1974) and the top-down influence of linguistic knowledge upon the experience of speech. A reasonable challenge is that each of these characteristics&mdashmultimodality, duplex perception, and top-down influence&mdashalso is displayed in general audition.

4.3 Crossmodal Influences

4.3.1 Crossmodal Illusions

Auditory perception of speech is influenced by cues from vision and touch (see Gick et al. 2008). The McGurk effect in speech perception leads to an illusory auditory experience caused by a visual stimulus (McGurk and Macdonald 1976). Do such multimodal effects occur in ordinary audition? Visual and tactile cues commonly do shape auditory experience. The ventriloquist illusion is an illusory auditory experience of location that is produced by an apparent visible sound source (see, e.g., Bertelson 1999). Audition even impacts experience in other modalities. The sound-induced flash effect involves a visual illusion as of seeing two consecutive flashes that is produced when a single flash is accompanied by two consecutive beeps (Shams et al. 2000, 2002). Such crossmodal illusions demonstrate that auditory experience is impacted by other modalities and that audition influences other modalities. In general, experiences associated with one perceptual modality are influenced by stimulation to other sensory systems.

4.3.2 Causal or Constitutive?

An important question is whether the impact is merely causal, or whether perception in one modality is somehow constitutively tied to other modalities. If, for instance, vision merely causally impacts your auditory experience of a given sound, then processes associated with audition might be proprietary and characterizable in terms that do not appeal to other modalities. Relying on information from vision or touch could simply improve the existing capacity to perceive space, time, or spoken language auditorily. On the other hand, coordination between audition and other senses could enable a new perceptual capacity. In that case, audition might rely constitutively on another sense.

A first step in resolving this question is recognizing that crossmodal illusions are not mere accidents. Instead, they are intelligible as the results of adaptive perceptual strategies. In ordinary circumstances, crossmodal processes serve to reduce or resolve apparent conflicts in information drawn from several senses. In doing so, they tend to make perception more reliable overall. Thus, crossmodal illusions differ from synaesthesia. Synaesthesia is just a kind of accident. It results from mere quirks of processing, and it always involves illusion (or else is accidentally veridical). Crossmodal recalibrations, in contrast, are best understood as attempts &ldquoto maintain a perceptual experience consonant with a unitary event&rdquo (Welch and Warren 1980, 638).

In the first place, the principled reconciliation of information drawn from different sensory sources suggests that audition is governed by extra-auditory perceptual constraints. Moreover, since conflict requires a common subject matter, such constraints must concern common sources of stimulation to multiple senses. If so, audition and vision share a perceptual concern for a common subject matter. And that concern is reflected in the organization of auditory experience. But this by itself does not establish constitutive dependence of audition on another sense.

However, the perceptual concern for a common subject matter could be reflected as such in certain forms of auditory experience. For instance, the commonality may be experientially evident in jointly perceiving shared spatio-temporal features, or in the perceptual experience of audio-visual intermodal feature binding. If so, some forms of auditory perceptual experience may share with vision a common multimodal or amodal content or character (see O&rsquoCallaghan 2008b, Clark 2011). More to the point, if coordination with another sense enables a new auditory capacity, then vision or touch could have a constitutive rather than merely causal impact upon corresponding auditory experiences.

4.3.3 Multimodality in Perception

What hangs on this? First, it bears on questions about audition&rsquos content. If we cannot exhaustively characterize auditory experience in terms that are modality-specific or distinctive to audition, then we might hear as of things we can see or experience with other senses. This is related to one puzzling question about hearing sound sources: How could you hear as of something you could see? Rather than just a claim about audition&rsquos content that requires further explanation, we now have a story about why things like sound sources figure in the content of auditory experience. Second, all of this may bear on how to delineate what counts as auditory perception, as opposed to visual or even amodal perception. If hearing is systematically impacted by visual processes, and if it shares content and phenomenology with other sense experiences, what are the boundaries of auditory perception? Multimodal perception may bear on the question of whether there are clear and significant distinctions among the sense modalities (cf. Nudds 2003). Finally, multimodal perceptual experiences, illusions, and explanatory strategies may illuminate the phenomenological unity of experiences in different modalities, or the sense in which, for instance, an auditory experience and a visual experience of some happening comprise a single encompassing experience (see the entry on the unity of consciousness).

We can ask questions about the relationships among modalities in different areas of explanatory concern. Worthwhile areas for attention include the objects, contents, and phenomenology of perception, as well as perceptual processes and their architecture. Crossmodal and multimodal considerations might shed doubt on whether vision-based theorizing alone can deliver a complete understanding of perception and its contents. This approach constitutes an important methodological advance in the philosophical study of perception (for further discussion, see O&rsquoCallaghan 2012, 2019, Matthen 2015, Stokes et al. 2015).


T[h]e [ear] of the duck

– THEORISING SOUND IMAGERY IN PSYCHOLOGY

Michael A. Forrester. Department of Psychology, Keynes College, Canterbury

Abstract: The study of sound in psychology has been dominated by the auditory perception view of psycho-acoustics. This paper considers the nature of the relationship between sound as event and associated processes of imagery, imagination and memory. Through a consideration of sound(s) as ecological event(s), the role of sound in film and radio, and our earliest experiences of sound as language, the discussion centres on whether psychology can contribute to our understanding of sound imagery. Concluding comments touch on the observation that when hearing a sound, our imagination often plays an important part in recognising what it might be.

* Sections of this paper are to appear in a forthcoming book ‘Psychology of the Image’ published by Routledge.

AUDITORY PERCEPTION AND SOUND AS EVENT: THEORISING SOUND IMAGERY IN PSYCHOLOGY

Within psychology the study of sound falls under the umbrella term ‘auditory perception’ where the research focus is centred upon the presumed relationships between the psychophysics of sound and associated cognitive processes of recognition and interpretation. While the benefits of such an approach can be identified in certain specific applied areas, such as in neuropsychology, it can be argued that there remains something of a theoretical vacuum in our understanding of the relationship between hearing sound and the images or imagery that is conjured up by our experience. This paper asks whether psychology can develop a theoretical outlook which moves beyond the ‘stimulus driven’ orientation of the traditional approach, an orientation which helps highlight the role of imagery in our everyday perception of sound(s) as event.

The emphasis on the visual in Western culture makes it difficult for those not visually impaired, to recognise that the world of sound is an event-world while the world of sight is an object world (Ong, 1971). Reflecting on the relationship between sound and imagery provokes the observation that ours is a visually dominant representational culture. There is no reason to believe, however, that sound perception is any less complicated than visual perception, where the relationship between perception and discursive representations of perceptual experience remain philosophically problematic (Sharrock and Coulter, 1998). Although we understand scientific descriptions of auditory perception, phenomenally we don’t ‘hear’ acoustic signals or sound waves, we hear events: the sounds of people and things moving, changing, beginning and ending, forever interdependent with the dynamics of the present moment. We ‘hear’ the sound of silence.

From an evolutionary perspective sound has at least two distinct qualitative dimensions, one nurturing, supportive and indicative of comfort, care and safety the other dissonant, disruptive and likely to provoke anxiety. Nurturing sounds might include blood flow (from our time in the womb), rhythm, intonational prominence and all those many sounds associated with the presence of others involved in our care. The preference new-born infants display for their own mother’s voices has been well documented (DeCasper and Fifer, 1980). Parents in many cultures spontaneously produce ‘baby-talk’ when soothing infants, a form of speech characterised by rhythmic intonational patterns, short sentences, often spoken softly (Snow and Ferguson, 1977). In adult life the beneficial effect of meditative or calming mood music is promoted as an aid to reducing stress, and sufferers of insomnia know the value of listening to music or a late-night radio discussion show in order to lull themselves to sleep. The inherent rhythm to the sound of speech can have a comforting or soothing effect on us when we’re anxious (although not all the time, e.g., Baker, et al, 1993).

In contrast, it makes evolutionary sense that we are be highly sensitive to those sounds that might indicate the presence of potential predators, not dissimilar to our keen visual sensitivity to the detection of movement in peripheral vision. Some sounds appear to be intrinsically appealing and pleasurable, otherwise discomforting and annoying. We are very easily disturbed by loud and disruptive noises. In particular, sounds in our environment which presuppose danger in some way, e.g., screeching car-tyres from behind as we walk on the highway, are exceptionally attention grabbing, and for good reason. In what sense however, do we ‘imagine’ the cause of the sound or the sound-event? When woken in the night by a scratching noise we might quickly decide that we are listening to the sound of a mouse or rat under the floorboards or behind the wall. But consider, it is on hearing the noise that we then imagine that the sound is the kind of noise a rodent might make when scraping or scratching around for food. Our knowledge of such sounds has come from the cultural repertoire of all those available imaginable sounds, i.e., we don’t in reality have to have seen a rat or mouse making such a sound, a great deal of our knowledge comes from the available cultural discourses about sounds and their causes. Again, in the same way that visual perception of an event is interdependently linked with labels, names, discourses about that event, so it is for sound. We might even say that there is no such thing as silence, except an imaginary silence – a pure, abstract absence of sound, arguably we cannot jump out of our discursive representational knowledge of sound into a ‘soundless’ void.

Here, I want to begin by comparing the traditional approach to sound (auditory) perception within psychology with more recent attempts inspired by Gibson’s (1979) realist metaphor, and which focus on sound as event. After some discussion on the differences between these approaches, I then consider the relationship between sound, affect and our earliest experiences, followed by a look at specific contexts where sound effects are deliberately manipulated in service of the imagination, e.g., film and radio. Reflecting on our response to sound in such contexts provokes a brief look at the role of affect and sounds that evoke particular meaning or significance for us. By way of conclusion, towards the end of the paper a number of comments are made regarding the cultural basis of auditory perception, i.e., sound as ‘meaning and event’ within a particular social-discursive context.

SOUND AS PSYCHOPHYSICAL OBJECT

Psychology studies the nature of sound as the psychophysics of wave form analysis. The essential focus is on the nature of the computation said to take place as a result of sound waves creating vibrations in our eardrums. In line with other areas of sensory perception, the more dominant theories of auditory perception focus on how the cognitive system constructs appropriate auditory representations, that is, given the potentially confusing, degraded or redundant information made available to the ears. In light of the observation that sound waves from any source will reach each ear at a different time, the question of how sound is located is normally framed within a ‘deprivation’ model. The established practice of viewing auditory perception in terms of sound waves underlies the rather curious image we have where humans can only ‘hear’ sounds within a certain frequency range, and dogs, bats, porpoises and other mammals able to hear much higher frequencies. As sound wave frequency increases, pitch increases, providing the template for Western musical scales, and interestingly one of the earliest theories of pitch perception (pitch is described as the prime quality of sound measured on a scale of high to low), proposed that the ear contained a structure formed like a stringed instrument:

Different parts of this structure are tuned to different frequencies, so that when a frequency is presented to the ear, the corresponding part of the structure vibrates-just as when a tuning fork is struck near a piano, the piano string that is tuned to the frequency of the fork will begin to vibrate. This idea proved to be essentially correct the structure turned out to be the basilar membrane, which unlike a set of strings, is continuous (Atkinson, et al 1990:143).

Even such a cursory examination of the images, metaphors and ideas informing current theory in auditory perception reminds us that the scientific study of sound is linked in a very particular way with what is said to constitute, subjectively, our perception of sound events in the first place. Consider for example, what must influence the calibration of any instrument for measuring the intensity of sounds in decibels (table1).


Auditory Perception

Auditory Perception: A New Synthesis focuses on the effort to show the connections between key areas in hearing. The book offers a review of classical problems, and then presents interpretations and evidence of this topic. A short introduction to the physical nature of sound and the way sound is transmitted and changed within the ear is provided. The book discusses the importance of being able to identify the source of a sound, and then presents processes in this regard. The text provides information on the organs involved in the identification of sound and discusses pitch and infrapitch and the manner by which their loudness can be measured. Scales are presented to show the loudness of sound. The relationship of hearing with other senses is also discussed. The text also outlines how speech is produced, taking into consideration the organs involved in the process. The book is a valuable source of data for research scientists and other professionals who are involved in hearing and speech.

Auditory Perception: A New Synthesis focuses on the effort to show the connections between key areas in hearing. The book offers a review of classical problems, and then presents interpretations and evidence of this topic. A short introduction to the physical nature of sound and the way sound is transmitted and changed within the ear is provided. The book discusses the importance of being able to identify the source of a sound, and then presents processes in this regard. The text provides information on the organs involved in the identification of sound and discusses pitch and infrapitch and the manner by which their loudness can be measured. Scales are presented to show the loudness of sound. The relationship of hearing with other senses is also discussed. The text also outlines how speech is produced, taking into consideration the organs involved in the process. The book is a valuable source of data for research scientists and other professionals who are involved in hearing and speech.


The writing of this paper was supported by National Science Foundation grant BCS1026023 and a summer research stipend from the College of Liberal Arts at the University of Nevada Las Vegas awarded to Joel S. Snyder, and a Canadian Institute for Health Research grant awarded to Claude Alain.

Ahveninen, J., Hamalainen, M., Jaaskelainen, I. P., Ahlfors, S. P., Huang, S., Lin, F. H., Raij, T., Sams, M., Vasios, C. E., and Belliveau, J. W. (2011). Attention-driven auditory cortex short-term plasticity helps segregate relevant sounds from noise. Proc. Natl. Acad. Sci. U.S.A. 108, 4182�.

Alain, C. (2007). Breaking the wave: effects of attention and learning on concurrent sound perception. Hear. Res. 229, 225�.

Alain, C., and Arnott, S. R. (2000). Selectively attending to auditory objects. Front. Biosci. 5, D202�.

Alain, C., Arnott, S. R., and Picton, T. W. (2001). Bottom-up and top-down influences on auditory scene analysis: evidence from event-related brain potentials. J. Exp. Psychol. Hum. Percept. Perform. 27, 1072�.

Alain, C., and Izenberg, A. (2003). Effects of attentional load on auditory scene analysis. J. Cogn. Neurosci. 15, 1063�.

Alain, C., Reinke, K., He, Y., Wang, C. H., and Lobaugh, N. (2005). Hearing two things at once: neurophysiological indices of speech segregation and identification. J. Cogn. Neurosci. 17, 811�.

Alain, C., and Woods, D. L. (1993). Distractor clustering enhances detection speed and accuracy during selective listening. Percept. Psychophys. 54, 509�.

Alain, C., and Woods, D. L. (1994). Signal clustering modulates auditory cortical activity in humans. Percept. Psychophys. 56, 501�.

Andreou, L. V., Kashino, M., and Chait, M. (2011). The role of temporal regularity in auditory segregation. Hear. Res. 280, 228�.

Assmann, P. F., and Summerfield, Q. (1990). Modeling the perception of concurrent vowels: vowels with different fundamental frequencies. J. Acoust. Soc. Am. 88, 680�.

Basirat, A., Sato, M., Schwartz, J. L., Kahane, P., and Lachaux, J. P. (2008). Parieto-frontal gamma band activity during the perceptual emergence of speech forms. Neuroimage 42, 404�.

Bendixen, A., Denham, S. L., Gyimesi, K., and Winkler, I. (2010). Regular patterns stabilize auditory streams. J. Acoust. Soc. Am. 128, 3658�.

Bentin, S., and Mann, V. (1990). Masking and stimulus-intensity effects on duplex perception: a confirmation of the dissociation between speech and nonspeech modes. J. Acoust. Soc. Am. 88, 64�.

Bey, C., and McAdams, S. (2002). Schema-based processing in auditory scene analysis. Percept. Psychophys. 64, 844�.

Bidet-Caulet, A., Fischer, C., Besle, J., Aguera, P. E., Giard, M. H., and Bertrand, O. (2007). Effects of selective attention on the electrophysiological representation of concurrent sounds in the human auditory cortex. J. Neurosci. 27, 9252�.

Biederman, I. (1987). Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94, 115�.

Boly, M., Garrido, M. I., Gosseries, O., Bruno, M. A., Boveroux, P., Schnakers, C., Massimini, M., Litvak, V., Laureys, S., and Friston, K. (2011). Preserved feedforward but impaired top-down processes in the vegetative state. Science 332, 858�.

Boring, E. G. (1953). A history of introspection. Psychol. Bull. 50, 169�.

Bregman, A. S. (1990). Auditory Scene Analysis: The Perceptual Organization of Sound. Cambridge, MA: MIT Press.

Bregman, A. S., and Campbell, J. (1971). Primary auditory stream segregation and perception of order in rapid sequences of tones. J. Exp. Psychol. 89, 244�.

Broadbent, D. E., and Broadbent, M. H. (1987). From detection to identification: response to multiple targets in rapid serial visual presentation. Percept. Psychophys. 42, 105�.

Carlyon, R. P. (2004). How the brain separates sounds. Trends Cogn. Sci. (Regul. Ed.) 8, 465�.

Carlyon, R. P., Cusack, R., Foxton, J. M., and Robertson, I. H. (2001). Effects of attention and unilateral neglect on auditory stream segregation. J. Exp. Psychol. Hum. Percept. Perform. 27, 115�.

Carlyon, R. P., Plack, C. J., Fantini, D. A., and Cusack, R. (2003). Cross-modal and non-sensory influences on auditory streaming. Perception 32, 1393�.

Chalikia, M. H., and Bregman, A. S. (1989). The perceptual segregation of simultaneous auditory signals: pulse train segregation and vowel segregation. Percept. Psychophys. 46, 487�.

Cherry, E. C. (1953). Some experiments on the recognition of speech, with one and with two ears. J. Acoust. Soc. Am. 25, 975�.

Chun, M. M., and Potter, M. C. (1995). A two-stage model for multiple target detection in rapid serial visual presentation. J. Exp. Psychol. Hum. Percept. Perform. 21, 109�.

Chun, M. M., and Potter, M. C. (2001). “The attentional blink and task switching within and across modalities,” in The Limits of Attention: Temporal Constraints in Human Information Processing, ed. K. Shapiro (Oxford: Oxford University Press), 20�.

Ciocca, V. (2008). The auditory organization of complex sounds. Front. Biosci. 13, 148�.

Crick, F., and Koch, C. (1995). Are we aware of neural activity in primary visual cortex? Nature 375, 121�.

Crick, F., and Koch, C. (2003). A framework for consciousness. Nat. Neurosci. 6, 119�.

Crowley, K. E., and Colrain, I. M. (2004). A review of the evidence for P2 being an independent component process: age, sleep and modality. Clin. Neurophysiol. 115, 732�.

Cusack, R. (2005). The intraparietal sulcus and perceptual organization. J. Cogn. Neurosci. 17, 641�.

Cusack, R., Deeks, J., Aikman, G., and Carlyon, R. P. (2004). Effects of location, frequency region, and time course of selective attention on auditory scene analysis. J. Exp. Psychol. Hum. Percept. Perform. 30, 643�.

Cutting, J. E. (1976). Auditory and linguistic processes in speech-perception: inferences from 6 fusions in dichotic-listening. Psychol. Rev. 83, 114�.

Daltrozzo, J., Signoret, C., Tillmann, B., and Perrin, F. (2011). Subliminal semantic priming in speech. PLoS ONE 6, e20273. doi:10.1371/journal.pone.0020273

Danzinger, K. (1980). The history of introspection reconsidered. J. Hist. Behav. Sci. 16, 241�.

Davis, C., Kim, J., and Barbaro, A. (2010). Masked speech priming: neighborhood size matters. J. Acoust. Soc. Am. 127, 2110�.

Davis, M. H., Coleman, M. R., Absalom, A. R., Rodd, J. M., Johnsrude, I. S., Matta, B. F., Owen, A. M., and Menon, D. K. (2007). Dissociating speech perception and comprehension at reduced levels of awareness. Proc. Natl. Acad. Sci. U.S.A. 104, 16032�.

Demany, L., Trost, W., Serman, M., and Semal, C. (2008). Auditory change detection: simple sounds are not memorized better than complex sounds. Psychol. Sci. 19, 85�.

Denham, S. L., and Winkler, I. (2006). The role of predictive models in the formation of auditory streams. J. Physiol. Paris 100, 154�.

Deutsch, D. (1997). The tritone paradox: a link between music and speech. Curr. Dir. Psychol. Sci. 6, 174�.

Devergie, A., Grimault, N., Tillmann, B., and Berthommier, F. (2010). Effect of rhythmic attention on the segregation of interleaved melodies. J. Acoust. Soc. Am. 128, EL1𠄾L7.

Ding, N., and Simon, J. Z. (2012). Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. J. Neurophysiol. 107, 78�.

Ditzinger, T., Tuller, B., Haken, H., and Kelso, J. A. S. (1997a). A synergetic model for the verbal transformation effect. Biol. Cybern. 77, 31�.

Ditzinger, T., Tuller, B., and Kelso, J. A. S. (1997b). Temporal patterning in an auditory illusion: the verbal transformation effect. Biol. Cybern. 77, 23�.

Donner, T. H., Sagi, D., Bonneh, Y. S., and Heeger, D. J. (2008). Opposite neural signatures of motion-induced blindness in human dorsal and ventral visual cortex. J. Neurosci. 28, 10298�.

Dowling, W. J. (1973). Perception of interleaved melodies. Cogn. Psychol. 5, 322�.

Dowling, W. J., Lung, K. M. T., and Herrbold, S. (1987). Aiming attention in pitch and time in the perception of interleaved melodies. Percept. Psychophys. 41, 642�.

Du, F., and Abrams, R. A. (2010). Endogenous orienting is reduced during the attentional blink. Exp. Brain Res. 205, 115�.

Duncan, J., Martens, S., and Ward, R. (1997). Restricted attentional capacity within but not between sensory modalities. Nature 387, 808�.

Dupoux, E., De Gardelle, V., and Kouider, S. (2008). Subliminal speech perception and auditory streaming. Cognition 109, 267�.

Durlach, N. I., Mason, C. R., Kidd, G. Jr., Arbogast, T. L., Colburn, H. S., and Shinn-Cunningham, B. G. (2003a). Note on informational masking. J. Acoust. Soc. Am. 113, 2984�.

Durlach, N. I., Mason, C. R., Shinn-Cunningham, B. G., Arbogast, T. L., Colburn, H. S., and Kidd, G. Jr. (2003b). Informational masking: counteracting the effects of stimulus uncertainty by decreasing target-masker similarity. J. Acoust. Soc. Am. 114, 368�.

Dykstra, A. R., Halgren, E., Thesen, T., Carlson, C. E., Doyle, W., Madsen, J. R., Eskandar, E. N., and Cash, S. S. (2011). Widespread brain areas engaged during a classical auditory streaming task revealed by intracranial EEG. Front. Hum. Neurosci. 5:74. doi:10.3389/fnhum.2011.00074

Dyson, B. J., Alain, C., and He, Y. (2005). Effects of visual attentional load on low-level auditory scene analysis. Cogn. Affect. Behav. Neurosci. 5, 319�.

Edelman, S. (1998). Representation is representation of similarities. Behav. Brain Sci. 21, 449�.

Elhilali, M., Ma, L., Micheyl, C., Oxenham, A. J., and Shamma, S. A. (2009a). Temporal coherence in the perceptual organization and cortical representation of auditory scenes. Neuron 61, 317�.

Elhilali, M., Xiang, J. J., Shamma, S. A., and Simon, J. Z. (2009b). Interaction between attention and bottom-up saliency mediates the representation of foreground and background in an auditory scene. PLoS Biol. 7, e1000129. doi:10.1371/journal.pbio.1000129

Engelien, A., Huber, W., Silbersweig, D., Stern, E., Frith, C. D., Doring, W., Thron, A., and Frackowiak, R. S. (2000). The neural correlates of �-hearing’ in man: conscious sensory awareness enabled by attentional modulation. Brain 123, 532�.

Eramudugolla, R., Irvine, D. R. F., McAnally, K. I., Martin, R. L., and Mattingley, J. B. (2005). Directed attention eliminates 𠆌hange deafness’ in complex auditory scenes. Curr. Biol. 15, 1108�.

Fowler, C. A., and Rosenblum, L. D. (1990). Duplex perception: a comparison of monosyllables and slamming doors. J. Exp. Psychol. Hum. Percept. Perform. 16, 742�.

Greene, M. R., and Oliva, A. (2009). Recognition of natural scenes from global properties: seeing the forest without representing the trees. Cogn. Psychol. 58, 137�.

Gregg, M. K., and Samuel, A. G. (2008). Change deafness and the organizational properties of sounds. J. Exp. Psychol. Hum. Percept. Perform. 34, 974�.

Gregg, M. K., and Samuel, A. G. (2009). The importance of semantics in auditory representations. Atten. Percept. Psychophys. 71, 607�.

Gutschalk, A., Micheyl, C., Melcher, J. R., Rupp, A., Scherg, M., and Oxenham, A. J. (2005). Neuromagnetic correlates of streaming in human auditory cortex. J. Neurosci. 25, 5382�.

Gutschalk, A., Micheyl, C., and Oxenham, A. J. (2008). Neural correlates of auditory perceptual awareness under informational masking. PLoS Biol. 6, e138. doi:10.1371/journal.pbio.0060138

Hansen, J. C., and Hillyard, S. A. (1980). Endogenous brain potentials associated with selective auditory attention. Electroencephalogr. Clin. Neurophysiol. 49, 277�.

Hautus, M. J., and Johnson, B. W. (2005). Object-related brain potentials associated with the perceptual segregation of a dichotically embedded pitch. J. Acoust. Soc. Am. 117, 275�.

Hill, K. T., Bishop, C. W., Yadav, D., and Miller, L. M. (2011). Pattern of BOLD signal in auditory cortex relates acoustic response to perceptual streaming. BMC Neurosci. 12, 85. doi:10.1186/1471-2202-12-85

Hillyard, S. A., Hink, R. F., Schwent, V. L., and Picton, T. W. (1973). Electrical signs of selective attention in the human brain. Science 182, 171�.

Hochstein, S., and Ahissar, M. (2002). View from the top: hierarchies and reverse hierarchies in the visual system. Neuron 36, 791�.

Hock, H. S., Kelso, J. A. S., and Schoner, G. (1993). Bistability and hysteresis in the organization of apparent motion patterns. J. Exp. Psychol. Hum. Percept. Perform. 19, 63�.

Horváth, J., and Burgyan, A. (2011). Distraction and the auditory attentional blink. Atten. Percept. Psychophys. 73, 695�.

Hupé, J. M., Joffo, L. M., and Pressnitzer, D. (2008). Bistability for audiovisual stimuli: perceptual decision is modality specific. J. Vis. 8, 1�.

Hupé, J. M., and Rubin, N. (2003). The dynamics of bi-stable alternation in ambiguous motion displays: a fresh look at plaids. Vision Res. 43, 531�.

Iversen, J. R., Repp, B. H., and Patel, A. D. (2009). “Top-down control of rhythm perception modulates early auditory responses,” in Neurosciences and Music III: Disorders and Plasticity, eds S. Dallabella, N. Kraus, K. Overy, C. Pantev, J. S. Snyder, M. Tervaniemi, B. Tillmann, and G. Schlaug (Oxford: Blackwell Publishing), 58–S73.

Johnson, B. W., Hautus, M., and Clapp, W. C. (2003). Neural activity associated with binaural processes for the perceptual segregation of pitch. Clin. Neurophysiol. 114, 2245�.

Jones, M. R. (1976). Time, our lost dimension – toward a new theory of perception, attention, and memory. Psychol. Rev. 83, 323�.

Jones, M. R., and Boltz, M. (1989). Dynamic attending and responses to time. Psychol. Rev. 96, 459�.

Jones, M. R., Kidd, G., and Wetzel, R. (1981). Evidence for rhythmic attention. J. Exp. Psychol. Hum. Percept. Perform. 7, 1059�.

Kahneman, D. (1973). Attention and Effort. Englewood Cliffs, NJ: Prentice-Hall.

Kanai, R., Tsuchiya, N., and Verstraten, F. A. (2006). The scope and limits of top-down attention in unconscious visual processing. Curr. Biol. 16, 2332�.

Kerlin, J. R., Shahin, A. J., and Miller, L. M. (2010). Attentional gain control of ongoing cortical speech representations in a 𠇌ocktail party.” J. Neurosci. 30, 620�.

Kidd, G. Jr., Arbogast, T. L., Mason, C. R., and Gallun, F. J. (2005). The advantage of knowing where to listen. J. Acoust. Soc. Am. 118, 3804�.

Kidd, G. Jr., Mason, C. R., and Richards, V. M. (2003). Multiple bursts, multiple looks, and stream coherence in the release from informational masking. J. Acoust. Soc. Am. 114, 2835�.

Kidd, G., Mason, C. R., Deliwala, P. S., Woods, W. S., and Colburn, H. S. (1994). Reducing informational masking by sound segregation. J. Acoust. Soc. Am. 95, 3475�.

Kidd, G., Mason, C. R., Richards, V. M., Gallun, F. J., and Durlach, N. I. (2007). “Informational masking,” in Auditory Perception of Sound Sources, eds W. A. Yost, R. R. Fay, and A. N. Popper (New York: Springer), 143�.

Kim, C. Y., and Blake, R. (2005). Psychophysical magic: rendering the visible ‘invisible.’ Trends Cogn. Sci. (Regul. Ed.) 9, 381�.

Koch, C., and Tsuchiya, N. (2007). Attention and consciousness: two distinct brain processes. Trends Cogn. Sci. (Regul. Ed.) 11, 16�.

Kondo, H. M., and Kashino, M. (2007). Neural mechanisms of auditory awareness underlying verbal transformations. Neuroimage 36, 123�.

Kondo, H. M., and Kashino, M. (2009). Involvement of the thalamocortical loop in the spontaneous switching of percepts in auditory streaming. J. Neurosci. 29, 12695�.

Kouider, S., de Gardelle, V., Dehaene, S., Dupoux, E., and Pallier, C. (2010). Cerebral bases of subliminal speech priming. Neuroimage 49, 922�.

Kouider, S., and Dupoux, E. (2005). Subliminal speech priming. Psychol. Sci. 16, 617�.

Lamme, V. A. (2004). Separate neural definitions of visual consciousness and visual attention a case for phenomenal awareness. Neural. Netw. 17, 861�.

Large, E. W., and Jones, M. R. (1999). The dynamics of attending: how people track time-varying events. Psychol. Rev. 106, 119�.

Leopold, D. A., and Logothetis, N. K. (1999). Multistable phenomena: changing views in perception. Trends Cogn. Sci. (Regul. Ed.) 3, 254�.

Leung, A. W., Jolicoeur, P., Vachon, F., and Alain, C. (2011). The perception of concurrent sound objects in harmonic complexes impairs gap detection. J. Exp. Psychol. Hum. Percept. Perform. 37, 727�.

Long, G. M., and Toppino, T. C. (2004). Enduring interest in perceptual ambiguity: alternating views of reversible figures. Psychol. Bull. 130, 748�.

Martens, S., and Johnson, A. (2005). Timing attention: cuing target onset interval attenuates the attentional blink. Mem. Cognit. 33, 234�.

McAnally, K. I., Martin, R. L., Eramudugolla, R., Stuart, G. W., Irvine, D. R. F., and Mattingley, J. B. (2010). A dual-process account of auditory change detection. J. Exp. Psychol. Hum. Percept. Perform. 36, 994�.

McClelland, J. L., Mirman, D., and Holt, L. L. (2006). Are there interactive processes in speech perception? Trends Cogn. Sci. (Regul. Ed.) 10, 363�.

McDermott, J. H., Wrobleski, D., and Oxenham, A. J. (2011). Recovering sound sources from embedded repetition. Proc. Natl. Acad. Sci. U.S.A. 108, 1188�.

McDonald, K. L., and Alain, C. (2005). Contribution of harmonicity and location to auditory object formation in free field: evidence from event-related brain potentials. J. Acoust. Soc. Am. 118, 1593�.

Micheyl, C., Carlyon, R. P., Gutschalk, A., Melcher, J. R., Oxenham, A. J., Rauschecker, J. P., Tian, B., and Courtenay Wilson, E. (2007a). The role of auditory cortex in the formation of auditory streams. Hear. Res. 229, 116�.

Micheyl, C., Shamma, S. A., and Oxenham, A. J. (2007b). “Hearing out repeating elements in randomly varying multitone sequences: a case of streaming?” in Hearing: From Sensory Processing to Perception, eds B. Kollmeier, G. Klump, V. Hohmann, U. Langemann, M. Mauermann, S. Uppenkamp, and J. Verhey (Berlin: Springer), 267�.

Micheyl, C., and Oxenham, A. J. (2010). Objective and subjective psychophysical measures of auditory stream integration and segregation. J. Assoc. Res. Otolaryngol. 11, 709�.

Micheyl, C., Tian, B., Carlyon, R. P., and Rauschecker, J. P. (2005). Perceptual organization of tone sequences in the auditory cortex of awake macaques. Neuron 48, 139�.

Miller, G. A., and Heise, G. A. (1950). The trill threshold. J. Acoust. Soc. Am. 22, 637�.

Moore, B. C. J. (1978). Psychophysical tuning curves measured in simultaneous and forward masking. J. Acoust. Soc. Am. 63, 524�.

Moore, B. C. J., Glasberg, B. R., and Peters, R. W. (1986). Thresholds for hearing mistuned partials as separate tones in harmonic complexes. J. Acoust. Soc. Am. 80, 479�.

Moore, B. C. J., and Gockel, H. (2002). Factors influencing sequential stream segregation. Acta Acust. United Ac. 88, 320�.

Movshon, J. A., Adelson, E. H., Gizzi, M. S., and Newsome, W. T. (1985). “The analysis of moving visual patterns,” in Study Group on Pattern Recognition Mechanisms, eds C. Chagas, R. Gattass, and C. G. Gross (Vatican City: Pontifica Academia Scientiarum), 117�.

Näätänen, R., and Picton, T. (1987). The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure. Psychophysiology 24, 375�.

Nahum, M., Nelken, I., and Ahissar, M. (2008). Low-level information and high-level perception: the case of speech in noise. PLoS Biol. 6, e126. doi:10.1371/journal.pbio.0060126

Neff, D. L., and Green, D. M. (1987). Masking produced by spectral uncertainty with multicomponent maskers. Percept. Psychophys. 41, 409�.

Newman, R. S., and Evers, S. (2007). The effect of talker familiarity on stream segregation. J. Phon. 35, 85�.

Nieuwenstein, M. R. (2006). Top-down controlled, delayed selection in the attentional blink. J. Exp. Psychol. Hum. Percept. Perform. 32, 973�.

Nygaard, L. C. (1993). Phonetic coherence in duplex perception: effects of acoustic differences and lexical status. J. Exp. Psychol. Hum. Percept. Perform. 19, 268�.

Nygaard, L. C., and Eimas, P. D. (1990). A new version of duplex perception: evidence for phonetic and nonphonetic fusion. J. Acoust. Soc. Am. 88, 75�.

Oliva, A., and Torralba, A. (2001). Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145�.

Pascual-Leone, A., and Walsh, V. (2001). Fast backprojections from the motion to the primary visual area necessary for visual awareness. Science 292, 510�.

Pastore, R. E., Schmuckler, M. A., Rosenblum, L., and Szczesiul, R. (1983). Duplex perception with musical stimuli. Percept. Psychophys. 33, 469�.

Pearson, J., and Brascamp, J. (2008). Sensory memory for ambiguous vision. Trends Cogn. Sci. (Regul. Ed.) 12, 334�.

Pichora-Fuller, M. K., Schneider, B. A., and Daneman, M. (1995). How young and old adults listen to and remember speech in noise. J. Acoust. Soc. Am. 97, 593�.

Pitt, M. A., and Shoaf, L. (2002). Linking verbal transformations to their causes. J. Exp. Psychol. Hum. Percept. Perform. 28, 150�.

Potter, M. C., Chun, M. M., Banks, B. S., and Muckenhoupt, M. (1998). Two attentional deficits in serial target search: the visual attentional blink and an amodal task-switch deficit. J. Exp. Psychol. Learn Mem. Cogn. 24, 979�.

Pressnitzer, D., and Hupé, J. M. (2006). Temporal dynamics of auditory and visual bistability reveal common principles of perceptual organization. Curr. Biol. 16, 1351�.

Pressnitzer, D., Sayles, M., Micheyl, C., and Winter, I. M. (2008). Perceptual organization of sound begins in the auditory periphery. Curr. Biol. 18, 1124�.

Rand, T. C. (1974). Dichotic release from masking for speech. J. Acoust. Soc. Am. 55, 678�.

Raymond, J. E., Shapiro, K. L., and Arnell, K. M. (1992). Temporary suppression of visual processing in an RSVP task: an attentional blink? J. Exp. Psychol. Hum. Percept. Perform. 18, 849�.

Rensink, R. A. (2002). Change detection. Annu. Rev. Psychol. 53, 245�.

Repp, B. H. (2007). Hearing a melody in different ways: multistability of metrical interpretation, reflected in rate limits of sensorimotor synchronization. Cognition 102, 434�.

Repp, B. H., and Bentin, S. (1984). Parameters of spectral temporal fusion in speech-perception. Percept. Psychophys. 36, 523�.

Riecke, L., Mendelsohn, D., Schreiner, C., and Formisano, E. (2009). The continuity illusion adapts to the auditory scene. Hear. Res. 247, 71�.

Riecke, L., Micheyl, C., Vanbussel, M., Schreiner, C. S., Mendelsohn, D., and Formisano, E. (2011). Recalibration of the auditory continuity illusion: sensory and decisional effects. Hear. Res. 152�.

Roberts, B., Glasberg, B. R., and Moore, B. C. J. (2002). Primitive stream segregation of tone sequences without differences in fundamental frequency or passband. J. Acoust. Soc. Am. 112, 2074�.

Rogers, W. L., and Bregman, A. S. (1993). An experimental evaluation of three theories of auditory stream segregation. Percept. Psychophys. 53, 179�.

Sato, M., Baciu, M., Loevenbruck, H., Schwartz, J. L., Cathiard, M. A., Segebarth, C., and Abry, C. (2004). Multistable representation of speech forms: a functional MRI study of verbal transformations. Neuroimage 23, 1143�.

Sato, M., Schwartz, J. L., Abry, C., Cathiard, M. A., and Loevenbruck, H. (2006). Multistable syllables as enacted percepts: a source of an asymmetric bias in the verbal transformation effect. Percept. Psychophys. 68, 458�.

Schacter, D. L., Dobbins, I. G., and Schnyer, D. M. (2004). Specificity of priming: a cognitive neuroscience perspective. Nat. Rev. Neurosci. 5, 853�.

Schadwinkel, S., and Gutschalk, A. (2011). Transient BOLD activity locked to perceptual reversals of auditory streaming in human auditory cortex and inferior colliculus. J. Neurophysiol. 105, 1977�.

Shamma, S. A., Elhilali, M., and Micheyl, C. (2011). Temporal coherence and attention in auditory scene analysis. Trends Neurosci. 34, 114�.

Shen, D., and Alain, C. (2010). Neuroelectric correlates of auditory attentional blink. Psychophysiology 47, 184�.

Shen, D., and Alain, C. (2011). Temporal attention facilitates short-term consolidation during a rapid serial auditory presentation task. Exp. Brain Res. 215, 285�.

Shen, D., and Mondor, T. A. (2006). Effect of distractor sounds on the auditory attentional blink. Percept. Psychophys. 68, 228�.

Shen, D., and Mondor, T. A. (2008). Object file continuity and the auditory attentional blink. Percept. Psychophys. 70, 896�.

Shinn-Cunningham, B. G. (2008). Object-based auditory and visual attention. Trends Cogn. Sci. (Regul. Ed.) 12, 182�.

Shinn-Cunningham, B. G., Lee, A. K. C., and Babcock, S. (2008). Measuring the perceived content of auditory objects using a matching paradigm. J. Assoc. Res. Otolaryngol. 9, 388�.

Shinn-Cunningham, B. G., Lee, A. K. C., and Oxenham, A. J. (2007). A sound element gets lost in perceptual competition. Proc. Natl. Acad. Sci. U.S.A. 104, 12223�.

Shinn-Cunningham, B. G., and Schwartz, A. H. (2010). Dissociation of perceptual judgments of “what” and “where” in an ambiguous auditory scene. J. Acoust. Soc. Am. 128, 3041�.

Simons, D. J., and Rensink, R. A. (2005). Change blindness: past, present, and future. Trends Cogn. Sci. (Regul. Ed.) 9, 16�.

Sinnett, S., Costa, A., and Soto-Faraco, S. (2006). Manipulating inattentional blindness within and across sensory modalities. Q. J. Exp. Psychol. 59, 1425�.

Snyder, J. S., and Alain, C. (2007). Toward a neurophysiological theory of auditory stream segregation. Psychol. Bull. 133, 780�.

Snyder, J. S., Alain, C., and Picton, T. W. (2006). Effects of attention on neuroelectric correlates of auditory stream segregation. J. Cogn. Neurosci. 18, 1�.

Snyder, J. S., Carter, O. L., Hannon, E. E., and Alain, C. (2009a). Adaptation reveals multiple levels of representation in auditory stream segregation. J. Exp. Psychol. Hum. Percept. Perform. 35, 1232�.

Snyder, J. S., Holder, W. T., Weintraub, D. M., Carter, O. L., and Alain, C. (2009b). Effects of prior stimulus and prior perception on neural correlates of auditory stream segregation. Psychophysiology 46, 1208�.

Snyder, J. S., Carter, O. L., Lee, S.-K., Hannon, E. E., and Alain, C. (2008). Effects of context on auditory stream segregation. J. Exp. Psychol. Hum. Percept. Perform. 34, 1007�.

Snyder, J. S., and Gregg, M. K. (2011). Memory for sound, with an ear toward hearing in complex scenes. Atten. Percept. Psychophys. 73, 1993�.

Snyder, J. S., and Weintraub, D. M. (2011). Pattern specificity in the effect of prior Δf on auditory stream segregation. J. Exp. Psychol. Hum. Percept. Perform. 37, 1649�.

Soto-Faraco, S., Spence, C., Fairbank, K., Kingstone, A., Hillstrom, A. P., and Shapiro, K. (2002). A crossmodal attentional blink between vision and touch. Psychon. Bull. Rev. 9, 731�.

Stainsby, T. H., Moore, B. C. J., Medland, P. J., and Glasberg, B. R. (2004). Sequential streaming and effective level differences due to phase-spectrum manipulations. J. Acoust. Soc. Am. 115, 1665�.

Stoerig, P., and Cowey, A. (1997). Blindsight in man and monkey. Brain 120, 535�.

Sussman, E., Ritter, W., and Vaughan, H. G. (1999). An investigation of the auditory streaming effect using event-related brain potentials. Psychophysiology 36, 22�.

Sussman, E. S., Horváth, J., Winkler, I., and Orr, M. (2007). The role of attention in the formation of auditory streams. Percept. Psychophys. 69, 136�.

Teki, S., Chait, M., Kumar, S., Von Kriegstein, K., and Griffiths, T. D. (2011). Brain bases for auditory stimulus-driven figure-ground segregation. J. Neurosci. 31, 164�.

Thompson, S. K., Carlyon, R. P., and Cusack, R. (2011). An objective measurement of the build-up of auditory streaming and of its modulation by attention. J. Exp. Psychol. Hum. Percept. Perform. 37, 1253�.

Toiviainen, P., and Snyder, J. S. (2003). Tapping to Bach: resonance-based modeling of pulse. Music Percept. 21, 43�.

Tong, F., Meng, M., and Blake, R. (2006). Neural bases of binocular rivalry. Trends Cogn. Sci. (Regul. Ed.) 10, 502�.

Tremblay, S., Vachon, F., and Jones, D. M. (2005). Attentional and perceptual sources of the auditory attentional blink. Percept. Psychophys. 67, 195�.

Tuller, B., Ding, M. Z., and Kelso, J. A. S. (1997). Fractal timing of verbal transforms. Perception 26, 913�.

Ullman, S. (2007). Object recognition and segmentation by a fragment-based hierarchy. Trends Cogn. Sci. (Regul. Ed.) 11, 58�.

Vachon, F., and Tremblay, S. (2005). Auditory attentional blink: masking the second target is necessary, delayed masking is sufficient. Can. J. Exp. Psychol. 59, 279�.

van Ee, R., Van Boxtel, J. J. A., Parker, A. L., and Alais, D. (2009). Multisensory congruency as a mechanism for attentional control over perceptual selection. J. Neurosci. 29, 11641�.

Van Noorden, L. P. A. S. (1975). Temporal Coherence in the Perception of Tone Sequences. Unpublished doctoral dissertation, Eindhoven University of Technology, Eindhoven.

Vitevitch, M. S. (2003). Change deafness: the inability to detect changes between two voices. J. Exp. Psychol. Hum. Percept. Perform. 29, 333�.

Warren, R. M. (1968). Verbal transformation effect and auditory perceptual mechanisms. Psychol. Bull. 70, 261�.

Whalen, D. H., and Liberman, A. M. (1987). Speech-perception takes precedence over nonspeech perception. Science 237, 169�.

Wibral, M., Bledowski, C., Kohler, A., Singer, W., and Muckli, L. (2009). The timing of feedback to early visual cortex in the perception of long-range apparent motion. Cereb. Cortex 19, 1567�.

Xiang, J. J., Simon, J., and Elhilali, M. (2010). Competing streams at the cocktail party: exploring the mechanisms of attention and temporal integration. J. Neurosci. 30, 12084�.

Zhang, D., Shao, L., Nieuwenstein, M., and Zhou, X. (2008). Top-down control is not lost in the attentional blink: evidence from intact endogenous cueing. Exp. Brain Res. 185, 287�.

Keywords: auditory scene analysis, multistability, change deafness, informational masking, priming, attentional blink

Citation: Snyder JS, Gregg MK, Weintraub DM and Alain C (2012) Attention, awareness, and the perception of auditory scenes. Front. Psychology 3:15. doi: 10.3389/fpsyg.2012.00015

Received: 14 October 2011 Paper pending published: 07 November 2011
Accepted: 11 January 2012 Published online: 07 February 2012.

Alexander Gutschalk, Universität Heidelberg, Germany
Hirohito M. Kondo, Nippon Telegraph and Telephone Corporation, Japan

Copyright: © 2012 Snyder, Gregg, Weintraub and Alain. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.


Perception *Psychology*

FIGURE/GROUND: Objects tend to stand out from a background. There is also something called Reversible Figure/Ground and that is where the background stands out from an object. Mainly the background and the object switch places/purposes.

GROUPING:
SIMILARITY: It displays a sort of likeness with other things. Say there were columns of circles and squares, (e.g. circle, square, circle, square), however due to similarity, you seem them rather as horizontal rows because they are similar (e.g. ROW 1: circle, circle, circle. ROW 2: square, square, square).

PROXIMITY: We perceive elements that are closer together as grouped together. As a result, we tend to see pairs of dots rather than a row of single dots.

CONTINUATION: We tend to see it as a smooth path or a form a continuing on course. When there is an intersection between two or more objects, people tend to perceive each object as a single uninterrupted object. This allows differentiation of stimuli even when they come in the visual overlap. We have a tendency to group and organize lines or curves that follow an established direction over those defined by sharp and abrupt changes in direction.

CLOSURE: We usually group elements to form enclosed or complete figures rather than open ones. We tend to ignore the breaks and concentrate on the overall form.

COMMON FATE: We tend to see them as objects that move together are also grouped together. For example, birds may be distinguished from their background as a single flock because they are moving in the same direction and at the same velocity, even when each bird is seen—from a distance—as little more than a dot. The moving 'dots' appear to be part of a unified whole.

SIMPLICITY: When we observe a pattern, we perceive it in the most basic, straightforward manner that we can. For example, there is a picture of a diamond with two lines connected to the outside corners. People could perceive it as that, or a W and an M, or some could see a dinner plate with silverware.

(DONE WITH GROUPING SECTION)

CONTOURS: Boundaries between the figure and the ground that gives it shape and meaning (camouflage). There are also subjective contours which are no distinguishable physical boundaries. (See boundaries that aren't there).

CONTEXT: The setting in which the object occurs. You see the expectation of the object and adapt to its location.

CONSTANCY: Even with brightness, color, shape, and size, the shape stays the same, even though the retinal image changes.

-If perception were based primarily on breaking down a stimulus into its most basic elements, understanding the sentence, as well as other ambiguous stimuli, it would not be possible. The fact that you were probably able to recognize such an imprecise stimulus illustrates that perception proceeds along two different avenues, called top-down processing and bottom-up processing.

Top-down Processing: Perception is guided by higher-level knowledge, experience, expectations, and motivations. You were able to figure out the meaning of the sentence with the missing letters because of your prior reading experience and because written English contains redundancies. Top-down processing is illustrated by the importance of context in determining how we perceive objects. However, it cannot occur on its own. Even though it allows us to fill in the gaps in ambiguous and out-of-context stimuli, we would be unable to perceive the meaning of such stimuli without bottom-up processing.

Bottom-up processing: Consists of the progression of recognizing and processing info from individual components of stimuli and moving to the perception of the whole. We would make no headway in our recognition of the sentence without being able to perceive the individual shapes that make up the letters. Some perception, then, occurs at the level of the patterns and features of each of the separate letters.


Q&A with the Editors of Auditory Perception & Cognition

(MH)Discussions about the journal stemmed from a long-standing satellite meeting of the Psychonomic Society called The Auditory Perception Cognition and Action Meeting, which brings together researchers from different theoretical perspectives who study different levels of processing in auditory perception and cognition. This meeting has grown into a core group of researchers who want to reach beyond their narrowly defined research areas.

The journal’s need became apparent by the fact that there was an exodus of auditory research from what were traditionally considered general perceptual and cognitive journals that would publish auditory work. We collected some data and found that the editorial boards were changing in complexion. Simultaneously, a bunch of auditory work was showing up in journals that were targeted at a more narrow audience. We felt that was bad for auditory science. The way to address this problem is to create an outlet where you bring research areas together without restrictions concerning levels of processing, particular theoretical approaches, or the types of stimuli.

(MR)To expand on that, one of the problems currently facing auditory researchers is where to publish.

You go with the venue where you have the highest readership, but then you’re publishing one auditory research article among 10 or more vision articles. The end result is that you’re publishing in a journal that’s not really devoted to auditory research, and the repercussion could be that your manuscript does not get much attention.

On the flipside, you can publish in a journal that is specific to one area of auditory research. Now your manuscript is among other articles related to auditory research. But the problem has not changed. Your research is in a journal that has a limited audience, and again, you’ve reduced readership. One thing I believe we all want is to have our manuscripts read by the largest audience possible.

One of the core aims of Auditory Perception & Cognition is to have a venue where all auditory researchers can come together and publish their research in a journal that has a high degree of readership. That’s the thing that we all are striving for. We are doing great research and we want as many people as possible to know about it.

(MH)The one thing I’d add is a shared desire to reach a readership that would help researchers realise the general implications of their work.

(MR)I completely agree. Individuals with different theoretical backgrounds and methodological backgrounds all come together reading research that can influence them. It just gives you this big-picture perspective, which doesn’t exist currently with other journals.

What sets Auditory Perception & Cognition apart from its peers in the broader field of cognitive auditory work?

(MH)We don’t see another journal that brings together auditory work in a very general sense. Most auditory journals cater to a specific level of processing—for example, either as with low-level perceptual or cognitive processing. Furthermore, such a journal might only publish one or two auditory articles per issue, and much of the readership might not be focused on auditory work. The combined specificity of an auditory focus and great breadth in content is what really sets it apart.

(MR)One thing that sets this journal apart is that it is not just limited to cognition. It’s basic psychology. It’s applied psychology. It’s language. It’s music. It’s psychophysics. There are no limits in terms of the content area that we accept, on theoretical orientation, or for that matter, word- or page count.

(MH)We’re thinking in terms of how the system works. Like many auditory researchers, I was trained to think that cognition begins where perception ends. As a result, if people ask me if I’m a cognitive psychologist, I say “yes” if they ask me if I’m a perceptual psychologist, I say “yes”. If these processes do fall along a continuum, then why continue to break up these pieces of the puzzle into different journals? We should have them in one place.

(MR)Exactly, and I’m completely the opposite of you based on my theoretical background. I’m a Gibsonian, an ecological psychologist. My field literally cuts out cognitive psychology there’s perception and there’s action and that’s the only thing we have to consider when somebody says, “Are you a cognitive psychologist?” I say no. We don’t have to do that. If we get our science right, then we can just understand perception and action and they’re related. As Michael said that’s the big picture. We have to consider all the pieces together, and that’s the beauty of AP&C.


How would you characterise the work you’re looking to publish in the journal?

(MH)There is a wide array of content areas that are considered by the journal. Submissions can be distinguished by the type of stimuli or methods employed. We accept behavioural work. We accept neuroscience. We’re interested in research on music perception and cognition, concerning either lower-level or higher-order explanations of speech and language processing, environmental noise processing, both basic and applied research projects, as well as animal work that has implications for processing in humans.

(MR)We need a venue where different researchers from different theoretical backgrounds and with different subject matter can all come together to determine the best way to serve science and make progress in the field.


So in terms of the article types that you’re accepting, there are no restrictions there?

(MR)There are no restrictions on any kind of manuscript we consider for publication: literature reviews, theoretical articles, empirical articles, or brief reports.

(MH)It could re-evaluate existing work, or reflect a new empirical focus that could point out bigger theoretical issues.

(MR)We’re not just limited to solely, purely auditory research, but are interested in cross-modal research as well. We can’t consider how we think, perceive and act in a world in terms of just one sensory system, in terms of one modality.

(MH)For instance, speech processing is not solely auditory. In conversation we also rely on the visual system. How can you understand one aspect of speech perception fully without considering another with which it interacts?

(MR)To understand why we think and act the way that we do we have to consider the individual as a single system that composed of multiple sensory systems, multiple sources of information, different cognitive processes, etc. That’s the beauty of AP&C.


What advice would you give to someone who was thinking about submitting to the journal?

(MH)Do your work the way in which you intend it to be done and seen. We just want the work to be good no matter how it's presented. This should liberate researchers to concentrate on doing the job that they think that should be done.

(MR)I completely agree. One of the best parts about AP&C is the lack of limitations or restrictions. Researchers have the opportunity to delve deep into what they found and what they’re trying to express to readers. I would say it’s an opportunity for authors to convey their thoughts in the best way possible.

(MH)Also, researchers should be aware that this is a hybrid journal that has both a traditional subscription component and an open-access option. The journal far undercuts the typical cost of open-access publication fees, thereby making it accessible to a wider variety of researchers. Furthermore, if authors become members of our sister organisation, the Auditory Perception & Cognition Society, then they can get those fees reduced the point they could publish four open-access articles in AP&C for the cost of one article is some other outlets. The review process is the same regardless of whether it’s ultimately destined for open-access or a subscription issue, and that decision is made by the author at the conclusion of the review process.


What are the emerging topics and themes that you’re expecting to see in the field in the next two to three years?

(MR) I am sure everybody in psychology, as well as in all the other scientific fields, is going to agree that our fields are constantly changing. They’re morphing, new areas are being developed, and other areas are going by the wayside. One thing I think that we’re moving in the direction of is the bigger picture. We used to think about cognition as being one very separate field and perception being another field and action being another field. We used to be incredibly specialised,

I think we’re going to move, in due time, to the bigger global picture of seeing all of these areas as being intimately interconnected. So as we’re moving in that particular direction, we’re going to find that specialisation is no longer always the best way to be thinking about things or to be doing research, and thus, isn’t the best way to be develop a true theory about how we think and act.

In AP&C we already have a journal in place that allows the field to progress in whatever way it’s going to progress without having to worry about changing its name or its aim.

(MH)Any time I’ve ever thought that I could anticipate where the field’s going, ultimately my anticipation was incorrect. That’s part of the beauty of it, right? We need publications that permit us to see where connections are possible.

For example, I am seeing more instances of people doing cross-stimulus comparisons, such as between speech and non-speech processing for analogous conditions. Are these acting as shared resources or separate domains? I would expect this journal to provide an excellent forum for having those kinds of conversations.

(MR)As the field is emerging, new topics and avenues of research will arise and some will fall by the wayside. Despite that, AP&C is still going to exist. We are not limited to a particular theoretical approach nor are we limited to a particular domain of research or a particular methodological approach. We don’t have to change our aims, because the aim of this journal is simply to truly understand how it is that we perceive and act in the world based on sound.


Imagination and mind wandering

Mario Villena-González , Diego Cosmelli , in Creativity and the Wandering Mind , 2020

The role of sensory and memory cortices in thoughts

When we remember something or someone, e.g., the face of a loved person, there is the subjective impression of “seeing with the mind's eye.” The phenomenological similarity between visual imagery and visual perception has been noted long ago and suggests that at least some of the same neural bases are underlying both processes.

Studies in mental imagery have shown that visual imagery and visual perception share specialized brain areas ( Kosslyn et al., 2001 ). Results have even shown that visual imagery activates most of the same areas that visual perception does, although some sensory processes may be engaged differently ( Ganis, Thompson, & Kosslyn, 2004 ). Even when the content is specific, as in the case of a face compared with an object, the content-related activation in the ventral extrastriate visual cortex follows the same patterns when people are imagining faces or objects compared with when they are being perceived ( Ishai, Ungerleider, & Haxby, 2000 ).

In the case of auditory imagery, neuroimaging studies have shown that, during inner speech and auditory imagery, the same brain regions related to auditory perception are activated, but visual areas show no activation ( McGuire et al., 1996 Shergill et al., 2001 ).

The abovementioned evidence shows that thoughts use different cortical processing resources depending on their modality. Therefore, sensory cortices are actively involved in thoughts, but the question about how they are integrated to give rise to what we experience as a coherent mental representation in the form of imagery or mind wandering is something that only recently is starting to be elucidated.

In the same way that sensory cortices are involved in self-generated activity, memory systems also play an important role in the generation, maintenance, and manipulation of mental representations during mental imagery and mind wandering.

In the case of mental imagery, an important finding has shown that imagining events in the future share neural substrates with remembering past events ( Addis et al., 2007 ). This finding that a common brain network underlies both memory and imagination may well depend on temporal orientation or other nontemporal factors, such as the subprocesses of scene construction, taking the perspective of other person or spatial navigation ( Schacter et al., 2012 ). In any case, memory integration is key in highly integrated states such as mental imagery ( Schlichting & Preston, 2015 ). Likewise, the generation of mental representations during self-generated thoughts relies on both episodic and semantic memories ( Wang, Yue, & Huang, 2016 ), whereas working memory is important for the maintenance of these processes ( Levinson, Smallwood, & Davidson, 2012 ).

Therefore, beyond future thinking, the hippocampus has an important role in scene construction that is a pillar in mental imagery, in general ( Palombo, Hayes, Peterson, Keane, & Verfaellie, 2018 ). Besides, memory transformation is also important, and studies have found that this process involves the posterior hippocampus (connected to perceptual and spatial representational systems in the posterior neocortex) and the anterior hippocampus (connected to conceptual systems such as medial prefrontal cortex) ( Sekeres, Winocur, & Moscovitch, 2018 ).

In the case of spontaneous mind wandering, the hippocampus has been showed to be a key structure in the neural architecture of mind wandering, shaping the phenomenology of self-generated thoughts. A study examined the frequency and phenomenology of mind wandering in patients with selective bilateral hippocampal damage. They found hippocampal damage changed the form and content of mind wandering from flexible, episodic, and scene based to abstract, non–scene based, and verbal ( McCormick, Rosenthal, Miller, & Maguire, 2018 ).

In the following, we will discuss the role of the DMN in the integration of sensory information, semantic and episodic memory, and processes of scene construction to generate self-generated thoughts.


Psychology of auditory perception

Audition is often treated as a ‘secondary’ sensory system behind vision in the study of cognitive science. In this review, we focus on three seemingly simple perceptual tasks to demonstrate the complexity of perceptual–cognitive processing involved in everyday audition. After providing a short overview of the characteristics of sound and their neural encoding, we present a description of the perceptual task of segregating multiple sound events that are mixed together in the signal reaching the ears. Then, we discuss the ability to localize the sound source in the environment. Finally, we provide some data and theory on how listeners categorize complex sounds, such as speech. In particular, we present research on how listeners weigh multiple acoustic cues in making a categorization decision. One conclusion of this review is that it is time for auditory cognitive science to be developed to match what has been done in vision in order for us to better understand how humans communicate with speech and music. WIREs Cogni Sci 2011 2 479–489 DOI: 10.1002/wcs.123