Auditory Processing


Sound is encoded as action potential trains by the cochlea and transmitted along the eighth nerve into the cochlear nucleus. Central auditory processing exemplifies several general principles of sensory neuronal conduction and control. Information is processed in parallel, with subdivisions of the auditory pathway adapted for specialised feature extraction, and it is hierarchical. The temporal properties of sound mean that the preservation of timing information during auditory processing is of fundamental importance. Timing cues are especially important for encoding the horizontal position of a sound source. Interaural timing differences are encoded by brainstem nuclei with specialised neuronal features. Characteristics of the auditory system have also evolved to encode features present in social vocalisations. The complex interaction between excitation and inhibition in the auditory midbrain enables selective encoding of particular vocalisations by different populations of neurons. Thus, auditory processing has evolved such that behaviourally relevant sounds are efficiently encoded.

Key Concepts

  • The fundamental organising principle of the auditory system is tonotopy, which is a systematic representation of frequency that starts at the level of the basilar membrane.
  • The ascending auditory pathway is comprised of many nuclei that enable both parallel and serial information processing.
  • The inferior colliculus (the main auditory midbrain nucleus) is the main point of convergence of ascending and descending information.
  • Interaural timing differences are encoded in the medial superior olive (in mammals) and depend on coincidence detection, delay lines and specialised neuronal characteristics.
  • Interaural level differences are encoded in the lateral superior olive (in mammals) and depend on interactions of excitation and inhibition.
  • The barn owl is an exceptional animal that can precisely localise sound in the horizontal and vertical plane by using interaural timing and interaural level differences.
  • A fundamental function of the auditory system is to detect, discriminate and categorise complex sounds such as social vocalisations.
  • Neurons in the inferior colliculus are selective to vocalisations such that a unique neural representation occurs for different vocalisations.
  • The complex interplay between excitation and inhibition in the inferior colliculus results in heterogeneous responses to vocalisations.
  • The mouse auditory system has evolved to utilise cochlear distortions created by overlapping ultrasonic frequencies to encode social vocalisations that have spectral content much higher than the tuning properties of the neurons.

Keywords: hearing; auditory pathway; sound localisation; vocalisation processing; frequency representation

Figure 1. (a) Section through the human head showing auditory canal, middle ear and inner ear. (b) The cochlea is a fluid‐filled tube coiled into a three‐turn spiral. (c) In cross‐section it has three compartments, with the hair cells arranged in rows running the length of the cochlea within the organ of Corti. Sound vibrations cause displacement of hair cell stereocilia relative to the overlying tectorial membrane.
Figure 2. Scanning electron micrograph of the organ of Corti with the tectorial membrane removed. Three rows of outer hair cells and the single row of inner hair cells can be seen, which extend for the length of the cochlea. Each hair cell is capped by a cluster of stereocilia; small movements of these ‘hairs’ are responsible for sensing sound vibrations. Inner hair cells sense sound and transmit this information via the type I spiral ganglion afferents; the outer hair cells transmit information via smaller type II afferent fibres and are thought to serve as the cochlear amplifier. Micrograph kindly provided by Prof. David Furness, Keele University, UK.
Figure 3. The encoding of sound by the cochlea. (a) Sensory transduction. This figure shows the sequence of events leading to the transduction of sound into a train of action potentials. Sound enters the auditory canal, moving the tympanic membrane, which vibrates the three ossicular bones in the middle ear. The stapes transmits this amplified signal to the oval window in the cochlea. The resulting travelling wave displaces the basilar membrane and causes bending of the inner hair stereocilia (IHCs). This movement changes the activity of ion channels, causing depolarisation or hyperpolarisation of the hair cell membrane potential, which in turn modulates calcium influx and transmitter release onto the type I spiral ganglion cell afferents. The resultant action potentials propagate into the cochlear nucleus via the eighth nerve. (b) Tonotopy. An uncoiled cochlea is shown (Figureb) with two travelling waves represented above. A high‐frequency (orange) sound resonates the base of the cochlea and generates action potentials in the spiral ganglion cells projecting to that region. A low‐frequency sound (green) activates a region of the cochlea close to the apex and this stimulates a different group of afferents. The action potential response is similar in both fibres (below), but the different cochlear positions of the hair cells that respond to a given frequency of sound allows this information to be encoded into a particular set of afferent fibres. This tonotopic relationship is preserved at many levels of auditory processing. (c) Characteristic or best frequency. The tonotopic relationship means that each afferent fibre responds best (with the lowest threshold) to the sound of a given frequency. Each afferent responds to sound that is of a characteristic frequency. (d) The volume of a sound is thought to be encoded by the number of action potentials. This probably includes action potentials both within and around a particular frequency range; that is, as the volume increases, additional afferents are recruited (volley principle).
Figure 4. Ascending auditory pathway. The three levels of central auditory processing are illustrated. The brainstem auditory pathway consists of the cochlear nucleus and superior olivary complex; the midbrain contains the inferior and superior colliculi and the auditory thalamic relay (the medial geniculate nucleus), which projects to the primary auditory cortex (AI) and the surrounding secondary auditory centres. Abbreviations: CN, cochlear nucleus; DCN, dorsal cochlear nucleus; aVCN, anterioventral cochlear nucleus; DAS, dorsal acoustic stria; IAS, intermediate acoustic stria; SOC, superior olivary complex; LSO, lateral superior olive; MNTB, medial nucleus of the trapezoid body; MSO, medial superior olive; PON, periolivary nuclei; LL, lateral lemniscus; DNLL, dorsal nucleus of the lateral lemniscus; INLL, intermediate nucleus of the lateral lemniscus; VNLL, ventral nucleus of the lateral lemniscus; IC, inferior colliculus; CIC, central nucleus of the inferior colliculus; BIC, brachium of the inferior colliculus; DCx, dorsal cortex of the inferior colliculus; LN, lateral nucleus of the inferior colliculus; SC, superior colliculus; MGN, medial geniculate nucleus; dMGN, mMGN, vMGN, dorsal, medial and ventral MGN, respectively; AI, primary auditory cortex; Py, pyramids; Tz, trepezoid body.
Figure 5. The cochlear nucleus. The cochlear nucleus (CN) is located on the lateral edges of the brainstem, below the cerebellum; this diagram shows the nucleus as viewed from the side. The spiral ganglion axons that carry the input from the cochlea enter the CN via the eighth nerve. The axons maintain their tonotopic relationship and bifurcate into anterior‐ventral and rostrodorsal pathways. The dorsal cochlear nuclei (DCN) and ventral cochlear nuclei (VCN) form the first central processing of the auditory input. The DCN is a laminar structure concerned with the spectral properties of a sound, while the VCN forms the major inputs to the binaural brainstem pathways concerned with sound localisation. The locations of the principal cells are indicated by the symbols and by the dashed lines.
Figure 6. Cells of the anterioventral cochlear nucleus have characteristic morphologies and firing properties. Firing properties are depicted here as post‐stimulus time histograms (PSTH). These represent the rate of firing plotted against time from the start of a sound stimulus (as indicated by the filled bar). (a) The primary afferents or spiral ganglion axons give a short burst of action potentials at the start of a sound and then a sustained discharge for the duration of the sound. This is known as a ‘primary’ response pattern. (b) Bushy cells are so called because they have just a few dendrites, which tend to form a bush‐like appearance. They receive their primary afferent input from the endbulbs of Held and follow the firing pattern of the primary afferents very closely; hence, they are often referred to as having a ‘primary‐like’ firing pattern. (c) Stellate or multipolar cells have a more conventional neuronal appearance; they respond to a tone with short bursts of action potentials with distinct, precisely timed pauses, which are called ‘chopper’ responses. (d) Octopus cells possess thick sparsely branched dendrites that tend to originate from one pole of the soma. They fire predominantly at the onset of a stimulus.
Figure 7. The brainstem binaural auditory pathway. This diagram shows a transverse section of the brainstem at the level of the seventh nerve. The eighth nerve enters the aVCN and excites the bushy cells. The globular bushy cells send a large diameter axon that crosses the brainstem in a tract called the trapezoid body. The terminals of these axons form the calyx of Held giant synapses on the cell bodies of principal cells in the medial nucleus of the trapezoid body (MNTB). The MNTB, in turn, gives an inhibitory projection to both the medial superior olive (MSO) and the lateral superior olive (LSO). The MSO receives a bilateral excitatory input from the spherical bushy cells and is responsible for interaural time difference computation. The LSO receives an excitatory input from the ipsilateral bushy cells, which is integrated with the inhibitory input from the MNTB as the first stage of interaural level difference computation.
Figure 8. Descending auditory pathways. Each level of the auditory pathway possesses both ascending and descending pathways. The descending projection to the hair cells originates in the superior olivary complex (SOC) of the brainstem (from the same regions involved in sound localisation – see below) and forms an important part of the sensitivity control of the cochlear sense organ. These efferent axons form the olivocochlear bundle (OCB) and originate from either medial or lateral divisions of the SOC. The lateral division makes presynaptic connections with primary efferent terminals on the inner hair cells, while the medial division makes direct contact with the outer hair cells.
Figure 9. The Jeffress model for interaural time difference computation in the medial superior olive (MSO). Three MSO cells are shown. Each receives excitatory synaptic inputs from both cochlea onto opposite dendrites. The path length of the axons varies along the length of the nucleus; the longer path lengths will give a longer latency response and hence act as a delay line. The cell labelled ‘Right’ receives the longest delay‐line response from the right ear and the shortest delay‐line response from the left ear. At zero time, a sound originating from the far right‐hand side is heard; it generates an excitatory postsynaptic potential (EPSP) in all the MSO neurons, but with a range of latencies. About 600 µs later, the same sound reaches the other ear and triggers EPSPs from the left side. This sound crosses the brainstem and activates the same population of cells. The latency of the left and right EPSPs is the sum of their respective conduction time (+c, for the sound to cross the head) and the delay introduced by the delay‐line (+dl). Only those MSO cells that receive coincident left and right EPSPs will be able to generate an action potential as shown in the lower inset. Cell 3 will only receive a coincident input when the sound originates from the right; this is signalled by firing an AP. By the same mechanism, cell 1 will only receive coincident EPSPs when the sound originated from the far left. By setting up a complete range of conduction delays, the 180° range of azimuth locations can be specified in this model.
Figure 10. Sound localisation in the barn owl. The barn owl uses ITDs to discriminate azimuth location and ILD to detect vertical location. The auditory field of ‘view’ is shown for timing isotherms (ITD, upper graph measured in microseconds), and intensity (ILD, lower graph measured in dB). Combining these two maps of auditory space pinpoints the origin of the sound as indicated in the overlaid map (centre).
Figure 11. Spectrograms (frequency over time plots) showing the spectrotemporal features of mouse, bat and two songbird species social vocalisations. Colour indicates intensity. Red is high and blue is low. Modified and reproduced with permission from Woolley and Portfors (2013) © Elsevier.
Figure 12. Responses of mouse neurons in the inferior colliculus (IC) respond selectively to mouse vocalisations. The raster plots of spike trains of six neurons with similar frequency tuning show that each neuron responds differently to the individual vocalisations, and each neuron responds differently from the other neurons. Modified and reproduced with permission from Woolley and Portfors (2013) © Elsevier.
Figure 13. Inhibition shapes selectivity to vocalisations in the mouse IC. Blocking GABAergic and glycinergic inhibition decreases neuronal selectivity to vocalisations. The responses of one single unit to six vocalisations (USV1, USV2, USV3, USV7, USV8, USV10) before (control) and after (Bic+Str) blocking inhibition. Reproduced from Mayko et al. (2012) © US National Library of Medicine (open access article).
Figure 14. The frequency representation of the mouse auditory system does not match the spectral energy contained in mouse social vocalisations, and excitatory and inhibitory frequency tuning of single auditory midbrain neurons. (a) Tonotopic organisation of the mouse IC. Frequencies <50 kHz are over‐represented and frequencies >50 kHz are under‐represented. Coronal section of mouse IC with circles marking electrophysiological recording locations superimposed. Circles are colour codes according to the characteristic frequency (CF) of the neuron recorded at that location. The average power spectrum of mouse social vocalisation is to the right of the image, showing that the majority of power in vocalisations falls above the frequency tuning of most IC neurons. (b) Examples of frequency tuning curves in mouse IC with excitatory and inhibitory regions colour coded. Excitatory tuning is lower than the spectral energy in mouse social vocalisations. Modified and reproduced with permission from Woolley and Portfors (2013) © Elsevier.


von Békésy G (1960) Experiments in Hearing. New York: McGraw‐Hill.

Brand A, Behrend O, Marquardt T, McAlpine D and Grothe B (2002) Precise inhibition is essential for microsecond interaural time difference coding. Nature 417: 543–547.

Gale JE and Ashmore JF (1997) An intrinsic frequency limit to the cochlear amplifier. Nature 389: 63–66.

van der Heijden M, Lorteije JA, Plauška A, et al. (2013) Directional hearing by linear summation of binaural inputs at the medial superior olive. Neuron 78: 936–948.

Helfert RH and Schwartz IR (1986) Morphological evidence for the existence of multiple neuronal classes in the cat lateral superior olivary nucleus. Journal of Comparative Neurology 244: 533–549.

Köppl C (2012) Auditory neuroscience: how to encode microsecond differences. Current Biology 22: R56–R58.

Klug A, Bauer EE, Hanson JT, et al. (2002) Response selectivity for species‐specific calls in the inferior colliculus of Mexican free‐tailed bats is generated by inhibition. Journal of Neurophysiology 88: 1941–1954.

Konishi M (1993) Listening with two ears. Scientific American 268: 66–73.

Kotak VC, Korada S, Schwartz IR and Sanes DH (1998) A developmental shift from GABAergic to glycinergic transmission in the central auditory system. Journal of Neuroscience 18: 4646–4655.

Mayko ZM, Roberts PD and Portfors CV (2012) Inhibition shapes selectivity to vocalizations in the inferior colliculus of awake mice. Frontiers Neural Circuits 6: 73.

Oertel D (1983) Synaptic responses and electrical properties of cells in brain slices of the mouse aVCN. Journal of Neuroscience 3: 2043–2053.

Olsen JF, Knudsen EI and Esterly S (1989) Neural maps of interaural time and intensity differences in the optic tectum of the barn owl. Journal of Neuroscience 9: 2591–2605.

Pecka M, Brand A, Behrend O and Grothe B (2008) Interaural time difference processing in the mammalian medial superior olive: the role of glycinergic inhibition. Journal of Neuroscience 28: 6914–6925.

Plomp R (1965) Detectability threshold for combination tones. Journal of the Acoustical Society of America 37: 1110–1123.

Pollak GD (2013) The dominant role of inhibition in creating response selectivities for communication calls in the brainstem auditory system. Hearing Research 305: 86–101.

Portfors CV, Roberts PD and Jonson K (2009) Over‐representation of species‐specific vocalizations in the awake mouse inferior colliculus. Neuroscience 162: 486–500.

Roberts MT, Seeman SC and Golding NL (2013) A mechanistic understanding of the role of feedforward inhibition in the mammalian sound localization circuitry. Neuron 78: 923–935.

Seidl AH, Rubel EW and Harris DM (2010) Mechanisms for adjusting interaural time differences to achieve binaural coincidence detection. Journal of Neuroscience 30: 70–80.

Warren B, Gibson G and Russell IJ (2009) Sex recognition through midflight mating duets in Culex mosquitoes is mediated by acoustic distortion. Current Biology 19: 485–491.

Woolley SMN and Portfors CV (2013) Conserved mechanisms of vocalization coding in mammalian and songbird auditory midbrain. Hearing Research 305: 45–56.

Further Reading

Altschuler RA, Bobbin RP, Clopton BM and Hoffman DW (eds) (1991) Neurobiology of Hearing: The Central Auditory System. New York: Raven Press.

Bregman AS (1994) Auditory Scene Analysis: The Perceptual Organization of Sound. Cambridge, Mass. and London: The MIT Press.

Ehret G and Romand R (eds) (1997) The Central Auditory System. New York: Oxford University Press.

Geisler CD (1998) From Sound to Synapse: Physiology of the Mammalian Ear. New York: Oxford University Press.

Kass JH, Hackett TA and Tramo MJ (1999) Auditory processing in the primate cerebral cortex. Current Opinion in Neurobiology 9: 164–170.

de Nó L (1981) The Primary Acoustic Nuclei. New York: Raven Press.

Oertel D (1999) The role of timing in the brain stem auditory nuclei of vertebrates. Annual Review of Physiology 61: 497–519.

Trussell LO (1999) Synaptic mechanisms for coding timing in auditory neurons. Annual Review of Physiology 61: 477–496.

Young E (1998) Cochlear nucleus. In: Shepard GM (ed) The Synaptic Organisation of the Brain, pp. 121–158. New York: Oxford University Press.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Portfors, Christine V(Feb 2016) Auditory Processing. In: eLS. John Wiley & Sons Ltd, Chichester. [doi: 10.1002/9780470015902.a0000017.pub2]