• Users Online: 466
  • Home
  • Print this page
  • Email this page
Home About us Editorial board Search Ahead of print Current issue Archives Submit article Instructions Subscribe Contacts Login 

 Table of Contents  
Year : 2016  |  Volume : 30  |  Issue : 2  |  Page : 28-39

Effect of localization training in horizontal plane on auditory spatial processing skills in listeners with normal hearing

Department of Audiology, All India Institute of Speech and Hearing, Mysuru, Karnataka, India

Date of Web Publication27-Jun-2017

Correspondence Address:
K V Nisha
Department of Audiology, All India Institute of Speech and Hearing, Naimisham Campus, Manasagangothri, Mysuru - 570 006, Karnataka
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/jisha.JISHA_2_17

Rights and Permissions

Introduction: Source localization depends on processing of monaural and binaural spatial cues. Although difficulties arising due to the deficits in the processing spatial cues are well documented in the literature, remediation programs aimed at resolving spatial deficits are scanty. The present study is a preliminary research aimed at exploring the changes in the spatial performance of normal-hearing listeners using localization training in a horizontal plane. Methods: Twelve normal-hearing listeners aged 18–25 years participated in the study. The study was conducted in three phases including pretraining, training, and posttraining phase. At the pre- and post-training phase, three tests of spatial skills, namely, test of localization in free-field, test of lateralization ability under headphones (i.e., virtual auditory space (VAS) identification test), and tests for binaural processing ability (i.e., interaural level difference [ILD] and interaural time difference [ITD]), were administered. The training phase consisted of structured localization regimen spanning eight sessions spread over 2 weeks. Results: Paired t-test revealed that root mean square error, ITD threshold, and VAS scores in the post-training phase were significantly better than pretraining condition, indicative of the benefit derived from training. ILD did not alter significantly in posttraining phase owing to the ceiling effect in pretraining phase. Conclusion: The localization training protocol used in the present study on a preliminary basis proves to be effective in normal-hearing listeners and its implications can be extended to other clinical populations as well.

Keywords: Binaural cue processing, interaural threshold differences, localization training, root mean square error, spatial acuity, virtual auditory space identification

How to cite this article:
Nisha K V, Kumar U A. Effect of localization training in horizontal plane on auditory spatial processing skills in listeners with normal hearing. J Indian Speech Language Hearing Assoc 2016;30:28-39

How to cite this URL:
Nisha K V, Kumar U A. Effect of localization training in horizontal plane on auditory spatial processing skills in listeners with normal hearing. J Indian Speech Language Hearing Assoc [serial online] 2016 [cited 2023 Feb 5];30:28-39. Available from: https://www.jisha.org/text.asp?2016/30/2/28/208990

  Introduction Top

Spatial hearing refers to the ability of the auditory system to relate points of physical space with that of internal auditory space.[1] Localization is strongly dependent on two kinds of cues, namely, the monaural and binaural spatial processing cues. Binaural cues arise due to subtle differences in duration or level of the signal arriving at one ear relative to the other ear. The terms interaural time difference (ITD) and interaural level difference (ILD) are used to quantify such differences in the time of arrival and in the received intensity of signal, respectively. This aids in right-left sound localization in horizontal plane. On the other hand, front-back localization (comparison of same sound between two surfaces of the same ear) is widely explained by monaural spectral filtering effects of pinna.[1],[2]

Spatial acuity and the difficulties arising due to the deficits in the processing of spatial cues are well documented in the literature.[3],[4],[5] Most classically occurring errors in auditory space are those that arise due to the confusion of whether a source is ahead or behind.[6] Such errors in localization are termed as “front-to-back” errors and are objectively quantified using root mean square (rms) localization error scores.[7] The reason for this finding is commonly attributed to the symmetric shape of human torso. The ITD value computed using Woodworth's formula (Δt = [φ + sin φ]/c, where φ is the incident angle and c is the velocity of sound in air) for a sound source in the frontal plane at a particular left or right angle will be exactly the same if the sound source was behind.[8] Given its simplicity, Woodworth's formula is commonly used to compute ITDs from azimuths,[6] wherein the results of such computations will essentially be the same for the source at a particular azimuth in front plane compared with the one in exact opposite hemifield.

The localization error in front-back plane though rare is not an indisputable phenomenon in normal-hearing individuals. The same is lot more apparent in individuals with hearing impairment. Best et al.[9] showed that the normal-hearing listeners exhibited a front-to-back error in about 5% of the trials, whereas it was about 12% in unaided and between 25% and 45% in aided condition for listeners with hearing impairment. Although front-to-back errors in the localization experiments are empirically proven, they are not so predominantly experienced in real world situations, especially so in normal-hearing listeners. This is attributed to the role of the head movements in resolving such localization confusions.[10] However, deficits in sound localization generally impair speech communication to a larger extent.[11] Localization errors can occur in difficult to localize conditions where visual cues are absent and can have cataclysmic effects in professions where there is a need for precise location detections such as navigation or vehicle driving.[12] Another major consequence of spatial acuity deficit is often seen in understanding group conversations, especially when the conversation switches from one person to another. In such situations, it becomes prerequisite for the listener to locate the new speaker instantly, or they will miss the first part of each segment of the conversation, which may seriously reduce understanding. In addition, localization difficulties can pose serious limitations such as understanding speech in noise [13] and reverberation [14] apart from hindering the exchange of ideas with the communication partner.[11] Furthermore, the spatial acuity deficits can also manifest as a serious threat segregating sound sources and thereby have an adverse effect while on auditory scene analysis.[15]

Although review of literature has umpteen citations on the nature and consequences of spatial acuity deficits,[3],[4],[5] the remediation programs initialized at ameliorating the spatial difficulties are scanty. Some of the notable strides in enhancing spatial acuity have used interaural difference training [16],[17],[18],[19],[20] or head-related transfer function (HRTF)-generated virtual acoustic stimuli.[12],[21],[22] Although minimal improvements were noted in these studies,[12],[16],[17],[18],[19],[20],[21],[22] the clinical applicability of such remedial programs is questionable owing to a number of factors such as those related to study design (heterogeneous outcome measures and randomization) as well as those related to technical aspects such as length of the training programs and the cost–benefit ratio. The above-listed limitations in the clinical efficacy of training studies reported in literature can stem at least partially from the devoid of natural stimuli used in training since these studies focused on manipulating one or other parameters (ITD, ILD, or HRTFs) related to spatial hearing. However, the current study takes advantage of naturally occurring binaural and monaural spectral cues in free-field in systematically graded hierarchy to make the training program maximally effective.

Given the overwhelming importance of localization skills in everyday listening situations and abundant evidence in documenting the effects of spatial deficits, it is only apt if these strides are realized in the direction of ameliorating the spatial deficits. Hence, there is a strong need for implementation of an easy, yet effective auditory training program aimed at enhancing spatial acuity.

The present study was a preliminary research aimed at exploring the changes in the spatial performance of normal-hearing listeners using a localization training regimen in the horizontal plane. The specific objectives of the study were to document and compare the pre- and post-training performance of the normal-hearing listeners on the following spatial acuity measures:

  1. Rms localization error
  2. Interaural difference thresholds (ITD and ILD)
  3. Virtual auditory space identification (VASI) scores.

  Materials and Methods Top


The study enrolled 12 healthy undergraduate and postgraduate student volunteers (9 females; 3 males, mean age: 20.83 ± 4.49 years, age range 18–25 years) with normal-hearing sensitivity. All participants of the study had air conduction pure tone hearing thresholds from 250 to 8000 Hz at octave interval ≤15 dB HL in both ears as measured from pure tone audiometry using modified Hughson–Westlake procedure.[23] It was ensured that all participants of the study did not have any of the otological problems, speech and language disorders, neurologic disorders, cognitive deficits, and auditory processing problems through a structured interview.

Informed consent and ethics

Informed consent was obtained from all individual participants included in the study. All procedures performed in this study were in accordance with the ethical guidelines of bio-behavioral research involving human subjects [24] of the All India Institute of Speech and Hearing, Mysore. Before the start of the research, the approval from the Institutional Ethics Committee (ref: WOF0417/2014-15 dated 12/11/2015) was obtained.


The study was conducted in three phases, i.e., pretraining, training, and posttraining phase.

Phase I: Pretraining evaluation phase

To comprehensively evaluate spatial processing skills of the participants in horizontal plane, multiple spatial acuity measurements were carried out. These measurements included assessment of localization skills in free-field and under headphones (using virtual auditory space). In addition, interaural difference thresholds (both level and time) were also measured to evaluate the binaural processing abilities. To account for order effect, counterbalancing of tests was done across participants. The overall duration of testing for all the tests was approximately 50–60 min. Free-field localization skills were assessed on all participants, but interaural difference thresholds and localization under headphones could be assessed only in ten participants due to attrition.

Test of localization ability in free-field

Test material

White noise bursts of 250 ms (including 5 ms rise and 5 ms fall time) duration generated using AUX [25] software at 16-bit and 44100 Hz sampling frequency served as stimuli. The stimuli were loaded to a personal computer with Cubase software (Steinberg Media Technologies GmbH, Hamburg). Eighteen audio tracks were created and the stimuli were randomly assigned to these tracks to enable the stimulus presentation through 18 different loudspeakers (Genelec 8020B BI amplified monitoring system, Finland). [Figure 1] shows the schematic representation of the loudspeaker setup. Noise bursts were presented five times from each loudspeaker leading to a total of 90 presentations (18 loudspeaker locations × 5 repetitions). The order of the presentation was randomized using a custom-made sequencing file in Cubase. Sequenced test stimuli were delivered through Lynx Aurora 16 Mixer two in number (Lynx Studio Technology Inc.,) to the calibrated loudspeakers at 65 dB SPL. Output of the speakers was calibrated using sound level meter (Bruel and Kjaer 2270, type 2270 in a KEMAR Manikin) in the beginning of the experiment. The interstimulus interval (ISI) was adaptively varied depending on the responses of participants. The stimulus in the sequence file was played only after the client responded to the first stimulus in the sequence or after 3 s of inbuilt ISI, whichever occurred later.
Figure 1: Schematic representation of the localization testing chamber with the listener in a seated position. The figure represents the cross section of the spherical array of loudspeakers placed at 0° elevation used in Phase I and Phase III of the study

Click here to view


Testing was carried out in a sound-treated room free from visual distractions. Participants were seated in the center of the room with the 18-loudspeaker array spaced 20° azimuth apart from each other in a circular fashion covering the entire 360° spatial field. The loudspeaker facing the participants' head was placed at 0° azimuth, while the loudspeaker located exactly at back of the participant was aligned at an angle of 180° azimuth. For the ease of identification, each loudspeaker was numbered with digits in a clockwise manner. The head of the participant was always aligned at 2 m distance from the loudspeaker positioned at 0° azimuth as depicted in [Figure 1]. The participants were instructed to look at the front (0°) loudspeaker at all times and not to move during each run. They were asked to verbally respond the digit corresponding to the loudspeaker which delivered the sound, while the tester manually entered the responses. Head movements were allowed only when the participants had to indicate their responses. No feedback was given. The time for testing spatial acuity in free-field approximately took about 10 min.


Rms localization error [7] was calculated by running a program written in python script implemented in paradigm experimenter builder software.[26] Rms error is commonly used measure for quantifying the overall precision in localization performance. It represents the standard deviation for each participant of the differences between target locations and the localization response.[7] The overall rms error (rms of localization errors for all the 18 loudspeakers) as well as rms error for individual speaker (rms error quantified for each speaker location) was calculated using the following formula:

, where n is the number of stimuli presented. Stimulus represents the target loudspeaker location, while response denotes the loudspeaker location to which the participant localized the sound.

Further, a confusion matrix corresponding to the properly identified spatial locations and the misjudged locations was also computed for an individual participant by modification of a confusion matrix for syllable identification script [27] running on MATLAB version 7.10.0 (R2010a) (The MathWorks Inc., Natick)[28] environment. Confusion matrix provides opportunity to study the perceptual confusions arising from limitations in perception of target loudspeaker location. In addition, an attempt was made to carefully examine the potential types of localization errors in horizontal plane (front-back errors and errors in cone of confusion plane) occurring in test of sound direction perception. For this analysis, the spatial hemifield spanning 360° azimuth was divided into four spatial quadrants [Figure 1], and the rms error quantified at speakers located in each quadrant was averaged. This resulted in rms error for the four spatial planes, i.e., front, back, right, and left spatial fields.

Test of lateralization ability under headphones (virtual auditory space identification test)

The aim of generating virtual auditory space (VAS) is to create the illusion of a natural free-field sound using a closed-field sound system such as headphones.[29] VAS stimuli are sound percepts created within the closed-field involving lateralization of the sound image within head. Generation of VAS stimuli involves direction-dependent measures as reflected in HRTFs. HRTF involves complex interaction of sound impinging on eardrum with the head and the torso.[30] VAS results can complement the results of free-field spatial accuracy skills assessed by localization test.

Test material

The stimuli for the illusionary effect of VAS was created by employing sound lab (Slab sound module of SLAB3d).[31] These stimuli were formed by convoluting 250 ms white band noise with nonindividualized slab3d's default HRTF database “jdm.slh.” This default HRTF used in SLAB3d has produced good reproducibility of virtual environments which is comparable to international HRTF databases such as LISTEN (HRTF indices: 1, 2, 3, 4, 6, 9, 11, 14, 16, 17, 18, 21, 22, 30, 33, 34, 35, 36, 39, 40, 42, 44, 46, 48, 50) and CIPIC (HRTF indices: 5, 12, 28, 32, 40, 45).[32] Thus, generated stimuli were used to create five spatial perceptions, i.e., mid-line front (0° azimuth), 45° azimuth toward the right ear (R45), 90° azimuth toward the right ear (R90), 45° azimuth toward the left ear (L45), and 90° azimuth toward the left ear (L90). All the stimuli were routed through a professional soundcard (MOTU MICROBOOK II) connected to a laptop and played through Sennheiser HD 280 PRO (Wedenmark, Germany) headphones.


After familiarization of stimuli and task, the test of VASI commenced. The stimuli were presented using paradigm [26] experimental builder software at 65 dB SPL. Before the initiation of the test, the VAS stimuli were calibrated using SLM (Bruel and Kjaer 2270, type 2270 in a KEMAR Manikin) in a 2CC coupler. In familiarization runs, a dummy head with five locations representing the five virtual auditory stimuli was displayed on the monitor. The participants were asked to click the mouse pointer on the position of dummy head and the audio file corresponding to the virtual location was played.

After the completion of familiarization phase, test run was initiated. In this phase, stimulus corresponding to each virtual location was presented ten times in random sequence. The participants were asked to attend the stimuli and click the mouse pointer on the position of dummy head [Figure 2] corresponding to the perceived location in the head. The test was terminated after the completion 50 trails (5 VAS locations × 10 repetitions). The time for administration of this test was approximately 20 min.
Figure 2: Pictographical representation of dummy head used in Phase I and Phase III of virtual auditory space identification test. The alphanumerical code represents the location of stimulus lateralization. 0° - at the midline front, R45-45° azimuth toward the right ear; R90-90° azimuth toward the right ear, L45-45° azimuth toward the left ear, L90-90° azimuth toward the left ear

Click here to view


The accuracy scores of identification of each virtual space apart from overall accuracy score were computed.

Test for binaural processing ability: Interaural time difference and interaural level difference thresholds

Test material

Three 250 ms perceptually interaurally correlated white noise bursts (stereo, 16 bit, 44,100 sampling frequency) with 5 ms onset and offset ramps were presented in each run. Interaural threshold difference tests (ITD and ILD) were conducted using white noise stimuli routed to Sennheiser headphones (HD 280 PRO 499) through a professional soundcard (MOTU MICROBOOK II) connected to the laptop.


The stimulus generation for interaural thresholds varied adaptively using a modified program in psychoacoustics toolbox [33] running in MATLAB version 7.10.0 (R2010a) (The MathWorks Inc., Natick) environment. The experiments of interaural threshold difference were conducted using a three-interval forced choice method with three down one up transformed up-down staircase method [34] converging at 75% of psychometric function.

In one run, three bursts of white noises including two standards and one variable stimulus were presented. The standard stimulus was 250 ms interaurally correlated white noise presented at 65 dB SPL (calibrated using SLM-Bruel and Kjaer 2270, type 22702270 in KEMAR Manikin). The variable stimulus was similar to the standard stimuli except that it was presented earlier in terms of time or increased intensity level as compared to standard stimulus, resulting in ITD and ILD, respectively. This results in the tone leading or louder in one ear thus getting lateralized to one side. The variable tone always leads or was heard louder in the right ear. The starting level of the variable stimulus was 30 ms and 20 dB SPL for ITD and ILD task, respectively. A step size of 2 ms was used for the ITD task and 2 dB was used for ILD task.

The participants were instructed to indicate the interval in which the variant stimulus (interval in which the sound leads or is heard louder in the right ear) was presented by pressing the number corresponding to the same on the keyboard. The time or level of the variable tone was varied adaptively in accordance with the response of the participant. Feedback in terms of the accuracy of response (correct or wrong) was given for each trial. The testing was terminated at 10 reversals and the last four reversals were averaged to get the converged value of the interaural time and intensity difference thresholds. The duration of for each of the binaural processing test was approximately 10 min.

Phase II: Training phase

Phase II consisted of criterion and duration-based training spanning a maximum of eight sessions (20 min/day) or criterion achievement, whichever occurred prior aimed at improving the localization skills through structured hierarchy of stimulus presentation. Participants were seated at the center of circle containing eight speakers spaced 45° azimuth apart kept at 1 m distance from the listener as shown in [Figure 3]. Each loudspeaker was numbered starting from number 1 assigned to speaker positioned at 0° azimuth to 8 in a clockwise direction. The participants were instructed to name the number of speakers verbally corresponding to the location of the loudspeaker which emanated the sound. Guessing the loudspeaker location, if uncertain, was encouraged.
Figure 3: Loudspeaker setup for localization training

Click here to view

The training regimen proposed by Kuk et al.[35] was adapted. The stimuli were three environmental sounds, i.e., bus horn, speech sound/da/, and telephone ring, and spectra corresponding to these stimuli are given in [Figure 4]. These stimuli were chosen so as to represent low (<1.5 kHz), mid (1.5–3 kHz), and high (3–5 kHz) frequency, respectively. The level of the stimuli was then calibrated to 65 dB SPL using SLM (B&K, 2270).
Figure 4: Spectra of stimuli used in the training phase of the study (a) bus horn, (b) speech stimulus/da/, and (c) telephone ring

Click here to view

Training progressed from easy to difficult tasks. Difficulty levels of the stimuli were varied by (i) changing the duration of the signal and (ii) changing the attenuation provided for the stimuli coming from back. As longer stimuli are easier to localize compared to shorter stimuli, four different stimulus durations were used with increasing level of difficulty – 1000, 800, 500, and 300 ms. Similarly, it has also been shown that localization may be made easy by attenuating the sounds that were presented from back speaker.[35] Therefore, stimuli presented from back loudspeakers were attenuated by four levels from easy to difficult – 8, 4, 2, and 0 dB SPL with reference to front presentation. The back attenuation used in the study is consistent with the literature on pinna effects on front-back localization. The evidence on localization of broadband signals (BBSs) shows that human listeners most dominantly use the ITD cue of the temporal fine structure.[36],[37] From a review of these studies, it can be inferred that temporal cues dominate sound localization of BBS compared to intensity cues. Furthermore, a pilot study on localization of BBS varying in temporal (300 vs. 500 ms duration) and intensity (300 ms at 0 dB attenuation vs. 300 ms with 2 dB SPL back attenuation) parameter was conducted on five participants. The participants were asked to perceptually rate the signal from easy to difficult to localize sounds. Four out of the five participants in the pilot study rated BBS with longer duration (500 ms) as easily localizable compared to 300 ms with 2 dB SPL back attenuation. Based on the above two insights from literature and our pilot study, the intensity/attenuation factor was considered as a nested parameter within the temporal parameter while designing the hierarchy of stimulus presentation in the localization training paradigm.

During training, each stimulus was presented for two runs to obtain mean rms error measure of localization accuracy. Each run comprised randomly occurring stimuli presented thrice through each speaker (8 speakers × 3 repetitions) using Cubase software (Steinberg Media Technologies GmbH, Hamburg) and Lynx mixer (Lynx Studio Technology Inc.). The stimuli were grouped into four levels of hierarchy (easy to difficult) as shown in [Figure 5]. The easiest level in training was 1000 ms with 8 dB SPL back attenuation, and the most difficult level was 300 ms duration with zero back attenuation.
Figure 5: Hierarchy of stimulus (duration and attenuation parameters) presentation in the training phase. S represents stage and L represents level. Progression from left to right represents easy to difficult conditions

Click here to view

Training commenced with 1000 ms with 8 dB SPL back attenuation. Two runs of stimuli were presented. Verbal feedback of correct and incorrect response was given. In case of incorrect response, positional feedback of the expected/target response was provided. Feedback encouraged attention as well as helped maintain motivation level of the participants. After the completion of the two runs of the stimuli at this training level (1000 ms, 8 dB SPL attenuation), rms localization error was calculated. If the error is <10° azimuth of rms error in this level, the task was made more difficult by decreasing the back attenuation, i.e., the stimuli were presented for two more runs with 1000 ms stimulus duration at 4 dB SPL back attenuation. The criteria of using a <10° error were based on pilot study on normal hearing. When 0° error criterion, only three out of the 12 participants achieved >70% correct identification. Using a 10° criterion, more than chance level, i.e., seven of the 12 participants achieved >70% correct identification. Thus, a 10° error criterion was chosen as it may avoid reaching the floor performance.

After the completion of two runs in this level, rms error was calculated, and if it was <10° azimuth, back attenuation was decreased to 2 dB SPL. The first stage of the training terminated when the participants successfully performed localization of 1000 ms duration stimuli with 0 dB SPL back attenuation. Following the successful completion of Stage I training, Stage II training was started with 800 ms stimuli and the same criterion was employed to advance the training. The training was terminated when the rms error did not show a decline or reached a plateau at any level, as indexed by lower precision in identifying correct loudspeaker location (rms errors >10° azimuth). All the participants of the study completed training within a span of 8 days, with either one session each day/at least one session of training in every 2 days.

Phase III: Posttraining evaluation phase

Phase III included the readministration of the all the spatial acuity measures used in Phase I to quantify the changes (if any) in spatial acuity skills of normal listeners subsequent to localization training in horizontal plane. All the posttraining evaluations are done immediately after training within span of 1 day.

  Results Top

To understand the time-course of spatial learning through localization training regimen, graphical representation of spatial performance of each individual participant was studied using learning curves. The analyses of pattern of spatial learning across training sessions resulted in three types of learning curves (apart from the mean learning curve) which are depicted in [Figure 6].
Figure 6: Learning curves plotted as a function of performance of participants (training stage in ms) across time. The top panel (a) slow learner (b) average learners (c) rapid learners and (d) mean learning curve with error bars representing ± standard error of mean

Click here to view

As shown in [Figure 6]a,[Figure 6]b,[Figure 6]c,[Figure 6]d, the learning curves depict variability in the amount of learning across participants. All, except one participant (P1), completed the training paradigm in 4–5 sessions, whereas one participant could do so over after eight sessions. While most participants (P2, P4, P6, P7, P8, P9, and P11) were able to complete the training in four sessions, few others (P3, P5, P10, and P12) took around five sessions to reach the final stage of training, i.e., 300 ms, 0 dB attenuation.

The effect of localization training regimen was studied by comparison of the pre- versus post-training performance of the participants on the spatial acuity measures employed in the study. The data obtained at the two measurement phases of the study were analyzed using IBM Statistical Package Social Sciences version 20.0. Paired t-test was employed to compare the rms localization error, VASI scores, and ILD thresholds (normally distributed measures), whereas nonparametric Wilcoxon test was used to analyse ITD data as this measure was not normally distributed. The results of the study are discussed under the following headings:

  1. Effect of localization training regimen on spatial acuity in free-field
  2. Effect of localization training regimen on spatial acuity in closed-field and on binaural processing.

Effect of localization training regimen on spatial acuity in free-field

Pre- and post-training spatial acuity of the participants in free-field is measured through comparison of their performance in localization test (rms error) using paired t-test. [Figure 7] shows mean and one standard deviation along with the pairwise comparison of overall rms error scores obtained in pre- and post-training conditions. The finding of the study showed that rms errors were significantly lower in posttraining condition compared to pretraining condition (t(11) = 7.187, P< 0.001, effect size (r) = 0.86).
Figure 7: Mean localization error (root mean square) as a function of ocalization training in horizontal plane. The error bars indicate ± standard error of mean. ***Stands for P < 0.001

Click here to view

To elucidate further, the rms error scores for each loudspeaker in pre- and post-training conditions were compared using a confusion matrix. [Table 1] shows confusion matrices created separately for each individual by denoting the target location-response location relationship in the form of 18 × 18 grid (target locations × possible response locations).
Table 1: Confusion matrix denoting the overall response provided for each loudspeaker location in pre-training (top bottom light gray panels) and post-training (top white panels) phases. The diagonal (dark gray) represents accurate spatial judgments scores (maximum score - 12 subjects × 5 repetitions<60) for each spatial location

Click here to view

From [Table 1], it can be seen that rms error scores reduced for majority speaker locations indicating that participant's ability to localize the sounds improved in all locations. This is further complimented by declined perceptual confusions in target location identification after training. Furthermore, training effect on rms errors (along with one standard deviation) was analyzed across the four quadrants using paired t-test and the results are depicted in [Figure 7].

As indicated in [Figure 8], there was a significant decline of rms localization errors in front (t(11) = 3.191, P< 0.01), back (t(11) = 2.691, P< 0.05), and left (t(11) = 2.743, P< 0.05) spatial hemifields. Although not statistically significant (t(11) = 1.51, P> 0.05), mean cone of confusion errors seen in right hemisphere also decreased in posttraining phase.
Figure 8: Root mean square error across four spatial hemifields as a function of localization training in horizontal plane. The error bars indicate ± standard error of mean. **Stands for P < 0.01 while *stands for P < 0.05

Click here to view

Effect of localization training regimen on spatial acuity in closed-field and binaural processing

The changes in mean and error bars for the other spatial acuity measures of the study measured at pre- and post-training conditions are depicted in box plots shown in [Figure 9], indicating the improved performance of the participants on all the measures in the posttraining phase.
Figure 9: (a) Virtual auditory space identification scores, (b) interaural time difference thresholds and (c) interaural level difference thresholds as a function of localization training in horizontal plane. The error bars indicate ± standard error of mean

Click here to view

The results of paired t-test for VASI scores and Wilcoxon test for ITD revealed that the participants ability to judge spatial location in closed-space (t(9) = −3.602, P< 0.01, r = 0.77) and binaural time processing skill (Z = −2.266, P< 0.01, r = −0.8431) in posttraining phase showed statistically significant improvement. On the other hand, paired t-test analyses of ILD in posttraining condition did not alter significantly (P > 0.05) when compared to pretraining condition.

  Discussion Top

The current study explored the effects of localization training in the horizontal plane on spatial acuity measures in listeners with normal hearing. The learning curves (stages completed across sessions) reflected variability in performance across participants. While 7/12 participants completed the training in four sessions; four others took five sessions to complete training. This finding shows that spatial cues are learnt at a rapid pace in the initial period of training. Two important conclusions can be drawn with respect to these findings. First, there is a rapid phase of initial learning of spatial cues in auditory domain. Similar reports of initial rapid improvements were reported by Wright and Zhang,[38] who found that learning to localize altered spatial cues can occur rapidly, within 1–2 h. Second, despite noticeable variability across subjects in learning patterns, the extent of this variability is characteristically small (as only one participant took eight sessions to complete training, while the rest did so in 4–5 sessions [Figure 6]a,[Figure 6]b,[Figure 6]c,[Figure 6]d). This finding can be attributed to the equivalence of the participant's spatial processing ability before training as all the participants considered in the study had normal hearing. Hence, all the participants who underwent training maximally benefited from it in fewer sessions.

As reflected in the current study, impact of localization training was studied using various psychophysical measures that not only targeted on assessment of spatial accuracy but also provided complimentary evidence on different aspects of spatial processing. Among these psychophysical measures, localization error index will tell about the participants' ability to recognize the sound sources which are closely placed. The rms error scores in identifying the spatial locations of the speakers decreased significantly following training. The decline in rms error score is indicative of the benefit derived as a consequence of training. Furthermore, standard deviations in rms error reduced following training indicating reduced variability in localization. Similar findings were also reported by Kuk et al.[35] in adults with hearing impairment who were trained using home-based localization training regimen.

Analysis of quadrant-wise localization errors revealed that the spatial errors in the front and back hemifield declined significantly in posttraining phase than those at the pretraining assessment phase. Thus, the training protocol used in the current study remediated the front-back confusions, which are the most commonly occurring spatial errors demonstrated by normal listeners.[9] The resolution of front-back reversals reported in the current study is in consensus with Zahorik et al.[12] localization training using HRTFs. Zahorik et al.[12] reported that multimodal feedback training procedure leads to enhanced processing of spatial information in the front-back dimension. The authors report of resolved front-back reversals following brief (two 30-min sessions) training, resulting in rapid perceptual recalibration of auditory space. The improved spatial skill in participants of their study was correlated with improved processing of spectral information, especially in 3–7 kHz region (using nonindividualized HRTFs). Furthermore, the resolution of errors in left hemifield was statically significant in posttraining condition as compared to right hemifield. This asymmetry in the rectification of cone of confusion errors seen in left hemifield is in support of previous research findings reported in the literature.[38],[39] The research evidence on localization of altered spatial cues delineates differences in spatial adaptation between the hemifield, wherein the adaptations appear to be more complete for stimuli presented in the left than the right hemifield.

VASI and interaural difference thresholds assessed the extent to which localization training can be generalized to other spatial acuity skills. In the current investigation, participants were trained for localization of sound sources in free-field. VASI assesses the participants' ability to localize the sounds under headphones. It has been observed that azimuth judgments of listeners with normal hearing under headphones in closed-field had a reasonably good correlation with sound location judgments made in free-field.[40] Individuals' ability to detect changes in locations in VAS will reveal the participant's sensitivity to cues related to lateralization (HRTF). VASI scores in posttraining measurement phase were significantly better than the pretraining measurement. These results indicate the generalization of localization skills to untrained environment. Generalization of localization skills to untrained situations is reported by other investigators too.[12],[16],[22] Majdak et al.[22] employed spectrally warped and band-limited HRTFs stimuli for training sound localization in normal listeners. The results of their study showed that training can improve sound localization in free-field even when altered HRTF-based spectrally modified (reduced by band-limiting or remapped by warping) cues are employed. On similar lines, Zahorik et al.[12] evaluated the efficacy of HRTF-based VAS training regimen on sound localization listeners with normal hearing sensitivity and proved that VAS training (lateralization training) reduced spatial errors (mean unsigned errors) in localization of sound source in free-field. On the other hand, Ortiz and Wright [16] showed that training on binaural cues i.e. ITD and ILD improved not only the task which was trained (ITD/ILD thresholds) but were also generalized to temporal acuity task (GAP detection).

The impact of training on binaural processing skills was assessed by pre- versus post-training comparison of interaural differences in time (ITD) and sound pressure level (ILD) thresholds. ITD thresholds significantly improved following localization training. Improvements in ITD thresholds indicate improvement in listeners' sensitivity to temporal information. Similar observations are also reported by other researchers.[17],[19],[41] They reported reduction in listeners' ITD following successful lateralization training. Research evidence is also supportive of good relationship between ITD thresholds and localization skills in normal-hearing individuals.[42] Therefore, benefit of training evidenced as improvement of localization accuracy in free-field is also complimented by plausible changes in ITD thresholds measured in posttraining evaluation phase. In addition to the positive outcomes of localization training on different spatial acuity measures, the effect size measured in the study was also high, which in turn serves as an evidence of clinical applicability of the training regimen used in the study.

On comparison of the various spatial measures employed in the study, the ITD is most vulnerable to the effect of localization training (as reflected by high effect size - r). This is followed by VASI test and localization test in free-field. Hence, from the findings of the study, ITD proves to be the most sensitive index for the assessment of changes in spatial skills subsequent to localization training in horizontal plane.

On concluding remarks, we highlight the research findings from the present study and contrast the same with those reported in the literature so that the readers can take a closer look into the efficacy of the current training paradigm in remediating spatial acuity deficits. The effect of training on perception of sound direction in the free-field has been examined in several experiments, which showed variable outcomes in sound localization performance. Shinn-Cunningham [43] training using broadband noise (BBN) stimulus decreased localization errors in cone of confusion by approximately 1.5° azimuth, which is comparable to the improvement seen in the current training regimen (~1° azimuth). On the other hand, Abel and Paik [44] used BBN and filtered noise to train ten normal hearing individuals and found minimal improvement in mean correct scores of identification, which were not statistically significant. Quadrant-wise analysis of errors reported in their study reported showed that percentage correct responses for stimulus originating in front plane were higher than those originating from back plane, while the location of stimulus in left hemifield was accurately judged than those presented in the right hemifield. On similar lines, the localization training regimen used in the current study resolved found localization errors not only front and left hemifields but also improvement in localization skills in back plane. In addition, the current training program proves to be more effective than certain long-term training programs such as the one put forth by Recanzone et al.,[45] who observed no improvement on either localization or minimum-audible-angle tasks in two naive listeners over multiple sessions of testing, hence adding to the realization of benefits derived in the short-term training protocol used in the study.

There was a general benefit of the training sessions on almost all the spatial measures considered in the study. The richness of training regimen based on systematic variation of graded hierarchy of stimuli using both temporal and intensity parameters might have contributed to the observed effect size in most of spatial acuity measures in such small amount of time. From the current findings, we speculate that the protocol used in the present study can be clinically effective and time efficient. In addition, the current localization training protocol can be advocated to remediate spatial deficits in clinical population such as individuals with sensorineural hearing loss (SNHL), central auditory processing disorder (CAPD), and auditory neuropathy spectrum disorder (ANSD), who primarily face multitude difficulties in day-to-day conversation stemming from spatial processing deficits. According to Jerger,[46] the spatial processing deficits in individuals with CAPD would manifest as loss of ability to separate auditory foreground from auditory background (figure-ground discrimination) and/or as failure to code fine temporal structures necessary for the analysis of speech. In most instances, spatial deficits in individuals with SNHL results in poor intensity and spectral discrimination of elements of speech, whereas spatial deficits in ANSD (which is thought to affect the timing of neural activity in the auditory pathway) usually disrupts aspects of auditory perception based on temporal cues.[47],[48] Therefore, the current training program can be a promising avenue of research in remediating the perceptual consequences of poor spatial encoding in such clinical population. The present study is a preliminary attempt toward remediating spatial deficits. Although it is effective (as inferred from generalization of training effects to localization in free- and closed-fields apart from binaural cue processing) in normal hearing, its applicability in clinical population should be further probed. Further, the findings of the study should be viewed with a caution as the study lacks a control group.

  Conclusion Top

The findings of this study suggest that in spite of normal hearing, the protocol used for training in the current study can bring about plausible improvement in spatial acuity and that training-induced plasticity can further refine auditory spatial capabilities. The localization training protocol used in the present study on a preliminary basis proves to be effective in remediating localization errors especially in front-back spatial hemifields. Although the spatial accuracy statistically improved only in front-back hemispheres, the utility of the current training regimen can be advocated in professions involving orientation and/or navigation where front-back reversals can lead to catastrophic effects.

In addition, the training protocol implemented in the current study also showed positive impact on multiple facets of spatial performance, thus implying the utility of training protocol in resolving different aspects of auditory spatial processing. The localization training protocol used in the current study can be extended to clinical population such as individuals with SNHL, CAPD, and ANSD who are bound to have spatial difficulties.


We thank the Director, All India Institute of Speech and Hearing, Mysore, and participants of the study. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.

  References Top

Blauert J. Spatial Hearing: The Psychophysics of Human Sound Localization. Cambridge: MIT press, 1997.  Back to cited text no. 1
King AJ, Schnupp JW, Doubell TP. The shape of ears to come: Dynamic coding of auditory space. Trends Cogn Sci 2001;5:261-70.  Back to cited text no. 2
Koehnke J, Culotta CP, Hawley ML, Colburn HS. Effects of reference interaural time and intensity differences on binaural performance in listeners with normal and impaired hearing. Ear Hear 1995;16:331-53.  Back to cited text no. 3
Noble W, Byrne D, Lepage B. Effects on sound localization of configuration and type of hearing impairment. J Acoust Soc Am 1994;95:992-1005.  Back to cited text no. 4
Rakerd B, Vander Velde TJ, Hartmann WM. Sound localization in the median sagittal plane by listeners with presbyacusis. J Am Acad Audiol 1998;9:466-79.  Back to cited text no. 5
Akeroyd MA. An overview of the major phenomena of the localization of sound sources by normal-hearing, hearing-impaired, and aided listeners. Trends Hear 2014;18. pii: 2331216514560442.  Back to cited text no. 6
Rakerd B, Hartmann WM. Localization of sound in rooms, III: Onset and duration effects. J Acoust Soc Am 1986;80:1695-706.  Back to cited text no. 7
Woodworth RS. Experimental Psychology. New York: Holt; 1938.  Back to cited text no. 8
Best V, Kalluri S, McLachlan S, Valentine S, Edwards B, Carlile S. A comparison of CIC and BTE hearing aids for three-dimensional localization of speech. Int J Audiol 2010;49:723-32.  Back to cited text no. 9
Brimijoin WO, Akeroyd MA. The role of head movements and signal spectrum in an auditory front/back illusion. Iperception 2012;3:179-82.  Back to cited text no. 10
Byrne D, Noble W. Optimizing sound localization with hearing AIDS. Trends Amplif 1998;3:51-73.  Back to cited text no. 11
Zahorik P, Bangayan P, Sundareswaran V, Wang K, Tam C. Perceptual recalibration in human sound localization: Learning to remediate front-back reversals. J Acoust Soc Am 2006;120:343-59.  Back to cited text no. 12
Kidd G Jr., Arbogast TL, Mason CR, Gallun FJ. The advantage of knowing where to listen. J Acoust Soc Am 2005;118:3804-15.  Back to cited text no. 13
Takahashi T. A novel view of hearing in reverberation. Neuron 2009;62:6-7.  Back to cited text no. 14
Bregman AS. Auditory Scene Analysis: The Perceptual Organization of Sound. Cambridge, Massachusetts: MIT Press; 1994.  Back to cited text no. 15
Ortiz JA, Wright BA. Contributions of procedure and stimulus learning to early, rapid perceptual improvements. J Exp Psychol Hum Percept Perform 2009;35:188-94.  Back to cited text no. 16
Rowan D, Lutman ME. Learning to discriminate interaural time differences: An exploratory study with amplitude-modulated stimuli. Int J Audiol 2006;45:513-20.  Back to cited text no. 17
Spierer L, Tardif E, Sperdin H, Murray MM, Clarke S. Learning-induced plasticity in auditory spatial representations revealed by electrical neuroimaging. J Neurosci 2007;27:5474-83.  Back to cited text no. 18
Wright BA, Fitzgerald MB. Different patterns of human discrimination learning for two interaural cues to sound-source location. Proc Natl Acad Sci U S A 2001;98:12307-12.  Back to cited text no. 19
Zhang Y, Wright BA. Similar patterns of learning and performance variability for human discrimination of interaural time differences at high and low frequencies. J Acoust Soc Am 2007;121:2207-16.  Back to cited text no. 20
Honda A, Shibata H, Gyoba J, Saitou K, Iwaya Y, Suzuki Y. Transfer effects on sound localization performances from playing a virtual three-dimensional auditory game. Appl Acoust 2007;68:885-96.  Back to cited text no. 21
Majdak P, Walder T, Laback B. Effect of long-term training on sound localization performance with spectrally warped and band-limited head-related transfer functions. J Acoust Soc Am 2013;134:2148-59.  Back to cited text no. 22
Carhart R, Jerger J. Preferred method for clinical determination of pure-tone thresholds. J Speech Hear Disord 1959;24:330-45.  Back to cited text no. 23
Venkatesan S. Ethical Guidelines for Bio-Behavioural Research Involving Human Subjects. Mysore: All India Institute of Speech and Hearing; 2009.  Back to cited text no. 24
Kwon BJ. AUX: A scripting language for auditory signal processing and software packages for psychoacoustic experiments and education. Behav Res Methods 2012;44:361-73.  Back to cited text no. 25
Perception Research Systems. Paradigm Stimulus Presentation; 2007. Available from: http://www.paradigmexperiments.com. [Last accessed on 2015 Jun 05].  Back to cited text no. 26
Gnanateja N. Consonant Confusion Matrix; 2014. Available from: https://www.in.mathworks.com/matlabcentral/fileexchange/46461-consonant-confusion-matrix. [Last accessed on 2016 Sep 14].  Back to cited text no. 27
Matlab version 7.10 (R2010a). Natick, Massachusetts, United States. The Mathworks Inc.; 2010.  Back to cited text no. 28
Pralong D, Carlile S. Generation and validation of virtual auditory space. Virtual Auditory Space: Generation and Application. Berlin Heidelberg: Springer; 1996. p. 109-51.  Back to cited text no. 29
Rothbucher M, Kronmüller D, Durkovic M, Habigt T, Diepold K. HRTF sound localization. In: Strumillo P, editor. Advances in Sound Localization. Rijeka, Croatia: InTech; 2011. p. 79-94.  Back to cited text no. 30
Spatial Auditory Displays Lab. Sound Lab (SLAB3d); 2012. Available from: http://www.slab3d.sonisphere.com/ and http://www.humansystems.arc.nasa.gov/SLAB/. [Last accessed on 2015 Apr 17].  Back to cited text no. 31
Miller JD, Godfroy-Cooper M, Wenzel EM. Using Published HRTFS with Slab3D: Metric-Based Database Selection and Phenomena Observed. 20th International Conference on Auditory Display; 2014.  Back to cited text no. 32
Soranzo A, Grassi M. PSYCHOACOUSTICS: A comprehensive MATLAB toolbox for auditory testing. Front Psychol 2014;5:712.  Back to cited text no. 33
Levitt H. Transformed up-down methods in psychoacoustics. J Acoust Soc Am 1971;49 Suppl 2: 467-77.  Back to cited text no. 34
Kuk F, Keenan DM, Lau C, Crose B, Schumacher J. Evaluation of a localization training program for hearing impaired listeners. Ear Hear 2014;35:652-66.  Back to cited text no. 35
Freigang C, Richter N, Rübsamen R, Ludwig AA. Age-related changes in sound localisation ability. Cell Tissue Res 2015;361:371-86.  Back to cited text no. 36
Wightman FL, Kistler DJ. The dominant role of low-frequency interaural time differences in sound localization. J Acoust Soc Am 1992;91:1648-61.  Back to cited text no. 37
Wright BA, Zhang Y. A review of learning with normal and altered sound-localization cues in human adults. Int J Audiol 2006;45 Suppl 1:S92-8.  Back to cited text no. 38
Shinn-Cunningham B. Applications of Virtual Auditory Displays. Proceedings of the 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society; Vol. 20. 1998. p. 1105-8.  Back to cited text no. 39
Zhou ZH. Sound Localization and Virtual Auditory Space. Project Report Univesity of Toronto. Vol. 19. Toronto; 2002. p. 1-6.   Back to cited text no. 40
Rowan D, Lutman ME. Learning to discriminate interaural time differences at low and high frequencies. Int J Audiol 2007;46:585-94.  Back to cited text no. 41
Tobias JV, Stanley Z. Lateralization threshold as a function of stimulus duration. J Acoust Soc Am 1959;31:1591-4.  Back to cited text no. 42
Shinn-Cunningham B. Learning Reverberation: Considerations for Spatial Auditory Displays. International Conference on Auditory Display: Proceedings of the 2000 International Conference on Auditory Display; Atlanta, GA: 2000. p. 126-34.  Back to cited text no. 43
Abel SM, Paik JE. The benefit of practice for sound localization without sight. Appl Acoust 2004;65:229-41.  Back to cited text no. 44
Recanzone GH, Makhamra SD, Guard DC. Comparison of relative and absolute sound localization ability in humans. J Acoust Soc Am 1998;103:1085-97.  Back to cited text no. 45
Jerger J. Controversial issues in central auditory processing disorders. Semin Hear 1998;19:393-8.  Back to cited text no. 46
Rance G. Auditory neuropathy/dys-synchrony and its perceptual consequences. Trends Amplif 2005;9:1-43.  Back to cited text no. 47
Rance G, Barker E, Mok M, Dowell R, Rincon A, Garratt R. Speech perception in noise for children with auditory neuropathy/dys-synchrony type hearing loss. Ear Hear 2007;28:351-60.  Back to cited text no. 48


  [Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6], [Figure 7], [Figure 8], [Figure 9]

  [Table 1]

This article has been cited by
1 Test re-test reliability of virtual acoustic space identification (VASI) test in young adults with normal hearing
Kavassery Venkateswaran Nisha, Prabuddha Bhatarai, Kruthika Suresh, Shashish Ghimire, Prashanth Prabhu
Journal of Otology. 2022;
[Pubmed] | [DOI]
2 Effects of Maturation and Chronological Aging on Auditory Spatial Processing: A Cross-Sectional Study Across Life Span
Kavassery Venkateswaran Nisha, Ajith Kumar Uppunda, Sreeraj Konadath
American Journal of Audiology. 2022; : 1
[Pubmed] | [DOI]
3 Effects of Spatial Training Paradigms on Auditory Spatial Refinement in Normal-Hearing Listeners: A Comparative Study
Kavassery Venkateswaran Nisha, Ajith Uppunda Kumar
Journal of Audiology and Otology. 2022; 26(3): 113
[Pubmed] | [DOI]
4 Speech perception in noise and localization performance of digital noise reduction algorithm in hearing aids with ear-to-ear synchronization
Geetha Chinnaraj, Kishore Tanniru, Raja Rajan Raveendran
Journal of All India Institute of Speech and Hearing. 2021; 40(1): 23
[Pubmed] | [DOI]


Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
Access Statistics
Email Alert *
Add to My List *
* Registration required (free)

  In this article
Materials and Me...
Article Figures
Article Tables

 Article Access Statistics
    PDF Downloaded626    
    Comments [Add]    
    Cited by others 4    

Recommend this journal