• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Journal of the Acoustical Society of America

Year Range: 
Search Issue | RSS Feeds RSS
Previous Issue Next Issue

May 2012

Volume 131, Issue 5, pp. EL355-4232

back to top
RSS Feeds

Sparse regularized regression identifies behaviorally-relevant stimulus features from psychophysical data

Vinzenz H. Schönfelder and Felix A. Wichmann

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 3953-3969 (2012); (17 pages)

Full Text: Read Online (HTML) | Download PDF

Show Abstract
As a prerequisite to quantitative psychophysical models of sensory processing it is necessary to learn to what extent decisions in behavioral tasks depend on specific stimulus features, the perceptual cues. Based on relative linear combination weights, this study demonstrates how stimulus-response data can be analyzed in this regard relying on an L1-regularized multiple logistic regression, a modern statistical procedure developed in machine learning. This method prevents complex models from over-fitting to noisy data. In addition, it enforces “sparse” solutions, a computational approximation to the postulate that a good model should contain the minimal set of predictors necessary to explain the data. In simulations, behavioral data from a classical auditory tone-in-noise detection task were generated. The proposed method is shown to precisely identify observer cues from a large set of covarying, interdependent stimulus features—a setting where standard correlational and regression methods fail. The proposed method succeeds for a wide range of signal-to-noise ratios and for deterministic as well as probabilistic observers. Furthermore, the detailed decision rules of the simulated observers were reconstructed from the estimated linear model weights allowing predictions of responses on the basis of individual stimuli.
Show PACS
43.66.Ba Models and theories of auditory processes
43.66.Dc Masking

Comparing models of the combined-stimulation advantage for speech recognition

Christophe Micheyl and Andrew J. Oxenham

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 3970-3980 (2012); (11 pages) | Cited 1 time

Full Text: Read Online (HTML) | Download PDF

Show Abstract
The “combined-stimulation advantage” refers to an improvement in speech recognition when cochlear-implant or vocoded stimulation is supplemented by low-frequency acoustic information. Previous studies have been interpreted as evidence for “super-additive” or “synergistic” effects in the combination of low-frequency and electric or vocoded speech information by human listeners. However, this conclusion was based on predictions of performance obtained using a suboptimal high-threshold model of information combination. The present study shows that a different model, based on Gaussian signal detection theory, can predict surprisingly large combined-stimulation advantages, even when performance with either information source alone is close to chance, without involving any synergistic interaction. A reanalysis of published data using this model reveals that previous results, which have been interpreted as evidence for super-additive effects in perception of combined speech stimuli, are actually consistent with a more parsimonious explanation, according to which the combined-stimulation advantage reflects an optimal combination of two independent sources of information. The present results do not rule out the possible existence of synergistic effects in combined stimulation; however, they emphasize the possibility that the combined-stimulation advantages observed in some studies can be explained simply by non-interactive combination of two information sources.
Show PACS
43.66.Ba Models and theories of auditory processes
43.66.Ts Auditory prostheses, hearing aids

Binaural loudness summation for speech presented via earphones and loudspeaker with and without visual cues

Michael Epstein and Mary Florentine

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 3981-3988 (2012); (8 pages)

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Preliminary data [M. Epstein and M. Florentine, Ear. Hear. 30, 234–237 (2009)] obtained using speech stimuli from a visually present talker heard via loudspeakers in a sound-attenuating chamber indicate little difference in loudness when listening with one or two ears (i.e., significantly reduced binaural loudness summation, BLS), which is known as “binaural loudness constancy.” These data challenge current understanding drawn from laboratory measurements that indicate a tone presented binaurally is louder than the same tone presented monaurally. Twelve normal listeners were presented recorded spondees, monaurally and binaurally across a wide range of levels via earphones and a loudspeaker with and without visual cues. Statistical analyses of binaural-to-monaural ratios of magnitude estimates indicate that the amount of BLS is significantly less for speech presented via a loudspeaker with visual cues than for stimuli with any other combination of test parameters (i.e., speech presented via earphones or a loudspeaker without visual cues, and speech presented via earphones with visual cues). These results indicate that the loudness of a visually present talker in daily environments is little affected by switching between binaural and monaural listening. This supports the phenomenon of binaural loudness constancy and underscores the importance of ecological validity in loudness research.
Show PACS
43.66.Cb Loudness, absolute threshold
43.66.Pn Binaural hearing
43.66.Ba Models and theories of auditory processes
43.66.Lj Perceptual effects of sound

Further evidence that fundamental-frequency difference limens measure pitch discrimination

Christophe Micheyl, Claire M. Ryan, and Andrew J. Oxenham

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 3989-4001 (2012); (13 pages) | Cited 2 times

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Difference limens for complex tones (DLCs) that differ in F0 are widely regarded as a measure of periodicity-pitch discrimination. However, because F0 changes are inevitably accompanied by changes in the frequencies of the harmonics, DLCs may actually reflect the discriminability of individual components. To test this hypothesis, DLCs were measured for complex tones, the component frequencies of which were shifted coherently upward or downward by ΔF = 0%, 25%, 37.5%, or 50% of the F0, yielding fully harmonic (ΔF = 0%), strongly inharmonic (ΔF = 25%, 37.5%), or odd-harmonic (ΔF = 50%) tones. If DLCs truly reflect periodicity-pitch discriminability, they should be larger (worse) for inharmonic tones than for harmonic and odd harmonic tones because inharmonic tones have a weaker pitch. Consistent with this prediction, the results of two experiments showed a non-monotonic dependence of DLCs on ΔF, with larger DLCs for ΔF’s of ±25% or ±37.5% than for ΔF’s of 0 or ±50% of F0. These findings are consistent with models of pitch perception that involve harmonic templates or with an autocorrelation-based model provided that more than just the highest peak in the summary autocorrelogram is taken into account.
Show PACS
43.66.Hg Pitch
43.66.Fe Discrimination: intensity and frequency
43.66.Ba Models and theories of auditory processes

Identification of walked-upon materials in auditory, kinesthetic, haptic, and audio-haptic conditions

Bruno L. Giordano, Yon Visell, Hsin-Yun Yao, Vincent Hayward, Jeremy R. Cooperstock, and Stephen McAdams

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 4002-4012 (2012); (11 pages) | Cited 1 time

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Locomotion generates multisensory information about walked-upon objects. How perceptual systems use such information to get to know the environment remains unexplored. The ability to identify solid (e.g., marble) and aggregate (e.g., gravel) walked-upon materials was investigated in auditory, haptic or audio-haptic conditions, and in a kinesthetic condition where tactile information was perturbed with a vibromechanical noise. Overall, identification performance was better than chance in all experimental conditions and for both solids and the better identified aggregates. Despite large mechanical differences between the response of solids and aggregates to locomotion, for both material categories discrimination was at its worst in the auditory and kinesthetic conditions and at its best in the haptic and audio-haptic conditions. An analysis of the dominance of sensory information in the audio-haptic context supported a focus on the most accurate modality, haptics, but only for the identification of solid materials. When identifying aggregates, response biases appeared to produce a focus on the least accurate modality—kinesthesia. When walking on loose materials such as gravel, individuals do not perceive surfaces by focusing on the most accurate modality, but by focusing on the modality that would most promptly signal postural instabilities.
Show PACS
43.66.Lj Perceptual effects of sound
43.66.Wv Vibration and tactile senses
43.66.Jh Timbre, timbre in musical acoustics
43.66.Ba Models and theories of auditory processes

Temporal predictions based on a gradual change in tempo

Thomas E. Cope, Manon Grube, and Timothy D. Griffiths

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 4013-4022 (2012); (10 pages) | Cited 1 time

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Previous studies investigating sensitivity to step changes in tempo and prediction of tone onset time have generally utilized isochronous sequences. This study investigates subjects’ ability to detect deviations from a gradual change in the tempo of a tone sequence (experiment 1) and their judgment of the perceptually optimal timing of this tone (experiment 2). In experiment 1, inter-onset-intervals within pairs of eight-tone sequences followed a geometric progression to create a gradual tempo change. In one sequence, the final tone was presented either earlier or later than specified by the progression. Subjects performed well at detecting deviations that exaggerated the tempo progression but poorly when it was counteracted. Experiment 2 used similar pairs except that the final tone was always presented earlier in one sequence than the other. Final interval length was adaptively adjusted to subjects’ judgments; it was adjudged in best agreement with the progression when its length was roughly half way between the mathematically correct value and the length of the penultimate interval. The data support “multiple-look” and entrainment models of tempo sensitivity and suggest that temporal prediction is based less on the tempo contour of a whole sequence than on the duration of the preceding interval.
Show PACS
43.66.Mk Temporal and sequential aspects of hearing; auditory grouping in relation to music
43.75.Cd Music perception and cognition

The three-channel model of sound localization mechanisms: Interaural level differences

Rachel N. Dingle, Susan E. Hall, and Dennis P. Phillips

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 4023-4029 (2012); (7 pages) | Cited 2 times

Full Text: Read Online (HTML) | Download PDF

Show Abstract
The current understanding of mammalian sound localization is that azimuthal (horizontal) position assignments are dependent upon the relative activation of two populations of broadly-tuned hemifield neurons with overlapping medial borders. Recent psychophysical work has provided evidence for a third channel of low-frequency interaural time difference (ITD)-sensitive neurons tuned to the azimuthal midline. However, the neurophysiological data on free-field azimuth receptive fields, especially of cortical neurons, has primarily studied high-frequency cells whose receptive fields are more likely to have been shaped by interaural level differences (ILDs) than ITDs. In four experiments, a selective adaptation paradigm was used to probe for the existence of a midline channel in the domain of ILDs. If no midline channel exists, symmetrical adaptation of the lateral channels should not result in a shift in the perceived intracranial location of subsequent test tones away from the adaptors because the relative activation of the two channels will remain unchanged. Instead, results indicate a shift in perceived test tone location away from the adaptors, which supports the existence of a midline channel in the domain of ILDs. Interestingly, this shift occurs not only at high frequencies, traditionally associated with ILDs in natural settings, but at low frequencies as well.
Show PACS
43.66.Qp Localization of sound sources
43.66.Pn Binaural hearing
43.66.Ba Models and theories of auditory processes
43.66.Ed Auditory fatigue, temporary threshold shift

Across-site patterns of modulation detection: Relation to speech recognition

Soha N. Garadat, Teresa A. Zwolan, and Bryan E. Pfingst

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 4030-4041 (2012); (12 pages) | Cited 1 time

Full Text: Read Online (HTML) | Download PDF

Show Abstract
The aim of this study was to identify across-site patterns of modulation detection thresholds (MDTs) in subjects with cochlear implants and to determine if removal of sites with the poorest MDTs from speech processor programs would result in improved speech recognition. Five hundred millisecond trains of symmetric-biphasic pulses were modulated sinusoidally at 10 Hz and presented at a rate of 900 pps using monopolar stimulation. Subjects were asked to discriminate a modulated pulse train from an unmodulated pulse train for all electrodes in quiet and in the presence of an interleaved unmodulated masker presented on the adjacent site. Across-site patterns of masked MDTs were then used to construct two 10-channel MAPs such that one MAP consisted of sites with the best masked MDTs and the other MAP consisted of sites with the worst masked MDTs. Subjects’ speech recognition skills were compared when they used these two different MAPs. Results showed that MDTs were variable across sites and were elevated in the presence of a masker by various amounts across sites. Better speech recognition was observed when the processor MAP consisted of sites with best masked MDTs, suggesting that temporal modulation sensitivity has important contributions to speech recognition with a cochlear implant.
Show PACS
43.66.Ts Auditory prostheses, hearing aids
43.66.Fe Discrimination: intensity and frequency
43.71.Es Vowel and consonant perception; perception of words, sentences, and fluent speech

Beneficial acoustic speech cues for cochlear implant users with residual acoustic hearing

Anisa S. Visram, Mahan Azadpour, Karolina Kluk, and Colette M. McKay

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 4042-4050 (2012); (9 pages) | Cited 2 times

Full Text: Read Online (HTML) | Download PDF

Show Abstract
This study investigated which acoustic cues within the speech signal are responsible for bimodal speech perception benefit. Seven cochlear implant (CI) users with usable residual hearing at low frequencies in the non-implanted ear participated. Sentence tests were performed in near-quiet (some noise on the CI side to reduce scores from ceiling) and in a modulated noise background, with the implant alone and with the addition, in the hearing ear, of one of four types of acoustic signals derived from the same sentences: (1) a complex tone modulated by the fundamental frequency (F0) and amplitude envelope contours; (2) a pure tone modulated by the F0 and amplitude contours; (3) a noise-vocoded signal; (4) unprocessed speech. The modulated tones provided F0 information without spectral shape information, whilst the vocoded signal presented spectral shape information without F0 information. For the group as a whole, only the unprocessed speech condition provided significant benefit over implant-alone scores, in both near-quiet and noise. This suggests that, on average, F0 or spectral cues in isolation provided limited benefit for these subjects in the tested listening conditions, and that the significant benefit observed in the full-signal condition was derived from implantees’ use of a combination of these cues.
Show PACS
43.66.Ts Auditory prostheses, hearing aids
43.71.Ky Speech perception by the hearing impaired

Estimating head-related transfer functions of human subjects from pressure–velocity measurements

Marko Hiipakka, Teemu Kinnari, and Ville Pulkki

J. Acoust. Soc. Am. Volume 131, Issue 5, pp. 4051-4061 (2012); (11 pages)

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Direct measurements of individual head-related transfer functions (HRTFs) with a probe microphone at the eardrum are unpleasant, risky, and unreliable and therefore have not been widely used. Instead, the HRTFs are commonly measured from the blocked ear canal entrance, which excludes the effects of the individual ear canals and eardrums. This paper presents a method that allows obtaining individually correct magnitude frequency responses of HRTFs at the eardrum from pressure–velocity (PU) measurements at the ear canal entrance with a miniature PU sensor. The HRTFs of 25 test subjects with nine directions of sound incidence were estimated using real anechoic measurements and an energy-based estimation method. To validate the approach, measurements were also conducted with probe microphones near the eardrums as well as at blocked ear canal entrances. Comparisons between the different methods show that the method presented is a valid and reliable technique for obtaining magnitude frequency responses of HRTFs. The HRTF filters designed using the PU measurements are also shown to yield more correct frequency responses at the eardrum than the filters designed using measurements from the blocked ear canal entrance.
Show PACS
43.66.Yw Instruments and methods related to hearing and its measurement
43.58.Fm Sound level meters, level recorders, sound pressure, particle velocity, and sound intensity measurements, meters, and controllers
43.20.Mv Waveguides, wave propagation in tubes and ducts
Close

close