Effects of temporally fluctuating maskers on speech production and communication
Several decades of research have been devoted to investigating how speech production in noise manifests as a function of loudness, task type, spectral content of noise, and other properties. Recently, work on how listeners track speech in noise has found that neural oscillators entrain with envelope modulations of the attended stream, and that speakers produce more pronounced modulations when speaking in noise. However, to date there has not been any investigation into the possible effects of varying noise types on amplitude modulations and the ability of talkers to adapt them to the noise environment.
This aim of this study is to expand on the finding that speakers produce more pronounced modulations when speaking in noise, and more generally that speakers can adapt to temporal fluctuations in the masker. Pairs of normally hearing adults between the ages of 18-35 are audio recorded while completing a sentence repetition task (N=40). The talkers are seated in adjacent booths and communicate via headset in a ‘virtual room’ that simulates the acoustics of a real room, including reverberation and sound/speaker locality (www.phon.ucl.ac.uk/resource/audio3d/). One talker (‘Talker A’) reads Harvard sentences to Talker B, who repeats back what they heard in 4 noise conditions (speech-shaped noise modulated by 1 Hz, 4 Hz, and 8 Hz square waves for an ‘on-off’ effect) presented at 80dB, as well as in a quiet condition. Various global acoustic-phonetic measures are taken of Talker A’s speech, including f0 median and range, articulation rate, and mean energy, as well as some temporal measures. These include a comparison of speech energy in masker ‘on’ periods to energy in masker ‘off’ conditions, and calculating the modulation spectrum of Talker A’s speech. It is predicted that speech produced in the presence of 1 and 4 Hz maskers will show more pronounced modulations in the amplitude envelope at 1 and 4Hz as talkers adjust to speak in the ‘gaps’, whereas 8 Hz, which is faster than a normal syllable rate, will not show this effect. The results will further understanding of how and to what extent talkers interact with temporally fluctuating noise.