The effect of cognitive noise on duration, intensity and pitch discrimination of a synthesised vocoid
Research on sub-optimal listening conditions has been concerned primarily with speech-in-noise, where the acoustic signal is affected by energetic and/or informational masking. Less is known about “cognitive noise”. Cognitive Noise (CN) is a load-based adverse condition where listening takes place alongside a concurrent non-auditory attentional or mnemonic task. While the concurrent task creates no acoustic interference to the speech signal itself, it is thought to place demands on cognitive resources and deplete resources available for speech perception. The effect of CN on sentence comprehension, word segmentation, and phoneme identification has been documented. Its effect on basic hearing processes is poorly understood, however. Our study investigates the effect of CN on low-level auditory perception, specifically, the discrimination of duration, intensity, and pitch cues.
Ninety-six participants were randomly assigned to one of three discrimination tasks: Duration, Intensity, and Pitch (n = 32 in each). Discrimination was assessed by performance on a 3I-2AFC test, which provided just-noticeable differences (jnd) for each of those three dimensions adaptively. The stimuli were derived from a base Klatt-synthesised vocoid 500-ms stimulus resembling /ɑ/, with F0 150 Hz, F1 836 Hz, F2 1152 Hz, F3 2741 Hz, played at 60dB. Deviant stimuli varied in Duration (500–800 ms), Intensity (60–70 dB), or Pitch (150–153 Hz) across 60 equidistant steps. CN was imposed via a secondary visual n-back task implementing two types of load, Rhyme and Image. Rhyme CN stimuli were written monosyllabic nonwords; Image CN stimuli were un-nameable, meaningless characters In the Rhyme condition, participants pressed a key every time they saw a nonword that rhymed with the preceding nonword (1-back, low CN) or with the nonword two steps earlier in the sequence (2-back, high CN). In the Image condition, they responded to image repetition either consecutively (1-back) or separated by one intervening image (2-back).
CN significantly increased jnds for Duration and Intensity, but not for Pitch. This pattern held for both Image and Rhyme CN. The results show that CN, just as physical noise, affects the precision with which the primary dimensions of sound are processed. The apparent encapsulation of pitch from cognitive load suggests that this dimension is processed more automatically than duration and intensity. Consequences for theories of speech perception in adverse conditions will be discussed.