Task-dependent decoding and encoding of sounds in natural auditory scenes based on modulation transfer functions
Recent studies used decoding and encoding analyses to examine the neural processing of natural continuous sounds as measured with EEG and MEG in multi-speaker environments. A robust finding is that the (attended) sound envelope is strongly represented by these cortical signals. However, it remains unclear whether and which acoustic features of the sound are more strongly represented in cortex than others.
In this study, participants (N=17) are presented with natural listening situations including two speakers and music while they perform selective attention tasks and EEG signals are acquired. We use a decoding and encoding approach based on sound descriptions by modulation transfer functions (MTF) and, more specifically, temporal modulations of 2-32Hz (log-scale).
Temporal response functions (TRFs) are estimated to reconstruct sounds from EEG data for each modulation (decoding) and to predict EEG responses based on MTFs (encoding).
Decoding results show better reconstruction when based on temporal modulations compared to decoding based on sound envelopes for all temporal rates. Attention effects for speech and music decoding were strongest during faster rates (4-8Hz).
Encoding results based on models of temporal rates performed similar to envelope-based encoding. For speech, encoding models suggest strong differences between attended and unattended sounds at specific rates and delays. For music, models were independent of attentional state.
These results shed more light on previously reported envelope-based descriptions of EEG data and point towards distinct processing of attended and unattended speech sounds as characterized by temporal modulations and delays. Remaining analysis including spectral modulations will provide additional insights.