On the articulation between acoustic and semantic uncertainty in speech perception: Investigating the interaction between sources of information in perceptual classification.
Listeners processing speech signals have to deal with two main classes of entropy. Ladefoged & Broadbent (1957, see also Sjerps & McQueen, 2013) showed that modifications in the resonant frequencies of a carrier sentence would affect the interpretation of a given vowel. For example, in the sentence "Please say what this word is: bet", increasing the frequency of F1 on the whole sentence, excluding the final word, would lead listeners to identify this final word as "bit" rather than "bet". Listeners may also take linguistic information into account, among which lexical hypotheses based on word co-occurence probabilities and semantic relations (e.g. McClelland & Elman, 1986). A tension can then take place between these two sources of information and the uncertainty that is associated with each of them must be stabilized in order to reach a perceptual decision. It is not clear however, how these sources of uncertainty are pondered in perception.
We are currently setting-up an experiment in which we plan to investigate this issue by independently manipulating (1) semantic relationships between words and (2) acoustic relations between a contextual part and the final word in this sentence. For example, based on word pairs that contrast on their vowel target only and for which the vowels are close to each other in articulatory / acoustic space (e.g. french "balle" vs. "belle", pronounced /bal/ vs. /bEl/ – eng. "ball" vs. "beauty"), 3 types of sentences are generated: (1) a sentence that would semantically "prime" the word /bal/ ("Le joueur a dévié la", eng. "The player deflected the"), (2) a sentence that would favour the word /bEl/ ("Le prince a charmé la", eng. "The prince charmed the") and (3) a neutral and / or semantically incongruous sentence in both cases "Le journaliste a parlé de la", eng. "The journalist talked about the").
We will present listeners with fully ambiguous final words (acoustically located between e.g. /bal/ and /bEl/) in contexts where semantic influence varies (sentence-types 1/2/3) and is balanced with acoustic manipulations of formant frequencies favouring one word or the other. This will let us describe how these sources of information interact in vowel classification. We are currently selecting sentence and word materials using French models for word embeddings (Mikolov, 2013) in order to provide a set of materials that will fit our expectations. We would highly benefit from discussing these materials and their modes of selection during the conference.