Auditory speech can be difficult to understand but seeing the articulatory movements of a speaker can drastically improve spoken-word recognition and, on the longer-term, it helps listeners to adapt to acoustically distorted speech. Given that individuals with developmental dyslexia (DD) have sometimes been reported to rely less on lip-read speech than typical readers, we examined lip-read-driven adaptation to distorted speech in a group of adults with DD ( N = 29) and a comparison group of typical readers ( N = 29). Participants were presented with acoustically distorted Dutch words (six-channel noise-vocoded speech, NVS) in audiovisual training blocks (where the speaker could be seen) interspersed with audio-only test blocks.
View Article and Find Full Text PDFTrust is an aspect critical to human social interaction and research has identified many cues that help in the assimilation of this social trait. Two of these cues are the pitch of the voice and the width-to-height ratio of the face (fWHR). Additionally, research has indicated that the content of a spoken sentence itself has an effect on trustworthiness; a finding that has not yet been brought into multisensory research.
View Article and Find Full Text PDFOur percept of the world is not solely determined by what we perceive and process at a given moment in time, but also depends on what we processed recently. In the present study, we investigate whether the perceived emotion of a spoken sentence is contingent upon the emotion of an auditory stimulus on the preceding trial (i.e.
View Article and Find Full Text PDFWhen listening to distorted speech, does one become a better listener by looking at the face of the speaker or by reading subtitles that are presented along with the speech signal? We examined this question in two experiments in which we presented participants with spectrally distorted speech (4-channel noise-vocoded speech). During short training sessions, listeners received auditorily distorted words or pseudowords that were partially disambiguated by concurrently presented lipread information or text. After each training session, listeners were tested with new degraded auditory words.
View Article and Find Full Text PDFHumans' extraordinary ability to understand speech in noise relies on multiple processes that develop with age. Using magnetoencephalography (MEG), we characterize the underlying neuromaturational basis by quantifying how cortical oscillations in 144 participants (aged 5-27 years) track phrasal and syllabic structures in connected speech mixed with different types of noise. While the extraction of prosodic cues from clear speech was stable during development, its maintenance in a multi-talker background matured rapidly up to age 9 and was associated with speech comprehension.
View Article and Find Full Text PDFWe investigated how aging modulates lexico-semantic processes in the visual (seeing written items), auditory (hearing spoken items) and audiovisual (seeing written items while hearing congruent spoken items) modalities. Participants were young and older adults who performed a delayed lexical decision task (LDT) presented in blocks of visual, auditory, and audiovisual stimuli. Event-related potentials (ERPs) revealed differences between young and older adults despite older adults' ability to identify words and pseudowords as accurately as young adults.
View Article and Find Full Text PDFSpoken language comprehension is a fundamental component of our cognitive skills. We are quite proficient at deciphering words from the auditory input despite the fact that the speech we hear is often masked by noise such as background babble originating from talkers other than the one we are attending to. To perceive spoken language as intended, we rely on prior linguistic knowledge and context.
View Article and Find Full Text PDFBackground: One potentially relevant neurophysiological marker of internalizing problems (anxiety/depressive symptoms) is the late positive potential (LPP), as it is related to processing of emotional stimuli. For the first time, to our knowledge, we investigated the value of the LPP as a neurophysiological marker for internalizing problems and specific anxiety and depressive symptoms, at preschool age.
Method: At age 4 years, children (N = 84) passively viewed a series of neutral, pleasant, and unpleasant pictures selected from the International Affective Pictures System.
Humans quickly adapt to variations in the speech signal. Adaptation may surface as , a learning effect driven by error-minimisation between a visual face and an ambiguous auditory speech signal, or as , a contrastive aftereffect driven by the acoustic clarity of the sound. Here, we examined whether these aftereffects occur for vowel identity and voice gender.
View Article and Find Full Text PDFLip-reading is crucial for understanding speech in challenging conditions. But how the brain extracts meaning from, silent, visual speech is still under debate. Lip-reading in silence activates the auditory cortices, but it is not known whether such activation reflects immediate synthesis of the corresponding auditory stimulus or imagery of unrelated sounds.
View Article and Find Full Text PDFIn two experiments, we investigated the relationship between lexical access processes, and processes that are specifically related to making lexical decisions. In Experiment 1, participants performed a standard lexical decision task in which they had to respond as quickly and as accurately as possible to visual (written), auditory (spoken) and audiovisual (written + spoken) items. In Experiment 2, a different group of participants performed the same task but were required to make responses after a delay.
View Article and Find Full Text PDFSpeech perception is influenced by vision through a process of audiovisual integration. This is demonstrated by the McGurk illusion where visual speech (for example /ga/) dubbed with incongruent auditory speech (such as /ba/) leads to a modified auditory percept (/da/). Recent studies have indicated that perception of the incongruent speech stimuli used in McGurk paradigms involves mechanisms of both general and audiovisual speech specific mismatch processing and that general mismatch processing modulates induced theta-band (4-8 Hz) oscillations.
View Article and Find Full Text PDFAlthough the default state of the world is that we see and hear other people talking, there is evidence that seeing and hearing ourselves rather than someone else may lead to visual (i.e., lip-read) or auditory "self" advantages.
View Article and Find Full Text PDFHyperscanning refers to obtaining simultaneous neural recordings from more than one person (Montage et al., 2002 [1]), that can be used to study interactive situations. In particular, hyperscanning with Electroencephalography (EEG) is becoming increasingly popular since it allows researchers to explore the interactive brain with a high temporal resolution.
View Article and Find Full Text PDFPerception of vocal affect is influenced by the concurrent sight of an emotional face. We demonstrate that the sight of an emotional face also can induce recalibration of vocal affect. Participants were exposed to videos of a 'happy' or 'fearful' face in combination with a slightly incongruous sentence with ambiguous prosody.
View Article and Find Full Text PDFIncongruent audiovisual speech stimuli can lead to perceptual illusions such as fusions or combinations. Here, we investigated the underlying audiovisual integration process by measuring ERPs. We observed that visual speech-induced suppression of P2 amplitude (which is generally taken as a measure of audiovisual integration) for fusions was similar to suppression obtained with fully congruent stimuli, whereas P2 suppression for combinations was larger.
View Article and Find Full Text PDFPerceiving linguistic input is vital for human functioning, but the process is complicated by the fact that the incoming signal is often degraded. However, humans can compensate for unimodal noise by relying on simultaneous sensory input from another modality. Here, we investigated noise-compensation for spoken and printed words in two experiments.
View Article and Find Full Text PDFLip-read speech suppresses and speeds up the auditory N1 and P2 peaks, but these effects are not always observed or reported. Here, the robustness of lip-read-induced N1/P2 suppression and facilitation in phonetically congruent audiovisual speech was assessed by analyzing peak values that were taken from published plots and individual data. To determine whether adhering to the additive model of AV integration (i.
View Article and Find Full Text PDFAlthough infant speech perception in often studied in isolated modalities, infants' experience with speech is largely multimodal (i.e., speech sounds they hear are accompanied by articulating faces).
View Article and Find Full Text PDFAuditory lexical processing starts within 200 ms after onset of the critical stimulus. Here, we used electroencephalography (EEG) to investigate whether (1) the so-called N200 effect can be triggered by single-item lexical context, and (2) such effects are robust against temporal violations of the signal. We presented items in which lexical status (i.
View Article and Find Full Text PDFThe correspondence between auditory speech and lip-read information can be detected based on a combination of temporal and phonetic cross-modal cues. Here, we determined the point in developmental time at which children start to effectively use phonetic information to match a speech sound with one of two articulating faces. We presented 4- to 11-year-olds (N=77) with three-syllabic sine-wave speech replicas of two pseudo-words that were perceived as non-speech and asked them to match the sounds with the corresponding lip-read video.
View Article and Find Full Text PDFNeuropsychologia
January 2014
Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were modulated by phonetic audiovisual congruency. In order to disentangle speech-specific (phonetic) integration from non-speech integration, we used Sine-Wave Speech (SWS) that was perceived as speech by half of the participants (they were in speech-mode), while the other half was in non-speech mode.
View Article and Find Full Text PDFInfants and adults are well able to match auditory and visual speech, but the cues on which they rely (viz. temporal, phonetic and energetic correspondence in the auditory and visual speech streams) may differ. Here we assessed the relative contribution of the different cues using sine-wave speech (SWS).
View Article and Find Full Text PDFAuditory phoneme categories are less well-defined in developmental dyslexic readers than in fluent readers. Here, we examined whether poor recalibration of phonetic boundaries might be associated with this deficit. 22 adult dyslexic readers were compared with 22 fluent readers on a phoneme identification task and a task that measured phonetic recalibration by lipread speech (Bertelson, Vroomen, & De Gelder, 2003).
View Article and Find Full Text PDF