When visual speech (mouth movements) is mismatched with auditory speech (sounds), the perception can be altered and an entirely different message can be understood. New research suggests a potential mechanism explaining this phenomenon, also known as the McGurk effect.
The study, published in the journal PLOS Computational Biology, details the algorithm used by neuroscience researchers who investigated this perceptual illusion.
McGurk Effect, Debunked By Researchers
The research could provide a better understanding of the patients who suffer from speech perception deficits and could also be employed in building computers able to understand visual and auditory speech.
Because virtually everyone grows up being exposed to thousands of speech examples, the human brain generally has the capacity to map the likelihood of pairing auditory and visual speech without disruptions. Additionally, the human brain selects the speech patterns in conversations, getting the messages without any perceived effort.
"In everyday situations we are frequently confronted with multiple talkers emitting auditory and visual speech cues, and the brain must decide whether or not to integrate a particular combination of voice and face," noted Dr. Michael Beauchamp, professor of neurosurgery at Baylor College of Medicine and senior author of the study.
The McGurk effect describes a situation where understanding a message does not come so naturally. When the subjects perceive mouth movements, sometimes overlapping the sounds, the brain can distort the message it hears, understanding a completely different sound. At the same time, when the eyes are closed and only the auditory speech can be perceived, the same subjects get the right message that is being transmitted.
The team of scientists also designed an algorithm model with an integrated multisensory speech perception that functions on the principle of causal inference. When a visual and auditory syllables overlap, the brain will calculate whether it's more likely for the stimuli to come from one or multiple transmitters, and determines the final perception on the basis of this likelihood.
The scientists compared their model with another one, which was identical in every aspect except integrating the available cues. The main difference between the models is that the second one did not use any casual inference in understanding speech.
"Although there is variability in the frequency with which different subjects report the illusory McGurk percept and the efficacy of different stimuli in evoking it, we are not aware of any reports of the inverse McGurk stimuli evoking an illusory percept, as predicted by the non-CIMS model," noted the research.
A Step Of Many To Follow
A better understanding of the pattern based on which the human brain combines the information perceived through multiple senses could help improve the rates of speech decline, typically caused by aging. Further research based on this discovery could even lead to developing a device that would improve hearing throughout a person's entire life.
"These results are important because speech is the most important form of human communication and is fundamentally multisensory, making use of both visual information from the talker's face and the auditory information from the talker's voice," also noted the research.