Does a bag of potato chips on a silent video make a sound? It does, if you're an engineer -- researchers at MIT, Microsoft and Adobe have been working out how to reverse-engineer video to create the sound it would have made. Vibrations captured on high-resolution video can be used to replicate the sound of an event, even if there was no sound captured originally.
The technique can be used to recapture sound from vibrations even as slight as the rustling of a bag of potato chips, the researchers say.
Abe Davis, Michael Rubenstein, Neal Wadhwa, Gautham Mysore, Frédo Durand, and William T. Freeman collaborated on a paper that will be presented in August at SIGGRAPH 2014. The paper can be viewed online currently on MIT's website. They found that using only high-resolution video, they were able to recreate the sound of objects in the video using the vibration captured onscreen. They shot the video at 6,000 frames per second, but they also found that the technique could also work on consumer cameras shooting at 60 frames per second (though not as well).
Using nothing but a bag of potato chips with a cell phone next to it, the research team was able to recreate a person singing "Mary Had a Little Lamb" on the cell phone by tracking the vibrations on the bag of potato chips. This technique could be used to innocuously construct phone conversations even after they take place, without the audio or visual of the person talking.
This "allows us to turn everyday objects -- a glass of water, a potted plant, a box of tissues, or a bag of chips -- into visual microphones," according to the paper. "Remarkably, it is possible to recover comprehensible speech and music in a room from just a video of a bag of chips."
The technology still has a lot of refining that needs to be done before it is able to be marketed, but this invention could be of interest to the art of espionage and eavesdropping. However, there are already some pretty sophisticated methods for surveillance out there, including a patent for a "Technique and device for through-the-wall audio surveillance" created based on technology developed at NASA. However, the new technology developed at MIT has the potential to be a very useful tool, and how to harvest information.
"The motion of this vibration creates a very subtle visual signal that's usually invisible to the naked eye," Davis wrote in the paper. "People didn't realize that this information was there."