Researchers at Columbia University's School of Engineering and Applied Science have introduced DIVID, a new tool designed to identify AI-generated videos with 93.7% accuracy.
DIVID: New Tool to Detect AI-Generated Videos
The research team led by Computer Science Professor Junfeng Yang developed DIVID, short for DIffusion-generated VIdeo Detector, in response to the increasing sophistication of AI-generated content.
AI-generated videos have become remarkably realistic, challenging both human observers and existing detection systems to discern between real footage and artificially created content.
Unlike earlier AI models like generative adversarial networks (GANs), which were detectable through visible anomalies like pixel irregularities and unnatural movements, newer AI techniques like diffusion models produce videos of such fidelity, and distinguishing them from real footage has become exceptionally difficult, according to the research team.
DIVID builds upon the team's previous work with Raidar, a tool designed to detect AI-generated text by analyzing linguistic patterns rather than delving into the mechanics of AI models like GPT-4 or Gemini.
Raidar's approach measures the number of alterations needed to transform a text, with fewer edits indicating a higher likelihood of machine generation due to AI's propensity to produce coherent text with minimal corrections.
The DIRE Technique
Applying a similar principle, DIVID employs the DIRE (DIffusion Reconstruction Error) technique to scrutinize diffusion-generated videos. This method evaluates the disparity between an input video and its reconstruction using a pre-trained diffusion model, thereby flagging videos likely to have been generated by AI.
DIVID aims to enhance the detection capabilities necessary to combat the proliferation of deceptive visual content by focusing on the inherent differences between AI-generated and real videos.
Yang, co-director of the Software Systems Lab, emphasized the Raidar insight's universal applicability, noting its adaptation from textual to visual mediums. With AI-generated videos becoming increasingly realistic, the team aimed to leverage Raidar's insights to develop a tool capable of accurately detecting AI-generated videos.
The evolution of AI technology, particularly in video synthesis through diffusion models like Sora by OpenAI and Runway Gen-2, underscores the urgency for robust detection mechanisms like DIVID.
These models progressively refine each video frame from random noise, achieving unprecedented realism and challenging conventional detection methodologies reliant on surface-level anomalies.
Through DIVID, Columbia's researchers ultimately aim to mitigate the risks posed by AI-generated videos in various contexts, including fraud prevention and maintaining the integrity of digital content.
"The insight in Raidar-that the output from an AI is often considered high-quality by another AI so it will make fewer edits-is really powerful and extends beyond just text," Yang said in a statement.
"Given that AI-generated video is becoming more and more realistic, we wanted to take the Raidar insight and create a tool that can detect AI-generated videos accurately," he added.
The findings of the research team were published in arXiv.