Vidu: China Reveals Revolutionary Text-to-Video Generator to Rival OpenAI's Sora

With the unveiling of this text-to-video generator, Shengshu Technology and Tsinghua University have demonstrated their commitment to pushing the boundaries of AI technology.

This partnership highlights the growing importance of AI research and development in China and its potential impact on various industries worldwide.

Google Gemini AI Generator Now Paused Following 'Inaccuracies' in Historical Pictures—Improved Version Coming? (Photo: Steve Johnson from Unsplash)

China's Next Step in AI Innovation

Shengshu Technology and Tsinghua University's joint venture, Vidu, represents a significant milestone in China's AI innovation journey.

This collaboration brings together the expertise of a tech startup and an esteemed academic institution to create a cutting-edge text-to-video generator.

With Vidu's unveiling at the Zhongguancun Forum in Beijing, it has garnered attention as a noteworthy competitor to OpenAI's Sora.

Unlike Sora's longer 60-second video capability, Interesting Engineering reported that Vidu allows users to generate shorter yet high-definition 16-second video clips with just a single click.

While Vidu's functionality may seem limited compared to Sora, its introduction marks a significant step forward in China's AI technology landscape.

As the country continues to invest in AI research and development, Vidu exemplifies China's commitment to innovation and technological advancement.

Zhu Jun, the chief scientist at Shengshu and deputy dean at Tsinghua's Institute for AI, described Vidu as a significant advancement in self-reliant innovation, boasting breakthroughs in various domains.

Vidu is characterized by its imaginative capabilities, ability to simulate the physical world, and capacity to generate 16-second videos with consistent characters, scenes, and timelines.

Furthermore, Zhu highlighted Vidu's proficiency in understanding "Chinese elements." During the model's debut, Shengshu Technology presented several demonstrations, including scenarios such as a panda playing a guitar on grass and a puppy swimming in a pool.

Advancements in Vidu's Architectural Framework

Vidu is constructed on a proprietary visual transformation model architecture called the Universal Vision Transformer (U-ViT). Developers have indicated that this architecture combines two text-to-video AI models: the Diffusion and the Transformer.

Furthermore, this architectural framework facilitates the creation of lifelike videos featuring dynamic camera movements, intricate facial expressions, and authentic lighting and shadow effects.

Zhu noted that the introduction of Sora resonated with their technical direction, intensifying their resolve to continue their research efforts.

Also read: Sora's New Realistic AI-Generated Videos Means We Can't Trust Our Eyes Anymore

Contrary to many Chinese iterations of OpenAI's ChatGPT that emerged in November 2020, Chinese competitors have only recently caught up to Sora's capabilities.

Experts in the industry attribute this delay to the significant challenge of insufficient computing power for Chinese companies.

According to Li Yangwei, a Beijing-based technical consultant specializing in intelligent computing, running Sora requires eight NVIDIA A100 graphics processing units (GPUs) for over three hours to generate a one-minute video clip.

Yangwei notes that Sora demands extensive computing power for inferencing.

Tags: Vidu text-to-video AI Artificial Intelligence Shengshu Technology Tsinghua University Zhongguancun Forum Chinese Tech video generation Sora alternative

Join the Discussion

Vidu: China Reveals Revolutionary Text-to-Video Generator to Rival OpenAI's Sora

Vidu represents a significant milestone in China's AI innovation journey.

China's Next Step in AI Innovation

Advancements in Vidu's Architectural Framework

Chinese Robot Maker Creates Humanoid Robot that Can Do Almost All Human Tasks

Snowflake Software Maker Looks to Expand AI Capabilities With Reka AI Acquisition Talks

Microsoft Looks to Remedy Nvidia H100's Constricted Supply With Alternative AMD AI Chips

RTX 4070-Powered Dell G16 Is $400 Off Now: Why You Should Snag This Beast Gaming Laptop

Senate Committee Passes Bills to Combat AI Misinformation Ahead of US Elections

Vidu: China Reveals Revolutionary Text-to-Video Generator to Rival OpenAI's Sora

Vidu represents a significant milestone in China's AI innovation journey.

China's Next Step in AI Innovation

Advancements in Vidu's Architectural Framework

Chinese Robot Maker Creates Humanoid Robot that Can Do Almost All Human Tasks

Snowflake Software Maker Looks to Expand AI Capabilities With Reka AI Acquisition Talks

Microsoft Looks to Remedy Nvidia H100's Constricted Supply With Alternative AMD AI Chips

RTX 4070-Powered Dell G16 Is $400 Off Now: Why You Should Snag This Beast Gaming Laptop

Senate Committee Passes Bills to Combat AI Misinformation Ahead of US Elections

Subscribe to Tech Times!