ByteDance unveils 2 new video-generation AI models to narrow gap with OpenAI’s Sora

By South China Morning Post | Created at 2024-09-24 11:43:16 | Updated at 2024-09-30 21:25:12 6 days ago

TikTok owner ByteDance has launched two new large language models (LLMs) – the technology underpinning generative artificial intelligence (AI) applications like ChatGPT – designed for creating videos based on text and image prompts, as Chinese tech firms look to catch up with the advances made by OpenAI’s Sora.

The new Doubao-PixelDance and Doubao-Seaweed LLMs – part of the Doubao family of AI models, which share the same name as the Doubao chatbot that ByteDance introduced last year – will be available early this October, according to Tan Dai, president of ByteDance cloud unit Volcano Engine.

The Doubao-PixelDance model, which is able to handle complex and sequential motions, can produce 10-second videos, while the Doubao-Seaweed model can generate clips of up to 30 seconds, according to Volcano Engine’s website.

The addition of video-generation AI models to the Doubao LLM family “has benefited from the capabilities of understanding videos accumulated by Douyin and Jianying over the years”, Tan said at an event in Shenzhen on Tuesday, referring to the Chinese version of TikTok and ByteDance’s popular video-editing app known as CapCut outside the mainland.

Tan’s demonstration at the event showed that both new AI models were able to generate videos that simulate real-life scenes, like a first-person view of driving a car, as well as fictional clips such as a winged frog flying and a floating island.

Read Entire Article